Skip to content

Commit 7b4d07e

Browse files
alexottalenegro81
authored andcommitted
more refactoring
1 parent e99e6e3 commit 7b4d07e

File tree

12 files changed

+43
-49
lines changed

12 files changed

+43
-49
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ If you encounter problems in running some of the examples you can decide to eith
105105

106106
All the scripts work perfectly with the community and the enterprise edition of Neo4j.
107107
Moreover, it is possible to use the [Neo4j desktop](https://neo4j.com/download/) to manage the Neo4j instances.
108-
The book has been written during the transition from 3.5.x to 4.x so the code has been all updated to run on the version 4.x.
108+
The book has been written during the transition from 3.5.x to 4.x so the code has been all updated to run on the version 4.x (all code was tested on 4.2.3).
109109
It introduced some changes, like the variable binding and the multi relationships syntax in Cypher that make the code incompatible with the previous version 3.5.x.
110110

111111
You can find all the instruction for downloading and installing Neo4j in the way you prefer here:

ch04/imports/movielens/Makefile

+4-1
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,12 @@ PYTHON=python
44
init:
55
$(PIP) install -r requirements.txt
66

7-
source:
7+
download:
88
mkdir -p ../../../dataset/movielens
99
curl -f http://files.grouplens.org/datasets/movielens/ml-latest-small.zip -o ../../../dataset/movielens/ml-latest-small.zip
1010
unzip ../../../dataset/movielens/ml-latest-small.zip -d ../../../dataset/movielens/
11+
12+
13+
1114
test:
1215
nosetests tests

ch04/imports/movielens/README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,12 @@ To install what is necessary run:
88
make
99
```
1010

11-
## Download the data source
11+
## Download the source data
1212
The Makefile contains also the command to download the necessary data sources.
1313
Run:
1414

1515
```sh
16-
make source
16+
make download
1717
```
1818

1919
You can also download it manually from [project's site](http://files.grouplens.org/datasets/movielens/ml-latest-small.zip)
@@ -28,7 +28,7 @@ The default location is in the home of this code repository in the directory dat
2828
python import_movielens.py -u <neo4j username> -p <password> -b <bolt uri> -s <source directory>
2929
```
3030

31-
If you used the makefile for downloading the directory you don't need to specify the datasource.
31+
If you used the makefile for downloading the directory you don't need to specify the datasource. If you specified username, password & bolt URI in the config file, then you don't need to specify these parameters.
3232
The simple version takes a while to be completed. I recommend to run the parallel version as follows:
3333

3434
```sh
@@ -42,4 +42,4 @@ it can happen that it will start rejecting the requests. It is perfectly normal.
4242

4343
After the chapter has been released the full version of the IMDB has been released [here](https://www.imdb.com/interfaces/).
4444

45-
In the future I'll make some changes in order to load from files instead.
45+
In the future I'll make some changes in order to load from files instead.

ch05/recommendation/README.md renamed to ch05/recommendation/collaborative_filtering/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,8 @@ pip install -r requirements.txt
1616

1717
## Import the data
1818

19-
Before starting to work with the code, you need to import the data as described [here](../imports/retail_rocket/README.md).
19+
Before starting to work with the code, you need to import the data as described [here](../../imports/retail_rocket/README.md).
2020

2121
## Working with the code
2222

23-
The code for chapter 5 is located in the [`collaborative_filtering`](collaborative_filtering/) directory. Just change into it
23+
Code is in the file `recommender.py`, just execute it with `python recommender.py`.

ch05/recommendation/collaborative_filtering/__init__.py

Whitespace-only changes.

ch05/recommendation/collaborative_filtering/recommender.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ class UserRecommender(BaseRecommender):
122122
WITH otherUser, count(otherUser) as size
123123
MATCH (otherUser)-[r:PURCHASES]->(target:Target)
124124
WHERE target.itemId = $itemId
125-
return (1.0f/size)*count(r) as score
125+
return (+1.0/size)*count(r) as score
126126
"""
127127

128128

ch07/imports/depaulmovie/Makefile

+2-1
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,5 @@ test:
55
nosetests tests
66

77
download:
8-
curl -L -o ratings.txt https://raw.githubusercontent.com/JDonini/depaulmovie-recommender-system/master/dataset/Movie_DePaulMovie/ratings.txt
8+
mkdir -p ../../../dataset/deeppaulmovie && \
9+
curl -L -o ../../../dataset/deeppaulmovie/ratings.txt https://raw.githubusercontent.com/JDonini/depaulmovie-recommender-system/master/dataset/Movie_DePaulMovie/ratings.txt

ch07/imports/depaulmovie/import_depaulmovie.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -207,7 +207,7 @@ def write_movie_on_db(self):
207207
importing = DePaulMovieImporter(sys.argv[1:])
208208
base_path = importing.source_dataset_path
209209
if not base_path:
210-
base_path = "/Users/ale/neo4j-servers/gpml/dataset/Movie_DePaulMovie"
210+
base_path = "../../../dataset/deeppaulmovie"
211211
importing.import_event_data(file=os.path.join(base_path, "ratings.txt"))
212212
importing.import_movie_details()
213213
end = time.time() - start

ch08/import/ieee/import_ieee.py

+14-23
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,25 @@
22
import time
33
import threading
44
from queue import Queue
5-
from neo4j import GraphDatabase
65
import math
76
import sys
87

9-
class IEEEImporter(object):
8+
from util.graphdb_base import GraphDBBase
109

11-
def __init__(self, uri, user, password):
12-
self._driver = GraphDatabase.driver(uri, auth=(user, password), encrypted=0)
10+
class IEEEImporter(GraphDBBase):
11+
12+
def __init__(self, argv):
13+
super().__init__(command=__file__, argv=argv)
1314
self._transactions = Queue()
1415
self._dictionaries = {}
1516
self._print_lock = threading.Lock()
1617
with self._driver.session() as session:
17-
self.executeNoException(session, "CREATE CONSTRAINT ON (s:Transaction) ASSERT s.transactionId IS UNIQUE")
18-
self.executeNoException(session, "CREATE INDEX ON :Transaction(isFraud)")
19-
self.executeNoException(session, "CREATE INDEX ON :Transaction(isTrain)")
18+
self.execute_without_exception("CREATE CONSTRAINT ON (s:Transaction) ASSERT s.transactionId IS UNIQUE")
19+
self.execute_without_exception("CREATE INDEX ON :Transaction(isFraud)")
20+
self.execute_without_exception("CREATE INDEX ON :Transaction(isTrain)")
2021

2122
def close(self):
22-
self._driver.close()
23+
self.close()
2324

2425
def import_transaction(self, directory):
2526
j = 0
@@ -113,26 +114,16 @@ def write_transaction(self):
113114
print(e, row)
114115
self._transactions.task_done()
115116

116-
def executeNoException(self, session, query):
117-
try:
118-
session.run(query)
119-
except Exception as e:
120-
pass
121-
122-
123-
def strip(string): return ''.join([c if 0 < ord(c) < 128 else ' ' for c in string])
124-
125117

126118
if __name__ == '__main__':
127-
uri = "bolt://localhost:7687"
128-
importer = IEEEImporter(uri=uri, user="neo4j", password="q1")
119+
importer = IEEEImporter(sys.argv[1:])
129120

130121
start = time.time()
131-
base_path = "/Users/ale/neo4j-servers/gpml/dataset/ieee/"
132-
if len(sys.argv) > 1:
133-
base_path = sys.argv[1]
122+
base_path = importer.source_dataset_path
123+
if not base_path:
124+
base_path = "../../../dataset/ieee"
134125
importer.import_transaction(directory=base_path)
135-
print("Time to complete paysim ingestion:", time.time() - start)
126+
print("Time to complete IEEE ingestion:", time.time() - start)
136127

137128
# intermediate = time.time()
138129
# importer.post_processing(sess_clicks=sessions)

ch08/import/paysim/import_paysim.py

+14-15
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,18 @@
11
import pandas as pd
22
import numpy as np
33
import time
4-
from neo4j import GraphDatabase
54
import sys
5+
import os
66

7-
class PaySimImporter(object):
7+
from util.graphdb_base import GraphDBBase
88

9-
def __init__(self, uri, user, password):
10-
self._driver = GraphDatabase.driver(uri, auth=(user, password))
9+
class PaySimImporter(GraphDBBase):
10+
11+
def __init__(self, argv):
12+
super().__init__(command=__file__, argv=argv)
1113

1214
def close(self):
13-
self._driver.close()
15+
self.close()
1416

1517
def import_paysim(self, file):
1618
dtype = {
@@ -144,19 +146,16 @@ def post_processing(self, sess_clicks):
144146
return sess_clicks
145147

146148

147-
def strip(string): return ''.join([c if 0 < ord(c) < 128 else ' ' for c in string])
148-
149-
150149
if __name__ == '__main__':
151-
uri = "bolt://localhost:7687"
152-
importer = PaySimImporter(uri=uri, user="neo4j", password="q1")
150+
importer = PaySimImporter(sys.argv[1:])
153151

154152
start = time.time()
155-
file_path = "/Users/ale/neo4j-servers/gpml/dataset/paysim/PS_20174392719_1491204439457_log.csv"
156-
if len(sys.argv) > 1:
157-
file_path = sys.argv[1]
158-
importer.import_paysim(file=file_path)
159-
print("Time to complete paysim ingestion:", time.time() - start)
153+
base_path = importer.source_dataset_path
154+
if not base_path:
155+
base_path = "../../../dataset/paysim"
156+
157+
importer.import_paysim(file=os.path.join(base_path, "PS_20174392719_1491204439457_log.csv"))
158+
print("Time to complete PaySim ingestion:", time.time() - start)
160159

161160
# intermediate = time.time()
162161
# importer.post_processing(sess_clicks=sessions)

0 commit comments

Comments
 (0)