GraphDatabases

This information relates specifically to Neo4j, YMMV for other graph dbs.

A graph database has several types of data:

nodes - graph data records
relationships - connections between nodes
properties - named data values pairs for nodes or relationships
labels - mechanism to group nodes together

A node can have properties. A relationship can have properties. A label does not have any properties. Similar nodes can have different sets of properties. Properties can be strings, numbers, or booleans. A node may have multiple labels.

Relationships (property graph)

Relationships always have a direction
Relationships always have a type
Relationships always have a start node
Relationships always have an end node
Relationships form patterns of data
Relationships are data records
Relationships can contain properties

Most often, relationships have quantitative properties, like weight, rating, strength, or distance. Even though they are directed, relationships can be navigated regardless of direction.

The only rule: no broken links. You cannot delete a node without also deleting the relationships it's a member of.

Graph relationships naturally form paths, so querying/traversing the graph involves following those paths.

Modeling With Graphs

When transitioning from a relational model, use the following guidelines:

A row is a node.
A table name is a label.
Adopt a design for queryability mindset.

The graph model will describe the relationship in more detail. The name of the relationship will already give an indication about it's nature. Additionally, in a graph model, the data can be normalized without sacrificing performance.

Define a statement describing a connection between 2 entities.
Identify each unique conceptual identity in the statement as a node.
Extract label names by identifying the roles of each of the nodes.
Connect the nodes with a relationship by describing their interactions.
Draw the data model
Start asking pertinent questions about your data to identify properties of the node or relationship.
Create a simple dataset to validate your assumptions.
Translate your questions in queries.

Use labels to group nodes into sets. Queries can limit their scope by using these labels instead of searching the entire graph. After identifying the roles of the objects, extract the label names.

When refactoring a graph database schema, the normal mechanism is just to add new nodes and relationships rather than add new properties to existing ones. The rationale is data safety, changing properties might introduce some variance in existing queries. Graph database are naturally additive. Always model for the questions you want answered and create new nodes and relationships that describe them.

Avoid storing entities in relationships, rather keep the properties focused on how the entities are related, rather than what they are. Be careful about the nouns you use.

Neo4j Data Modeling Guide Modeling Trees with Neo4j Neo4j Mailing List - search for 'tree'

Neo4j Browser

This is a command driven web-based client. Use it for running ad-hoc graph queries or prototype a simply Neo4j-based application. You can export the any query results. It provides visualization mechanisms. It is built on top of the REST API.

Editor

:help
edit multi-line with <shift-enter>
execute a query with <ctrl-enter>
:play start|intro|concepts|graphs
:clear
:play sysinfo will get monitoring information
:help history will show the command history of the browser

Cypher

When querying w/ Cypher, we frequently start with bound nodes, which are well known starting points in the graph. Use the START clause to query the underlying indexes to start exploring the rest of the graph.

A simple example to create a small social graph:

CREATE (ee:Person { name: "Emil", from: "Sweden", klout: 99 })

CREATE creates the data
() indicates a node
ee:Person a variable ee and label Person for the new node
{} add properties to the node

A simple example to find the node representing Emil:

MATCH (ee:Person) WHERE ee.name = "Emil" RETURN ee;

MATCH specifies a pattern of nodes and relationships
(ee:Person) a single node pattern with label "Person" which will assign matches to the variable ee
WHERE constrains the results
ee.name = "Emil" compares the name property to the value "Emil"
RETURN used to request particular results

You can create many nodes and relationships at the same time:

MATCH (ee:Person) WHERE ee.name = "Emil"
CREATE (js:Person { name: "Johan", from: "Sweden", learn: "surfing" }),
(ir:Person { name: "Ian", from: "England", title: "author" }),
(rvb:Person { name: "Rik", from: "Belgium", pet: "Orval" }),
(ally:Person { name: "Allison", from: "California", hobby: "surfing" }),
(ee)-[:KNOWS {since: 2001}]->(js),(ee)-[:KNOWS {rating: 5}]->(ir),
(js)-[:KNOWS]->(ir),(js)-[:KNOWS]->(rvb),
(ir)-[:KNOWS]->(js),(ir)-[:KNOWS]->(ally),
(rvb)-[:KNOWS]->(ally)

Pattern Matching

A pattern can be used to find Emil's friends:

MATCH (ee:Person)-[:KNOWS]-(friends)
WHERE ee.name = "Emil" RETURN ee, friends

MATCH describes the pattern from known nodes to found nodes
(ee:Person) starts the pattern with a Person (qualified by WHERE)
-[:KNOWS]- matches "KNOWS" relationships in either direction
(friends) results bound to this variable

Pattern matching can also make recommendations. For example, Johan is learning to surf, so he may want to find a new friend who already does:

MATCH (js:Person)-[:KNOWS]-()-[:KNOWS]-(surfer)
WHERE js.name = "Johan" AND surfer.hobby = "surfing"
RETURN DISTINCT surfer

() empty parens to ignore the nodes in-between
DISTINCT because more than one path will match the pattern
surfer will contain Allison, a friend of a friend who surfs

Resources

Concepts

Elements

Guidelines

Miscellaneous

Techniques

Provide feedback

Saved searches

Use saved searches to filter your results more quickly