DATA ARCHITECTURE

Graph Data Modeling

Chapter 17Specialized & AppliedGraphs

Orientation

What You'll Master Here

Most databases treat relationships as second-class: a foreign key, a join table, something you reconstruct at query time. Graph databases invert that, relationships are first-class data, stored and traversed directly. This chapter teaches you to model when the connections between things are the point.

You will learn the property graph model (nodes, relationships, properties), how to turn a connected domain into a graph almost word-for-word from requirements, why multi-hop questions that are brutal self-joins in SQL are natural traversals in a graph, and exactly when a graph is the right tool, and when it is not.

Everything is shown with Cypher (Neo4j’s query language) and direct relational comparisons, so you can see the same question expressed both ways and feel why the graph wins for connected data.

Why it matters

Recommendation engines, fraud rings, social networks, knowledge graphs, and supply chains are all about relationships and paths. On those problems a graph is dramatically simpler and faster than forcing the connections through relational joins.

Core mental model

A graph stores entities as nodes and their relationships as first-class, typed, directed edges, so you walk connections instead of joining tables.

Key terms
node
An entity in the graph, with a label (type) and properties.
relationship
A typed, directed edge between two nodes, which can carry its own properties.
property
A key/value pair on a node or relationship.
traversal
Walking from node to node along relationships to answer a question.

Common mistake

Reaching for a graph for ordinary tabular, aggregation-heavy workloads.

Graphs are not built for big set-based aggregation; you lose the warehouse’s strengths for no traversal benefit.

Better habit

  • Use a graph when relationships and paths are the core question.
  • Model relationships as first-class edges, not join tables.
  • Keep aggregation-heavy analytics in relational/dimensional stores.
The big idea

In a graph, a relationship is data you store and walk, not a join you compute. That makes deep, variable-length connection questions both natural to express and fast to run.

How to study this chapter

For each idea, compare the Cypher to the SQL it replaces. The graph’s advantage is clearest on the multi-hop questions.

Practice prompts

  • Name three domains where relationships are the primary question.
  • Explain why a graph is a poor fit for "monthly revenue by region".

Remember this

Graph databases make relationships first-class so connected, multi-hop questions are natural traversals; use them when the connections are the point, not for tabular aggregation.