Memgraph: A Crash Course

The open-source, in-memory graph database built for real-time performance

#databases

Prelude

I have had a lot of fun learning Memgraph and integrating it into WaSQL. I also loaded a 13+million distributor tree into Memgraph and tried it out. It was insanely fast! Like I have done with other articles in the past, I reached out to Dominik Tomicevic, one of the inventors and authors of Memgraph, and send him a private link to this article. He loved it! So this article has been read and approved by Dominik Tomicevic himself. :)

Memgraph

In a world drowning in connected data - social networks, fraud rings, supply chains, recommendation engines - traditional relational databases struggle to keep up. Enter Memgraph: a high-performance, in-memory graph database designed from the ground up for speed.

If you have ever waited seconds (or minutes) for a recursive SQL query to traverse a hierarchy, or wrestled with Neo4j's JVM memory tuning, Memgraph offers a compelling alternative. It is fast, it is open source, and it speaks Cypher - the same query language used by Neo4j.

This crash course will take you from zero to being productive with Memgraph, covering its history, architecture, installation, and practical query examples.

A Brief History

Memgraph was founded in 2016 in Zagreb, Croatia by software engineers Dominik Tomicevic and Marko Budiselic. Frustrated with the performance limitations of existing graph databases, they set out to build something better - a native, in-memory graph database written in C++ rather than Java.

Key milestones:

2016 - Company founded; engineering team begins building the core engine 2018 - First production release 2021 - Memgraph 2.0 launched as open source; raised 8M EUR with Microsoft M12 as investor 2022 - Community grows to 150,000+ developers across 100+ countries 2024 - Memgraph 3.0 released with enhanced GenAI/LLM integration

Today, Memgraph is headquartered in London with engineering operations in Croatia. The company has raised over $14M in funding from investors including Microsoft M12, Heavybit, In-Q-Tel, and Mundi Ventures.

Why Memgraph?

The Core Value Proposition

Memgraph differentiates itself through three pillars:

In-Memory Architecture: The entire graph lives in RAM. No disk seeks, no cold cache surprises. Traversals happen at memory speed.
Native C++ Implementation: Unlike JVM-based competitors, Memgraph avoids garbage collection pauses and JIT warmup time. Performance is consistent from the first query.
Neo4j Compatibility: Memgraph supports openCypher and the Bolt protocol, making it a drop-in replacement for many Neo4j deployments. Your existing drivers and queries often work unchanged.

Performance Numbers

Based on published benchmarks:

Latency: 1ms to 1 second on complex expansion queries (vs. 14ms to 3+ seconds for Neo4j)
Throughput: 32,000+ queries per second on single-hop expansions (vs. ~280 QPS for Neo4j)
Write Speed: 100,000 nodes created in ~400ms

These numbers come from Memgraph's own benchmarks - take them directionally, but the architectural advantages are real.

Use Cases

Memgraph excels in scenarios requiring real-time graph analytics:

Fraud Detection

Banks and payment processors use graph patterns to identify fraud rings in real time. When a transaction occurs, Memgraph can traverse connections to known bad actors in milliseconds.

Cybersecurity

Security teams model network topologies, identify attack paths, and detect anomalies by analyzing connection patterns across infrastructure.

Recommendation Engines

E-commerce and streaming platforms traverse user-item-user graphs to generate personalized recommendations without batch processing delays.

Network & IT Operations

Telecom and cloud providers model infrastructure dependencies, enabling rapid root cause analysis when failures occur.

Knowledge Graphs

Organizations build enterprise knowledge graphs for search, discovery, and AI/LLM augmentation (RAG pipelines).

Supply Chain & Logistics

Manufacturers trace component dependencies and optimize routing through complex supplier networks.

Installation

Memgraph offers multiple deployment options. Here's the fastest path to a running instance:

Option 1: Docker (Recommended)

# Pull and run Memgraph Platform (includes Memgraph + Lab UI)
docker run -d \
  --name memgraph \
  -p 7687:7687 \
  -p 7444:7444 \
  -p 3000:3000 \
  memgraph/memgraph-platform

# Verify it's running
docker logs memgraph

This gives you:

Port 7687: Bolt protocol (for queries)
Port 7444: WebSocket for Lab
Port 3000: Memgraph Lab web UI

Option 2: Docker (Database Only)

# Minimal install without the Lab UI
docker run -d \
  --name memgraph \
  -p 7687:7687 \
  memgraph/memgraph

Option 3: Native Installation (Debian/Ubuntu)

# Add Memgraph repository
curl -fsSL https://download.memgraph.com/memgraph-archive-keyring.gpg | \
  sudo gpg --dearmor -o /usr/share/keyrings/memgraph-archive-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/memgraph-archive-keyring.gpg] \
  https://download.memgraph.com/memgraph/v2/debian/ stable main" | \
  sudo tee /etc/apt/sources.list.d/memgraph.list

# Install
sudo apt update
sudo apt install memgraph

# Start the service
sudo systemctl start memgraph

Option 4: Memgraph Cloud

For a fully managed experience, sign up at cloud.memgraph.com. No infrastructure to manage.

Connecting to Memgraph

Using mgconsole (CLI)

# If running in Docker
docker exec -it memgraph mgconsole

# Native installation
mgconsole --host 127.0.0.1 --port 7687

Using Memgraph Lab (Web UI)

Navigate to http://localhost:3000 in your browser. Lab provides:

Query editor with syntax highlighting
Graph visualization
Schema exploration
Query execution history

Using Drivers

Memgraph supports the Bolt protocol, so standard Neo4j drivers work:

Python

from neo4j import GraphDatabase

driver = GraphDatabase.driver("bolt://localhost:7687")

with driver.session() as session:
    result = session.run("MATCH (n) RETURN count(n) AS count")
    print(result.single()["count"])

driver.close()

JavaScript

const neo4j = require('neo4j-driver');

const driver = neo4j.driver('bolt://localhost:7687');
const session = driver.session();

const result = await session.run('MATCH (n) RETURN count(n) AS count');
console.log(result.records[0].get('count').toNumber());

await session.close();
await driver.close();

import "github.com/neo4j/neo4j-go-driver/v5/neo4j"

driver, _ := neo4j.NewDriverWithContext(
    "bolt://localhost:7687",
    neo4j.NoAuth(),
)
defer driver.Close(ctx)

Supported Programming Languages

Memgraph supports the Bolt protocol and is compatible with Neo4j drivers, making it accessible from a wide range of programming languages:

Officially Supported Languages

Language	Driver	Documentation
Python	`neo4j` or `gqlalchemy`	Neo4j Python Driver / GQLAlchemy OGM
JavaScript/TypeScript/Node.js	`neo4j-driver`	Neo4j JavaScript Driver
Java	`neo4j-java-driver`	Neo4j Java Driver
Groovy	`neo4j-java-driver`	Neo4j Java Driver (JVM-based)
Go	`neo4j-go-driver`	Neo4j Go Driver
C#/.NET	`Neo4j.Driver`	Neo4j .NET Driver
PowerShell	`Neo4j.Driver` or `NEO4J-PowerShell-Driver`	Neo4j .NET Driver / PowerShell Wrapper
C/C++	`mgclient`	Memgraph C/C++ Client
Rust	`neo4rs` or `rsmgclient`	Neo4j Rust Driver / Memgraph Rust Client
PHP	`laudis/neo4j-php-client`	Neo4j PHP Client
Ruby	`neo4j-ruby-driver`	Neo4j Ruby Driver
Julia	`Neo4jBolt.jl`	Community Julia Bolt Driver

Additional Language Support

Since Memgraph uses the standard Bolt protocol, any language with a Neo4j-compatible driver can connect to Memgraph. Additional community-supported languages include:

Elixir (via Bolt.Sips)
Haskell (via Hasbolt)
Scala (via Neo4j Java Driver)
R (via neo4r)

For the most up-to-date list of client libraries, visit the Memgraph documentation.

Core Concepts

Before diving into queries, understand the property graph model:

Nodes (Vertices)

Entities in your graph. Each node can have:

One or more labels (like types): :Person, :Company
Properties (key-value pairs): {name: "Alice", age: 30}

Relationships (Edges)

Connections between nodes. Each relationship has:

A type: :WORKS_AT, :KNOWS, :PURCHASED
A direction: from one node to another
Optional properties: {since: 2020}

Visual Example

(:Person {name: "Alice"})-[:WORKS_AT {since: 2020}]->(:Company {name: "Acme"})

This represents Alice working at Acme since 2020.

Cypher Query Examples

Cypher is a declarative, pattern-matching query language. If you know SQL, you'll find it intuitive.

Creating Data

// Create a single node
CREATE (:Person {name: "Alice", age: 30});

// Create a node and return it
CREATE (p:Person {name: "Bob", age: 25})
RETURN p;

// Create multiple nodes
CREATE (:Person {name: "Charlie"}), (:Person {name: "Diana"});

// Create a relationship between existing nodes
MATCH (a:Person {name: "Alice"}), (b:Person {name: "Bob"})
CREATE (a)-[:KNOWS {since: 2023}]->(b);

// Create nodes and relationship in one statement
CREATE (a:Person {name: "Eve"})-[:MANAGES]->(b:Person {name: "Frank"});

Reading Data

// Find all nodes
MATCH (n) RETURN n;

// Find nodes by label
MATCH (p:Person) RETURN p;

// Find nodes by property
MATCH (p:Person {name: "Alice"}) RETURN p;

// Find nodes with WHERE clause
MATCH (p:Person)
WHERE p.age > 25
RETURN p.name, p.age;

// Find relationships
MATCH (a:Person)-[r:KNOWS]->(b:Person)
RETURN a.name, b.name, r.since;

// Traverse multiple hops
MATCH (a:Person {name: "Alice"})-[:KNOWS*1..3]->(friend)
RETURN DISTINCT friend.name;

Updating Data

// Update a property
MATCH (p:Person {name: "Alice"})
SET p.age = 31;

// Add a new property
MATCH (p:Person {name: "Alice"})
SET p.email = "alice@example.com";

// Add a label
MATCH (p:Person {name: "Alice"})
SET p:Employee;

// Remove a property
MATCH (p:Person {name: "Alice"})
REMOVE p.email;

Deleting Data

// Delete a node (must have no relationships)
MATCH (p:Person {name: "Frank"})
DELETE p;

// Delete a node and all its relationships
MATCH (p:Person {name: "Eve"})
DETACH DELETE p;

// Delete a specific relationship
MATCH (a:Person {name: "Alice"})-[r:KNOWS]->(b:Person {name: "Bob"})
DELETE r;

// Delete all data (careful!)
MATCH (n) DETACH DELETE n;

Aggregations

// Count nodes
MATCH (p:Person) RETURN count(p);

// Group and count
MATCH (p:Person)-[:WORKS_AT]->(c:Company)
RETURN c.name, count(p) AS employee_count
ORDER BY employee_count DESC;

// Calculate averages
MATCH (p:Person)
RETURN avg(p.age) AS average_age;

// Collect into lists
MATCH (c:Company)<-[:WORKS_AT]-(p:Person)
RETURN c.name, collect(p.name) AS employees;

Real-World Example: Distributor Tree

Let's model an MLM/distributor hierarchy - a perfect use case for graph databases.

Create the Schema

// Create indexes for performance
CREATE INDEX ON :Distributor(id);

// Create some distributors
CREATE (d1:Distributor {id: 1, name: "Alice", level: 0})
CREATE (d2:Distributor {id: 2, name: "Bob", level: 1})
CREATE (d3:Distributor {id: 3, name: "Charlie", level: 1})
CREATE (d4:Distributor {id: 4, name: "Diana", level: 2})
CREATE (d5:Distributor {id: 5, name: "Eve", level: 2})
CREATE (d6:Distributor {id: 6, name: "Frank", level: 2})

// Create sponsor relationships
CREATE (d1)-[:SPONSORS]->(d2)
CREATE (d1)-[:SPONSORS]->(d3)
CREATE (d2)-[:SPONSORS]->(d4)
CREATE (d2)-[:SPONSORS]->(d5)
CREATE (d3)-[:SPONSORS]->(d6);

Common Queries

// Find someone's direct downline
MATCH (d:Distributor {id: 1})-[:SPONSORS]->(downline)
RETURN downline.name;

// Find full downline (all levels)
MATCH (d:Distributor {id: 1})-[:SPONSORS*]->(downline)
RETURN downline.name, downline.level;

// Find upline (sponsor chain to root)
MATCH (d:Distributor {id: 6})<-[:SPONSORS*]-(upline)
RETURN upline.name, upline.level
ORDER BY upline.level;

// Count total team size
MATCH (d:Distributor {id: 1})-[:SPONSORS*]->(downline)
RETURN count(downline) AS team_size;

// Find distributors at specific depth
MATCH (d:Distributor {id: 1})-[:SPONSORS*3]->(downline)
RETURN downline.name;

// Find the path between two distributors
MATCH path = shortestPath(
  (a:Distributor {id: 1})-[:SPONSORS*]-(b:Distributor {id: 6})
)
RETURN [node IN nodes(path) | node.name] AS path;

Advanced Features

MAGE: Graph Algorithms Library

Memgraph includes MAGE (Memgraph Advanced Graph Extensions), an open-source library of graph algorithms:

// PageRank
CALL pagerank.get() 
YIELD node, rank
RETURN node.name, rank
ORDER BY rank DESC
LIMIT 10;

// Community Detection (Louvain)
CALL community_detection.get()
YIELD node, community_id
RETURN community_id, collect(node.name) AS members;

// Shortest Path
CALL algo.shortest_path(
  (a:Distributor {id: 1}),
  (b:Distributor {id: 6})
) YIELD path
RETURN path;

Streaming Integration

Memgraph connects directly to streaming platforms:

// Create a Kafka stream
CREATE STREAM purchases TOPICS purchase_events
TRANSFORM kafka.transform.purchase
BATCH_SIZE 1000;

// Start consuming
START STREAM purchases;

Triggers

Execute logic automatically when data changes:

CREATE TRIGGER new_distributor
ON CREATE AFTER COMMIT EXECUTE
CALL notify.slack("New distributor joined!");

Performance Tips

1. Create Indexes

// Label + property index (most common)
CREATE INDEX ON :Distributor(id);

// Check existing indexes
SHOW INDEX INFO;

2. Use Parameters (Avoid Query Injection)

# Bad - string concatenation
session.run(f"MATCH (d:Distributor {{id: {user_id}}}) RETURN d")

# Good - parameterized query
session.run("MATCH (d:Distributor {id: $id}) RETURN d", id=user_id)

3. Limit Traversal Depth

// Unbounded - potentially expensive
MATCH (d)-[:SPONSORS*]->(x) RETURN x;

// Bounded - predictable performance
MATCH (d)-[:SPONSORS*1..10]->(x) RETURN x;

4. Profile Your Queries

// See the execution plan
EXPLAIN MATCH (d:Distributor)-[:SPONSORS*1..5]->(x) RETURN x;

// Run and see actual metrics
PROFILE MATCH (d:Distributor)-[:SPONSORS*1..5]->(x) RETURN x;

5. Memory Configuration

# Set memory limit (e.g., 4GB)
docker run -d \
  --name memgraph \
  -p 7687:7687 \
  -e MEMGRAPH_MEMORY_LIMIT=4096 \
  memgraph/memgraph

When to Choose Memgraph

Choose Memgraph when:

You need real-time traversal performance (sub-millisecond to low milliseconds)
Your graph fits in memory (or you can shard appropriately)
You are already using Cypher/Neo4j and want a faster alternative
You need streaming data ingestion (Kafka, Pulsar)
You want predictable latency without JVM tuning

Consider alternatives when:

Your graph is too large for RAM and you cannot distribute it
You need enterprise features only available in Neo4j Enterprise
Your team is deeply invested in the Neo4j ecosystem (Bloom, etc.)
You prefer a managed service with more maturity (Neo4j Aura)

Resources

Documentation: memgraph.com/docs
GitHub: github.com/memgraph/memgraph
MAGE Algorithms: github.com/memgraph/mage
Memgraph Lab: memgraph.com/lab
Discord Community: discord.gg/memgraph
Playground: playground.memgraph.com

Conclusion

Memgraph represents a new generation of graph databases - one built for the real-time demands of modern applications. Its in-memory architecture, C++ implementation, and Neo4j compatibility make it an attractive choice for teams who need graph performance without graph complexity.

Whether you are detecting fraud, building recommendations, or modeling a distributor network with millions of nodes, Memgraph delivers the speed that relational databases simply cannot match.

Start with Docker, load your data, and see the difference for yourself.

docker run -p 7687:7687 -p 3000:3000 memgraph/memgraph-platform

Happy graphing.