chapter twelve

12 Graph-based RAG

This chapter covers

Why vector search fails at multi-hop, relationship-heavy questions
Graph-Based RAG taxonomy: global summarizers, associative reasoners, and hybrid optimizers
Hierarchical community detection (Leiden algorithm) and Map-Reduce summarization in Microsoft's GraphRAG
How Personalized PageRank (PPR) lets HippoRAG mimic the human hippocampus and associative memory
Dual-level retrieval in LightRAG that routes queries between graph structures and vector indices
Evaluating performance with Influence Score (IS) and Partial Information Decomposition (PID)

RAG has evolved by improving the resolution of similarity. We moved from keyword matching to dense embeddings, and we got better at finding chunks that look like the question. The chunks themselves, though, still sit in the index as isolated points: close to other chunks that share vocabulary, but otherwise unconnected. The pipeline measures distance, not relationship.

12.1 Graph-based RAG

12.1.1 From chunks and vectors to graphs

12.1.2 Three lineages of graph-based RAG

12.1.3 The core architecture of graph-based RAG

12.2 Microsoft GraphRAG

12.2.1 G-Indexing: Building the pyramid

12.2.2 G-Retrieval and G-Generation: Map-Reduce over the pyramid

12.2.3 Implementing GraphRAG

12.3 HippoRAG

12.3.1 From PageRank to Personalized PageRank

12.3.2 How HippoRag works

12.3.3 G-Generation: ranking by activation

12.3.4 Implementing PPR activation

12.4 LightRAG

12.4.1 G-Indexing: Graph-enhanced indexing

12.4.2 G-Retrieval: Dual-path routing

12.4.3 Implementing LightRAG

12.5 Evaluation and real-world application

12.6 Emerging optimizations

12.6.1 TagRAG

12.6.2 Clue-RAG

12.6.3 HybGRAG

12.6.4 AcademicRAG

12.6.5 Future

12.7 Summary