Aussie AI

Chapter 16. Graph RAG

Book Excerpt from "RAG Optimization: Accurate and Efficient LLM Applications"

by David Spuler and Michael Sharpe

Chapter 16. Graph RAG

What is Graph RAG?

Maintaining a list of vectors corresponding to chunks so that for any retrieved vector, the chunks above and below it is just one way to organize RAG data. More complex structures often make sense. One of the most general structures available is a graph. A graph is a set of “vertices” corresponding to chunks of the source data. Vertices are related by “edges”. In the above section the graph degenerates to a list and the edges effectively represent an order (reading order maybe).

More complex structures can exist. Consider a ticketing system (e.g., like Atlassian Jira, or Zendesk). Each ticket can have associated comments. Tickets are often related to other tickets. Ticket relationships could indicate one is a duplicate of another, or is caused by another, or depends on another etc. Further, tickets might be linked to problems and problems might have solutions. Tickets could be linked to a part of a knowledge base too.

Chunking all of this, whilst maintaining the relationships can be very valuable. For example, if the retriever matches a ticket chunk, it can also pull in some comments, it can pull in other tickets, it can pull in a solution, perhaps, and even content from the knowledge base to provide chunks to the LLM.

All of this helps to get a good amount of likely related content for the LLM to build a great answer. The type of the chunk initially matched can guide the retriever to finding the good data and related data. The type of the edges also help with this guidance.

Document Structure

Many RAG sources have structure already. Ticketing systems and knowledge based are inherently structured. Even websites have a lot of structure and a lot of hidden data in them which can help classify chunks (e.g., SEO information in metatags). Chunks from different website pages linked to each other are likely useful to keep together. Then analysing a website, it’s possible to “guess” the nature of a link (e.g., is it an ad? Is it a deeper dive? is it an alternative?). LLM are actually pretty good at classifying things like that.

Other types of documents are often structured, but less directly. For example, books have multiple chapters, where early chapters often introduce topics, and subsequent chapters can revisit and dive deeper. And some don’t.

If there is no “physical” structure to the RAG data source, then Graph RAG can be very useful. Wait, what, wasn’t all of the above graph RAG, too? Well, yes, but it was speaking more to the actually structuring of the chunks in the data store that the retriever works with

Graph RAG algorithms tends to focus on how vertices and edges are created, particularly when there is no inherent structure in the source data. But when those vertices and edges are identified, it’s just a data structure that the retriever uses.

Scientific papers are often used as a source for RAG systems and are chunked and related using a citation graph. However, this does not form edges between chunks. A citation will have a chunk on one side, but the whole paper of the other side. That is not useful. Also not useful is having edges from the source chunk to all the chunks in the other paper. As such, to get a chunk to chunk edge, additional work needs to be done. The easiest approach is to calculate the vector embedding for the source chunk, and for all the chunk in the cited paper and then only create edges for the embeddings which match the source chunk’s embedding the best. Fortunately, the vector embeddings need to be calculated for everything anyway, and the logic to find the best chunks is part of any RAG system.

In fact, many documents have a complex level of structure associated with them. Some examples of simple ideas include:

Headings
Sections
Chapters
Table of Contents
Index

These sections with inherent structure are all useful for building graphs which can help relate chunks. StructuGraphRAG is an example of such a system.

Microsoft GraphRAG

GraphRAG was proposed by Microsoft in 2024. The basic idea is to extract the key concepts/ideas (“entities”) out of each “chunk” using an LLM. Each chunk which shares an entity is linked to each other. Everything linked by a common entity forms a “community”.

The LLM is used to summarise each community. Each summary becomes part of the graph too. At retrieval time, each summary which matches the question/query is used to build a full answer.

GraphRAG does well at answering questions which do not have answer directly in the raw source text. For example, in the Microsoft example, they run RAG over the transcripts of a podcast.

The question posed was “What are the oddest topics discussed”. This naturally did not directly match any particular thing discussed. But due to the concept/idea extraction oddness did get into the graph. GraphRAG performs better than standard RAG in many cases.

LightRAG and Other Variants

It turns out that GraphRAG is computationally expensive. Both for the initial chunking/indexing and at retrieval time. LightRAG (2024) is an attempt to speed things up whilst retaining similar ideas. LightRAG uses the LLM to find key concepts or ideas (“entities”) and then builds a graph similarly to GraphRAG. But instead of summarizing each “entity” and adding it to the graph, it just uses the graph to find related chunks. Retrieval works by first retrieving chunks using vector lookups, and then using the graph to find related chunks to the entities in the initial retrieved chunks.

There are numerous other proposed methods which can be categorized as Graph-based RAG. Some include:

HybGRAG (2024)
Think-on-Graph (2024)
KnowGPT (2024)
KG RAG (2025)
PathRAG (2025)

Every week seems to yield a new graph-based RAG mechanism.

Theory of Graph RAG

The key point to take away from this section is that to get maximal accuracy out of the retriever, it needs to be aligned with the structure of your data source. The more context which can be taken or derive from the source, the better the retriever can be implemented.

It should not be a surprise that a “graph” is a common type of structure to organize your data. A graph is effectively a catch-all data structure, and everything is a graph of some sort. All that is needed for a graph is a set of nodes and the arcs that represent one-way relationships between two nodes.

Tree RAG. Even the simple list in classic RAG implementations based on overlapping chunks is a degenerate type of graph. As a final example, if a family tree was the source of a RAG system, it would not be a surprise if a tree is the best way to represent the data source, too. A tree is just another special type of graph. Chunks can be related by family relationships, too. The tree would come from direct parent/child relationships. Other relations ships can be threaded into the tree, such as nieces or nephews in the tree hierarchy.

References on Graph RAG

Another important point to note: most of the methods mentioned above have open-source implementations. References and project links are below:

Edge, Darren, et al., 2024, From Local to Global: A Graph RAG Approach to Query-Focused Summarization, Microsoft Research Tech Report (Preprint), 2024, https://www.microsoft.com/en-us/research/publication/from-local-to-global-a-graph-rag-approach-to-query-focused-summarization, https://www.microsoft.com/en-us/research/project/graphrag/, https://github.com/microsoft/graphrag
Zhu, Xishi, et al., 2024, StructuGraphRAG: Structured Document-Informed Knowledge Graphs for Retrieval-Augmented Generation, AAAI Symposium Series, 2024, https://www.researchgate.net/publication/385678132_StructuGraphRAG_Structured_Document-Informed_Knowledge_Graphs_for_Retrieval-Augmented_Generation
Guo, Zirui, et al., 2024, LightRAG: Simple and Fast Retrieval-Augmented Generation, arXiv preprint 2410.05779, 2024, https://arxiv.org/html/2410.05779v1, https://github.com/HKUDS/LightRAG
Lee, Meng-Chieh, et al., 2024, HybGRAG: Hybrid Retrieval-Augmented Generation on Textual and Relational Knowledge Bases, Findings of ACL, 2024, https://ar5iv.labs.arxiv.org/html/2412.16311
Sun, Jiashuo, et al., 2024, Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph, ICLR, 2024, https://arxiv.org/abs/2307.07697
Zhang, Qinggang, et al., 2024, KnowGPT: Knowledge Graph Based Prompting for Large Language Models, NeurIPS, 2024, https://openreview.net/forum?id=PacBluO5m7, https://arxiv.org/abs/2312.06185
Zhu, Xiangrong, et al., 2025, Knowledge Graph-Guided Retrieval Augmented Generation (KG²RAG), NAACL, 2025, https://arxiv.org/abs/2502.06864
Chen, Boyu, et al., 2025, PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths, arXiv preprint 2502.14902, 2025, https://arxiv.org/abs/2502.14902

Ontology RAG

Ontology-based RAG is the use of a special type of Knowledge Graph, known as an “ontology” or “taxonomy” of the concept space. Extra information can be extracted from the taxonomy as a special type of retrieval for RAG-based systems. The advantage is the ability to better capture structured information and hierarchical relationships between concepts in the ontology.

Research papers on LLMs and ontologies or taxonomies include:

Prajwal Kailas, Max Homilius, Rahul C. Deo, Calum A. MacRae, 16 Dec 2024, NoteContrast: Contrastive Language-Diagnostic Pretraining for Medical Text, https://arxiv.org/abs/2412.11477
Muhayy Ud Din, Jan Rosell, Waseem Akram, Isiah Zaplana, Maximo A Roa, Lakmal Seneviratne, Irfan Hussain, 10 Dec 2024, Ontology-driven Prompt Tuning for LLM-based Task and Motion Planning, https://arxiv.org/abs/2412.07493 https://muhayyuddin.github.io/llm-tamp/ (Detecting objects in the prompt text and then using a RALM algorithm to query an ontology database.)
Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov, 11 Jul 2023, OntoChatGPT Information System: Ontology-Driven Structured Prompts for ChatGPT Meta-Learning, International Journal of Computing, 22(2), 170-183, https://arxiv.org/abs/2307.05082 https://doi.org/10.47839/ijc.22.2.3086 https://computingonline.net/computing/article/view/3086
Alhassan Mumuni, Fuseini Mumuni, 6 Jan 2025, Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches, https://arxiv.org/abs/2501.03151
Kartik Sharma, Peeyush Kumar, Yunqing Li, 12 Dec 2024, OG-RAG: Ontology-Grounded Retrieval-Augmented Generation For Large Language Models, https://arxiv.org/abs/2412.15235
Chengshuai Zhao, Garima Agrawal, Tharindu Kumarage, Zhen Tan, Yuli Deng, Ying-Chih Chen, Huan Liu, 10 Dec 2024, Ontology-Aware RAG for Improved Question-Answering in Cybersecurity Education, https://arxiv.org/abs/2412.14191
Ramona Kühn, Jelena Mitrović, Michael Granitzer, 18 Dec 2024, Enhancing Rhetorical Figure Annotation: An Ontology-Based Web Application with RAG Integration, https://arxiv.org/abs/2412.13799
Xueli Pan, Jacco van Ossenbruggen, Victor de Boer, Zhisheng Huang, 13 Sep 2024, A RAG Approach for Generating Competency Questions in Ontology Engineering, https://arxiv.org/abs/2409.08820
Rafael Teixeira de Lima, Shubham Gupta, Cesar Berrospi, Lokesh Mishra, Michele Dolfi, Peter Staar, Panagiotis Vagenas, 29 Nov 2024, Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems, https://arxiv.org/abs/2411.19710
Yuxing Lu, Sin Yee Goi, Xukai Zhao, Jinzhuo Wang, 22 Jan 2025 (v2), Biomedical Knowledge Graph: A Survey of Domains, Tasks, and Real-World Applications, https://arxiv.org/abs/2501.11632
Battazza, I. F. C., Rodrigues, C. M. d. O., & Oliveira, J. F. L. d., 2025, A Framework for Market State Prediction with Ontological Asset Selection: A Multimodal Approach, Applied Sciences, 15(3), 1034. https://doi.org/10.3390/app15031034 https://www.mdpi.com/2076-3417/15/3/1034
AD Al Hauna, AP Yunus, M Fukui, S Khomsah, Apr 2025, Enhancing LLM Efficiency: A Literature Review of Emerging Prompt Optimization Strategies, International Journal on Robotics, https://doi.org/10.33093/ijoras.2025.7.1.9 https://mmupress.com/index.php/ijoras/article/view/1311 PDF: https://mmupress.com/index.php/ijoras/article/view/1311/834
Jean-Philippe Corbeil, Amin Dada, Jean-Michel Attendu, Asma Ben Abacha, Alessandro Sordoni, Lucas Caccia, François Beaulieu, Thomas Lin, Jens Kleesiek, Paul Vozila, 15 May 2025, A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment, https://arxiv.org/abs/2505.10717