4 Generating Cypher queries from natural language questions

 

This chapter covers

  • The basics of query language generation
  • Where query language generation fits in the RAG pipeline
  • Useful practices for query language generation
  • Implementing a text2cypher retriever using a base model
  • Specialized (finetuned) LLMs for text2cypher

We’ve covered a lot of ground in the previous chapters. We’ve learned how to build a knowledge graph, extract information from text, and use that information to answer questions. We’ve also looked into how we can extend and improve plain vector search retrieval by using hardcoded Cypher queries to get more relevant context to the LLM. In this chapter, we will go a step further and learn how to generate Cypher queries from natural language questions. This will allow us to build a more flexible and dynamic retrieval system that can adapt to different types of questions and knowledge graphs.

4.1 The basics of query language generation

4.2 Where query language generation fits in the RAG pipeline

4.3 Useful practices for query language generation

4.3.1 Using few-shot examples for in-context learning

4.3.2 Using database schema in the prompt to show the LLM the structure of the knowledge graph

4.3.3 Adding terminology mapping to semantically map the user question to the schema

4.3.4 Format instructions

4.4 Implementing a text2cypher generator using a base model

4.5 Specialized (finetuned) LLMs for text2cypher

4.6 What we’ve learned and what text2cypher enables

Summary