chapter twelve

12 Using LLMs to query your local data

This chapter covers

Using GPT4All to query your private data
Using PDF documents for querying by a large language model (LLM)
Loading CSV and JSON files for querying
Using LLMs to analyze your data files

Up to this point, you’ve explored the capabilities of LLMs and their use through platforms such as OpenAI and Hugging Face. Although these services ease the burden of hosting models, they come at a cost. But running powerful models locally also requires significant setup effort and cost.

Developers often face the common challenge of using LLMs to answer questions about their data, whereas businesses emphasize the need to maintain data privacy. Chapter 8 discussed sending data to OpenAI for embedding and querying with LangChain and LlamaIndex. This chapter delves deeper into the topic, focusing on querying local private documents without compromising data privacy. The chapter discusses two approaches:

12.1 Using GPT4All to query with your own data

12.1.1 Installing the required packages

12.1.2 Importing the various modules from the LangChain package

12.1.3 Loading the PDF documents

12.1.4 Splitting the text into chunks

12.1.5 Embedding

12.1.6 Loading the embeddings

12.1.7 Downloading the model

12.1.8 Asking questions

12.1.9 Loading multiple documents

12.1.10 Loading CSV files

12.1.11 Loading JSON files

12.2 Using LLMs to write code to analyze your data

12.2.1 Preparing the JSON file

12.2.2 Loading the JSON file

12.2.3 Asking the question using the Mistral 7B model

12.2.4 Asking questions using OpenAI

Summary