12 Using LLMs to Query Your Local Data

 

This chapter covers

  • Using GPT4All to query your own private data
  • Loading PDF documents for querying by a LLM
  • Preparing PDF documents for embedding
  • Using a model by GPT4All to answer questions on your own PDF document
  • Loading CSV and JSON files for querying
  • Using LLMs to analyze your own data files

Up to this point, you've explored the capabilities of LLMs and their usage through platforms like OpenAI and Hugging Face. While these services ease the burden of hosting models, they come at a cost. Alternatively, running powerful models locally requires significant setup effort and cost.

Developers often face the common challenge of utilizing LLMs to answer questions about their data, while businesses emphasize the need to maintain data privacy. In Chapter 8, we discussed sending data to OpenAI for embedding and querying with LangChain and LlamaIndex.

In this chapter, we will delve deeper into the topic, focusing on querying local private documents without compromising data privacy. Two approaches will be discussed in this chapter:

12.1 Using GPT4All to Query with Your Own Data

12.1.1 Installing the Required Packages

12.1.2 Importing the Various Modules from the LangChain Package

12.1.3 Loading the PDF Documents

12.1.4 Splitting the Text into Chunks

12.1.5 Embedding

12.1.6 Loading the Embeddings

12.1.7 Downloading the Model

12.1.8 Asking Questions

12.1.9 Loading Multiple Documents

12.1.10 Loading CSV Files

12.1.11 Loading JSON Files

12.2 Using LLMs to Write Code to Analyze Your Own Data

12.2.1 Preparing the JSON File

12.2.2 Loading the JSON file

12.2.3 Asking the Question using the Mistral 7B Model

12.2.4 Asking questions using OpenAI

12.3 Summary