chapter five

5 Feeding Data to Your Generative AI Models

 

This chapter covers

  • Building and then querying an index based on a local data archive.
  • Uploading a PDF document to the ChatPDF service to query it the way you’d use ChatGPT.
  • Scripting the PDF-querying process using the ChatPDF API.
  • Using the AutoGPT tool to give a GPT-fueled agent access to the full and open internet.

There’s only so long you’ll keep at it before the novelty of torturing secrets out of an always friendly (and occasionally outrageous) AI gets a bit stale. After all, how many versions of the perfect resume do you actually need? And do you really want to hear how John Lennon would have sounded singing Shakespearian sonnets?

The real power of an LLM is in how quickly it’s able to process - and "understand" - insane volumes of data. It would be a shame to have to limit its scope to just the stuff it was shown during its training period. And in any case, you stand to gain more from the way that your AI processes your data than someone else’s. Just imagine how much value can be unleashed by identifying:

  • Patterns and trends in health records
  • Threats and attacks in digital network access logs
  • Potential financial opportunities or risks in banking records
  • Opportunities for introducing efficiencies in supply chain, infrastructure, and governance operations
  • Insurance, tax, or program fraud
  • Government corruption (and opportunities for operational improvements)

5.1 Indexing local data archives

5.2 Seeding a chat session with private data (ChatPDF)

5.3 Connecting your AI to the internet (Auto-GPT)

5.4 Summary

5.5 Try this for yourself