chapter eight

8 Postgres for generative AI

This chapter covers

Exploring Postgres’s capabilities for generative AI
Using the pgvector extension to store and query vector embeddings
Optimizing the similarity search with HNSW and IVFFlat indexes
Implementing RAG with Postgres

Generative artificial intelligence (gen AI) uses specialized models to produce text, images, videos, and other types of data. These generative models are trained on vast amounts of data and can generate new content based on patterns learned during training. For example, a large language model (LLM) is trained on diverse text data that can produce coherent text in response to user prompts in a natural language such as English or Japanese. An LLM can answer questions, engage in conversation, and perform various tasks by trying to understand the intent behind the prompts.

Let’s explore Postgres’s capabilities for generative AI as we continue building the movie recommendation service that lets users find movies they like or might want to watch. We’ll learn to turn movie descriptions into vector embeddings, store them in Postgres, and use vector similarity search and retrieval-augmented generation (RAG) to provide users with more sophisticated and personalized recommendations.

8.1 How to use Postgres with gen AI

8.1.1 Postgres and LLMs

8.1.2 Postgres and embedding models

8.2 Starting Postgres with pgvector

8.3 Generating embeddings

8.3.1 Generating embeddings for movies

8.3.2 Loading the final dataset into Postgres

8.4 Performing vector similarity search

8.4.1 Using cosine distance for similarity search

8.4.2 Changing the search phrase for better results

8.5 Indexing embeddings

8.5.1 Using IVFFlat indexes

8.5.2 Using the HNSW index

8.6 Implementing RAG

8.6.1 Preparing the environment for the prototype

8.6.2 Interacting with the LLM

8.6.3 Retrieving context for LLM