6 Postgres for full-text search

 

This chapter covers

  • Understanding how Postgres supports full-text search
  • Converting textual data into lexemes
  • Storing lexemes in the database
  • Performing full-text search over lexemes using built-in functions and operators
  • Optimizing full-text search performance with GIN and GiST indexes

Postgres has supported various text data types and search operators for years. For instance, we can store text values in a column of the TEXT type and then use the equality (=), pattern matching (LIKE and ILIKE), regular expression matching (~ and ~*), and other operators to query the stored text data. While these capabilities work well for simple text data and queries, they are less effective when searching through large text documents that require understanding linguistic features like word variations or relevance ranking.

Full-text search is another core feature of Postgres that lets us perform advanced searches over textual data of varying size and complexity. Unlike basic text comparison or regex matching, it uses linguistic techniques to tokenize, normalize, rank, and index text data, making searches faster and more relevant.

6.1 Basics of full-text search in Postgres

6.1.1 Tokenization and normalization

6.1.2 Full-text search configurations

6.2 Preparing data for text search

6.2.1 Generating lexemes with to_tsvector function

6.2.2 Storing tsvector lexemes in database

6.3 Performing full-text search

6.3.1 Using plainto_tsquery for simple queries

6.3.2 Using to_tsquery for advanced filtering

6.4 Ranking search results

6.5 Highlighting search results

6.6 Indexing lexemes

6.6.1 Using GIN index

6.6.2 Using GiST index

6.7 Summary