This chapter covers:
- Representing the meaning of text with dense vectors
- An introduction to Transformers, and their impact on text representation and retrieval
- Building a fast and accurate autocomplete using transformer models
- Using approximate nearest neighbor (ANN) search to speed up dense vector retrieval
- Semantic-search using dense vectors
In this chapter, we’ll start our journey into the emerging future of search, where we see the swell wave of hypercontectual vectors soak into the beaches of information retrieval.
Our story begins with what you have already learned in section 2.5, that we can represent context as numerical vectors, and we can compare these vectors to see which are closer using a similarity metric. In chapter 2 we demonstrated the concept of searching on dense vectors, a technique known as "dense vector search", but our examples were simple and contrived (searching on made-up food attributes). In this chapter we pose the question - how can we convert real world unstructured text into a high dimensional dense vector space that attempts to model the actual meaning of the text representation. And how can we leverage this representation of knowledge for advanced search applications?