chapter ten

10 Large Language Models in the real world

 

This chapter covers

  • Understanding how conversational LLMs work
  • Recognizing errors, misinformation, and biases in LLM output
  • Fine-tuning LLMs on your own data
  • Finding meaningful search results for your queries (semantic search)
  • Speeding up your vector search with Approximate Nearest Neighbor Algorithm
  • Generating fact-based well-formed text with LLMs

If you increase the number of parameters for transformer-based language models to obscene sizes, you can achieve some surprisingly impressive results. Researchers call these surprises emergent properties but they may be a mirage.[368] Since the general public started to become aware of the capabilities of really large transformers, they are increasingly referred to as Large Language Models (LLMs). The most sensational of these surprises is that chatbots built using LLMs generate intelligent-sounding text. You’ve probably spent some time using LLM chatbots such as ChatGPT, Bard, Bing Chat, Perplexity AI, and they may be helping you get ahead in your career by helping you craft words, code, or ideas faster. Like most, you are probably relieved to finally have a search engine and virtual assistant that actually gives you direct, smart-sounding answers to your questions. This chapter will help you use LLMs smartly so you can do more than merely sound intelligent.

10.1 Large Language Models (LLMs)

10.1.1 Smarter smaller LLMs

10.1.2 Generating warm words

10.1.3 Creating your own Generative LLM

10.1.4 Fine-tuning your generative model

10.1.5 Nonsense (hallucination)

10.2.1 Web-scale reverse indices

10.2.2 Improving your full-text search with trigram indices

10.3.2 Choose your index

10.3.3 Quantizing the math

10.3.4 Pulling it all together with haystack

10.3.5 Getting real