chapter one

1 Large language models: The power of AI

This chapter covers

Introducing large language models
Understanding the intuition behind transformers
Exploring the applications, limitations, and risks of large language models
Surveying breakthrough large language models for dialogue

On November 30, 2022, San Francisco–based company OpenAI tweeted, “Try talking with ChatGPT, our new AI system which is optimized for dialogue. Your feedback will help us improve it” [1]. ChatGPT, a chatbot that interacts with users through a web interface, was described as a minor update to the existing models that OpenAI had already released and made available through APIs. But with the release of the web app, anyone could have conversations with ChatGPT, ask it to write poetry or code, recommend movies or workout plans, and summarize or explain pieces of text. Many of the responses felt like magic. ChatGPT set the tech world on fire, reaching 1 million users in a matter of days and 100 million users two months after launch. By some measures, it’s the fastest-growing internet service ever [2].

Evolution of natural language processing

The birth of LLMs: Attention is all you need

Explosion of LLMs

What are LLMs used for?

Language modeling

Question answering

Coding

Content generation

Logical reasoning

Other natural language tasks

Where do LLMs fall short?

Training data and bias

Limitations in controlling machine outputs

Sustainability of LLMs

Revolutionizing dialogue: Conversational LLMs

OpenAI’s ChatGPT

Google’s Bard/LaMDA

Microsoft’s Bing AI

Meta’s LLaMa/Stanford’s Alpaca

Summary