chapter one

1 Large language models: The foundation of generative AI

 

This chapter covers

  • Introducing large language models
  • Understanding the intuition behind transformers
  • Exploring the applications, limitations, and risks of large language models
  • Surveying the major players in generative AI

On November 30, 2022, San Francisco–based company OpenAI tweeted, “Try talking with ChatGPT, our new AI system which is optimized for dialogue. Your feedback will help us improve it” [1]. ChatGPT, a chatbot that interacts with users through a web interface, was described as a minor update to the existing models that OpenAI had already released and made available through APIs (application programming interfaces). But with the release of the web app, anyone could have conversations with ChatGPT and ask it to write poetry or code, recommend movies or workout plans, or summarize or explain pieces of text. Many of the responses felt like magic. ChatGPT set the tech world on fire, reaching 1 million users in a matter of days and 100 million users two months after launch. By some measures, it’s the fastest-growing internet service ever [2].

The evolution of natural language processing

The birth of LLMs

The explosion of LLMs

What are LLMs used for?

Language modeling

Question answering

Coding

Content generation

Logical reasoning

Other natural language tasks

Where do LLMs fall short?

Training data and bias

Limitations in controlling machine outputs

Sustainability of LLMs

Major players in generative AI

OpenAI

Google

Meta

Microsoft

Anthropic

Other notable players

Conclusion

Summary