chapter one

1 Introduction to Large Language Models (LLMs)

This chapter covers

Introducing large language models (LLMs) and how they came to such preeminence in the field of Natural Language Processing (NLP).
Understanding the intuition behind Transformers, a novel neural network architecture for natural language understanding.
Exploring the attributes that make LLMs ideal candidates for dialogue agents, as well as an overview of applications, limitations, and risks of LLMs.
Surveying breakthrough LLMs for dialogue, including OpenAI’s ChatGPT, Google’s Bard, Microsoft’s Bing AI, and Meta’s LLaMa.

On November 30, 2022, the San Francisco-based company OpenAI tweeted, "Try talking with ChatGPT, our new AI system which is optimized for dialogue. Your feedback will help us improve it." [1] ChatGPT, a chatbot that can interact with users through a web interface, was described as a minor update to the existing models that OpenAI had already been releasing and making available through APIs. But with the release of the web app, suddenly, anyone could have conversations with ChatGPT, ask it to write poetry or code, recommend movies or workout plans, and summarize or explain pieces of text. Many of the responses felt like magic. ChatGPT set the tech world on fire, reaching 1 million users in a matter of days and 100 million users two months after launch. By some measures, it is the fastest-growing internet service ever [2].

1.1 Introduction to Modern Natural Language Processing

1.2 The Birth of Large Language Models: Attention is All You Need

1.3 Explosion of Large Language Models

1.4 Applications of Large Language Models

1.4.1 Language Modeling

1.4.2 Question Answering

1.4.3 Coding

1.4.4 Content Generation

1.4.5 Logical Reasoning

1.4.6 Other Natural Language Tasks

1.5 Limitations and Risks of Large Language Models

1.5.1 Limitations in Training Data and Bias Issues

1.5.2 Limitations in Controlling Machine Outputs

1.5.3 Sustainability of Large Language Models

1.6 Overview of Current Language Models for Dialogue

1.6.1 OpenAI’s ChatGPT

1.6.2 Google’s Bard / LaMDA

1.6.3 Microsoft’s Bing AI

1.6.4 Meta’s LLaMa / Stanford’s Alpaca

1.7 Summary