2 Introduction to large language models

 

This chapter covers

  • An overview of LLMs
  • Key use cases powered by LLMs
  • Foundational models and their effect on AI development
  • New architecture concepts for LLMs, such as prompts, prompt engineering, embeddings, tokens, model parameters, context window, and emergent behavior
  • An overview of small language models
  • Comparison of open source and commercial LLMs

Large language models (LLMs) are generative AI models that can understand and generate human-like text based on a given input. LLMs are the foundation of many natural language processing (NLP) tasks, such as search, speech-to-text, sentiment analysis, text summarization, and more. In addition, they are general-purpose language models that are pretrained and can be fine-tuned for specific tasks and purposes.

This chapter explores the fascinating world of LLMs and their transformative effect on artificial intelligence (AI). As a significant advancement in AI, LLMs have demonstrated remarkable capabilities in understanding and generating human-like text, thus enabling numerous applications across various industries. Here, we dive into the critical use cases of LLMs, the different types of LLMs, and the concept of foundational models that has revolutionized AI development.

2.1 Overview of foundational models

2.2 Overview of LLMs

2.3 Transformer architecture

2.4 Training cutoff

2.5 Types of LLMs

2.6 Small language models

2.7 Open source vs. commercial LLMs

2.7.1 Commercial LLMs

2.7.2 Open source LLMs

2.8 Key concepts of LLMs

2.8.1 Prompts

2.8.2 Tokens

2.8.3 Counting tokens

2.8.4 Embeddings

2.8.5 Model configuration

2.8.6 Context window

2.8.7 Prompt engineering

2.8.8 Model adaptation

2.8.9 Emergent behavior