5 Exploring and evaluating language models
This chapter covers
- Understanding the capabilities of LMs
- Selecting suitable LMs
- Customizing LMs for specific tasks
- LMs in the wider application context
- Evaluating LMs
In this chapter, we will dive into the world of language models (LMs) and explore their capabilities, limitations, and practical applications. When building generative AI products, you need to have a solid understanding of these models to make informed decisions about model selection, deployment, customization, and risk management. You also need to support your engineers when it comes to taking design decisions about the integration, adaptation, and evaluation of LMs within the larger AI system you are building.
A word on terminology: while giant LLMs were the main culprit of the generative AI boom, the current trend is towards downscaling and using smaller, more efficient models. In this and the subsequent chapters, I will generally refer to “language models” (LMs), which can be both large (LLMs; 2B+ parameters) and small (SLMs; <2B parameters). We will be focusing on the text modality, only touching multi-modal models for those use cases where other modalities are combined with text.