5 Exploring and evaluating language models
This chapter covers
- Understanding the capabilities of LMs
- Selecting suitable LMs
- Customizing LMs for specific tasks
- LMs in the wider application context
- Evaluating LMs
In this chapter, we will dive into the world of language models (LMs), which can be used for a wide variety of tasks, starting with content creation and moving on to tasks such as text summarization, translation, and more complex problem-solving. The chapter will provide a solid understanding of LMs to help you make informed decisions about model selection, deployment, customization, and risk management. You also need to support your engineers in making design decisions about the integration, adaptation, and evaluation of LMs within the larger AI system you are building.
Terminology
While giant LLMs were the main “culprit” of the generative AI boom, there is also a trend towards downscaling and using smaller, more efficient models. In the following, I will use “language model” (LM) as a general term encompassing both large (LLMs; 2B+ parameters) and small (SLMs; <2B parameters) models.