7 Efficient and specialized small language models
This chapter covers
- Why small language models matter
- Sentiment classification with ModernBERT
- Adapting Gemma 3 270M for empathy and prosocial tone
- Adapting Gemma 3 270M for translation
- Broader use cases
As powerful as today’s largest language models are, they are not always the right tool for the job. Deploying a multihundred billion-parameter model for every task can be inefficient, expensive, and often unnecessary. Many applications demand something lighter, faster, or more focused.
This is where small language models (SLMs) come in. Instead of forcing one giant model to do everything, we can rely on compact versions designed to strike the right balance between capability, cost, and speed. In some cases, these smaller models are optimized for edge devices. In others, they serve as cost-efficient cloud deployments or as highly specialized classifiers, detectors, or routers in larger pipelines. In this chapter, we’ll look at the role small transformer-based models can play in an AI system and get an overview of how to evaluate and tune them for specialized use cases.