chapter seven

7 Efficient and specialized small language models

 

This chapter covers

  • Why small language models matter
  • Sentiment classification with ModernBERT
  • Adapting Gemma 3 270M for empathy and prosocial tone
  • Adapting Gemma 3 270M for translation
  • Broader use cases

As powerful as today’s largest language models are, they are not always the right tool for the job. Deploying a multihundred billion-parameter model for every task can be inefficient, expensive, and often unnecessary. Many applications demand something lighter, faster, or more focused.

This is where small language models (SLMs) come in. Instead of forcing one giant model to do everything, we can rely on compact versions designed to strike the right balance between capability, cost, and speed. In some cases, these smaller models are optimized for edge devices. In others, they serve as cost-efficient cloud deployments or as highly specialized classifiers, detectors, or routers in larger pipelines. In this chapter, we’ll look at the role small transformer-based models can play in an AI system and get an overview of how to evaluate and tune them for specialized use cases.

7.1 The power of small

7.2 Small models as agents in a system of specialists

7.3 Classification with SLMs

7.3.1 Evaluating classification performance

7.3.2 Accuracy and the F1-score

7.3.3 Fine-tuning SLMs on the Financial PhraseBank dataset

7.4 Adapting Gemma 3 270M for empathy and prosocial tone

7.5 Adapting Gemma 3 270M for English–Spanish translation

7.6 Broader use cases and complementary models

Summary