about this book
Transformers in Action is a comprehensive guide to understanding and applying transformer models in the language and multimodal space. These models are foundational to modern AI systems such as ChatGPT and Gemini. The book aims to provide you with a solid foundation to use these models for your own projects, starting with the core concepts of transformers and then moving to practical and more advanced applications such as multimodal retrieval systems.
You will learn why transformers are designed the way they are and how they work, giving you both the theoretical understanding and the hands-on skills to use them effectively. Along the way, you’ll see when to use small language models (SLMs) and when architectural choices such as encoder-only or decoder-only designs make more sense.
Who should read this book
This book is for data scientists and machine learning engineers who want to learn how to build and apply transformer-based models for language and multimodal tasks. The goal is to equip you with the essential knowledge to establish a strong foundation, so you can confidently move on to advanced models and approaches.
How this book is organized: A road map
The book is divided into three parts covering 10 chapters. Part 1 explains the foundations of transformer models: