chapter one

1 Introduction to DeepSeek

 

This chapter covers

  • Why DeepSeek represents a turning point in open-source AI
  • A high-level roadmap of the key innovations we will build throughout this book
  • The book's structure, scope, and prerequisites

Large Language Models (LLMs) have transformed the technology landscape in recent years. We now live in a world where AI systems can carry on conversations, write code, draft essays, and even solve complex problems in ways that feel almost human. But what if you, a technically curious reader, could build one of these powerful AI models from scratch?

What if you could understand the inner workings of a state-of-the-art LLM by constructing it step by step with code and theory hand-in-hand? That is what we plan to teach you in this book.

We will understand the layers of a cutting-edge open-source LLM named DeepSeek, recreating its key innovations from the ground up. By the end, you will not only understand what makes DeepSeek unique but also how to implement those innovations yourself, gaining invaluable insights into modern AI development along the way.

1.1 Why DeepSeek? A turning point in open-source AI

1.2 The key innovations we will build

1.2.1 Architecture

1.2.2 Training

1.2.3 Post-training

1.3 Book structure and scope

1.4 What this book will teach you and what it won’t

1.5 What you will need to follow along

1.6 Summary