chapter two

2 Now we are speaking

This chapter covers

Introducing the first computational approaches to language.
Breaking text into machine-readable units.
Mapping words into spaces where meaning takes geometric form.
Exposing why language remains a demanding frontier for AI.

Humans rely on language to coordinate actions, share knowledge, and explain ideas to one another. Through speech and writing, information can travel far beyond the moment in which it was created, allowing knowledge to accumulate across generations and become part of a shared intellectual inheritance For machines, however, language has always posed a difficult challenge, because words rarely carry meaning on their own and instead depend on context, intent, and the surrounding conversation. Teaching a computer to process text therefore means confronting the full complexity of how people express ideas, from simple descriptions to abstract arguments that unfold across entire paragraphs.

This chapter follows the long arc from early computational attempts to model language to the representations that make modern systems possible. It explores how rules gave way to statistics, how words were broken into machine-readable units, and how those units were embedded into numerical representations that capture patterns of meaning. Along the way, it examines why language poses challenges that go beyond syntax or vocabulary: ambiguity, long-range dependencies, pragmatic inference, and the sheer diversity of human expression.

2.1 Language equals intelligence?

2.1.1 When words stand in for minds

2.1.2 What machine language does reveal

2.1.3 Thought beyond language

2.1.4 Why language still matters

2.2 Machines’ first words

2.2.1 When language was logic

2.2.2 Patterns without understanding

2.2.3 From rules to evidence

2.3 Counting words

2.3.1 From intuition to probability

2.3.2 What is an n-gram?

2.3.3 Building language from statistics

2.3.4 The problem of sparsity

2.3.5 Balancing context and complexity

2.4 A token full of meaning

2.4.1 From raw text to tokens

2.4.2 Words, subwords, and characters

2.4.3 Compressing language

2.4.4 From BPE to modern tokenizers

2.4.5 Handling the unknown

2.4.6 Tokens as information boundaries

2.4.7 Why tokens matter

2.5 Embedding reality

2.5.1 From indexes to vectors

2.5.2 Learning embeddings

2.5.3 Embeddings go global

2.5.4 The geometry of meaning

2.5.5 Meaning in context