chapter two

2 Large language models: A deep dive into language modeling

This chapter covers

The linguistic background for understanding meaning and interpretation
A comparative study of language modeling techniques
Attention and the transformer architecture
How large language models both fit into and build upon these histories

If you know the enemy and know yourself, you need not fear the result of a hundred battles.
—Sun Tzu

This chapter delves into linguistics as it relates to the development of LLMs, exploring the foundations of semiotics, linguistic features, and the progression of language modeling techniques that have shaped the field of natural language processing (NLP). We will begin by studying the basics of linguistics and its relevance to LLMs, highlighting key concepts such as syntax, semantics, and pragmatics that form the basis of natural language and play a crucial role in the functioning of LLMs. We will delve into semiotics, the study of signs and symbols, and explore how its principles have informed the design and interpretation of LLMs.

2.1 Language modeling

2.1.1 Linguistic features

2.1.2 Semiotics

2.1.3 Multilingual NLP

2.2 Language modeling techniques

2.2.1 N-gram and corpus-based techniques

2.2.2 Bayesian techniques

2.2.3 Markov chains

2.2.4 Continuous language modeling

2.2.5 Embeddings

2.2.6 Multilayer perceptrons

2.2.7 Recurrent neural networks and long short-term memory networks

2.2.8 Attention

2.3 Attention is all you need

2.3.1 Encoders

2.3.2 Decoders

2.3.3 Transformers

2.4 Really big transformers

Summary