appendix-b

Appendix B. References and Further Reading

 

Chapter 1

Custom-built LLMs are able to outperform general-purpose LLMs as a team at Bloomberg showed via a version of GPT pretrained on finance data from scratch. The custom LLM outperformed ChatGPT on financial tasks while maintaining good performance on general LLM benchmarks:

Existing LLMs can be adapted and finetuned to outperform general LLMs as well, which teams from Google Research and Google DeepMind showed in a medical context:

The paper that proposed the original transformer architecture:

The original encoder-style transformer, called BERT:

The paper describing the decoder-style GPT-3 model, which inspired modern LLMs and will be used as a template for implementing an LLM from scratch in this book:

The original vision transformer for classifying images, which illustrates that transformer architectures are not only restricted to text inputs:

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 6

Chapter 7