preface
A lot has changed in the world of NLP since the first edition. You probably couldn‘t miss the release of BERT, GPT-3, Llama 3, and the wave of enthusiasm for ever larger large language models, such as ChatGPT.
More subtly, while reviewing the first edition of this book at the San Diego Machine Learning group book club (https://github.com/SanDiegoMachineLearning/bookclub), we watched while PyTorch (https://github.com/pytorch/pytorch) and spaCy (https://spacy.io/) rose to prominence as the workhorses of NLP at even the biggest of big tech corporations.
And the past few years have seen the rise of Phind, You.com, Papers With Code (http://paperswithcode.com; Meta AI Research maintains a repository of machine learning papers, code, datasets, and leaderboards), Wayback Machine (http://archive.today; The Internet Archive maintains the Wayback Machine, which houses petabytes of cached natural language content from web pages you wouldn‘t have access to otherwise), arXiv.org (http://arxiv.org; Cornell University maintains arXiv for independent researchers to release prepublication academic research), and many smaller search engines powered by prosocial NLP algorithms. In addition, vector search databases were a niche product when we wrote the first edition, while now, they are the cornerstone of most NLP applications.