chapter seven

7 Finding kernels of knowledge in text with CNNs

This chapter covers

Understanding neural networks for NLP
Finding patterns in sequences
Building a convolutional neural network with PyTorch
Training a convolutional neural network
Training embeddings
Classifying text

In this chapter, you will unlock the misunderstood superpowers of convolution for natural language processing (NLP). This will help your machine understand words by detecting patterns in sequences of words and how they are related to their neighbors.

Convolutional neural networks (CNNs) are all the rage for computer vision (image processing), but few businesses appreciate the power of CNNs for NLP. This creates an opportunity for you in your NLP learning and for entrepreneurs that understand what CNNs can do. For example, in 2022, Cole Howard and Hannes Hapke (coauthors of the first edition of this book) used their NLP CNN expertise to help their startup automate business and accounting decisions.¹ And deep learning deep thinkers in academia, like Christopher Manning and Geoffrey Hinton, use CNNs to crush the competition in NLP. You can too.

7.1 Patterns in sequences of words

7.2 Convolution

7.2.1 Stencils for natural language text

7.2.2 A bit more stenciling

7.2.3 Correlation vs. convolution

7.2.4 Convolution as a mapping function

7.2.5 Python convolution example

7.2.6 PyTorch 1D CNN on 4D embedding vectors

7.2.7 Natural examples

7.3 Morse code

7.3.1 Decoding Morse with convolution

7.4 Building a CNN with PyTorch

7.4.1 Clipping and padding

7.4.2 Better representation with word embeddings

7.4.3 Transfer learning

7.4.4 Robustifying your CNN with dropout

7.5 PyTorch CNN to process disaster toots

7.5.1 Network architecture

7.5.2 Pooling

7.5.3 Linear layers