chapter four

Chapter 4:Neural networks that understand language: king – man + woman == ?

Chapter 11 from Grokking Deep Learning by Andrew Trask.

In this chapter

Natural language processing (NLP)
Supervised NLP
Capturing word correlation in input data
Intro to an embedding layer
Neural architecture
Comparing word embeddings
Filling in the blank
Meaning is derived from loss
Word analogies

“Man is a slow, sloppy, and brilliant thinker; computers are fast, accurate, and stupid.”

John Pfeiffer, in Fortune, 1961

What does it mean to understand language?

What kinds of predictions do people make about language?

Up until now, we’ve been using neural networks to model image data. But neural networks can be used to understand a much wider variety of datasets. Exploring new datasets also teaches us a lot about neural networks in general, because different datasets often justify different styles of neural network training according to the challenges hidden in the data.

We’ll begin this chapter by exploring a much older field that overlaps deep learning: natural language processing (NLP). This field is dedicated exclusively to the automated understanding of human language (previously not using deep learning). We’ll discuss the basics of deep learning’s approach to this field.

Natural language processing (NLP)

NLP is divided into a collection of tasks or challenges

Chapter 4:Neural networks that understand language: king – man + woman == ?

What does it mean to understand language?

What kinds of predictions do people make about language?

Natural language processing (NLP)

NLP is divided into a collection of tasks or challenges

Supervised NLP

Words go in, and predictions come out

IMDB movie reviews dataset

You can predict whether people post positive or negative reviews

Capturing word correlation in input data

Bag of words: Given a review’s vocabulary, predict the sentiment

Predicting movie reviews

With the encoding strategy and the previous network, you can predict sentiment

Intro to an embedding layer

Here’s one more trick to make the network faster

After running the previous code, run this code

Interpreting the output

What did the neural network learn along the way?

Neural architecture