chapter thirteen

13 Scaling up (optimization, parallelization, and batch processing)

This chapter covers

Scaling up an NLP pipeline
Speeding up search with indexing
Batch processing to reduce your memory footprint
Parallelization to speed up NLP
Running NLP model training on a GPU

In chapter 12, you learned how to use all the tools in your NLP toolbox to build an NLP pipeline capable of carrying on a conversation. We demonstrated crude examples of this chatbot dialog capability on small datasets. The humanness, or IQ, of your dialog system seems to be limited by the data you train it with. Most of the NLP approaches you’ve learned give better and better results, if you can scale them up to handle larger datasets.

You may have noticed that your computer bogs down, even crashes, if you run some of the examples we gave you on large datasets. Some datasets in nlpia.data.loaders.get_data() will exceed the memory (RAM) in most PCs or laptops.

Besides RAM, another bottleneck in your natural language processing pipelines is the processor. Even if you had unlimited RAM, larger corpora would take days to process with some of the more complex algorithms you’ve learned.

So you need to come up with algorithms that minimize the resources they require:

Volatile storage (RAM)
Processing (CPU cycles)

13.1 Too much of a good thing (data)

13 Scaling up (optimization, parallelization, and batch processing)

This chapter covers

13.1 Too much of a good thing (data)

13.2 Optimizing NLP algorithms

13.2.1 Indexing

13.2.2 Advanced indexing

13.2.3 Advanced indexing with Annoy

13.2.4 Why use approximate indexes at all?

13.2.5 An indexing workaround: discretizing

13.3 Constant RAM algorithms

13.3.1 Gensim

13.3.2 Graph computing

13.4 Parallelizing your NLP computations