chapter four
4 Deep Transfer Learning for NLP with Recurrent Neural Networks
This chapter covers:
- Preprocessing and modeling tabular text data
- An exposition of three representative modeling architectures for transfer learning in natural language processing (NLP) that rely on recurrent neural networks (RNNs) for key functions – SIMOn, ELMo and ULMFiT
- Analyzing a new pair of representative concrete example NLP problems
- Transferring knowledge obtained from training on simulated data to a smaller set of real labeled data
- An introduction to some more sophisticated model adaptation strategies for modifying a pretrained model for your problem during fine-tuning via method ULMFiT
In the previous chapter, we looked in some detail at some important shallow neural network architectures important in transfer learning for NLP. These included word2vec and sent2vec. Recall also that the vectors produced by these methods are static and non-contextual, in the sense that they produce the same vector for the word or sentence in question regardless of the surrounding context. This means these methods are unable to disambiguate, i.e., distinguish between different possible meanings for a word or sentence.