chapter four

4 Deep Transfer Learning for NLP with Recurrent Neural Networks

 

This chapter covers:

  • Preprocessing and modeling tabular text data
  • An exposition of three representative modeling architectures for transfer learning in natural language processing (NLP) that rely on recurrent neural networks (RNNs) for key functions – SIMOn, ELMo and ULMFiT
  • Analyzing a new pair of representative concrete example NLP problems
  • Transferring knowledge obtained from training on simulated data to a smaller set of real labeled data
  • An introduction to some more sophisticated model adaptation strategies for modifying a pretrained model for your problem during fine-tuning via method ULMFiT

In the previous chapter, we looked in some detail at some important shallow neural network architectures important in transfer learning for NLP. These included word2vec and sent2vec. Recall also that the vectors produced by these methods are static and non-contextual, in the sense that they produce the same vector for the word or sentence in question regardless of the surrounding context. This means these methods are unable to disambiguate, i.e., distinguish between different possible meanings for a word or sentence.

4.1   Preprocessing Tabular Column Type Classification Data

4.1.1   Obtaining and Visualizing Tabular Data

4.1.2   Preprocessing Tabular Data

4.1.3   Encoding Preprocessed Data as Numbers

4.2   Preprocessing Fact Checking Example Data

4.2.1   Special Problem Considerations

4.2.2   Loading and Visualizing Fact-Checking Data

4.3   Semantic Inference for the Modeling of Ontologies (SIMOn)

4.3.1   General Neural Architecture Overview

4.3.2   Modeling Tabular Data

4.3.3   Application of SIMOn to tabular Column Type Classification Data

4.4   Embeddings from Language Models (ELMo)

4.4.1   ELMo Bidirectional Language Modeling

4.4.2   Application to Fake News Detection

4.5   Universal Language Model Fine-Tuning (ULMFiT)

4.5.1   Target Task Language Model Fine-Tuning

4.5.2   Target Task Classifier Fine-Tuning

4.6   Summary