chapter six

6 Deep transfer learning for NLP with recurrent neural networks

This chapter covers

Three representative modeling architectures for transfer learning in NLP relying on RNNs
Applying these methods to the two problems introduced in the previous chapter
Transferring knowledge obtained from training on simulated data to real labeled data
An introduction to some more sophisticated model adaptation strategies via ULMFiT

In the previous chapter, we introduced two example problems for the experiment we will conduct in this chapter—column-type classification and fake news detection. Recall that the goal of the experiment is to study the deep transfer learning methods for NLP that rely on recurrent neural networks (RNNs) for key functions. In particular, we will focus on three such methods—SIMOn, ELMo, and ULMFiT—which were briefly introduced in the previous chapter. In this chapter, we will apply them to the example problems, starting with SIMOn in the next section.

6.1 Semantic Inference for the Modeling of Ontologies (SIMOn)

As we discussed briefly in the previous chapter, SIMOn was designed as a component of an automatic machine learning (AutoML) pipeline for the Data-Driven Discovery of Models (D3M) DARPA program. It was developed as a classification tool for the column type in a tabular dataset but can also be viewed as a more general text classification framework. We will present the model in the context of arbitrary text input first and then specialize it to the tabular case.

6.1.1 General neural architecture overview

6.1.2 Modeling tabular data

6 Deep transfer learning for NLP with recurrent neural networks

This chapter covers

6.1 Semantic Inference for the Modeling of Ontologies (SIMOn)

6.1.1 General neural architecture overview

6.1.2 Modeling tabular data

6.1.3 Application of SIMOn to tabular column-type classification data

6.2 Embeddings from Language Models (ELMo)

6.2.1 ELMo bidirectional language modeling

6.2.2 Application to fake news detection

6.3 Universal Language Model Fine-Tuning (ULMFiT)

6.3.1 Target task language model fine-tuning

6.3.2 Target task classifier fine-tuning

Summary