5 Preprocessing data for recurrent neural network deep transfer learning experiments

This chapter covers

An overview of modeling architectures for transfer learning in NLP that rely on recurrent neural networks (RNNs)
Preprocessing and modeling tabular text data
Analyzing a new pair of representative NLP problems

In the previous chapter, we looked in some detail at some important shallow neural network architectures important in transfer learning for NLP, including word2vec and sent2vec. Recall also that the vectors produced by these methods are static and noncontextual, in the sense that they produce the same vector for the word or sentence in question, regardless of the surrounding context. This means these methods are unable to disambiguate, or distinguish, between different possible meanings of a word or sentence.

In this and the next chapter, we will cover some representative deep transfer learning modeling architectures for NLP that rely on recurrent neural networks (RNNs) for key functions. Specifically, we will be looking at the modeling frameworks SIMOn,¹ ELMo,² and ULMFiT.³ The deeper nature of the neural networks employed by these methods will allow the resulting embedding to be contextual, that is, to produce word embeddings that are functions of context and allow disambiguation. Recall that we first encountered ELMo in chapter 3. In the next chapter, we will take a closer look at its architecture.

5.1 Preprocessing tabular column-type classification data

5.1.1 Obtaining and visualizing tabular data

5 Preprocessing data for recurrent neural network deep transfer learning experiments

This chapter covers

5.1 Preprocessing tabular column-type classification data

5.1.1 Obtaining and visualizing tabular data

5.1.2 Preprocessing tabular data

5.1.3 Encoding preprocessed data as numbers

5.2 Preprocessing fact-checking example data

5.2.1 Special problem considerations

5.2.2 Loading and visualizing fact-checking data

Summary