8 Multitask learning

This chapter covers

Deep multitask learning for NLP: the joint learning of NLP tasks, with the aim of improving on each subtask by learning them jointly
Implementing hard parameter sharing for multitask learning
Implementing soft parameter sharing for multitask learning
Combining hard and soft parameter sharing into mixed parameter sharing

You will apply different approaches of multitask learning to practical NLP problems. In particular, we will apply multitask learning to three datasets:

Two sentiment datasets, consisting of consumer product reviews and restaurant reviews.
The Reuters topic dataset.
A part-of-speech and named entity tagging dataset.

8.1 Introduction

Figure 8.1. An introduction to multitask learning. Classifier performance improvement by learning several tasks in one go.

Multitask learning is concerned with learning several things at the same time. An example would be to learn both part of speech tagging and sentiment analysis at the same time. Or learning two topic taggers in one go. Why would that be a good idea? Ample research has demonstrated, for quite some time already, that multitask learning improves the performance on certain tasks separately. This gives rise to the following application scenario:

8 Multitask learning

This chapter covers

8.1 Introduction

Figure 8.1. An introduction to multitask learning. Classifier performance improvement by learning several tasks in one go.

8.2 Data

8.3 Consumer reviews: Yelp and Amazon

8.3.1 Data handling

8.4 Reuters topic classification

8.4.1 Data handling

8.5 Part-of-speech and named entity recognition data

8.5.1 Data handling

8.5 Summary

8.6 Further reading

8 Multitask learning

This chapter covers

8.1 Introduction

Figure 8.1. An introduction to multitask learning. Classifier performance improvement by learning several tasks in one go.

8.2 Data

8.3 Consumer reviews: Yelp and Amazon

8.3.1 Data handling

8.3.2 Hard parameter sharing

8.3.3 Soft parameter sharing

8.3.4 Mixed parameter sharing

8.4 Reuters topic classification

8.4.1 Data handling

8.4.2 Hard parameter sharing

8.4.3 Soft parameter sharing

8.4.4 Mixed parameter sharing

8.5 Part-of-speech and named entity recognition data

8.5.1 Data handling

8.5.2 Hard parameter sharing

8.5.3 Soft parameter sharing

8.5.4 Mixed parameter sharing

8.5 Summary

8.6 Further reading