concept BERT in category nlp

appears as: BERT
Transfer Learning for Natural Language Processing MEAP V04

This is an excerpt from Manning's book Transfer Learning for Natural Language Processing MEAP V04.

In all of these language-model-based methods – ELMo, ULM-Fit, the OpenAI Transformer and BERT – it was shown that embeddings generated could be fine-tuned for specific downstream NLP tasks with relatively few labeled data points. The focus on language models was deliberate - it was hypothesized that the hypothesis set induced by them would be generally useful, and the data for massive training was known to be readily available.

Figure 2.1. The different types of supervised models to be explored in the content classification examples in this chapter. The abbreviation ELMo stands for “Embeddings from Language Models” while BERT stands for “Bidirectional Encoder Representations from Transformers”.
A picture containing screenshot Description automatically generated

BERT, which stands for “Bidirectional Encoder Representations from Transformers” is a transformer-based model that we already encountered briefly in Chapter 2. It was trained with the masked modeling objective, i.e., to “fill-in-the-blanks”. Additionally, it was trained with the next sentence prediction task, i.e., to determine whether a given sentence is a plausible following sentence after a target sentence. While not suited for text-generation, this model performs very well on other general language tasks such as classification and question answering. Since we have already explored classification at some length, we will use the question answering task to explore this model architecture in more detail than what was done in Chapter 2.

sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage