12 Constructing a graph using natural language processing techniques

 

This chapter covers

  • The information extraction pipeline
  • Coreference resolution
  • Named entity recognition and linking
  • Relation extraction
  • Developing an information extraction pipeline

The amount of text-based information available on the internet is astounding. It is hard to imagine the number of social media posts, blogs, and news articles published daily. However, despite the wealth of information available, much of it remains unstructured and difficult to extract valuable insights from. This is where natural language processing (NLP) comes into play. NLP is a rapidly growing field that has seen a significant increase in attention in recent years, especially since transformer models (Vaswani, 2017) and, more recently, the GPT-3 (Brown et al., 2020) and GPT-4 models (OpenAI, 2023) were introduced. One particularly important area of NLP is the field of information extraction, which focuses on the task of extracting structured information from unstructured text.

12.1 Coreference resolution

 
 
 

12.2 Named entity recognition

 
 
 
 

12.2.1 Entity linking

 
 

12.3 Relation extraction

 
 
 

12.4 Implementation of information extraction pipeline

 
 
 

12.4.1 SpaCy

 
 
 

12.4.2 Corefence resolution

 
 

12.4.3 End-to-end relation extraction

 

12.4.4 Entity linking

 
 
 

12.4.5 External data enrichment

 

12.5 Solutions to exercises

 
 
 

Summary

 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage