11 Information extraction (named entity extraction and question answering)

 

This chapter covers

  • Sentence segmentation
  • Named entity recognition (NER)
  • Numerical information extraction
  • Part-of-speech (POS) tagging and dependency tree parsing
  • Logical relation extraction and knowledge bases

One last skill you need before you can build a full-featured chatbot is extracting information or knowledge from natural language text.

11.1 Named entities and relations

You’d like your machine to extract pieces of information and facts from text so it can know a little bit about what a user is saying. For example, imagine a user says “Remind me to read aiindex.org on Monday.” You’d like that statement to trigger a calendar entry or alarm for the next Monday after the current date.

11.1.1 A knowledge base

11.1.2 Information extraction

11.2 Regular patterns

11.2.1 Regular expressions

11.2.2 Information extraction as ML feature extraction

11.3 Information worth extracting

11.3.1 Extracting GPS locations

11.3.2 Extracting dates

11.4 Extracting relationships (relations)

11.4.1 Part-of-speech (POS) tagging

11.4.2 Entity name normalization

11.4.3 Relation normalization and extraction

11.4.4 Word patterns

11.4.5 Segmentation

11.4.6 Why won’t split('.!?') work?

11.4.7 Sentence segmentation with regular expressions