4 Information Extraction
This chapter covers:
- How to extract information from raw text
- A number of useful NLP concepts, including part-of-speech tagging, lemmatization, and dependency parsing
- How to build a language processing pipeline with spaCy, an industrial-strength Natural Language Processing library
In the previous chapter you looked into ways of finding texts that talk about particular concepts or facts. You’ve built an information retrieval system that can search for texts answering particular questions. For example, if you were wondering what information science is or what methods information retrieval systems use, you needed to provide your information retrieval system with the queries like “What is information science?” or “What methods do information retrieval systems use?”, and the system found for you relevant texts that talk about these things.