chapter eleven

11 Taming text with graphs

This chapter covers

A simple approach to decompose a text and store it in a graph
How to extract the hidden structure of unstructured data via natural language processing
An advanced graph model for taming text

Text, text, and more text! We are surrounded by textual data. Much of the world’s knowledge is stored and shared using text in natural language. This has been true since the beginning of human history, when we started sharing knowledge in different languages—first just by voice and later, to make it permanent, by writing.

Natural language is the way in which we interact with other humans. We begin learning it as infants—yet understanding language is one of the most complex tasks a machine can do. Nevertheless, computer scientists, data scientists, and machine learning practitioners have worked hard to make machines capable of dealing with textual data, providing for complex solutions that leverage text to offer advanced features to the final users. Let’s consider the most common solutions that you likely use every day, probably without even noticing just how complex and useful they are.

11 Taming text with graphs

This chapter covers

11.1 A basic approach: Store and access sequence of words

Exercise

Exercise

11.1.1 Advantages of the graph approach

11.2 NLP and graphs

Exercise

Exercise

11.2.1 Advantages of the graph approach

11.3 Summary

11.4 References