chapter five

5 Knowledge graphs (KGs) and natural language processing (NLP)

 

This chapter covers

  • Why NLP is indispensable for building comprehensive KGs
  • Building your first KG from news articles
  • Enriching KGs by incorporating external knowledge bases
  • Expanding knowledge by unsupervised graph-based ML

So far, we have discussed knowledge graphs (KGs) based on structured data such as tables, knowledge bases, and so forth, but what about unstructured data? Think of the various documentations, emails, chats, laws, research papers, guidelines, news articles, social media, and so on. The world is overflowing with information and knowledge locked in the unstructured form. Using these data sources could result in obtaining countless valuable observations, facts and insights important for your business.

According to Gartner [1], 80-90% of enterprise data today is unstructured. Moreover, most enterprises have limited visibility into what is inside these documents. It is not hard to see that those who learn how to process, manage, and analyze unstructured data will gain a competitive advantage over those who don’t. The task of transforming unstructured data into knowledge is a complex multistep process. Figure 5.1 depicts this as a mental model.

5.1 What is natural language processing (NLP)?

5.1.1 Basics of natural language processing

5.1.2 Named Entity Recognition (NER)

5.1.3 Use NLP for building a first KG

5.2 Knowledge enrichment

5.3 NLP-based machine learning

5.3.1 Keyword extraction

5.3.2 Graph-based topic modeling

5.4 Summary

5.5 References