7 Named entity disambiguation

 

This chapter covers

  • The key ideas of named entity disambiguation combined with knowledge graph technologies
  • Building a knowledge graph from multiple sources
  • Performing advanced analysis

Natural Language Processing (NLP) techniques play a critical role in the automatic construction of knowledge graphs (KGs) from unstructured data. A key task in this process is Named Entity Recognition (NER), which is essential for identifying mentions of relevant named entities in raw text. NER assigns these entities to predefined categories such as people, organizations, locations, or diseases. While NER is an important component in building KGs, it offers a necessary but incomplete contribution to achieving a precise understanding of text within a specific application domain.

Imagine developing an Intelligent Advisory System (IAS) to support the diverse activities of stakeholders in the healthcare field. One of the critical attributes of such IAS is interactivity, which is the practical ability to exchange information with humans effectively through multiple interactions.

A subset of the essential features to enable this exchange includes:

  1. the capacity to detect meaningful entities in natural language;
  2. the ability to retrieve information on these entities from different knowledge sources;

7.1 From recognition to disambiguation

7.2 Domain-based NED and LLMs

7.3 Business and domain understanding

7.3.1 Context

7.3.2 Use cases definition

7.4 Data understanding

7.4.1 Unstructured data

7.4.2 Domain ontologies

7.5 SoHO knowledge graph building

7.5.1 Schema definition

7.5.2 Documents processing and ingestion

7.5.3 Medical entities disambiguation and ingestion

7.5.4 Ontologies processing, loading, and mapping

7.5.5 Entities co-occurrence generation

7.6 Knowledge graph-based use cases

7.6.1 Conceptual search

7.6.2 Structured knowledge-based search

7.6.3 KG-based interpretability and discovery

7.6.4 New knowledge uncovering

7.7 Summary

7.8 References