chapter three

3 Create your first knowledge graph from ontologies

 

This chapter covers

  • Selecting the best KG technology based on use cases
  • Constructing a KG to support clinicians’ activities
  • Performing analysis and ontology-based reasoning on top of a KG

KG construction is complex due to the need to extract and integrate information from data sources that differ in format (XML, CSV, JSON), storage technology (relational or document-oriented), information syntax (e.g., 2022-08-09 or 9 August 2022), and especially the meaning of the data. In healthcare, for instance, varied expressions that identify the same concept (type 2 diabetes versus ketosis-resistant diabetes), identical acronyms that define distinct concepts (PE as physical examination or pulmonary embolism), and information granularity (necrosis or lobular necrosis) are obstacles to data integration.

When constructing a KG, we aim for a unified, well-grounded, and meaningful representation of data from various sources, where individual pieces of information are integrated into a coherent view. Issues related to the meaning of data can be addressed using semantic integration. A common strategy is to adopt one or more ontologies as a reference schema and vocabulary for incoming data. An ontology lets you model data using a standard vocabulary that includes elements such as formal names, properties, categories, and relationships between entities described within the data.

3.1 Knowledge graph building: Warmup

3.1.1 Business and domain understanding

3.1.2 Data understanding

3.2 Understanding knowledge graph technologies

3.2.1 RDF or LPG? A goal-driven discussion

3.2.2 Representing edge properties with RDF and LPG

3.3 Building a knowledge graph

3.3.1 Ontology ingestion and processing with neosemantics

3.3.2 Annotation ingestion and processing

3.4 Querying the data

3.5 Reasoning over the KG

Summary