5 Analyzing structured data
This chapter covers
- Translating questions to queries
- Building natural language interfaces
- Analyzing data tables
- Analyzing graph data
A significant percentage of the world’s information is stored as structured data. Structured data essentially means data stored in a standardized format. For example, data tables (e.g., think of the data you would find in an Excel spreadsheet) and data describing entities and their relationships as graphs (such as a data set describing a social network) are popular types of structured data.
Tools for processing structured data have been available for many decades. After all, structured data has a standardized format optimized to make it easy for computers to process. So why do we need large language models for that? The problem with existing tools for processing structured data is their interface. Typically, each tool (or, at the very least, each category of tools for specific types of structured data) supports its own formal query language.
Using this language, users can often perform a wide range of analysis operations on structured data. But learning such query languages takes time! Wouldn’t it be nice if all those systems could be queried using a single language, ideally in natural language (e.g., plain English)?