2 The basics of feature engineering

 

This chapter covers

  • Understanding the differences between structured and unstructured data
  • Discovering the four levels of data and how they describe the data’s properties
  • Looking at the five types of feature engineering and when we want to apply each one
  • Differentiating between the ways to evaluate feature engineering pipelines

This chapter will provide an introduction to the basic concepts of feature engineering. We will explore the types of data we will encounter and the types of feature engineering techniques we will see throughout this book. Before jumping right into case studies, this chapter will set up the necessary underpinnings of feature engineering and data understanding. Before we can import a package in Python, we need to know what we are looking for and what the data want to convey to us.

Oftentimes, getting started with data can be difficult. Data can be messy, unorganized, large, or in an odd format. As we see various terms, definitions, and examples in this chapter, we will set ourselves up to hit the ground running with our first case study.

2.1 Types of data

 
 
 

2.1.1 Structured data

 
 

2.1.2 Unstructured data

 
 
 
 

2.2 The four levels of data

 
 
 
 

2.2.1 Qualitative data vs. quantitative data

 
 
 

2.2.2 The nominal level

 
 

2.2.3 The ordinal level

 
 

2.2.4 The interval level

 
 

2.2.5 The ratio level

 
 
 

2.3 The types of feature engineering

 
 
 

2.3.1 Feature improvement

 
 
 

2.3.2 Feature construction

 
 
 

Summary

 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest