Wow. We did it. We have been through a lot together—from trying to distinguish between COVID-19 and the flu to trying to predict the stock market and a lot of things in between. In each of our case studies, we saw ways of manipulating data for the explicit purposes of maximizing ML metrics, minimizing bias from data, and simplifying how we view data. This chapter aims to wrap up everything we’ve talked about in a neat bow and give you the confidence and power to use feature engineering to enhance your ML pipelines.
We’ve spent a long time in the weeds engineering features for all kinds of data and use cases. If we zoom back out and look at the feature engineering pipeline from our first chapter, we can see our overall goal: transforming data into features that provide a signal to ML pipelines.
In this book, we have mainly looked at feature engineering as a way to enhance predictive ML pipelines, but that is not the only use of feature engineering. We can also rely on these techniques to do the following:
- Clean data for business intelligence dashboards and analytics.
- Perform unsupervised ML, like topic modeling and clustering.