This chapter covers
- Gathering and preparing data from the internet, using generative AI to help
- Drafting a baseline and first tentative model to be optimized
- Figuring out how the model works and inspecting it
This chapter concludes our overview of classical machine learning for tabular data. To wrap things up, we’ll work through a complete example from the field of data journalism. Along the way, we’ll summarize all the concepts and techniques we’ve used so far. We will also use a generative AI tool, ChatGPT, to help you get the job done and demonstrate a few use cases where having a large language model (LLM) can improve your work with tabular data.
We will finally build a model to predict prices, this time using a regression-based approach. Doing this will help us understand how the model works and why it performs in a particular manner to gain further insights into the pricing dynamics for Airbnb listings and challenge our initial hypothesis regarding how pricing happens for short-term rentals.