This chapter covers
- An introduction to Scikit-learn
- Exploring and processing features of the Airbnb NYC dataset
- Some classic machine learning techniques
Depending on the problem, classic machine learning algorithms are often the most practical approach to working with tabular data. With decades of research and practice behind these tools and algorithms, there is a rich palette of solutions to choose from.
In this chapter, we’ll cover essential algorithms in classical machine learning for making predictions using tabular data. We have focused on the linear models because they are still the most common solutions for both a challenging baseline and a solid and robust model in production. In addition, discussing linear models helps us build concepts and ideas that we can find in deep learning architectures and in more advanced machine learning algorithms, such as gradient-boosting decision trees (which will be one of the topics of the next chapter).
We’ll also give you a quick introduction to Scikit-learn, a powerful and versatile machine learning library that we’ll use to continue exploring the Airbnb NYC dataset. We’ll stay away from lengthy mathematical definitions and textbook details in favor of examples and practical recommendations for applying these models to tabular data problems.