3 Drawing a line close to our points: Linear regression

 
Diagram Description automatically generated

This chapter covers

  • What is linear regression?
  • How to predict the price of a house based on known prices of other houses.
  • How to fit a line through a set of data points.
  • How to code the linear regression algorithm in Python.
  • How to use Turi Create to build a linear regression model to predict housing prices in a real dataset.
  • What is polynomial regression?
  • Fitting a more complex curve to our data, when our data is non-linear.
  • Examples of linear regression in the real world, such as medical applications and recommender systems.

In this chapter we learn linear regression. Linear regression is a very powerful and widely used method to estimate values, such as the price of a house, the value of a certain stock, the life expectancy of an individual, the amount of time a user will watch a video or spend in a website, among many others. If you have seen linear regression before, you may have seen it as a plethora of complicated formulas including derivatives, systems of equations, and determinants. However, I like to see linear regression in a more graphical and less formulaic way. In this chapter, all you need to understand linear regression is to have the ability to visualize points and lines moving around.

The mental picture I have of linear regression is the following. Let us say that we have some points that roughly look like they are forming a line, like the one in figure 3.1.

3.1      The problem: We need to predict the price of a house

3.2      The solution: Building a regression model for housing prices

3.2.1   The remember step: looking at the prices of existing houses

3.2.2   The formulate step: formulating a rule that estimates the price of the house

3.2.3   The predict step: what do we do when a new house comes on the market?

3.2.4   What if we have more variables? Multivariate linear regression

3.2.5   Some questions that arise and some quick answers

3.3      How to get the computer to draw this line: the linear regression algorithm

3.3.1   Crash course on slope and y-intercept

3.3.2   A simple trick to move a line closer to a set of points, one point at a time.

3.3.3   The square trick: A much more clever way of moving our line closer to one of the points

3.3.4   The absolute trick: another useful trick to move the line closer to the points

3.3.5   The linear regression algorithm: Repeating the absolute or square trick many times to move the line closer to the points

3.3.6   Loading our data and plotting it

sitemap