chapter thirteen

13 Where to go next

In this chapter

You get a brief overview of 10 algorithms that haven’t been covered in this book and why they’re useful.
You get pointers on what to read next, depending on what your interests are.

Linear regression

Suppose you need to sell your house. It is 3,000 ft². You look at the homes recently sold in your neighborhood.

Based on this information, how would you price your house? Here’s one way you could do it. Plot all the points.

Then eyeball a line through these points.

Now you can see where 3000 ft² lands on that line, and that would be a pretty good starting price for your home:

This is how linear regression works. Given a bunch of points, it tries to fit a line to them, and then you can use that line to make predictions.

Linear regression has been used in statistics for a long time, and now it is being widely used in machine learning because it is an easy first technique to try. It is useful if your values are continuous. If you are trying to predict something, linear regression might be a good place to start.

Inverted indexes

Here’s a very simplified version of how a search engine works. Suppose you have three web pages with this simple content.

Let’s build a hash table from this content.

The keys of the hash table are the words, and the values tell you what pages each word appears on. Now suppose a user searches for hi. Let’s see what pages hi shows up on.

13 Where to go next

In this chapter

Linear regression

Inverted indexes

The Fourier transform

Parallel algorithms

map/reduce