Chapter 14. Simplifying data with the singular value decomposition

 

This chapter covers

  • The singular value decomposition matrix factorization
  • Recommendation engines
  • Using the singular value decomposition to improve recommendation engines

Restaurants get rolled into a handful of categories: American, Chinese, Japanese, steak house, vegan, and so on. Have you ever thought that these categories weren’t enough? Perhaps you like a hybrid of these categories or a subcategory like Chinese vegetarian. How can we find out how many categories there are? Maybe we could ask some human experts? What if one expert tells us we should divide the restaurants by sauces, and another expert tells us we should divide restaurants by the ingredients? Instead of asking an expert, let’s ask the data. We can take data that records people’s opinions of restaurants and distill it down into underlying factors.

These may line up with our restaurants categories, a specific ingredient used in cooking, or anything. We can then use these factors to estimate what people will think of a restaurants they haven’t yet visited.

The method for distilling this information is known as the singular value decomposition (SVD). It’s a powerful tool used to distill information in a number of applications, from bioinformatics to finance.

14.1. Applications of the SVD

14.2. Matrix factorization

14.3. SVD in Python

14.4. Collaborative filtering–based recommendation engines

14.5. Example: a restaurant dish recommendation engine

14.6. Example: image compression with the SVD

14.7. Summary