Chapter 8. Collaborative filtering in the neighborhood

 

Collaborating makes things easier, so let’s collaborate our way through this chapter.

  • You’ll start by revisiting the rating matrix.
  • You’ll look at the theory behind collaborative filtering.
  • Collaborative filtering is done in several steps, and you’ll look at each and learn about the choices that need to be addressed.
  • You’ll learn how collaborative filtering is implemented in MovieGEEKs.

This chapter introduces collaborative filtering and goes into detail about the branch of it called neighborhood-based filtering. Collaborative filtering is an umbrella of methods. What unites those is the selection of data. These filtering methods only use ratings (implicit or explicit) as the source for creating recommendations.

I dedicate two chapters, this one and chapter 10, for collaborative filtering. Chapter 10 covers learned models using matrix factorization to find hidden features, also known as latent features. Chapter 9 covers content-based filtering.[1]

1 I also use collaborative filtering in chapter 12, which covers hybrid recommenders as one of the feature recommenders, and again in chapter 13, which covers ranking algorithms. But that isn’t the focus of those chapters.

8.1. Collaborative filtering: A history lesson

8.1.1. When information became collaboratively filtered

8.1.2. Helping each other

8.1.3. The rating matrix

8.1.4. The collaborative filtering pipeline

8.1.5. Should you use user-user or item-item collaborative filtering?

8.1.6. Data requirements

8.2. Calculating recommendations

8.3. Calculating similarities

8.4. Amazon’s algorithm to precalculate item similarity

Beware of the 1 or 2 items in common problem

8.5. Ways to select the neighborhood

Clustering