Chapter 2. User behavior and how to collect it

 

This chapter invites you to delve into the interesting subject of data collection:

  • You’ll start by returning to the Netflix site to identify events, which can provide evidence to build a case for what a user likes.
  • You’ll learn how to build a collector to gather these events.
  • You’ll learn how a collector can be integrated into a site such as MovieGEEKs to fetch events similar to the ones identified on the Netflix site.
  • With a general overview in place and an implementation, you’ll step back and analyze general consumer behavior.

Evidence is the data that reveals a user’s tastes. When we talk about collecting evidence, we’re collecting events and behavior that provide an indication of the user’s tastes.

Most books on recommender systems describe algorithms and ways of optimizing them. They start at a point where you already have a large data set to feed your algorithms. You’ll use one such data set in the MovieGEEKs site. This data set contains a catalog of movies and ratings from real users. A data set doesn’t magically appear. Gathering the right evidence takes work and consideration. It’ll also make or break your system. “Garbage in, garbage out,” that famous programming saying is also true for recommenders.

2.1. How (I think) Netflix gathers evidence while you browse

Often the purpose isn’t what it seems

2.1.1. The evidence Netflix collects

2.2. Finding useful user behavior

Content affiliation to provider

2.2.1. Capturing visitor impressions

2.2.2. What you can learn from a shop browser

2.2.3. Act of buying

2.2.4. Consuming products

2.2.5. Visitor ratings

2.2.6. Getting to know your customers the (old) Netflix way

2.3. Identifying users

2.4. Getting visitor data from other sources

2.5. The collector

2.5.1. Building the project files

sitemap