23 Case study 5 solution

 

This section covers

  • Cleaning data
  • Exploring networks
  • Feature engineering
  • Optimizing machine learning models

FriendHook is a popular social networking app designed for college campuses. Students can connect as friends in the FriendHook network. A recommendation engine emails users weekly with new friend suggestions based on their existing connections; students can ignore these recommendations, or they can send out friend requests. We have been provided with one week’s worth of data pertaining to friend recommendations and student responses. That data is stored in the friendhook/Observations.csv file. We’re provided with two additional files: friendhook/Profiles.csv and friendhook/Friendships.csv, containing user profile information and the friendship graph, respectively. The user profiles have been encrypted to protect student privacy. Our goal is to build a model that predicts user behavior in response to the friend recommendations. We will do so by following these steps:

23.1 Exploring the data

23.1.1 Examining the profiles

23.1.2 Exploring the experimental observations

23.1.3 Exploring the Friendships linkage table

23.2 Training a predictive model using network features

23.3 Adding profile features to the model

23.4 Optimizing performance across a steady set of features

23.5 Interpreting the trained model

23.5.1 Why are generalizable models so important?

Summary