Chapter 3. Real-world Data
2.1. Getting started: data collection
2.1.1. Which features should be included?
2.1.2. How can we obtain ground truth for the target variable?
2.1.3. How much training data is required?
2.1.4. Is the training set representative enough?
2.2. Preprocessing the data for modeling
2.2.1. Categorical features
2.2.2. Dealing with missing data
2.2.3. Simple feature engineering
2.2.4. Data normalization
2.3. Using data visualization
2.5. Terms from this chapter
What's inside