This chapter covers
- What nonlinearity is and how nonlinearity in hidden layers of a neural network enhances the network’s capacity and leads to better prediction accuracies
- What hyperparameters are and methods for tuning them
- Binary classification through nonlinearity at the output layer, introduced with the phishing-website-detection example
- Multiclass classification and how it differs from binary classification, introduced with the iris-flower example
In this chapter, you’ll build on the groundwork laid in chapter 2 to allow your neural networks to learn more complicated mappings, from features to labels. The primary enhancement we will introduce is nonlinearity—a mapping between input and output that isn’t a simple weighted sum of the input’s elements. Nonlinearity enhances the representational power of neural networks and, when used correctly, improves the prediction accuracy in many problems. We will illustrate this point by continuing to use the Boston-housing dataset. In addition, this chapter will introduce a deeper look at over- and underfitting to help you train models that not only perform well on the training data but also achieve good accuracy on data that the models haven’t seen during training, which is what ultimately counts in terms of models’ quality.