3 Principles of curve fitting
This chapter covers:
- How to fit a parametric model
- What is a loss function and how to use it
- Linear regression, the mother of all neural networks
- Gradient descent as a tool to optimize a loss function
- Implementing gradient descent with different frameworks
DL models became famous because they outperformed traditional machine learning methods in a broad variety of relevant tasks such as computer vision and natural language processing. From the previous chapter, you already know that a critical success factor of DL models is their deep hierarchical architecture. DL models have millions of tunable parameters, and you might wonder how to tune these parameters so that the model behaves optimally. The solution is astonishingly simple and already used in many methods in traditional machine learning: you first define a loss function which describes how badly a model performs on the training data and then tune the parameters of the model to minimize the loss. This procedure is called fitting.