chapter three

3 Introduction to Vector Calculus from Machine Learning point of view

The core concept of machine learning is simple enough. We took a first look at it in section 1.3. Then in section 2.8.2 we studied classifiers as a special case. Let us revisit these with a different example this time. Also, in section 1.3 we skipped on the topic of error minimization. This time, armed with our knowledge of gradients, we will study the topic.

The python numpy/pytorch code for this section, in the form of fully functional and executable Jupyter-notebooks can be found at http://mng.bz/4Zya.

Suppose we want to create a classifier machine that classifies whether an image contain a car or a giraffe. Such classifiers, with only two classes, are known as binary classifiers. We identify a set of input signals which are collected together in an input vector denoted . In case of convolutional neural networks, aka CNNs, the inputs are the pixel values of the image. The image is usually scaled to a fixed size, say 224 × 224. Thus the image is representable as a matrix

⌊ ⌋ | X0,0 X0,1 ⋅⋅⋅ X0,223| X = || X1,0 X1,1 ⋅⋅⋅ X1,223|| ⌈ ... ... ... ... ⌉ X223,0 X223,1 ⋅⋅⋅ X223,223

Each element of the matrix, X_i,j is a pixel color value in the range [0,255].

3.1 Significance of the sign of the separating surface in binary classification

3.2 Estimating Model Parameters: Training

3.3 Minimizing Error during Training a Machine Learning Model: Gradient Vectors

3.3.1 Derivatives, Partial Derivatives, Change in function value and Tangents, Gradients

3 Introduction to Vector Calculus from Machine Learning point of view

3.1 Significance of the sign of the separating surface in binary classification

3.2 Estimating Model Parameters: Training

3.3 Minimizing Error during Training a Machine Learning Model: Gradient Vectors

3.3.1 Derivatives, Partial Derivatives, Change in function value and Tangents, Gradients

3.3.2 Level Surface representation and Loss Minimization

3.4 Python numpy and PyTorch code for Gradient Descent, Error Minimization and Model Training

3.4.1 Numpy and PyTorch code for Linear Models

3.4.2 Non-linear Models in PyTorch

3.4.3 A Linear Model for the cat-brain in PyTorch

3.5 Convex, Non-convex functions; Global and Local Minima

3.6 Multi-dimensional Taylor series and Hessian Matrix

3.6.1 1D Taylor Series recap

3.6.2 Multi-dimensional Taylor series and Hessian Matrix

3.7 Convex sets and functions

Chapter Summary