concept log loss in category machine learning

This is an excerpt from Manning's book Grokking Machine Learning MEAP V09.
Table 5.5. The full table with the four points, their label, predicted label, absolute error, square error and log loss.
Point
True label
Predicted label
Absolute Error
Square Error
Log loss
1
1 (Happy)
0.95
0.05
0.0025
0.051
2
0 (Sad)
0.8
0.8
0.64
1.609
3
1 (Happy)
0.3
0.7
0.49
0.916
4
0 (Sad)
0.1
0.1
0.01
0.105
If I haven’t convinced you of the power of the log loss error function, let’s look at an extreme point. Let’s say we have a point with label 1 (happy), for which the classifier makes a prediction of 0.00001. This point is very poorly classified. The absolute error will be 0.99999, and the square error will be 0.9999800001. However, the log loss will be the negative of the natural logarithm of (1-0.99999), which is 11.51. This value is much larger than the absolute or square errors, which means the log loss error is a better alarm for poorly classified points.
Notice that the classifier in the left, which is bad, has a log loss of 2.988. The classifier in the right, which is good, has a smaller log loss of 1.735. Thus, the log loss does its job, which is to assign a large error value to bad classifiers and a smaller one to good classifiers.
Figure 5.9. We now calculate the log loss by taking the probability calculated in Figure 5.8. Notice that the good classifier (right) has a smaller log loss than the bad clalssifier (left)
![]()