Convolutional neural network (CNN) architectures are useful tools for analyzing images and for differentiating their features. Lines or curves may indicate your favorite automobile, or the indicator might be a particular higher-level feature, such as the green coloring present in most frog pictures. More complex indicators might be a freckle near your left nostril or the curvature of your chin passed down through generations of your family.
Humans have become adept through the years at picking out these identifying features, and it’s fine to wonder why. Humans have grown accustomed to looking at billions of example images shown to them since birth and then receiving feedback about what they are seeing in those images. Remember your mom repeating the word ball while showing you a ball? There’s a good chance that you remember some time she said it. What about the time you saw another ball of a slightly different shape or color and said, “Ball”?