5 Detecting patterns with unsupervised learning
This chapter covers
- Getting the virtual assistant to find patterns in the data
- Pro-actively detecting anomalies in application logs
- Removing noise from the data to reduce its size
In the previous chapter, we built a basic virtual assistant application trained using supervised shallow learning techniques to perform simple but useful tasks. In this chapter, we will expand its capabilities by adding some new features powered by unsupervised learning techniques. This will showcase the unsupervised learning capabilities of ML.NET.
One of the scenarios where unsupervised learning is appropriate is being able to detect patterns in unstructured data. This is done via a technique known as clustering which we already briefly covered in chapter 3. This is where records are assigned to clusters based on their similarities.
One of the uses of such a technique for a virtual software development assistant is the ability to find similarities in software errors based on their stack trace or error message to quickly find out if the error we are investigating is likely to be related to other errors in the system that we already solved in the past. Another good use is to help us to pre-process training data for a machine learning task. For example, we can find similarities in data to turn it into a labeled training dataset for multiclass classification. In this chapter, we will cover scenarios similar to these.