This chapter covers
- Processing the output produced by outlier detection systems
- Improving outlier detection systems over time
- Taking advantage of labeled data to create more effective ensembles
Once outlier detection systems are put in production, they will begin to identify outliers. If run on a large body of data or if run over a long period of time, they may flag a very large number of outliers, even if this is only a small fraction of the total data examined. It’s usually necessary to examine the output, not only to investigate the outliers found, but also to ensure that the system is working well (particularly for new systems, but even for established systems, which may degrade in effectiveness over time). It’s important that both of these may be done efficiently.