This chapter covers
- Additional Python libraries that support outlier detection
- Additional algorithms not found in libraries
- Three algorithms that support categorical data
- An interpretable outlier detection method, association rules
- Examples and techniques you’ll need to develop your own outlier detection code where necessary
Although scikit-learn and PyOD cover a large number of outlier detection algorithms, there are many algorithms not covered by these libraries that may be equally useful for outlier detection projects. In addition, the detectors provided in these libraries cover only numeric and not categorical data, and may not be as interpretable as may sometimes be necessary. Of the algorithms we’ve looked at so far, only frequent pattern outlier factor (FPOF) and histogram-based outlier score (HBOS) provide good support for categorical data, and most have low interpretability. We’ll introduce in this chapter some other detectors better suited for categorical and mixed data: Entropy, association rules, and a clustering method for categorical data. Association rules detectors also have the benefit of being quite interpretable.