chapter sixteen

16 Deep Learning-based outlier detection

 

This chapter covers

  • Exploring the general concepts of deep learning-based outlier detection
  • Demonstrating some of the options available in standard libraries
  • Introducing outlier detection for image data
  • Discussing the state of the art in outlier detection today

Deep learning-based outlier detection techniques can be very powerful for many types of problems. For tabular data, they are typically still not as useful as the methods we’ve looked at so far in this book, but as a data scientist, you may also often work with time series, text, image, video, audio, network, or other types of data, and for many of these, deep learning-based methods can be very effective. In fact, for many types of data, including image, video, and audio, there really are no other viable options available today.

Deep-learning based outlier detection can work in a variety of ways, but all use deep neural networks in one way or another. Deep neural networks have some significant advantages as models and have proven themselves to be able to handle many types of problems that are unsolvable using other means. At the same time, they do have some costs associated with them. They tend to require a very large amount of data to train, are slower to work with, and are more difficult to tune. Still, they’ve made phenomenal progress in many fields the last several years, even with tabular data, and we will certainly see them become increasingly powerful in years to come.

16.1 Introduction to neural networks

16.2 PyOD

16.2.1 Autoencoders (AEs)

16.2.2 Variational autoencoders (VAEs)

16.2.3 Generative Adversarial Networks (GANs)

16.2.4 SO_GAAL (Single-objective Generative Adversarial Learning)

16.2.5 MO_GAAL (Multi-objective Generative Adversarial Learning)

16.2.6 DeepSVDD

16.2.7 DIF (Deep Isolation Forest)

16.3 Image data

16.3.1 Techniques for outlier detection with image data

16.3.2 Astronomaly

16.4 Alibi-detect

16.5 Self-supervised learning for outlier detection with tabular data

16.5.1 Introduction to embeddings

16.5.2 Embeddings for outlier detection

16.5.3 Transfer learning

16.5.4 Self-supervised methods for tabular data

16.5.5 DeepOD

16.6 Summary