Chapter 14. Deep learning on Spark with H2O

This chapter covers

  • Introduction to H2O
  • Introduction to deep learning
  • Starting an H2O cluster on Spark
  • Building and evaluating a regression deep-learning model using Sparkling Water
  • Building and evaluating a classification deep-learning model using Sparkling Water

Deep learning is a hot topic in the machine-learning world today. We could say that there’s a deep-learning revolution going on. Deep learning is a general term denoting a family of machine-learning methods characterized by the use of multiple processing layers of nonlinear transformations. These layers are almost universally implemented as neural networks.

Although the core principles aren’t new, a lack of computing power and efficient algorithms prevented those principles from being further developed in the previous decades. This has changed in recent years, with many advances in deep-learning algorithms and their successful applications. One of the many recent breakthroughs is the DeepID system for learning high-level features,[1] which is capable of recognizing tens of thousands of faces with a close-to-human accuracy of 97.45% (unlike its accuracy, its capacity is obviously superhuman).

1Yi Sun et al., “Deep Learning Face Representation from Predicting 10,000 Classes,” http://mng.bz/W01w.

14.1. What is deep learning?

14.2. Using H2O with Spark

14.3. Performing regression with H2O’s deep learning

14.4. Performing classification with H2O’s deep learning

14.5. Summary

sitemap