Chapter 14. Deep learning on Spark with H2O
This chapter covers
- Introduction to H2O
- Introduction to deep learning
- Starting an H2O cluster on Spark
- Building and evaluating a regression deep-learning model using Sparkling Water
- Building and evaluating a classification deep-learning model using Sparkling Water
Deep learning is a hot topic in the machine-learning world today. We could say that there’s a deep-learning revolution going on. Deep learning is a general term denoting a family of machine-learning methods characterized by the use of multiple processing layers of nonlinear transformations. These layers are almost universally implemented as neural networks.
Although the core principles aren’t new, a lack of computing power and efficient algorithms prevented those principles from being further developed in the previous decades. This has changed in recent years, with many advances in deep-learning algorithms and their successful applications. One of the many recent breakthroughs is the DeepID system for learning high-level features,[1] which is capable of recognizing tens of thousands of faces with a close-to-human accuracy of 97.45% (unlike its accuracy, its capacity is obviously superhuman).
1Yi Sun et al., “Deep Learning Face Representation from Predicting 10,000 Classes,” http://mng.bz/W01w.