Part 4. Bringing it together

 

In chapter 13, you’ll bring it all together and explore a Spark streaming application for analyzing log files and displaying the results on a real-time dashboard. The application implemented in chapter 13 can be used as a basis for your own future applications.

Chapter 14 introduces H2O, a scalable, fast machine-learning framework with implementations of many machine-learning algorithms, most notably deep learning, which Spark lacks; and Sparkling Water, H2O’s package that enables you to start and use an H2O cluster from Spark. Through Sparkling Water, you can use Spark’s Core, SQL, Streaming, and GraphX components to ingest, prepare, and analyze data, and transfer it to H2O to be used in H2O’s deep-learning algorithms. You can then transfer the results back to Spark and use them in subsequent computations.