6. Deploying your simple app

 

This chapter covers

  • Deploying a Spark application
  • Defining the roles of the critical components in a Spark cluster environment
  • Running an application on a cluster
  • Calculating an approximation of π (pi) using Spark
  • Analyzing the execution logs

In the previous chapters, you discovered what Apache Spark is and how to build simple applications, and, hopefully, understood key concepts like the dataframe and laziness. This chapter is linked with the preceding one: you built an application in chapter 5 and will deploy it in this chapter. Reading chapter 5 before this one is not required but is highly recommended.

In this chapter, you will leave code production aside to discover how to interact with Spark as you move toward deployment and production. You could ask, “Why are we talking deployment so early in the book? Deployment is at the end, no?”

A little over 20 years ago, when I was building applications using Visual Basic 3 (VB3), toward the end of the project, I would run the Visual Basic Setup wizard that would help build 3.5-inch floppy disks. In those days, my bible was the 25-chapter Microsoft Visual Basic 3.0 Programmer’s Guide , and deployment was covered in chapter 25.

6.1 Beyond the example: The role of the components

6.1.1 Quick overview of the components and their interactions

6.1.2 Troubleshooting tips for the Spark architecture

6.1.3 Going further

6.2 Building a cluster

6.2.1 Building a cluster that works for you

6.2.2 Setting up the environment

6.3 Building your application to run on the cluster

6.3.1 Building your application’s uber JAR

6.3.2 Building your application by using Git and Maven

6.4 Running your application on the cluster

6.4.1 Submitting the uber JAR

6.4.2 Running the application

6.4.3 the Spark user interface

Summary