As with any technology, you need to understand a bit of the “boring” theory before you can deep dive into using it. I have managed to contain this part to six chapters, which will give you a good overview of the concepts, explained through examples.
Chapter 1 is an overall introduction with a simple example. You will learn why Spark is not just a simple set of tools, but a real distributed analytics operating system. After this first chapter, you will be able to run a simple data ingestion in Spark.
Chapter 2 will show you how Spark works, at a high level. You’ll build a representation of Spark’s components by building a mental model (representing your own thought process) step by step. This chapter’s lab shows you how to export data in a database. This chapter contains a lot of illustrations, which should make your learning process easer than just from words and code!