Copyright
Brief Table of Contents
Table of Contents
Preface
Acknowledgments
About this book
About the author
About the cover illustration
1.
Chapter 1. Introduction
1.1. What you’ll learn in this book
1.2. Why large datasets?
1.3. What is parallel computing?
1.3.1. Understanding parallel computing
1.3.2. Scalable computing with the map and reduce style
1.3.3. When to program in a map and reduce style
1.4. The map and reduce style
1.4.1. The map function for transforming data
1.4.2. The reduce function for advanced transformations
1.4.3. Map and reduce for data transformation pipelines
1.5. Distributed computing for speed and scale
1.6. Hadoop: A distributed framework for map and reduce
1.7. Spark for high-powered map, reduce, and more
1.8. AWS Elastic MapReduce—Large datasets in the cloud
Summary
Chapter 2. Accelerating large dataset work: Map and parallel computing