Chapter 1. Introducing Hadoop

This chapter covers

The basics of writing a scalable, distributed data-intensive program
Understanding Hadoop and MapReduce
Writing and running a basic MapReduce program

Today, we’re surrounded by data. People upload videos, take pictures on their cell phones, text friends, update their Facebook status, leave comments around the web, click on ads, and so forth. Machines, too, are generating and keeping more and more data. You may even be reading this book as digital data on your computer screen, and certainly your purchase of this book is recorded as data with some retailer.^[1]

¹ Of course, you’re reading a legitimate copy of this, right?

1.1. Why “Hadoop in Action”?

Chapter 1. Introducing Hadoop

This chapter covers

1.1. Why “Hadoop in Action”?

1.2. What is Hadoop?

1.3. Understanding distributed systems and Hadoop

1.4. Comparing SQL databases and Hadoop

1.5. Understanding MapReduce

1.6. Counting words with Hadoop—running your first program

1.7. History of Hadoop

1.8. Summary

1.9. Resources