Chapter 1. Introducing Hadoop

 

This chapter covers

  • The basics of writing a scalable, distributed data-intensive program
  • Understanding Hadoop and MapReduce
  • Writing and running a basic MapReduce program

Today, we’re surrounded by data. People upload videos, take pictures on their cell phones, text friends, update their Facebook status, leave comments around the web, click on ads, and so forth. Machines, too, are generating and keeping more and more data. You may even be reading this book as digital data on your computer screen, and certainly your purchase of this book is recorded as data with some retailer.[1]

1 Of course, you’re reading a legitimate copy of this, right?

1.1. Why “Hadoop in Action”?

1.2. What is Hadoop?

1.3. Understanding distributed systems and Hadoop

1.4. Comparing SQL databases and Hadoop

1.5. Understanding MapReduce

1.6. Counting words with Hadoop—running your first program

1.7. History of Hadoop

1.8. Summary

1.9. Resources

sitemap