Chapter 3. Components of Hadoop

 

This chapter covers

  • Managing files in HDFS
  • Analyzing components of the MapReduce framework
  • Reading and writing input and output data

In the last chapter we looked at setting up and installing Hadoop. We covered what the different nodes do and how to configure them to work with each other. Now that you have Hadoop running, let’s look at the Hadoop framework from a programmer’s perspective. If the previous chapter is like teaching you how to connect your turntable, your mixer, your amplifier, and your speakers together, then this chapter is about the techniques of mixing music.

We first cover HDFS, where you’ll store data that your Hadoop applications will process. Next we explain the MapReduce framework in more detail. In chapter 1 we’ve already seen a MapReduce program, but we discussed the logic only at the conceptual level. In this chapter we get to know the Java classes and methods, as well as the underlying processing steps. We also learn how to read and write using different data formats.

3.1. Working with files in HDFS

 
 
 
 

3.2. Anatomy of a MapReduce program

 
 
 

3.3. Reading and writing

 

3.4. Summary

 
 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest