Chapter 8. Managing Hadoop

 

This chapter covers

  • Configuring for a production system
  • Maintaining the HDFS filesystem
  • Setting up a job scheduler

The installation instructions in chapter 2 produced a running Hadoop cluster fairly quickly. The configuration was relatively simple, but unfortunately it’s not good for a production cluster, which will be under heavy sustained use. There are various configuration parameters that you would want to tune for a production cluster, and section 8.1 will cover those parameters.

In addition, like any system, a Hadoop cluster will change over time and you (or some administrator) will have to know how to maintain it to keep it running in good shape. This is particularly true for the HDFS filesystem. In sections 8.2 through 8.5, we cover various standard filesystem maintenance tasks, such as checking its health, setting permissions, quotas, and recovering deleted files (trash). Sections 8.6 through 8.10 will cover the bigger but rarer administrative tasks more specific to HDFS. These include adding/removing nodes (capacity) and recovery from NameNode failure. We end the chapter with a section on setting up a scheduler to manage multiple running jobs.

8.1. Setting up parameter values for practical use

8.2. Checking system’s health

8.3. Setting permissions

8.4. Managing quotas

8.5. Enabling trash

8.6. Removing DataNodes

8.7. Adding DataNodes

8.8. Managing NameNode and Secondary NameNode

8.9. Recovering from a failed NameNode

8.10. Designing network layout and rack awareness

8.11. Scheduling jobs from multiple users

8.12. Summary