15 Batch schedulers:Bringing order to chaos

 

This chapter covers

  • The role of batch schedulers in high performance computing
  • Submitting a job to a batch scheduler
  • Linking job submissions for long runs or more complex workflows

Most high performance computing systems use batch schedulers to schedule the running of applications. We’ll give you a brief idea why in the first section of this chapter. Because schedulers are ubiquitous on high-end systems, you should have at least a basic understanding of them to be able to run jobs at high-performance computing centers and even smaller clusters. We’ll cover the purpose and usage of the batch schedulers. We won’t go into how to set up and manage them (that’s a whole other beast). Set up and management is a topic for system administrators and we are just lowly system users.

What if you don’t have access to a system with a batch scheduler? We don’t recommend installing a batch scheduler just to try out these examples. Rather, count your blessings and keep the information in this chapter handy for when the need arises. If your demand for computational resources grows and you begin using a larger multi-user cluster, you can come back to this chapter.

15.1 The chaos of an unmanaged system

15.2 How not to be a nuisance when working on a busy cluster

15.2.1 Layout of a batch system for busy clusters

15.2.2 How to be courteous on busy clusters and HPC sites: Common HPC pet peeves

15.3 Submitting your first batch script

15.4 Automatic restarts for long-running jobs

15.5 Specifying dependencies in batch scripts

15.6 Further explorations

15.6.1 Additional reading

15.6.2 Exercises

Summary