Chapter 14. Achieving high availability: availability zones, auto-scaling, and CloudWatch

 

This chapter covers

  • Using a CloudWatch alarm to recover a failed virtual machine
  • Understanding availability zones in an AWS region
  • Using auto-scaling to guarantee your VMs keep running
  • Analyzing disaster-recovery requirements

Imagine you run a web shop. During the night, the hardware running your virtual machine fails. Until the next morning when you go into work, your users can no longer access your web shop. During the 8-hour downtime, your users search for an alternative and stop buying from you. That’s a disaster for any business. Now imagine a highly available web shop. Just a few minutes after the hardware failed, the system recovers, restarts itself on new hardware, and your web shop is back online again—without any human intervention. Your users can now continue to shop on your site. In this chapter, we’ll teach you how to build a high-availability architecture based on EC2 instances.

Virtual machines aren’t highly available by default. The following scenarios could cause an outage of your virtual machine:

14.1. Recovering from EC2 instance failure with CloudWatch

14.2. Recovering from a data center outage

14.3. Analyzing disaster-recovery requirements

Summary