Chapter 6. Achieving high reliability at cloud scale
This chapter covers
- SOA as a precursor to the cloud
- How loose coupling improves reliability
- Distributed high-performance cloud reliability, including MapReduce
The cloud is great for dealing with scale because the public Infrastructure as a Service (IaaS) as well as Platform as a Service (PaaS) clouds are large collections of thousands of virtualized servers with tools that allow you to expand and contract the number of instances of your application according to demand. But what happens when you try to have literally thousands of commodity (cheap) computers all working in parallel? Well, some of them will fail as they reach the mean-time-to-failure point. You learned about designing and architecting for scalability in chapter 5. But in the event that you create a popular application (there’ll be another Google and Facebook, have no fear), you need to be prepared to deal with those hardware failures. You need to design and architect those applications for reliability. Reliability is important for any application, no matter where it resides, if it’s going to be put into production and in any way become mission critical. But the cloud presents interesting challenges as well as opportunities with respect to application reliability.