Chapter 6. Writing code that survives failure

 

This chapter covers

  • Understanding load balancing
  • Installing and configuring HAProxy to load balance Rabbit
  • Writing code that reconnects and intelligently survives failure

Building a RabbitMQ cluster to ensure availability and performance is only half the battle of ensuring a resilient messaging infrastructure. The other half is writing applications that expect node failure and knowing how to reconnect to the cluster when it happens. There are a number of strategies for handling reconnection to the cluster, but the one we’ll focus on is using a load balancer to handle node selection. By using a load balancer you not only reduce the complexity of the failure handling code in your apps, but you also ensure even connection distribution across your cluster. But even with a load balancer, there’s more to writing an app that can handle node failure than establishing a new connection to the cluster. Your apps also need to be prepared to re-create exchanges and queues that may not have survived the failure of the original node. This is particularly true when using two standalone Rabbit nodes in an active/standby configuration (which we’ll cover in chapter 7). Before you start writing failure-handling code in your apps, we’ll look at what it takes to use a load balancer with RabbitMQ.

6.1. Load balancing your Rabbits

6.2. Lost connections and failing clients between servers

6.3. Summary