Chapter 3. Patterns for performance, scalability, and availability
In this chapter
- The base for performance-related patterns
- Multimodal biometrics
- Scaling inside and outside of the service
When you design a software architecture for a complete system, you need to make sure it will accommodate additional sets of requirements beyond the basics. You need to take care of maintainability, security, and reliability. One very important quality attribute or requirement class is performance. Performance involves several concerns, such as throughput and latency, which sometimes complement and sometimes contradict each other.
SOA principles and guidelines don’t always help to solve performance problems. In fact, SOA is almost inherently bad for performance: by making the components distributed, it tends to increase latency and add layers of indirection. This chapter will present patterns to help mitigate these performance, scalability, and availability challenges. Availability and scalability are bundled with performance because a solution to one of these problems often helps to resolve the others.
One strategy to increasing performance is load balancing (see the Service Instance pattern in section 3.4). If implemented properly, it can also help increase service availability as each load-balanced server provides redundancy for the others.
Another aspect of temporal coupling is apparent in the Request/Reply pattern (discussed in chapter 5), which is what most common communications pattern SOA implementations use. With Request/Reply, you typically expect the service to return a result immediately, and this couples the consumer to the service in time, potentially resulting in a performance bottleneck. The maximum load is the maximum number of requests the service can handle concurrently.
The Decoupled Invocation pattern helps solve the potential performance bottleneck outlined in the problem section. It does this with a queue between the caller and the message handler components. Placing a message on the queue is an efficient operation, which means the service will be free to accept new requests sooner. If you keep the handler simple, you can employ the Virtual Endpoint pattern (see section 3.5) to resolve availability problems when faults occur.
The Parallel Pipelines pattern works well in combination with the other performance and scalability patterns we’ll discuss in this chapter. You can use Parallel Pipelines with the Gridable Service pattern (see section 3.3) to solve a performance problem within one of the subtask components.
Implementing the Parallel Pipelines pattern isn’t too complicated—the design of which operations should be grouped in which subcomponents is the complicated part. You can use Akka actors—a Scala framework also usable from Java that lets you implement remote message passing between components. Akka components are called actors, and each actor can be a separate pipeline. Another option is to base a solution on JavaSpaces technology, which has commercial implementations like GigaSpaces (usable both from Java and.NET). The nice feature of both the Akka and JavaSpaces technologies is that, though they’re different, they both allow you to make components local or remote by configuration and thus partition the logic into pipelines according to your needs and performance requirements.
Figure 3.7 illustrates the solution. The Gridable Service pattern is based on a computation grid, and possibly a data grid, as part of the internal structure of a service. When the service business logic needs to handle a task that’s computationally intense, the business logic creates a job on the grid root. A job is made of one or more tasks that can be queued and executed on the grid. The scheduler distributes the tasks to one or more nodes, depending on the job type, and the grid agent then executes them.
The grid infrastructure components (the agent, root node, and so on) constantly monitor resource availability. Adding hardware, configured with the grid components, enlarges the pool of available resources. The grid takes care to maximize the usage and does that based on the load of the machines. This “smart” resource allocation helps solve both scalability and load-balancing requirements. Additionally, the grid implements redundancy and failover and can pass tasks to new nodes when a node fails. The Gridable Service pattern can be combined with the Workflodize pattern (see chapter 2) by making the job’s tasks into workflow instances or by having a workflow drive the jobs.
It’s better to maintain a single endpoint and then divide the request load between the service instances. You can build on the Virtual Endpoint pattern (discussed in section 3.5) if you need multiple endpoints. The important point is that consumers of the service will be unaware of and unaffected by the scaling that occurs inside the service (see the sidebar for more information).
Implementing the Service Instance pattern doesn’t require a particular technology. Instead, you implement a dispatcher in the language of your choice, and distribute requests to the farm of servers running your service. This is especially true if you implement this pattern on top of the Decoupled Invocation pattern.
For one thing, you need to take care of restarting the failed service and resume request processing. You can look at the Service Monitor pattern (see chapter 4), Service Watchdog pattern (see section 3.6), and the Transactional Service pattern (see chapter 2) for ways to monitor services and recover from failures.
If your service is truly stateless, you can scale the service using the Service Instance pattern described earlier. But this may not provide a completely seamless solution to the service consumer. The fact that there are multiple instances of the service may be exposed to the client.
The first is the watchdog agent concept, where the service implements the Active Service pattern (discussed in chapter 2) and contains a component in charge of monitoring the service’s state. This component publishes the service’s state periodically, and also when something meaningful occurs (see the Inversion of Communications pattern in chapter 5). Note that just because the service actively publishes its state doesn’t mean it can’t also respond to inquiries regarding its health (akin to leaving a comment on a blog and getting a response from the author).
Let’s consider the advantages of the Service Watchdog pattern over the other options presented earlier. The Service Watchdog pattern combines the benefits of an agent that actively monitors the service’s health with the internal knowledge of how to maintain service continuity. For instance, a service is best equipped to know if its processing is running slower than usual. If there are many instances of the service, the service should know how many copies are really needed and how many are just for redundancy. And so on.
Performance, scalability, and availability are related attributes of any software system. Often the best way to solve a performance problem is to scale the solution. Once you do this, you may find that the same approach can be used to increase the solution’s availability. This is especially true when you combine patterns and multiply their individual quality attributes.
In this chapter, we examined structural patterns to help increase performance, scalability, and availability of services in an SOA. We covered the following patterns:
Martin Fowler, “The LMAX Architecture,” http://martinfowler.com/articles/lmax.html. The disruptor pattern discussed in this article creates a low-latency lock-free queue between writers and readers.
Frank Buschmann, Regine Meunier, Hans Rohnert, Peter Sommerlad, and Michael Stal, Pattern-Oriented Software Architecture: A System of Patterns, vol. 1 (John Wiley & Sons, 1996). The Parallel Pipelines pattern is an SOA application of the Pipes and Filters pattern described in Pattern-Oriented Software Architecture.