Chapter 12. Why event-driven?

published book

This chapter covers

  • Using event-driven architectures in the front end and back end systems
  • Relating event-driven architectures to reactive programming
  • Using an event-driven approach to implement microservices
  • Managing scalability, availability, and resilience
  • Estimating costs and using that information to design a business model

In the previous chapter, you completed building a media-sharing application integrated with an authentication service to recognize users. In this chapter, we’ll dive more deeply into the implications of what event-driven means and how to use multiple functions together to build an application.

Different architectural styles are also covered in this chapter. We’ll compare the solution we’re building using AWS Lambda with patterns that have evolved over the years to improve the scalability, security, and manageability of distributed applications, such as reactive programming and microservices.

Before the internet caused us to think about what scalability really means, it was common practice to recommend avoiding distributed systems: management was complex and expensive servers were the best answer to any scalability issue. But nowadays, running applications concurrently on thousands of servers is relatively common for internet-facing companies and designing your application for scalability should be among your main tenets.

No code examples are provided in this chapter, but the tools to design an application to be event-driven are described. You’ll get more theory and less practice than in the previous chapters, but you’ll put all this theory into practice soon after.


Tip

Even though the things you’ll learn are focused on AWS Lambda, most of them can be applied to distributed systems in general and will be useful whenever you’re going to design a scalable and reliable application that’s independent from the technology stack you’ll use.


The functions that make up an application interact in different ways. Certain functions are called directly by end users from a browser or a mobile application. Other functions subscribe to receive events from resources used by the application, such as a file store or a database. They’re activated by changes in those resources; for example, when a file has been uploaded or a database has been updated. If you look at the overall flow of the application, however, all the logic is driven by events.

join today to enjoy all our content. all the time.
 

12.1. Overview of event-driven architectures

Event-driven applications react to internal and external events, without a centralized workflow to coordinate processing of the resources. Those events are signals that can come from any source: human input, sensors, other applications, timers, or any activity on a resource used by the application. Those signals can bring data with them; for example, a selection made by a user or a change to a resource.

An important aspect of being event-driven is that the application doesn’t control or enforce the sequence of events that are processed. Instead, the overall execution flow follows the events that are received and triggers activities, eventually generating other events that can trigger other activities. This is in contrast to normal procedural programming, where a main plan schedules different activities to fulfill a final goal; for example, via a centralized workflow.

The event-driven approach gives certain design advantages that you can see immediately:

  • It decouples the sender of the event from the receiver.
  • You can have multiple receivers for a single event, and you can add or remove a receiver without affecting the others.
  • The flow of the application can be changed by modifying how activities react to events—for example, enabling or disabling a specific subscription, without touching the code inside the activities (functions, in the case of AWS Lambda).
  • Data is shared among activities via events or external repositories (such as databases) and doesn’t have a requirement for multiple activities sharing the same execution environment. This allows you to distribute the execution of those activities, and hence the application, in multiple physical servers for resilience and scalability.

Think of an application that solves a large-scale problem. It may be an e-commerce website, an online game, or an application that analyzes genetic data; it doesn’t matter. With an event-driven architecture, you implement an application whose software components have a simple, local visibility into

  • What they need to know (the events they can receive)
  • What they need to do (for example, work on resources, update files, or write to a database)
  • What new events they need to publish

This approach forces you to decompose a large-scale application into smaller components, each of which works on a smaller problem. Event-driven architectural patterns have been used for years by the telecom industry to build highly available and possibly self-healing systems to power communications networks. You can find the legacy of those patterns in programming languages like Erlang, originally developed by Ericsson, and toolkits and runtimes like Akka, via the actor model.


The actor model

The actor model, first discussed in computer science in 1973, uses actors as the main entities of computation: everything is an actor, and in response to a message that it receives, an actor can (concurrently) take local decisions, create other actors, send messages to other actors, and decide how to deal with future messages. For more information, see the following resources:

“A Universal Modular ACTOR Formalism for Artificial Intelligence,” by Carl Hewitt, Peter Bishop, and Richard Steiger (1973), http://dl.acm.org/citation.cfm?id=1624775.1624804.

“Foundations of Actor Semantics,” Mathematics Doctoral Dissertation by William Clinger (1981), http://hdl.handle.net/1721.1/6935.


Get AWS Lambda in Action
add to cart

12.2. Starting from the front end

When I think of event-driven programming, the first thing that comes to mind is usually a user interface (UI) where you click a button and something happens. To do that using the programming language of your choice, you link a function (or method) to the event you want to catch; for example, “Do this sequence of actions when the user clicks the button.”

This makes sense because you don’t know when a user will interact with a UI or what the user will do. You need a way to react and do something when that interaction occurs.

Let’s imagine you want to track the different people interacting with your website, giving them the ability to create new users within your application. A basic UI to create a new user would be similar to figure 12.1.

Figure 12.1. Sample UI to create new users: when users interact with UI components, they trigger actions. For example, the syntax used to write “Name” and “Email” is checked every time a character is written or changed in the text boxes, disabling the “Submit” button if the syntax is not valid, and a new user is created when the “Submit” button is enabled and pressed.

To implement the UI, you link actions to possible interactions with the elements that build up the UI. For example,

  • Whenever a character is written or changed in the Name text box, you check if the syntax is a valid name according to your syntax (only letters and spaces, no other characters). Additionally, you may capitalize the names as they’re written.
  • Whenever a character is written or changed in the Email text box, you check if the syntax is a valid email address (“something@some.domain”). Furthermore, you can check whether the domain used in the email address is a valid one.
  • If one of the previous checks fails, the Submit button is disabled and a warning is displayed to help the user fix what is not right; for example, “Names can contain only letters and spaces.”

In object-oriented languages such as C++ or Java, a UI is usually implemented using the observer pattern. The observer is an object where you register the target object to observe and the action (a method) to execute when something happens, such as a user clicking a button or selecting an option from a drop-down menu (figure 12.2).

Figure 12.2. In object-oriented languages, the observer pattern is commonly used in a user interface to decouple target UI elements from actions triggered by specific interactions (for example, the characters inside of a text box have changed or a button is pressed).

Note

In a practical implementation of the observer pattern, an event loop is used to process observer events. The event loop is typically single-threaded and should only be used to trigger actions running in other threads, because if the event loop is too busy, newer events need to be queued, slowing down the speed of user interactions with the UI. That should always be avoided. The good news is that with AWS Lambda, events are managed by the platform itself in a scalable way, so you don’t need to think about the event loop, as you’ll see in the next section.


Sign in for more free preview time

12.3. What about the back end?

In the back end of an application, you put all the logic that can’t be safely implemented on the client, either because part of the data must be shared with other clients or for security reasons (because the client can’t be trusted to make certain decisions).

Keep in mind these important considerations when developing a back end. First, if you design an application with a procedural approach, when a request from a client arrives in the back end you have to implement a detailed workflow of activities that should be executed to follow the required logic, doing all the necessary data manipulations and checks. This workflow grows in complexity every time the application must be updated to add features, or sometimes even to solve a bug.

Second, as the number of users or interactions grows, you’ll need to scale the back end of the application, and you can’t assume that it will always run on a single server; eventually you’ll need to distribute it on multiple systems.

And third, it’s common in a back end to have transactions involving multiple data sources that should be changed synchronously (commit) or not changed at all (roll back). If data isn’t locally centralized, but distributed in different repositories, things get far more complex: distributed transactions are slow and complex to manage.

Distributed systems should be designed in a way that doesn’t require synchronous access to data but uses eventual consistency: we shouldn’t expect data to always be in the same state if it’s stored in different repositories. Assuming that your application has synchronous access to data or strong consistency seems safe and practical at first, when you need to design a solution. But such a solution is difficult to implement, manage, and scale in practice.


The CAP theorem

To better understand the complexity of architecting distributed systems, I suggest that you have a look at the CAP theorem, also known as Brewer’s theorem. According to this computer science theorem, it’s impossible for a distributed computer system to simultaneously provide all three of the following guarantees (the acronym of which gives name to the theorem):

1.  Consistency of data across different nodes.

2.  Availability to requests coming to the distributed system, which should always get a response.

3.  Partition tolerance (if nodes get disconnected from each other—for example, because of network issues—the system should continue to work).

More information on the CAP theorem and its implications can be found in “Brewer’s Conjecture and the Feasibility of Consistent Available Partition-Tolerant Web Services,” by Seth Gilbert and Nancy Lynch (2002), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.1495.


An ideal approach is to have an application distributed in both space (in different environments) and time (data is sent and updated asynchronously and converges only at a point in time, with eventual consistency).[1] That means that all elements of the architecture should communicate only asynchronously with a predefined interface (a “contract”).

1 See the talk by Jonas Bonér, creator of the Akka toolkit and runtime, titled “Without Resilience, Nothing Else Matters,” for more on this idea.

It’s easy to see that the event-driven architecture I previously introduced takes that approach:

  • The execution of each action is independent from other actions, and actions can be executed on different systems.
  • Data is exchanged via events, and if multiple actions are accessing the same data, it’s the resource containing the data that triggers (via events) all relevant actions when that data is changed.
  • Each action knows (only) its input events, the resources it can change, and which events it should eventually trigger. (I’m not considering events triggered by resources manipulated by the action here.)

Let’s see in more detail about how to implement such an architecture using AWS Lambda and the possible interactions that you can use.

One way of getting events is from a UI, or in a more general case, from a client application. Let’s call those custom events to distinguish them from events coming from subscriptions to other resources. Those direct invocations may expect a response (bringing a value back) and are synchronous calls (figure 12.3).

Figure 12.3. Interaction model for a synchronous call from a client application. The function can read or write in some resources (files, databases) and returns a response. This model will be extended further in this chapter to cover other interactions.

You may also have custom events that trigger asynchronous calls that don’t return a value but change something in the state of the system; for example, in the resources (files, databases) used by the application. The main difference here is that the client application doesn’t need to know when the actions implied by the call are completed; it only needs to know that the request to initiate those actions has been received correctly and the back end will do whatever it can to respect that (figure 12.4).

Figure 12.4. Adding asynchronous calls to the previous interaction model. Asynchronous calls don’t return a response and the client doesn’t need to wait for the function to end.

The resources themselves, when changed, can trigger their own events. If a new picture has been uploaded, you may want to build a thumbnail to render the picture in an index page, or index the picture metadata in a database. If a new user has been created in a database, you may want to send an email to verify that the provided email address is correct and the user can receive emails at that address (figure 12.5).

Figure 12.5. Adding events generated by resources to the previous interaction model. If a function created a new file or updated a database, you can subscribe other functions to that event. Those functions will be called asynchronously with an event describing what happened to the resource as input.

Inside the back end, functions can also call other functions, but with AWS Lambda you’d probably avoid calling functions synchronously from another function, because you’d pay for the elapsed time twice: once for the function doing the synchronous call (which is blocked, waiting until the call returns) and once for the function that has been called. With few exceptions, functions are called asynchronously by other functions (figure 12.6).

Figure 12.6. Adding functions called asynchronously by other functions to the previous interaction model. In this way you can reuse a function multiple times and for different purposes. The same function can be called directly by a client or by another function.

Tip

A common pattern is to use a first function as a router to call multiple functions asynchronously and have them work in parallel to fulfill a task. If your workload can be split into chunks, you can then have multiple functions running concurrently, each function focusing on a single chunk and writing the output to a centralized repository where results can be collected.


For certain resources, it’s possible to have direct interactions from the client (figure 12.7). For example, a file repository such as Amazon S3, a NoSQL database such as Amazon DynamoDB, or a streaming service such as Amazon Kinesis[2] can be securely used by a client similarly to an AWS Lambda invocation: they’re all using the same security framework implemented by AWS Identity and Access Management (IAM) and can be protected by temporary credentials distributed by Amazon Cognito.

2 I don’t cover Amazon Kinesis in this book, but if you’re interested in real-time analytics, or a platform to load and analyze streaming data, using it can save you time. For more information, see https://aws.amazon.com/kinesis.

Figure 12.7. Clients can directly access a resource, completing the previous interaction model. For example, a client can upload a file (such as a picture) or write something in a database. This event can trigger functions that can analyze what has happened and do something with the new or updated content; for example, render a thumbnail when a high-resolution picture is uploaded or update a file based on the new content of a database.

The interaction diagram in figure 12.7 shows how an event-driven application can receive events from different sources and how those interactions relate to each other.

Using those interactions, you can follow best practices for architecting and developing distributed systems, such as reactive programming and microservices. For example, you can design a media-sharing application using an event-driven architecture with AWS Lambda, as described in figure 12.8. Remember that the client application can run on any device, such as a smartphone or a web browser.

Figure 12.8. Sample media-sharing application with an event-driven design built using AWS Lambda. Certain functions are directly called by the client; other functions are subscribed to events from back-end resources, such as file shares or databases.
join today to enjoy all our content. all the time.
 

12.4. Reactive programming

If you don’t want to put a hard limit on the number of users or interactions your application will be capable of handling in production, you have to design your application to be distributed in multiple environments. Distributed applications are inevitable if you need scalability, but they’re still difficult to design, manage, and scale.

Sometimes, to speed up development, small teams and startups create a quick prototype of an application that’s not designed for scaling but is still shared with users across the internet. Even if that seems to be the right approach when the prototype is under heavy development and many updates need to be implemented rapidly, the paradox is that if the idea they’re testing with the prototype is working, then many users could come all at once to try the new application. Those users may be following a review by an important website or a positive comment shared virally on a social network. When users try a new application, it’s a unique chance to be appreciated and gain their trust. If the application can’t scale to support so many users and slows down or stops working, you’ll probably lose those users for good. Wouldn’t it be better if a prototype were already capable of scaling?


Tip

My advice is to always consider scalability when you develop an application, even if at the beginning it seems out of context. There can be exceptions, such as management applications that are designed to have a few users, but often your user base can be difficult to estimate or can change often due to daily/weekly/monthly cycles.


You can follow different architectural approaches to design an application that can be scaled easily. One of the more interesting approaches is reactive programming. With reactive programming, you program your system in a way similar to a spreadsheet: logic is built around data and the propagation of changes in the data flow.

If you think it through carefully, the same syntax has different meanings, depending on whether you interpret it procedurally or in a reactive (event-driven) context. Consider, for example, the formula for the area of a rectangle:

area = length x width

In procedural programming (including functional nonreactive programming), this syntax represents a function that takes inputs (length, width) and returns a value (the area) synchronously. In reactive programming, this syntax represents a rule that binds data values together: if one of the input values changes (length or width), then the dependent data (the area, in this case) is automatically updated without an explicit request for a new area. Can you see the difference between the procedural and reactive approaches, even if the syntax seems to be the same?

The reactive approach is similar to event-driven programming, where you use subscriptions to events to trigger actions that force the update of dependent data—for example, computing the new area of a rectangle if the length or width is updated in a repository. The main difference is that with reactive programming you bind values together, usually through functions, while with event-driven programming you focus on the messages that are exchanged (events) and the actions that those messages trigger (subscriptions).


Tip

To simplify the analysis required to design an event-driven application, I suggest you start with a reactive approach (in a manner similar to data binding for a UI) and then map the result into events and actions.


You have different ways to implement a similar approach and build software that’s robust, resilient, flexible, and ready to handle an “unpredictable” workload. I found a good formalization in the Reactive Manifesto, which you can find online at http://www.reactivemanifesto.org.

According to the Reactive Manifesto, a reactive system is a distributed, loosely coupled, and scalable solution that is tolerant of internal failures (figure 12.9). That is, a reactive system meets the following criteria:

  • Responsive Providing a response within an acceptable and consistent time makes the system more usable and maintainable.
  • Resilient In case of a failure, the system can still provide a response.
  • Elastic The system can grow or shrink the resources used depending on the actual workload, avoiding any bottlenecks that could compromise this capacity.
  • Message-driven Components of the system should interact via asynchronous nonblocking communications.
Figure 12.9. The four main characteristics of reactive systems, courtesy of the Reactive Manifesto

Let’s look again at the CAP theorem. According to the CAP theorem, it’s impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

1.  Consistency of data across different nodes.

2.  Availability to requests coming to the distributed system, which should always get a response.

3.  Partition tolerance (if nodes get disconnected from each other—for example, because of network issues—the system should continue to work).

How do you think the four characteristics in the Reactive Manifesto affect the components of a distributed architecture following the CAP theorem? What would you need to change in a traditional server-based implementation?

I think the most important takeaway from the Reactive Manifesto is the message-driven approach in communication, which implicitly removes the need for data consistency (the “C” in the CAP theorem): if interactions are asynchronous, then multiple interactions don’t happen at the same time, and the data doesn’t have to be consistent at a point in time. The right way to look at this is that you’re sharing immutable data, avoiding the risk of contention among different interactions.


Note

I suggest that you read the whole Reactive Manifesto online (http://www.reactivemanifesto.org) and try to evaluate how those characteristics can be applied to your application and the concepts you’re learning in this book.


Sign in for more free preview time

12.5. The path to microservices

Microservices have no official definition, but the general consensus is that they follow an architectural style where applications are decomposed into small, independently deployable services, with several common characteristics:

  • Each service should be built around a business domain and not a technical one; this ensures, among other things, that the boundary around services will last if technologies evolve or are changed.
  • Services should be loosely coupled, so that changes within one service shouldn’t affect others.
  • A service should work within a “bounded context” as part of the entire business domain to simplify modeling of communications among services.

Note

Introducing and using the concept of a “bounded context” is part of domain-driven design and goes beyond the scope of this book. I suggest you start with Martin Fowler’s description and suggested readings, which can be found at http://martinfowler.com/bliki/BoundedContext.html.


DevOps and microservices

I find it interesting that the core characteristic for microservices of being independently deployable is an operational requirement. This is clear feedback provided by operations to development, thanks to the adoption of a DevOps culture within companies that pioneered microservice architectures.

Similar to microservices, DevOps has no official definition, but generally speaking, the goal of DevOps is to foster communication and collaboration between development, operations, and other IT-related roles within a company.


How small is “small” for microservices? No specific metric exists, but a good starting point is that you can build or rebuild a service in less than two weeks. In general, I’d say that it should be small enough that you can easily manage to completely rewrite a microservice within your deployment schedule.

You have important consequences of this capacity of rebuilding a service: if a new requirement arises, and it’s too hard to add to the current implementation of a service, you can create a new service that will implement the new requirement and all the old ones. In creating this new service, you can decide to use a different technology; for example, migrating from Java to Scala or from Ruby to Python.


Tip

Other than helping developers to always use the best technology for a specific purpose, the possibility to use a new or different technology stack also improves morale and hiring, because developers know that they have the freedom to choose a new programming environment if that makes sense and they won’t be forced to work on the same old stack for the rest of their (working) lives.


A possible downside of this freedom is that developers may be tempted to follow trends and choose technologies only because they’re “cool.” A service can have a long life span, and using a stable technology helps. If the technology used for a service loses traction and support, you can still rebuild the service using another technology in less than two weeks (according to our definition of microservices), but if you’ve used that technology in more than one service, you’ll have to spend more time rebuilding multiple services, which doesn’t add value for the end users of your application.


Note

For a broader description of what microservices are and how to use them for your particular use case, I suggest you start with Martin Fowler’s resource guide at http://martinfowler.com/microservices/.


If you recall how to design an event-driven application, and how AWS Lambda works by decomposing your application into small functions that can interact only via events, you’ll see that the approach described in this book puts you on the right path to implement microservices—but you still have great responsibility in the implementation.

AWS Lambda provides a framework to build small, mostly asynchronous services with a clean interface and covers some of the main complexities of managing microservice architectures in production, such as

  • Centralized logging, via Amazon CloudWatch Logs
  • Service discovery, via the AWS Lambda API

It’s your responsibility to use those features to your advantage while you build the overall application. For example:

  • To simplify debugging microservices, centralized logging needs a traceable “identification” that follows a single request among all the interacting services. That isn’t part of what AWS Lambda and Amazon CloudWatch Logs provide, and you need to think about that.
  • To automate service discovery, you should use a standard syntax in the function descriptions that you get via the AWS Lambda API.

Another main point that I think is relevant in understanding how AWS Lambda supports a microservice architecture—and one that has been a source of endless discussion in distributed architectures—is whether it’s better to favor choreography or orchestration of services.

To better clarify that, let’s continue the parallel with the artistic scenario that the two terms involve, using definitions from the Merriam-Webster English dictionary.


Definitions from the Merriam-Webster dictionary

Choreography: The art or job of deciding how dancers will move in a performance; also, the movements that are done by dancers in a performance.

Orchestration: The arrangement of a musical composition for performance by an orchestra.


Let’s try to adapt those definitions to IT architectures. With orchestration, you have an automated execution of a workflow, and an orchestration engine to execute that workflow and manage all interactions. With choreography, you describe the coordinated interactions between a small subset (usually two) of the elements that are interacting.

If you’re familiar with the enterprise deployment of a service-oriented architecture (SOA), it’s easy to see the similarity between the orchestration engine and the extended role that’s given to the enterprise message bus, for example in routing, filtering, or translating messages depending on a centralized logic. With microservices, messaging platforms shouldn’t have an active role and the logic should be kept within the services’ boundary.

With an event-driven architecture you’re describing the choreography among services, without a centralized workflow that needs to be aware of all aspects of the interactions and that scales in complexity as the number of services (and interactions) increases. As each service probably has more than one interaction with other services, the growth in complexity of a centralized workflow is far more than linear and is difficult to manage in a large-scale deployment.

Tour livebook

Take our tour and find out more about liveBook's features:

  • Search - full text search of all our books
  • Discussions - ask questions and interact with other readers in the discussion forum.
  • Highlight, annotate, or bookmark.
take the tour

12.6. Scalability of the platform

Scalability is one of the core aspects of IT architectures. Let’s start with a definition of scalability in the context of IT systems.


Definition

Scalability is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth. From “Characteristics of Scalability and Their Impact on Performance,” by André Benjamin Bondi (2000), http://dl.acm.org/citation.cfm?doid=350391.350432.


In an event-driven application, scalability is driven by the total concurrent executions across all functions. The number of concurrent executions depends on the number of events coming in and the duration of the functions triggered by those events, according to the following formula:

concurrent executions = (number of events per second) x (average duration of
the triggered functions)

For example, let’s consider this scenario with multiple interactions with AWS Lambda, several directly from users (custom events), and several coming from subscriptions to resources:

  • One thousand users are interacting every second via a client application to check if there’s a relevant picture close to where they are, using a Lambda function that on average takes 0.2 seconds to execute; this brings 1,000 events x 0.2s = 200 concurrent executions.
  • Ten users per second upload a picture on Amazon S3. An AWS Lambda function subscribed to that event that’s triggered on upload; it builds a thumbnail of the picture and extracts the metadata, inserting the metadata into an Amazon DynamoDB table. This function takes on average 2 seconds to complete (let’s imagine those are high-resolution pictures); this brings 10 events x 2s = 20 concurrent executions.
  • Another Lambda function is subscribed to the DynamoDB table; it receives all those events and updates an index of the user pictures. This function takes on average 3 seconds to complete (maybe you can tune it, but for the sake of simplicity let’s use this duration as an average); this brings 10 events x 3s = 30 concurrent executions.
  • In total for this scenario, 200 + 20 + 30 = 250 concurrent executions.

With AWS Lambda you don’t need to manage scalability and concurrency because the service is designed to run many instances of your functions in parallel. Of course, you have to take care of the scalability of the resources used by Lambda functions; for example, if you have multiple concurrent executions of a function reading or writing to a database, you need to be sure that the database is capable of sustaining the workload.

However, a default safety throttle of 100 concurrent executions per account per region limits the impact of errors or recursive functions. If you realize this is a scalability limit for your application, you can request to increase the number of concurrent executions to throttle, opening a case for a service limit increase in the AWS Support Center at no cost.


Note

The number of concurrent executions to throttle is a cumulative limit for all AWS Lambda functions you have within an account and region.


When an account goes beyond the safety throttle, the function execution is throttled. You can monitor this behavior in the corresponding Amazon CloudWatch metric, available in the AWS Lambda web console in the Monitoring tab.

When throttled, Lambda functions that are invoked synchronously return an HTTP error code 429, for “Too Many Requests.” Error code 429 is automatically managed by AWS SDKs, which will retry multiple times with an exponential back-off.

Lambda functions that are invoked asynchronously, when throttled, are automatically retried for 15–30 minutes. If you had a spike of traffic coming to your back end, that period should be enough to absorb the burst and execute the functions. After 15–30 minutes this retry period ends and all incoming events are rejected as throttled.

If the Lambda function is subscribed to events generated by other AWS events, those are retained and retried. The retry period is usually 24 hours, but you should check the documentation of AWS Lambda and the relevant AWS service for further details.

join today to enjoy all our content. all the time.
 

12.7. Availability and resilience

Together with scalability, availability defines how and when an IT system can be used in production. A definition of availability will help you to understand what will be discussed in this section.


Definition

Availability is the proportion of time a system is in a functioning condition.


Finding a definition of resilience to use in an IT context is not as easy because it’s usually discussed in the context of biology or psychology, but a general agreement can be made on the following.


Definition

Resilience is the capacity of adapting to adversity.


According to those definitions, availability is the metric to measure the probability of finding a specific system available and responding. Resilience is the capacity of such a system to automatically recover (self-heal) from issues that could compromise its ability to respond. In large-scale deployments, where hardware and software components are put into place, failures will happen. We want systems that are resilient to improve availability.

AWS Lambda is designed to use multiple features, such as replication and redundancy at the hardware and software level, to provide high availability for both the service itself and the functions it manages. AWS Lambda has no maintenance windows or scheduled downtimes.

Still, a Lambda function can fail because the internal logic terminates with an error; for example, using context.fail() on the Node.js runtime, or raising an exception on the Python runtime.

On failure, synchronous functions respond with an exception. Asynchronous functions are retried at least three times, after which the event may be rejected. Events from AWS services, such as Amazon Kinesis streams and Amazon DynamoDB streams, are retried until the Lambda function succeeds or the data expires, usually after 24 hours.

As discussed in section 12.4, asynchronous message passing is a better way to communicate among different components of your back end (functions, in the case of AWS Lambda) and should be your preferred choice whenever possible. Sometimes you have to change part of the internal logic of your application to accommodate asynchronous communications.

12.8. Estimating costs

Costs are an important part of cloud computing services. Costs, together with the technical specification of the service, define when and how a service can be used, and what the possible use cases are. A lower cost enables new use cases that wouldn’t make sense if they were more expensive to build.

With AWS Lambda you pay monthly for

  • Requests across all functions, including test invocations from the web console
  • Duration, with each function execution rounded up to the nearest 100 ms, depending on how much memory you configured for the function

The duration costs depend linearly on the memory configured for the function. If you double (or halve) the memory you configure, but keep the same duration, you also double (or halve) the duration costs.

When you give more (or less) memory to a function, you also allocate proportional CPU power and other resources that the function can use during the execution. Hence, giving more memory can also (depending on the function’s CPU and I/O usage) speed up the execution of a function.


Note

Cost information provided hereafter in this section is up to date at the time of writing of this book. Even if certain numbers have now changed, understanding the cost model and how that applies to your application will be useful when you plan to use AWS Lambda. For updated information on AWS Lambda pricing and the AWS Free Tier, see https://aws.amazon.com/lambda/pricing/ and http://aws.amazon.com/free/.


You start paying only after you exceed the AWS Free Tier, available for all AWS accounts. The Lambda free tier doesn’t expire after 12 months, unlike for other AWS services, and is available to all AWS customers indefinitely.

The Lambda free tier allows you to learn, test, and scale a prototype at no charge for

  • The first 1 million requests per month
  • The first 400,000 GB-seconds of compute time per month

In the AWS Lambda free tier, 400,000 GB-seconds is to be interpreted as the sum of the duration (up to the nearest 100 ms) of all function executions within an account, if the functions are configured with 1 GB of memory. If you give less memory, you get more execution time at no charge. For example, if you configure 128 MB of memory (that is, 1/8 of 1 GB), you get 8 x 400,000 seconds = 3.2 million seconds of execution time.

Because every execution time is rounded up to the nearest 100 ms, functions that execute quickly (for example, in about 20 ms) could have a greater impact than expected on costs than functions that are closer to the 100 ms execution time (for example, close to 90 ms).


Tip

For cost optimization, it can sometimes be useful to group more functions into one if the duration of those functions is far less than the minimum of 100 ms. Conversely, having smaller functions can make development and updates easier. You should find your own balance between costs in production (including AWS Lambda) and in development (where time to market can be critical).


When you invoke a Lambda function from another function, there can be two use cases:

1.  The second function is invoked asynchronously, so the first function can end while the second is still executing, and the two costs are completely independent.

2.  The second function is invoked synchronously, so that the first function will be blocked and wait for the second function to terminate before continuing its own execution. In this case you pay double time for the execution of the second synchronous function, so this isn’t always a good practice and it’s better to avoid invoking synchronous functions from other functions.

To estimate the costs of an application, you have to estimate all events—both custom events due to direct invocations and those coming from subscription to some resources—and the duration of the functions triggered by those events. You can use (multiple) test events to estimate the duration in the web console. Or you can look at the duration metric recorded by Amazon CloudWatch in the monitoring tab of web console.

Estimating cost (and consumption) per user is the best way to understand your cost model, how many users you need to exceed the free tier, and how your costs are growing with your user base.

For example, consider the media-sharing application I mentioned in chapter 1. You’re going to build a similar application in the following chapters of this book. Suppose that after analyzing your first trial users, you measure that each user, on average, is doing 100 function invocations (requests) per month, directly from a mobile app or via subscriptions to the picture store and the database tables. On average, half of those functions are quick and take 30 ms with 128 MB of memory. The other half are slower (for example, when you need to build a thumbnail of a high-resolution picture) and last on average 1 second with 512 MB of memory.

Let’s compute how many GB-seconds each user is contributing:

  • For the “quick” functions, 50 x 100 ms (because you need to round up from 30 ms) x 128 MB = 5 / 8 GB-seconds (you need to divide by 8 to get GB) = 0.625 GB-seconds.
  • For the “slow” functions, 50 x 1s x 512 MB = 50 / 2 GB-seconds (you need to divide by 2 to get GB) = 25 GB-seconds.

The overall contribution to duration costs for each user is 25.625 GB-seconds; as you’d expect, the “quick” functions are contributing far less than the “slow” ones.

You can now build a simple cost model that tells you

  • When you’re going to exceed the free tier
  • How much you’d pay for AWS Lambda for 10, 100, 1,000, and so on, users

Warning

I’m not considering storage and database costs for now, but they’re easier to estimate, and Amazon S3 and Amazon DynamoDB both have a free tier.


You can see an example of that in table 12.1, based on current costs at the time of writing of the book.

Table 12.1. AWS Lambda cost model for an application. Thanks to the free tier, you start incurring costs only when approaching 100,000 users. With this table, you can also estimate the average cost per user, a useful metric in defining and validating your business model. (view table figure)

Users

Requests

Duration

Requests to pay

Duration to pay

Request costs

Duration costs

Total costs

1 100 25.63 0 0 0 0 0
10 1,000 256.25 0 0 0 0 0
100 10,000 2,562.50 0 0 0 0 0
1,000 100,000 25,625 0 0 0 0 0
10,000 1,000,000 256,250 0 0 0 0 0
100,000 10,000,000 2,562,500 9,000,000 2,162,500 1.8 36.05 37.85
1,000,000 100,000,000 25,625,000 99,000,000 25,225,000 19.8 420.50 440.30

As you can see from the table, the free tier has no charges unless you approach 100,000 users, and then costs start to grow almost linearly with the user base.

From table 12.1 you can also estimate the average cost per user for your application. If different kinds of users (basic or advanced, for example) interact in different ways, bringing different costs to the platform, you may need to estimate their costs separately. This can be useful in designing your own business model and validating if (or when) it’s sustainable. Knowing the cost per user, you can find out, for example,

  • If a “freemium” pricing strategy, a popular approach for startups, would work for your application
  • If you should have different tiers for your users, with different pricing, depending on what they can do
  • If and when advertisements could pay for a significant part of your bill

Definition

Freemium is a business model in which a core product or service is provided free of charge to a large group of users, but money (premium) is charged to a smaller fraction of the user base for advanced features or virtual goods. For more information, see Freemium Economics by Eric Benjamin Seufert (Savvy Manager’s Guides, 2013).


Summary

In this chapter you learned the following:

  • How event-driven architectures work
  • How they are commonly used in the front end
  • The advantages of using the same approach in the back end of your application
  • How that relates to architectural best practices for distributed systems, such as reactive programming and microservices
  • What the advantages are for two fundamental characteristics of IT architectures, scalability, and availability
  • How to estimate AWS Lambda costs for an event-driven application and use that information to design your business model

In the next chapter you’ll move into the third part of this book, focusing on the tools and best practices that support the use of AWS Lambda from development to production.

Exercise

To test what you’ve learned in this chapter, try to answer these multiple-choice questions:

1

According to the Reactive Manifesto, it’s better for components of the system to interact

  1. Via synchronous communications, because that guarantees you get strong consistency in the answer
  2. Via asynchronous communication, so that components are loosely coupled and interactions are nonblocking
  3. Communication used by interactions isn’t important as long as the system remains responsive

1

b

2

Implementing an event-driven architecture, you’re favoring

  1. Choreography vs. orchestration, because you describe the relationship among resources
  2. Orchestration vs. choreography, because you have the automated execution of a workflow
  3. It depends on how you design the centralized workflow

2

a

3

To manage the scalability of functions executed by AWS Lambda

  1. You need to keep the number of events per second below the safety throttle of your account
  2. You need to keep the number of concurrent executions below the safety throttle for your account
  3. You need to keep the number of invocations per second below the safety throttle of your account

3

b

4

To estimate the AWS Lambda costs for your application

  1. You need to know how many functions you’re using and if they’re called synchronously or asynchronously
  2. You need to understand how many requests are made and the overall duration of function executions. The free tier can be neglected because it has no noticeable impact on the bill
  3. You need to understand how many requests are made and the overall duration of function executions, taking into consideration the free tier

4

c

Solution

1

b

2

a

3

b

4

c

.jpg"

sitemap
×

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage