If you ask software developers what software architecture is you might get answers ranging from “it’s a blueprint or a plan” to “a conceptual model” to “the big picture.” This book is about an emerging architectural approach that has been adopted by developers and companies around the world to build their modern applications—serverless architectures.
Serverless architectures have been described as somewhat of a “nirvana” for an application architectural approach. It promises developers the ability to iterate as fast as possible while maintaining business critical latency, availability, security, and performance guarantees, with minimal effort on the developers’ part.
This book teaches you how to think about serverless systems that can scale and handle demanding computational requirements without having to provision or manage a single server. Importantly, this book describes techniques that can help developers quickly deliver products to market while maintaining a high level of quality and performance by using services and architectures offered by today’s cloud platforms.
highlight, annotate, and bookmark
You can automatically highlight by performing the text selection while keeping the alt/ key pressed.

Before going in any further, we think it’s important to come to terms with the word serverless. There are various attempts at this already, including an official one from AWS (https://aws.amazon.com/serverless/) and a community favorite from Martin Fowler (https://martinfowler.com/articles/serverless.html). Here’s how we define it:
Definition
Simple enough, right? But there’s a lot to unpack in that simple definition. Let’s dive into each of the following two required criteria to call something serverless:
- Consumed as a utility service—The “software as a service” consumption model is well understood. It means that anyone using the software uses a prescribed application programming interface (API) or web interface to use the software and customize it, while staying within any published constraints for the software and usage policies for the API. Salesforce, Office365, and Google Maps are well-known software packages delivered as a service. What’s key here is that the actual infrastructure (servers, networking, storage, etc.) hosting the software and powering the API is completely abstracted from you as the consumer; all that is visible (and all that matter) is what the API permits. A service also typically comes with accompanying availability, reliability, and performance guarantees from the service provider. A utility service, further, has the billing characteristics that we’d expect from any utility computing offering; that is, you pay for usage not for reservation, subscriptions, or provisioning. All existing public cloud offerings have some form of utility billing associated with them. For example, Amazon Elastic Compute Cloud (EC2) allows you to pay by the second for the rent of virtual machines.
- Incurs cost only when used—This means there’s zero cost for having the software deployed and ready to use. Think of this as the same cost model we expect from our public utilities like electricity and water. You, as the consumer, pay a per granular usage unit cost if you use any, but you pay zero if you use nothing. This aspect of pure usage-based pricing is a distinguishing criterion of serverless offerings from the other utility services that came before it.
In the rest of the book, we will use the “serverless” qualifier only for software that fits these criteria. For example, software that requires you to provide a server to host a website (like the Apache web server) would not qualify because it does not meet the first criterion. Software that is available as a service but requires you to pay by subscription (like Salesforce) would not qualify as well because it does not meet the second criterion. A serverless architecture, by extension, is one composed entirely of serverless components. But which components of an architecture need to be serverless for it to be called as such? Let’s look at this next with an example.
discuss

Let’s take the example of a typical data-driven web application, not unlike the systems powering most of today’s web-enabled software. These typically consist of a backend (server) that accepts requests from a client and then processes the requests.
The backend server performs various forms of computation, and the frontend client provides an interface for users to operate via their browser, mobile, or desktop device. Data might travel through numerous application layers before being saved to a database. The backend then generates a response that could be in the form of JSON or in fully rendered markup, which is sent back to the client (figure 1.1). These kinds of applications are conventionally architected as tiers (a presentation tier that controls how the information is captured and provided to the user, an application tier that controls the business logic of the application, and a data tier with the database and corresponding access controls).
Figure 1.1 A basic request-response (client-server) message-exchange pattern that most developers are familiar with. There’s only one web server and one database in this figure. Most systems are much more complex.

Software architectures have evolved from the days of code running on a mainframe to a multitier architecture where the presentation, data, and application/logic tiers are traditionally separated. Within each tier, there may be multiple logical layers that deal with the particular aspects of functionality or domain. There are also cross-cutting components such as logging or exception handling systems that can span numerous layers. The preference for layering is understandable. Layering allows developers to decouple concerns and have more maintainable applications. Figure 1.2 shows an example of a tiered architecture with multiple layers including the API, the business logic, the user authentication component, and the database.
Figure 1.2 A typical three-tier application is usually made up of presentation, application, and data tiers. A tier can have multiple layers with specific responsibilities.

One blunt approach would be to combine all the layers (the API, the business logic, the user authentication) into one single, monolithic code base. This may sound like an antipattern today, but that was indeed the approach we adopted in the early days of cloud-based development. Most modern approaches, however, dictate that you architect with reusability, autonomy, composability, and discoverability in mind.
Among the veterans of our industry, service-oriented architecture (SOA) is a well-known buzzword. SOA encourages an architectural approach in which developers create autonomous services that communicate via message passing and often have a schema or a contract that defines how messages are created or exchanged.
The modern incarnation of the service-oriented approach is often referred to as microservices architecture. Modern application architectures are composed of services communicating through events and APIs with business logic inserted as appropriate. We define microservices as small, standalone, fully independent services built around a particular business purpose or capability. Ideally, microservices should be easy to replace, with each service written in an appropriate framework and language.
The mere fact that microservices can be written in a different general-purpose language or a domain-specific language (DSL) is a drawing card for many developers. Benefits can be gained from using the right language or a specialized set of libraries for the job. Each microservice can maintain state and store data. And if microservices are correctly decoupled, development teams can work and deploy microservices independently from one another. This approach of building and deploying applications as a collection of loosely coupled services is considered the default approach to development in the cloud today (the “cloud native” approach, if you will).
Once you have decided how your application is going to be architected, and all the software required for each of the layers is ready to go, you would think the hardest part is done. The truth is, that’s when some of the more complex tasks begin. Developing your desired services traditionally requires servers running in data centers or in the cloud that need to be managed, maintained, patched, and backed up. Today, you would pick from a few options:
- Directly build on VMs—The physical deployment of each service requires you to have a set of instances with additional tasks to address required activities such as load balancing, transactions, clustering, caching, messaging, and data redundancy. Provisioning, managing, and patching of these servers is a time-consuming task that often requires dedicated operations people. A non-trivial environment is hard to set up and operate effectively. Infrastructure and hardware are necessary components of any IT system, but they’re often also a distraction from what should be the core focus—solving the business problem. In our simple web application example, you would have to become an expert in building distributed systems and cloud infrastructure management. In a cloud environment, this form of computing is often referred to as infrastructure as a service (IaaS).
- Use a PaaS—Over the past few years, technologies such as platform as a service (PaaS) and containers have appeared as potential solutions to the headache of inconsistent infrastructure environments, conflicts, and server management overhead. PaaS is a form of cloud computing that provides a platform for users to run their software while hiding some of the underlying infrastructure. To make effective use of PaaS, developers need to write software that targets the features and capabilities of the platform. Moving a legacy application designed to run on a standalone server to a PaaS service often leads to additional development effort because of the ephemeral nature of most PaaS implementations. Still, given a choice, many developers would understandably choose to use PaaS rather than more traditional, manual solutions thanks to reduced maintenance and platform support requirements.
- Use containers—Containerization is considered ideal for microservices architectures because it is a way of isolating an application with its own environment. It’s a lightweight alternative to full-blown virtualization that traditional cloud servers use. Containers are an excellent deployment and packaging solution especially when dependencies are in play (although they can come with their own housekeeping challenges and complexities). Containers are isolated and lightweight, but they need to be deployed to a server, whether in a public or private cloud or on site.
While each of these models are perfectly valid and offer varying degrees of simplicity and speed of development for your service, your costs are still driven by the lifecycle of the infrastructure or servers you own, not to your application usage. If you purchase a rack at the data center, you pay for it 24/7. If you purchase a cloud instance (wrapped in a PaaS or running containers or otherwise), you pay for it when it runs, independent of whether it is serving traffic for your web app or not.
This leads to an entire discipline of engineers investing in improving server efficiency or trying to match infrastructure lifecycle to application usage and server sizes to traffic patterns. This also means that all the effort spent on these tasks is time taken away from improving the functionality and differentiating aspects of your application. This is equivalent to asking for a place to plug in your appliance and having to pay for a share of the power generators at your utility company, as well as configuring the generator to deliver the power in the phase, frequency, and wattage you desire no matter how much you use. The actual outcome (plug in your appliance) is dwarfed by the effort and cost for the infrastructure required (the generators). This is where the serverless approach comes in. It aims for the moral equivalent of the utility approach we know and love today—there when you need it, complexity abstracted away, and you only pay for when you use it.
A serverless architecture for our sample application could be composed of different layers. For example, to build the API, we would use a service that does not cost us anything if there are no API calls. To build the authentication service, we would use a service that does not cost us anything if there are no authentication calls. To build the storage service, we would use . . . you get the picture.
Much like the public cloud approach that offered virtual infrastructure Lego to assemble our cloud stack in the early days, a serverless architecture uses existing services from cloud providers like AWS to implement its architectural components. As an example, AWS offers services to build our application primitives like APIs (Amazon API Gateway), workflows (AWS Step Functions), queues (Amazon Simple Queue Service), databases (Amazon DynamoDB and Amazon Aurora), and more.
The idea of using off-the-shelf services to implement parts of our architecture is not new; indeed, it’s been a best practice since the days of SOA. What’s changed in the last few years is the capability to also implement the custom aspects of our applications (like the business logic) in a serverless manner. This ability to run arbitrary code without having to provision infrastructure to run it as a service or to pay for the infrastructure is referred to as functions as a service (FaaS).
FaaS allows you to provide custom code, associated dependencies, and some configuration to dictate your desired performance and access control characteristics. FaaS then executes this unit (referred to as a function) on an invisible compute fleet with each execution of your code receiving an isolated environment with its own disk, memory, and CPU allocation. You pay only for the time your code runs. A function is not a lightweight instance; instead, think of it as akin to processes in an OS, where you can spawn as many as needed by your application and then spin them down when your application isn’t running.
Serverless architectures are really the culmination of shifts that have been going on for a long time: from monoliths to services and from managing infrastructure to increasingly delegating the undifferentiating responsibilities. Serverless architectures can help with the problem of layering and having to update too many things. There’s room for developers to remove or minimize layering by breaking the system into functions and allowing the frontend to securely communicate with services and even the database directly. A well-planned serverless architecture can make future changes easier, which is an important factor for any long-term application.
To recap, a serverless architecture leverages a serverless implementation for each of its components, using FaaS (like AWS Lambda) for custom logic. This means each component is built as a service, with utility pricing that incurs cost only when used. Each component is a service and exposes no configuration or cost related to the infrastructure it is running on, which means these architectures don’t rely on direct access to a server to work. By making use of various powerful single-purpose APIs and web services, developers can build loosely coupled, scalable, and efficient architectures quickly. Moving away from servers and infrastructure concerns, as well as allowing the developer to primarily focus on code, is the ultimate goal behind serverless.
settings

The web application example we went through is one of the simplest demonstrations of what can be achieved with serverless architectures. A serverless approach can also work exceptionally well for organizations that want to innovate and move quickly.
Functions and serverless architectures, in general, are versatile. You can use them to build backends for CRUD applications, e-commerce, back-office systems, complex web apps, and all kinds of mobile and desktop software. Tasks that used to take weeks can be done in days or hours as long as we chose the right combination of technologies. Lambda functions are stateless and scalable, which makes them perfect for implementing any logic that benefits from parallel processing.
The most flexible and powerful serverless designs are event-driven, which means each component in the architecture reacts to a state change or notification of some kind rather than responding to a request or polling for information. In chapter 2, for example, you’ll build an event-driven, push-based pipeline to see how quickly you can put together a system to encode video to different bit rates and formats.
Note
You will find the use of events as a communication mechanism between components to be a recurring theme in serverless architectures; indeed, AWS Lambda’s initial launch was as an event-driven computing service. Building event-driven, push-based systems will often reduce cost and complexity (you won’t need to run extra code to poll for changes) and, potentially, make the overall user experience smoother. It goes without saying that although event-driven, push-based models are a good goal, they might not be appropriate or achievable in all circumstances.
Serverless architecture allows developers to focus on software design and code rather than infrastructure. Scalability and high availability are easier to achieve, and the pricing is often more fair because you pay only for what you use. More importantly, you have the potential to reduce some of the complexity of the system by minimizing the number of layers and amount of code needed.
Adopting a serverless approach to application development comes with significant agility, elasticity, and cost efficiency gains. However, it is easy to fall into the trap of trying to adopt a serverless approach for all applications. We recommend keeping a few principles in mind as you start your serverless journey:
- Avoid lift-and-shift—In practice, serverless architectures are more suited for new applications rather than porting existing applications over. This is because existing application code bases have a lot of code that is made redundant by the serverless services. For example, porting a Java Spring app into Lambda brings a heavy framework into a function, most of which exists to interact with a web server (which doesn’t exist inside Lambda).
- Adopt a serverless first approach, not a serverless only approach—While there are companies like A Cloud Guru that have adopted a serverless only approach, where 100% of their application runs as a serverless implementation, the more widespread approach that companies like Expedia and T-Mobile have adopted is to go serverless first. What this means is that their developers attempt to first build any new application in the following priority order: build as much as possible using third-party services, fall back to custom services built using AWS serverless primitives like AWS Lambda, and finally, fall back to custom services built using custom software running on infrastructure like EC2. We talk about the reasons why you may have to fall back beyond custom serverless services in the next section.
- It doesn’t have to be all or nothing—One advantage of the serverless approach is that existing applications can be gradually converted to serverless architecture. If a developer is faced with a monolithic code base, they can gradually tease it apart and convert individual components into a serverless implementation (the strangler pattern). The best approach is to initially create a prototype to test developer assumptions about how the system would function if it is going to be partly or fully serverless. Legacy systems tend to have interesting constraints that require creative solutions, and as with any architectural refactors at a large scale, compromises are inevitably going to be made. The system may end up being a hybrid (as in figure 1.3), but it may be better to have some of its components use Lambda and third-party services rather than remain with an unchanged legacy architecture that no longer scales or that requires expensive infrastructure to run.
Figure 1.3 Serverless architecture is not an all-or-nothing proposition. If you currently have a monolithic application, you can begin to gradually extract components and run them in isolated services or compute functions. You can decouple a monolithic application into an assortment of IaaS, PaaS, containers, functions, and third-party services if it helps.

The transition from a legacy, server-based application to a scalable serverless architecture may take time to get right. It needs to be approached carefully and slowly, and developers need to have a good test plan and a great DevOps strategy in place before they begin.
- Pick applications suited for a service-oriented architecture—Serverless architectures are a natural extension of ideas raised in SOAs. In a serverless architecture, all custom code is written and executed as isolated, independent, and often granular functions that are run in a compute service such as AWS Lambda. Because every component is a service, serverless architectures share a lot of advantages and complexities with event-driven microservices architectures. This also means applications likely need to be architected to meet the requirements of these approaches (like making the individual services stateless, for example). Keep in mind that the serverless approach is all about reducing the amount of code you have to own and maintain, so you can iterate and innovate faster. This means you should strive to minimize the number of components that are required to build your application. For example, you may architect your web application with a rich front end (in lieu of a complex backend) that can talk to third-party services directly. That kind of architecture can be conducive to a better user experience. Fewer hops between online resources and reduced latency will result in a better perception of performance and usability of the application. In other words, you don’t have to route everything through a FaaS; your frontend may be able to communicate directly with a search provider, a database, or another useful API. Also keep in mind that moving from a monolithic approach to a more decentralized serverless approach doesn’t automatically reduce the complexity of the underlying system. The distributed nature of the solution can introduce its own challenges because of the need to make remote rather than in-process calls and the need to handle failures and latency across a network, which your application will need to be resilient to.
- Minimize custom code—The rise of serverless means many standard application components like APIs, workflows, queues, and databases are available as serverless offerings from cloud providers and third parties. It’s far more useful for developers to spend time solving a problem unique to their domain rather than recreating functionality already implemented by someone else. Don’t build for the sake of building if viable third-party services and APIs are available. Stand on the shoulders of giants to reach new heights. Appendix A has a short list of Amazon Web Services and non-Amazon Web Services that we’ve found useful. We’ll look at most of those services in more detail as we move through the book. However, it goes without saying that when a third-party service is considered, factors such as price, capability, availability, documentation, and support must be carefully assessed. If you have to build a piece of custom functionality, our advice is simple: try to solve your problem using functions first, and if that doesn’t work explore containers and more traditional server-based architectures second. Developers can write functions to carry out almost any common task, such as reading and writing to a data source, calling out to other functions, and performing calculations. In more complex cases, developers can set up more elaborate pipelines and orchestrate invocations of multiple functions.
highlight, annotate, and bookmark
You can automatically highlight by performing the text selection while keeping the alt/ key pressed.

The serverless approach of building applications by quickly assembling services provides two significant advantages: less code to write and maintain per application and per activity pricing for our applications. This translates into a disruptive gain in agility and developer productivity, and a much more streamlined alignment between development and finance (because any application inefficiencies or optimizations show a direct, tangible financial impact). Here are a few of the specific benefits you will realize by adopting serverless architecture:
- High scale and reliability without server management—Building large scale, distributed systems is hard. Tasks such as server configuration and management, patching, and maintenance are taken care of by the vendor, as is managing the infrastructure architecture for high scale and reliability, which saves time and money. For example, Amazon looks after the health of its fleet of servers that power AWS Lambda. If you don’t have specific requirements to manage or modify server resources, then having Amazon or another vendor look after them is a great solution. You’re responsible only for your own code, leaving operational and administrative tasks to a different set of capable hands.
- Competitive pricing—Traditional server-based architecture requires servers that don’t necessarily run at full capacity all of the time. Scaling, even with automated systems, involves a new server, which is often wasted until there’s a temporary upsurge in traffic or new data. Serverless systems are much more granular with regard to scaling and are cost-effective, especially when peak loads are uneven or unexpected. Because of their utility pay-per-use billing, serverless services can be extremely cost-effective; however, they’re not cheaper than traditional (server and container) technologies in all circumstances. The best thing is to do some modeling before embarking on a big project.
- Less code—We mentioned at the start of the chapter that serverless architecture provides an opportunity to reduce some of the complexity and code, in comparison to more traditional systems. Adopting a serverless approach eliminates undifferentiated code such as that required for orchestrating server fleets or routing requests and events between components, which forms a surprisingly large part of modern code bases.
Serverless is not a silver bullet in all circumstances, however. Here are some reasons where you would want to avoid serverless architectures:
- You are not comfortable with public cloud-based architectures. Serverless development is a natural extension of the move to cloud-based development, where more and more of the undifferentiated heavy lifting is moved to the providers. There are applications and business scenarios where you need to maintain your own data center; in such cases, you cannot build a serverless architecture (though you are welcome to host your own primitives on your infrastructure and use those to build applications).
- The services don’t meet the availability, performance, compliance, or scale needs of your customers. AWS serverless services offer an availability SLA, but their threshold may be below what you need for your business. They also have a variety of compliance certifications, but you must validate if they need what your business needs. Services like AWS Lambda also do not offer a performance SLA, which means you may need to evaluate their performance against your desired levels. Non-AWS, third-party services are in the same boat. Some may have strong SLAs, whereas others may not have one at all.
- Your application and business needs more control or you need to customize the infrastructure. When it comes to Lambda, the efficiencies gained from having Amazon look after the platform and scale functions come at the expense of being able to customize the operating system or tweak the underlying instance. You can modify the amount of RAM allocated to a function and change timeouts, but that’s about it. Similarly, different third-party services will have varying levels of customization and flexibility.
- Your application and business needs require you to stay vendor agnostic. If a developer decides to use third-party APIs and services, including AWS, there’s a chance that architecture could become strongly coupled to the platform being used. The implications of vendor lock-in and the risk of using third-party services—including company viability, data sovereignty and privacy, cost, support, documentation, and available feature set—need to be thoroughly considered.
In this chapter, you learned what serverless architecture is, looked at its principles, and how it compares to traditional architectures. In the next chapter, we’ll get our hands dirty by creating a small serverless, event-driven application. This will help you get a good taste for serverless if this is your first time trying this approach. From there, we’ll explore important architectures and patterns and discuss use cases where serverless architectures are used to solve a problem.
discuss

For all intent and purposes, this is a completely different book from the first edition of Serverless Architectures on AWS. Most of the chapters have been written from the ground up to provide a completely different experience from the first edition.
When the first edition of this book came out in 2017, serverless was still new and many of us were learning about serverless for the first time. As such, the first edition gave a gentle introduction to serverless and walked the reader through a build of a serverless application. Since then a lot of new educational content has crossed our desks, including numerous books and video courses to help us get started with serverless technologies on AWS.
If you’re looking for an introduction to serverless architectures on AWS, we have included some introductory content in chapter 2 and in appendices A and B. You can also find the first edition of this book on the Manning website (https://www.manning.com/books/serverless-architectures-on-aws). Most of the content from the first edition is still relevant today, and with that book, you will learn to build a serverless application from scratch.
But, just as serverless technologies allow us to focus on the things that differentiate our business, we want to focus on things that can differentiate this book with this second edition. Instead of yet another getting started guide to serverless, this book focuses on serverless use cases and interesting architectures. It is aimed at developers with some experience of serverless technologies already and answers the questions that many of you have been asking us. Given the switch in focus, this book does not have many actual code samples. Instead, we hope to challenge the way you think about serverless architecture and help you get the most out of serverless technologies on AWS.
- The cloud has been and continues to be a game changer for IT infrastructure and software development.
- Software developers need to think about the ways they can maximize the use of cloud platforms to gain a competitive advantage.
- Serverless architectures are the latest step forward for developers and organizations to think about, study, and adopt. This exciting shift in implementing application architectures will grow quickly as software developers embrace compute services such as AWS Lambda.
- In many cases, serverless applications will be cheaper to run and faster to implement. There’s also a need to reduce complexity and costs associated with running infrastructure and carrying out development of traditional software systems.
- The reduction in cost and time spent on infrastructure maintenance and the benefits of scalability are good reasons for organizations and developers to consider serverless architectures.