chapter ten

10 Blackbelt Lambda

This chapter covers

Monitoring latency, request per second, and concurrency for serverless applications
Techniques for optimizing latency

Performance (how fast your application responds) and availability (whether or not your application provides a valid response) are critical aspects of your end user experience. When using serverless architectures, your performance also has a direct impact on your costs; for example, AWS Lambda bills you for the duration your function runs, weighted by the memory you assign to it. Serverless architectures eliminate many of the common surface areas for performance optimizations, like scaling available servers or tweaking server configurations, which can make it challenging for new users to understand how to go about making these optimizations.

This chapter introduces you to key tools and approaches available to you to improve performance across the various services that make up your serverless application. We’ll use relevant examples to demonstrate how these techniques work.

10.1 Where to optimize?

10.2 Before we get started

10.2.1 How a Lambda function handles requests

10.2.2 Latency: Cold vs. warm

10.2.3 Load generation on your function and application

10.2.4 Tracking performance and availability

10.3 Optimizing latency

10.3.1 Minimize deployment artifact size

10.3.2 Allocate sufficient resources to your execution environment

10.3.3 Optimize function logic

10.4 Concurrency

10.4.1 Correlation between requests, latency, and concurrency

10.4.2 Managing concurrency

Summary