1 Crafting Experiences for Cloud Native Development

MEAP v1

This chapter covers

  • What developer experience is and why it matters
  • Paving the path from idea to production through the inner loop and the outer loop
  • Using continuous delivery to build higher-quality software faster and safer
  • Main challenges and friction points impacting the developer experience
  • Introducing the cloud native project used in the book

As organizations adopt cloud native technologies and Kubernetes, they are fundamentally transforming how they build, deploy, and manage software. This shift is driven by the need to deliver software faster, more reliably, and at scale. At the core of this transformation is the developer experience—the daily reality of how developers interact with tools, platforms, and processes to deliver value to customers through software.

While these technologies offer powerful capabilities, they often come with increased cognitive load and reduced developer productivity. Understanding and optimizing the developer experience has become crucial for organizations aiming to succeed in their cloud native journey.

This chapter lays the foundation for understanding developer experience in modern software engineering. We will explore why developer experience is important, examine the fundamental challenges that teams face when building applications for Kubernetes, and introduce key concepts such as the inner and outer development loops.

livebook features:
highlight, annotate, and bookmark
Select a piece of text and click the appropriate icon to annotate, bookmark, or highlight (you can also use keyboard shortcuts - h to highlight, b to bookmark, n to create a note).

You can automatically highlight by performing the text selection while keeping the alt/ key pressed.
highlights
join today to enjoy all our content. all the time.
 

1.1 Why does developer experience matter?

In today's technology-driven landscape, where automated tools and solutions prevail, why is it essential to focus on developer experience? Even with the advancements in artificial intelligence (AI), application developers remain indispensable in the software delivery process.

This section will analyze the role of application developers and the challenges of building cloud native applications. We will look into why it's vital for organizations to help developers thrive in the presence of complex tools, lack of standardization, and despite challenges in implementing best practices in an environment where the only constant is change. Finally, it will provide a definition of developer experience that we’ll use for the rest of the book.

1.1.1 The Role of Application Developers

Application developers make today's world move forward. Regardless of industry, every company likely relies on software to operate and thrive. The work of developers creates tangible value by addressing and solving customers' problems through software solutions. However, if software cannot swiftly adapt to meet customers' evolving needs, they will inevitably turn to more efficient providers.

The journey from idea to value begins with a clear problem statement and a well-defined requirement for solving it. Developers are tasked with translating these requirements into functional software. That involves building new applications, implementing more features, and fixing bugs (figure 1.1). The speed and quality of this transformation, from the initial requirement to the final software delivered to customers (what we call the path to production), are critical to the organization's success.

Figure 1.1 The path from idea to production, where software delivers value to the organization and its customers.

In this book, when we refer to customers, we mean any user of the applications developers build and continuously enhance. Customers can be private end-users, organizations, or even internal teams within the same company developing the software. These applications span a wide range of use cases: from your home banking app to your favorite streaming platform, from services controlling offshore wind farms to your organization's internal portal to software solutions hospitals use to manage patient records and appointments. We call customers all users of these diverse applications.

The journey from idea to value is crucial for delivering software that meets customers' needs effectively. Cloud native technologies are an essential enabler for building modern software solutions. However, they introduce a unique set of challenges that application developers must address. That's the topic of the next section.

1.1.2 The Challenges of Cloud Native Development

In the past 15 years, companies have drastically changed how they design, build, and ship software to their customers. New technologies and practices brought along some challenges for application developers in three main areas:

  • Infrastructure and platforms
  • Architectures and design
  • Organizations and practices.

This section provides a generic overview of the main challenges. We'll analyze and address them throughout the book, so don't worry if something is unclear.

Infrastructure and Platforms

The cloud made it possible to consume infrastructure as a service, turning computing, network, and storage resources into commodities. Containers entered the stage and quickly became one of the most used methods for packaging and running applications while ensuring portability across environments, from development to production. Kubernetes raised the abstraction level on top of infrastructure and containers, laying the foundation for building platforms that application developers can consume via APIs.

We witnessed an exponential increase in new tools for cloud-based and Kubernetes-based ecosystems that can easily make developers feel overwhelmed by their number and complexity. Even more so when adopting these new technologies, developers are forced to know all the details of Kubernetes and related tools, resulting in a substantial cognitive load increase and reduced productivity.

The recent rediscovery of platform engineering focused more on the separation of concerns and the value of abstractions. Still, it's common for many organizations to require developers to interact with low-level details in Kubernetes and cloud infrastructures, reducing the time they can spend on producing more value for their customers. Does that happen in your organization?

Note

Platform engineering is a specialized branch of software engineering dedicated to creating platforms that empower development teams throughout their daily, iterative journey from idea to value. In the context of cloud native technologies, platforms are often built on top of Kubernetes and designed to offer on-demand services to developers while hiding the complexity of internal infrastructure. In later chapters, we'll dive deeper into cloud native platforms and their importance for application developers. To learn more about platform engineering, check the book "Platform Engineering on Kubernetes" by Mauricio Salatino (Manning, 2023).

Architectures and Design

Software architectures have also evolved. We've been building increasingly distributed systems with increasingly demanding requirements for scalability and resilience. New use cases were unlocked thanks to new architectural styles and infrastructures, but the complexity of the systems and related development environments increased. That's especially true when those architectural styles are implemented incorrectly. We've seen countless transitions to microservices go wrong, resulting in unmanageable distributed monoliths. As a consequence, many organizations are now considering adopting modular monoliths.

However, the problem we've been trying to solve hasn't changed in decades. Software decomposition is hard yet necessary whether you're building a monolithic or microservice-based system. Distributed systems are complex. When implemented on top of cloud infrastructures, they can substantially impact daily application development workflows, creating friction for developers when building against cloud services in their development environments or needing to run all the necessary dependencies to work on new features.

Note

Software decomposition and modularization have been challenging since the early days of software development. In 1972, D.L. Parnas published a paper titled "On the criteria to be used in decomposing systems into modules". Still, today, we continue to struggle with designing loosely coupled, maintainable solutions. Whether you're building a monolithic application or microservices, correctly decomposing a system into modules is essential to the success of a software product.

Organization and Practices

Organizations themselves underwent radical changes, trying to catch up with the rise and mainstream adoption of cloud computing. However, organizational transformations don't always succeed. The DevOps movement, which attempted to break the silos and friction between Development and Operations, has often been misunderstood and resulted in simply renaming the old Operations team to a DevOps team, creating a new silo named DevOps[1], or even pushing all operational responsibilities to the Development team. None of those changes effectively solve the underlying problem: streamlining the software delivery process.

Note

There is no universally accepted definition of DevOps. We find the one proposed by Ken Mugrage (principal technologist at ThoughtWorks) particularly interesting: "A culture where people, regardless of title or background, work together to imagine, develop, deploy, and operate a system”.[2]

One of the key tenets of the cloud computing model is its on-demand, self-service nature. That convenience is void when organizations adopt heavy processes and require development teams to submit a ticket to some Infrastructure team whenever they need to provision a new database or a virtual machine. Does that sound familiar to you? Getting a change from code to production often goes through expensive and slow processes, requiring many hand-overs and manual approvals.

All these factors impact application developers' productivity and their experience trying to get a code change into the hands of their customers. In 2010, Jez Humble and David Farley formalized the concept of continuous delivery[3], a holistic approach to developing and delivering higher-quality software faster, safer, and in a repeatable way. Continuous delivery practices give us well-tested solutions to improve the path from idea to production, but the adoption might be challenging. Unfortunately, many companies still struggle to implement these ideas.

Cloud Native

All the challenges mentioned so far are common when discussing cloud native development. But what does cloud native mean? The Cloud Native Computing Foundation (CNCF) answers that question in its cloud native definition[4]:

“Cloud native practices empower organizations to develop, build, and deploy workloads in computing environments (public, private, hybrid cloud) to meet their organizational needs at scale in a programmatic and repeatable manner. It is characterized by loosely coupled systems that interoperate in a manner that is secure, resilient, manageable, sustainable, and observable.

Cloud native technologies and architectures typically consist of some combination of containers, service meshes, multi-tenancy, microservices, immutable infrastructure, serverless, and declarative APIs — this list is non-exhaustive.”

The definition continues by highlighting the benefits of cloud native:

“Combined with robust automation, cloud native practices allow organizations to make high-impact changes frequently, predictably, with minimal toil and clear separation of concerns.”

Application developers' productivity has become a serious issue for organizations that want to adapt faster while taking advantage of new approaches and tools constantly introduced in the cloud native ecosystem. This book focuses on the main pain points that developers face when working with cloud native applications and trying to adopt all these new tools and practices. How can we combine them to achieve a great developer experience? That's what this book is all about!

Note

If you'd like to learn more about the definition and properties of cloud native applications from a developer perspective, you can refer to Chapter 1 of the book "Cloud Native Spring in Action" (Manning, 2022) by Thomas Vitale.

1.1.3 Defining Developer Experience

What do we mean by developer experience? It sure sounds like a buzzword. Like other buzzwords in our field, it can be confusing because it means different things to different people.

We rely on the insightful work by F. Fagerhold and J. Munch, who suggested a comprehensive definition in their paper “Developer Experience: Concept and Definition”[5]:

“...developer experience could be defined as a means for capturing how developers think and feel about their activities within their working environments, with the assumption that an improvement of the developer experience has positive impacts on characteristics such as sustained team and project performance.”

This definition captures the essence of why developer experience matters: improving it has a positive impact on development teams and increases their productivity. That means the better the developer experience, the higher the value produced.

Many factors influence developers' activities within software engineering projects. The paper suggests dividing these factors into three groups:

  1. Development Infrastructure Factors: How developers perceive the development infrastructure. That includes interactions with tools, frameworks, platforms, and organizational processes.
  2. Work Feelings Factors: How developers feel about their work. That includes social aspects such as respect and a sense of belonging within their team and organization.
  3. Value Contribution Factors: How developers perceive the value of their contributions. That includes aligning personal goals with project objectives and their sense of purpose within the team and organization.

This book focuses on the first dimension, which is all about tools and software practices and how developers perceive them while translating requirements into running software. As a developer, you might be overwhelmed by the number of tools you need to learn and master to complete your daily tasks. Or you might get frustrated due to slow and suboptimal tools. Perhaps your organization has added unnecessary hurdles and constraints that impact the performance of your team, making it more challenging to go from idea to value.

We can now suggest a more specific definition focused on the dimension of experiences that this book will cover.

“Developer Experience captures how developers interact with, and are empowered by, their technical environment to deliver customer value through software. This includes their ability to maintain flow and productivity while using development tools, frameworks, platforms and organizational processes. The assumption is that a well-designed, low-friction development infrastructure enables developers to focus on problem-solving rather than wrestling with tooling complexity or inefficient processes—positively impacting software delivery outcomes and team sustainability.”

Good developer experiences don't emerge spontaneously—they must be deliberately designed to align with the software being developed and the toolchain in use. Yet most teams work with an inherited patchwork of tools, each designed with different intentions and constraints. That creates a "Frankenstein Experience": a cobbled-together environment that undermines developer productivity rather than enhancing it.

The first step towards better experiences is understanding, at a deeper level, the main development activities, their relations with the overall software delivery cycle, and what can go wrong. We'll explore this in the next section while learning about the inner loop and outer loop.

livebook features:
discuss
Ask a question, share an example, or respond to another reader. Start a thread by selecting any piece of text and clicking the discussion icon.
discussions
Get Developer Experience on Kubernetes
add to cart

1.2 The Inner and Outer Loops

Let's consider the path from idea to production. It all starts with a clear problem statement and the definition of a requirement for solving it.

A requirement could involve creating a new application, adding a new feature to an existing distributed system, fixing a bug in a microservice, or refactoring existing code in a modular monolith. It would typically include acceptance criteria, which are the conditions that must be met for the requirement to be considered complete. Requirements can come from various sources, such as product managers, business analysts, customers, or developers.

Given a requirement, many activities must be performed to deliver a working software solution. These activities can be grouped into two main categories: the inner loop (also called inner dev loop) and the outer loop (also called outer dev loop). The requirement is the starting point for the development process and the input to the inner loop. The transition from the inner loop to the outer loop happens whenever a developer pushes a change into the version control system.

Finally, the outer loop can feed back learnings and insights from production, leading to new requirements and closing the software development lifecycle (figure 1.2).

Figure 1.2 The key parts of the software development lifecycle, triggered by a requirement and ending with value delivered to customers

1.2.1 Inner Loop

The inner loop is where developers spend most of their time. It's the cycle of activities that a developer performs to write, test, run, and debug code (figure 1.3). It is triggered by a requirement to implement a new feature, fix a bug, design a new system, or refactor existing code. Developers would take on a requirement from the backlog and start working on it. The focus of this loop is on rapid feedback and fast iteration.

Note

Other terms used to refer to the inner loop are development workflow, pre-commit workflow, or local development.

In the inner loop, developers go through these activities:

  • Code. Given a requirement, developers write code to implement the feature or fix the bug.
  • Test. For each code change, developers write tests to verify that the code works as expected.
  • Run. Developers build and run the code to see the change in action and validate its behavior.
  • Debug. If the code doesn't work as expected, developers debug the code to identify and fix the issue.

The activities in the inner loop take place in the development environment, which includes all the tools and services the developers need to make a change. Some might follow a Test-Driven Development (TDD) approach, where they write a failing test first, then write the code to make the test pass, and finally tidy up[6]. Others prefer to write the code first and then write the tests. When building a web application, developers might also establish an automated workflow to run the application and see the changes in real-time as they make them.

Figure 1.3 The inner loop is triggered by a requirement; it consists of all the activities carried out in a development environment to make a change, and it ends with the change committed to the mainline.

Once satisfied with the change, developers push it into the remote version control system. This action triggers the outer loop. It can take several iterations in the inner loop to complete a requirement and meet the acceptance criteria. However, that shouldn't delay the process of frequently pushing changes. Following the practice of continuous integration[7], developers should make small, incremental changes and push them frequently into the remote version control system. The goal is to get fast feedback and avoid integration issues.

At a minimum, developers should push their changes at least once per day. Less than that, and it's hard to call it continuous. You might think: "But I'm working on a feature that will take me a week to complete. Should I push every day?" The answer is yes. It will help you avoid conflicts with other developers and integration issues. It will give you the confidence that your changes are not breaking the build or the tests. It will also enable your peers to give you feedback early in the process. "But the feature is not complete yet. Wouldn't that cause issues for users?" That's fine. You can use techniques such as keystone interfaces[8] and feature flags to hide incomplete features from users (we'll explore that later in the book) while retaining all the benefits of continuous integration.

The action triggering the outer loop is the push of a change into the remote version control system. There are a few different strategies for pushing changes[9]. Considering the aspiration for the inner loop to get fast feedback, developers should push their changes frequently. When practicing continuous integration, all developers push their changes to the mainline in the remote version control system. In Git, that would typically be the main branch. This practice helps avoid long-lived branches, which can lead to integration issues and slow feedback loops. Developers might directly push their changes to the main branch when doing pair programming. In other contexts, a pull request might be required to review the changes before merging them (pre-integration reviews). In that case, developers would create a short-lived branch, make the changes, open a pull request, wait for the review, and then merge the changes to the mainline.

The trigger for the outer loop is the push to the mainline. When using pre-integration reviews or long-lived features branches, we consider the additional activities required to merge the changes back to the mainline as part of the inner loop. Suppose it takes too long to perform a pre-integration review. In that case, developers will start pushing changes less frequently, slowing down the feedback loop and increasing the cycle time (the time it takes to deliver a change to production). The feedback loop gets even slower when using feature branches instead of continuous integration, as developers would integrate their changes only at the end of the feature development process. Furthermore, low-frequency integration discourages code refactoring since that would lead to expensive conflicts when merging the changes back to the mainline.

Our primary focus in this book is on the experience of development teams working on software products full-time, adopting the discipline of continuous delivery and all its foundational practices. However, it's essential to understand that these practices might not apply to all contexts. For example, continuous integration would not work for open-source projects, where developers may not be part of the same team and might not have the same level of trust or time commitment to the project. In that case, feature branches would be a better fit. Developers would fork the repository, make the changes in their fork, and then open a pull request to merge the changes back to the original repository only after they have been reviewed and approved.

1.2.2 Outer Loop

The outer loop is the cycle of activities after a developer commits a change to the mainline in a version control system until the change is deployed to production and operational. The central part of the outer loop is the deployment pipeline, which is the key pattern in continuous delivery and represents the only path to production.

Based on the concepts described by Jez Humble and Dave Farley in their books[12], we can group the activities in the outer loop into three main stages: commit, acceptance, and production (figure 1.4).

Figure 1.4 The outer loop is triggered every time a new change is committed to the mainline; it goes through multiple stages until the change is released to customers and delivers value.

Commit Stage

This stage is triggered every time a developer pushes a change to the mainline. It includes activities such as compiling the code, running the tests (mostly unit and component tests), performing static code analysis, and creating a build artifact. The goal of this stage is to provide fast feedback to the developer about the quality of the change.

If the change doesn't pass the tests, the developer should fix the issue immediately by committing a new change or reverting to the previous state. Ideally, this stage should take less than 5 minutes to complete. If it takes longer, developers would have to wait too long for feedback, increasing the cycle time and causing friction.

The activities performed in this stage are run against a build environment (or continuous integration environment) and can be supported by a wide array of tools, including build services (or continuous integration services), such as Jenkins, GitLab CI, or GitHub Actions, among others. Developers use such tools but are not responsible for their configuration or maintenance, which is typically the responsibility of a platform team.

At the end of this stage, a build artifact is produced, representing a release candidate. Depending on the technology stack, it could be a binary executable, a container image, or something else. The release candidate is then promoted to the acceptance stage.

Acceptance Stage

This stage is triggered when a new release candidate is available. It includes activities such as running functional acceptance tests (validating the original acceptance criteria of the implemented requirement) and non-functional acceptance tests (assessing security, performance, capacity, and so on). The goal of this stage is to provide confidence that the change is ready to be released to production.

If the change doesn't pass the tests, the release candidate is discarded, and developers should fix the issue as soon as possible. Ideally, this stage should take less than 60 minutes to complete. In cases of high-frequency integrations, the commit stage would produce multiple release candidates while the acceptance stage is still running. That's why only the latest output from the commit stage triggers the acceptance stage at any given time.

The tests performed in this stage are run against production-like environments, which are as similar as possible to the production environment.

Many tools are available to support the activities in this stage. Developers use such tools but should not be responsible for configuring or maintaining them. Platform teams can provide the necessary capabilities for automating deployments and operations via self-service platforms.

Most activities in this stage are automated, but manual testing activities, such as exploratory or usability testing, might be helpful. In that case, the testers would install the release candidate on a production-like environment using the same deployment automation adopted in the automated tests. Such activities might be part of this stage or run independently from the outer loop to not affect the overall cycle time.

At the end of this stage, the release candidate is proven to be releasable to production and is promoted to the next stage. Overall, the deployment pipeline is only as good as the quality of the tests in both the commit and acceptance stages. The pipeline will not provide the expected feedback and confidence if the tests cannot reliably detect issues.

Production Stage

When a release candidate is promoted to this stage, it is ready to be released to production. This stage includes activities such as deploying the release candidate to the production environment, running smoke tests (validating the system's basic functionality), and monitoring the system for any issues. The goal of this stage is to provide confidence that the change is working as expected in production.

An essential benefit of continuous delivery is ensuring our software is always in a releasable state. When to release a specific version then becomes a business decision (not technical), and it can be triggered manually or automatically. If we decide to release automatically to production whenever a release candidate is promoted to this stage, then we would be practicing continuous deployment[13].

This stage is the last one in the deployment pipeline, providing the most value to the business since it's where customers use the software. The feedback from production can inform new requirements, thus closing the overarching loop from idea to production and back.

livebook features:
settings
Update your profile, view your dashboard, tweak the text size, or turn on dark mode.
settings
Sign in for more free preview time

1.3 The 10 Friction Points of Developer Experience

The daily workflow of an application developer can be full of challenges, creating friction throughout the software development lifecycle. We identify ten main areas that affect the developer experience. We call them "The Ten Friction Points of Developer Experience" (figure 1.5). Let's see if these sound familiar to you or your development teams.

If you're a developer, each friction point might cause you pain. Throughout the book, we'll explore several tools and techniques to help make each point as frictionless as possible.

If you're a platform engineer working in a team that enables developers, consider how to reduce friction in each of these points. After all, developers are the users of your platform. If the experience you provide is not good, they will not use it.

If you're management, why would you want a better experience for your developers? Removing friction from each of these points not only benefits developers but also increases productivity, results in more compliant and higher-quality results, and improves the working environment where developers like to be, leading to a higher employee retention rate.

Figure 1.5 Along the path to production, there are 10 main points that can cause friction and toil for developers.

Using these ten friction points, you can evaluate how your development teams rank their development experience. Appendix A includes a checklist to assess your current development experience, discover the significant pain points, and for each friction point, find a reference to which chapters in this book help you mitigate it.

We know that the cloud native space is constantly evolving, and new projects are popping up every day. Hence, we designed this book and the checklist to focus on friction points and how to mitigate them rather than focusing too much on specific tools. While we will mention concrete tools and solutions to mitigate some of these challenges, we aim to address the issues in a flexible way, where multiple tools can be applied depending on the context and the skills of the teams adopting them.

1.3.1 Kicking Off a New Project

Starting a new project can be a daunting task. Whether it's a new service in an existing system or a brand-new project (greenfield), you need to consider how to set up a Git repository, which architecture to adopt, which programming language and framework to use, and which conventions to follow. If the project is part of an existing, larger system, there would be some documentation or guidelines to follow already, but not necessarily up-to-date or easy to find. If it's a new project, you need to make all these decisions from scratch or follow guidelines from other projects in the organization.

The initial setup of a project and the decisions that need to be made can cause friction for developers, leading to delays in starting the project or making the wrong decisions.

As a developer, you might wonder:

  • I need to build a new service. Where do I begin?
  • Who should I talk to about the architecture?
  • How can I collaborate with my teammates on the new project?
  • How do I set up the new project to comply with the organization's policies?
  • Which programming languages and frameworks should I use?

1.3.2 Setting Up a Development Environment

Whether you've just started a new project or are ready to implement a new feature on an existing one, you need to set up your development environment. That includes installing the necessary tools, configuring your IDE, and setting up the project to run locally. Too often, such configurations are carried out manually based on instructions that might be outdated or incomplete. That can cause friction among developers, leading to inconsistencies between their environments and making it hard to collaborate and reproduce issues.

As a developer, you might wonder:

  • Do I have all the necessary tools installed?
  • Am I using the correct version of the tools?
  • Why am I getting different results than my teammates?
  • I would like to introduce a new tool to my workflow. How can I share it with my team?
  • I work on macOS, but my teammates use Linux/Windows. How can we ensure consistency between our environments?

1.3.3 Making a Change

Given a requirement and a set of acceptance criteria, developers enter the inner loop and start making a change to fulfill the initial requirement. This point can be a source of friction due to many factors. First, the goal is to iterate quickly and get feedback as soon as possible, but that's hard to achieve if it takes a long time to compile and validate the change. Second, the change could benefit from adopting a new library or framework, but that's hard to do if there's no straightforward process to introduce and maintain new technologies. Third, the change might require altering the architecture/infrastructure or introducing a new external integration, such as a database or a machine learning inference service. However, that's challenging if the architecture is not flexible or integration is complex to set up.

As a developer, you might wonder:

  • How long does it take to compile the project the first time?
  • How can I quickly validate my change?
  • How can I introduce a new library or framework to my project?
  • What's the process for introducing a new external integration?
  • Is there a guideline to follow when changing the architecture?

1.3.4 Testing a Change

Writing automated tests for your change is crucial to ensure the code's quality and prevent regressions. Furthermore, it's an essential prerequisite to enable continuous delivery. Whether you're doing test-driven development or writing tests after the implementation, this can be a source of friction caused by several factors.

The goal of the inner loop is to iterate quickly and get feedback as soon as possible, but that's hard to achieve if it takes a long time to run the tests to validate the change. Depending on the project's history, it might be hard to write tests for the change, or the existing tests might be flaky or slow. Suppose the change depends on external services (whether existing or introduced by the change). In that case, there are some decisions to make on whether to mock or use real services in the tests and whether they can be run in the development environment.

As a developer, you might wonder:

  • How long does it take to run the tests?
  • How quick and simple is it to write tests for my change?
  • Should I mock external services to run the tests, or can I use containerized versions of the other services?
  • If I'm forced to use real external services for legacy reasons, how do I access them?
  • Is there any test data I should initialize before running the tests?

1.3.5 Running a Change

By now, you've made a change, written tests for it, and validated it. The next step is to run the change in the development environment to ensure it works as expected. That can be a source of friction due to several factors. Building an executable artifact for the application might be a slow process, especially if a container image needs to be built. In that case, the feedback loop can be long, reducing productivity and possibly increasing frustration.

Furthermore, the change might require an external service to be running, such as a database or a message broker, which might add to the slowness of the process. Even though it would be desirable to establish an automated workflow to run the application and reload it automatically whenever you make a change, the complexity of the setup might make it hard to achieve.

As a developer, you might wonder:

  • How long does it take to build the application?
  • How can I run the application with all the required external services?
  • Can I automate the process of running the application and reloading it when I make a change?
  • Why do I have to wait so long to see the results of my change?
  • Do I have to containerize the application to run it locally?

1.3.6 Debugging a Change

After running the change in the development environment, it may not work as expected. That is the moment when you need to debug the issue. Depending on the development environment setup and the technology stack, this can be a source of friction. If it's required to containerize the application to run it locally, debugging can be challenging.

You might have instrumented the code to produce traces or logs as part of the change. Still, if the logs are not easily accessible or if the traces are not well integrated with the development environment, they are not helpful for debugging.

Some dedicated tools might be needed to debug the application, such as enabling profiling or inspecting the network traffic. Still, they might not be easily accessible or documented if they are not part of the standard development environment.

As a developer, you might wonder:

  • Why is it so difficult to debug my application? And can I debug it if it runs in a container?
  • Are there any recommended tools I can use to debug the application?
  • Are the logs, metrics, and traces easily accessible?
  • How can I profile the application to identify performance bottlenecks?
  • How can I inspect the network traffic to identify communication issues?

1.3.7 Integrating a Change with the Mainline

Once satisfied with a change, you must integrate it with the mainline, closing the inner loop and triggering the outer loop. Several factors contribute to this being a source of friction. When using continuous integration and committing to the mainline at least once per day, integrating the change with the mainline should be as smooth as possible.

If pre-integration reviews are required, there's a risk of the change being blocked or delayed if reviewers are not available. If feature branches are used instead of continuous integration, you might get stuck to this point for several days or weeks, leading to a long feedback loop and slowing down the development process. You might be in a situation where you cannot close your current task but can't fully move to a new one either because you're waiting for the integration to happen.

As a developer, you might wonder:

  • Can I integrate my change directly into the mainline?
  • What is preventing us from adopting continuous integration?
  • How long does it take to get a review for my change from my team?
  • How can I be productive working on a new task while waiting for the integration?
  • My feature takes two weeks to complete. How can I integrate daily with the mainline?

1.3.8 Validating a Change

The outer loop is triggered whenever a new change is integrated with the mainline. When practicing continuous delivery, the deployment pipeline validates the change and ensures that the mainline is always in a releasable state. That leads to new sources of friction. The commit stage of the pipeline might take a long time to run, leading to a long feedback loop.

If something goes wrong in the pipeline, it might be hard to understand what happened and how to fix it. Sometimes, the test environment cannot be easily reproduced, for example, when the tests require specific hardware (e.g., GPU) or a considerable amount of resources (e.g. when running capacity tests). When external services are involved, the setup used in the development environment might not be the same as the one used in the pipeline, leading to inconsistencies, high maintenance costs, and additional complexity.

As a developer, you might wonder:

  • Can I get feedback on my change quickly after integration?
  • How long does it take to run the acceptance stage of the pipeline?
  • How can I understand what went wrong in the pipeline?
  • How can I reproduce the test environment locally?
  • What are the main differences between my development environment and the one used for tests?

1.3.9 Deploying a Change

When it comes to deploying a new release, there can be more sources of friction. The output of the commit stage is an executable artifact (a release candidate). In cloud native scenarios, this artifact will typically be a container image. Depending on the strategy for packaging the application, you might be responsible for the complete containerization process as a developer, or the underlying platform might offer that as a service. Once a new release candidate is built, the acceptance stage deploys the artifact in production-like environments to conduct functional and non-functional acceptance tests.

This deployment process might require developers to provide additional configuration and be exposed to the complexity of the underlying deployment platform. After the acceptance stage, the release candidate is promoted to production, and the deployment should follow the same strategy used in the acceptance stage. However, suppose the two environments are inconsistent or the deployments are performed differently. In that case, developers might not get early feedback about the deployability of their change, leading to delays and frustration.

As a developer, you might wonder:

  • How can I package an application as a container image?
  • Am I responsible for the full containerization process? If not, how do I know that my change doesn’t break the containerization process?
  • Am I supposed to change or provide additional configuration for the deployment?
  • How can I be sure my change doesn't break the deployability of the system?
  • Should I be an expert in containers and Kubernetes to understand how my change will be deployed?

1.3.10 Observing a Change

Once a change is deployed in production, the outer loop is closed, and the change is observed. That is when developers can get feedback on the change from real users and real traffic. More friction can arise at this point. If the change is not instrumented correctly, there will be insufficient telemetry to understand how the change is behaving in production.

Depending on the setup, it might be hard to access the logs, metrics, and traces or to correlate them to understand the impact of the change. That's especially true when a unified observability solution is lacking, leading to slow incident resolution while busy correlating data from different sources. Developers might also be required to build dashboards and alerts to monitor the change if the platform doesn't offer that as a service, adding to the cognitive load and toil of the developers.

As a developer, you might wonder:

  • How can I instrument the application to collect telemetry?
  • Are production logs, metrics, and traces easily accessible to developers?
  • How can I correlate data from different sources to understand the impact of the change?
  • Do I have to build dashboards and alerts to monitor the change?
  • Why is it hard to understand the impact of the change in production?

In the next section, we'll look at a concrete scenario that we'll use throughout the book to tackle these friction points using different tools and practices.

livebook features:
highlight, annotate, and bookmark
Select a piece of text and click the appropriate icon to annotate, bookmark, or highlight (you can also use keyboard shortcuts - h to highlight, b to bookmark, n to create a note).

You can automatically highlight by performing the text selection while keeping the alt/ key pressed.
highlights
join today to enjoy all our content. all the time.
 

1.4 MinSalus: Cloud Native Project

Our goal for this book is to provide you with concrete strategies and tools to improve the developer experience throughout the software development lifecycle. We'll use the MinSalus project as a case study to illustrate the concepts and practices we discuss, giving you a real-world use case to follow along with and try out for yourself.

MinSalus is a fictitious project that simulates a healthcare system. In particular, we'll focus on the workflows for patients to book and manage appointments with doctors. We believe that most people can relate to these scenarios, making it easier to understand the underlying technical challenges and solutions. Even if you're not in the healthcare industry, the principles we cover in this book can be applied to any cloud native project.

Remember that all the code used in this book is available on the book's GitHub repository. You can clone the repository and follow along with the examples or use the code as a reference for your own projects. We provide examples in Go and Java, but you can adapt the concepts to any programming language you're comfortable with. We welcome contributions to the repository, so if you find any issues or want to include examples in other languages, feel free to open a pull request.

1.4.1 The Context

Let's start by identifying users and software systems that are part of MinSalus. A patient uses MinSalus to book appointments with doctors and receives updates about schedule changes or follow-up appointments. Figure 1.6 shows a system context diagram for MinSalus and illustrates the notation that we'll use for architecture diagrams throughout the book, based on the C4 model created by Simon Brown. The C4 Model is an approach for visualizing software architecture (https://c4model.com). In the diagram, we can identify two abstractions:

  • Person. It represents one of the human users of the software system. In our example, it's a patient.
  • Software System. It represents the overall system that delivers value to its users. In our example, it's the MinSalus system.
Figure 1.6 The system context diagram for MinSalus, following the C4 model.

A system context diagram is a high-level diagram that shows the interactions between a system and its users, as well as other systems that it interacts with. It targets a broader audience, including non-technical stakeholders. At this stage, we don't consider the internal details of the system or the technologies used to implement it.

If you're a developer supposed to work on a new feature for MinSalus, the system context diagram is insufficient to understand the system's architecture. Let's zoom into the system to see its internal components and how they interact with each other. We can do this by creating a container diagram (figure 1.7). A container diagram shows the high-level components of a system, their responsibilities, and how they interact. It relies on a new abstraction from the C4 model:

  • Container. It represents an application or data service. In our example, we have several containers: frontend applications, backend applications, databases, event brokers, and other services.
Note

The container abstraction in the C4 model should not be confused with the concept of containers in technologies such as Podman or Docker.

Figure 1.7 The container diagram for MinSalus, following the C4 model.

The Patient Portal serves as the front end of MinSalus, allowing patients to interact with the system. It relies on a session store to manage user sessions. The Appointment Service is the backend application that manages the booking and scheduling of appointments, storing the data in a database. The Patient Service is the backend application that manages patient information and stores the data in a database. The Notification Service updates patients about their appointments, reminding them of upcoming visits or changes in the schedule. An event broker is used to communicate between services, allowing them to be decoupled and scalable.

Figure 1.7 shows a partial view of the MinSalus system. We'll expand on this diagram as we progress through the book, adding more details about the services and their interactions. For example, the system will need an access control service to manage user authentication and authorization and an observability service to collect and analyze telemetry data.

1.4.2 The Process

Imagine you're a developer who has just joined the MinSalus team. On your first day, you receive a task to implement a feature to fulfill a new requirement. What do you do?

In the rest of the book, we'll guide you through the entire process, from receiving a requirement to releasing the feature to production. Step by step, we'll go through inner loop and outer loop activities, moving from development to integration to production environments.

We'll demonstrate several tools and practices that you can use to address the challenges highlighted in the Ten Friction Points of Developer Experience. We'll present different alternatives that you can choose from based on your team's context and constraints. We'll also discuss the trade-offs of each approach, helping you make informed decisions.

Are you ready? Let's start setting up the development environment and improving the inner loop experience. See you in the next chapter!

livebook features:
discuss
Ask a question, share an example, or respond to another reader. Start a thread by selecting any piece of text and clicking the discussion icon.
discussions
Sign in for more free preview time

1.5 Summary

  • The path from idea to value is crucial for delivering software that effectively meets customers' needs. Cloud native technologies are an essential enabler for building modern software solutions. However, they introduce a unique set of challenges that application developers must address.
  • Many organizations require developers to interact with low-level details in Kubernetes and cloud infrastructures, reducing their time to produce more value for their customers and resulting in a substantial cognitive load increase.
  • Distributed systems are complex. When implemented on top of cloud infrastructures, they can substantially impact daily application development workflows, creating friction for developers when building against cloud services in their development environments or needing to run all the necessary dependencies to work on new features.
  • In the past few years, organizations underwent radical changes, trying to catch up with the rise and mainstream adoption of cloud computing. However, organizational transformations don't always succeed.
  • Continuous delivery is a holistic approach to developing and delivering higher-quality software faster, safer, and in a repeatable way. Continuous delivery practices give us well-tested solutions to improve the path from idea to production, but the adoption might be challenging. Unfortunately, many companies still struggle to implement these ideas.
  • As a developer, you might be overwhelmed by the many tools you need to learn and master to complete your daily tasks. Or you might get frustrated due to slow and suboptimal tools. Perhaps your organization has added unnecessary hurdles and constraints that impact your team's performance, making it more challenging to go from idea to value.
  • Good developer experiences don't emerge spontaneously—they must be deliberately designed to align with the software being developed and the toolchain in use. Yet most teams work with an inherited patchwork of tools, each designed with different intentions and constraints. That creates a "Frankenstein Experience": a cobbled-together environment that undermines developer productivity rather than enhancing it.
  • The path from idea to production starts with a clear problem statement and the definition of a requirement for solving it. A requirement could involve creating a new application, adding a new feature to an existing distributed system, fixing a bug in a microservice, or refactoring existing code in a modular monolith.
  • Given a requirement, many activities must be performed to deliver a working software solution. These activities can be grouped into two main categories: the inner loop and the outer loop.
  • The inner loop is where developers spend most of their time. It's the cycle of activities that a developer performs to write, test, run, and debug code. It includes practices such as test-driven development and continuous integration.
  • The outer loop is the cycle of activities after a developer commits a change to the mainline in a version control system until the change is deployed to production and operational. The central part of the outer loop is the deployment pipeline, which is the key pattern in continuous delivery and represents the only path to production.
  • The daily workflow of an application developer can be full of challenges, creating friction throughout the software development lifecycle. The Ten Friction Points of Developer Experience highlight the main areas that affect the developer experience and cause pain.
  • In the C4 model, a system context diagram is a high-level diagram that shows the interactions between a system and its users and other systems that it interacts with. It targets a broader audience, including non-technical stakeholders. At this stage, we don't consider the internal details of the system or the technologies used to implement it.
  • A container diagram shows the high-level components of a system, their responsibilities, and how they interact. In this context, a container is an application or data service.

[1] J. Humble, “There's No Such Thing as a ‘DevOps Team’” October 19, 2012, https://continuousdelivery.com/2012/10/theres-no-such-thing-as-a-devops-team

[2] K. Mugrage, “My definition of DevOps,” December 8, 2020, https://dev.to/kmugrage/my-definition-of-devops-2baj

[3] J. Humble, D. Farley, “Continuous Delivery”, Addison-Wesley Professional, 2010

[4] Cloud Native Computing Foundation, “CNCF Cloud Native Definition v1.1”, https://github.com/cncf/toc/blob/main/DEFINITION.md

[5] F. Fagerhold and J. Munch, “Developer Experience: Concept and Definition”, University of Helsinki, https://researchportal.helsinki.fi/en/publications/developer-experience-concept-and-definition

[6] Test-Driven Development is one of the practices of Extreme Programming (XP), a software development methodology created by Kent Beck and described in his book "Extreme Programming Explained: Embrace Change" (2nd Edition, Addison-Wesley Professional, 2004). Too often, TDD is misunderstood, leading its author to publish an interesting article about what is "Canon TDD" (https://tidyfirst.substack.com/p/canon-tdd).

[7] Continuous Integration is one of the practices of Extreme Programming (XP), a software development methodology created by Kent Beck. You can learn more about it in the book "Continuous Integration: Improving Software Quality and Reducing Risk" by Paul M. Duvall, Steve Matyas, and Andrew Glover (Addison-Wesley Professional, 2007).

[8] Martin Fowler, “Keystone Interface”, https://martinfowler.com/bliki/KeystoneInterface.html.

[9] Mainline integration, pre-integration reviews, and feature branches are practices described by Martin Fowler in his article "Patterns for Managing Source Code Branches" (https://martinfowler.com/articles/branching-patterns.html).

[11] Martin Fowler, "Continuous Integration Certification" (https://martinfowler.com/bliki/ContinuousIntegrationCertification.html).

[12] J. Humble, D. Farley, “Continuous Delivery”, Addison-Wesley Professional, 2010. D. Farley, “Continuous Delivery Pipelines”, 2021.

[13] Continuous deployment is a practice originally suggested by Timothy Fitz in 2009 (http://timothyfitz.com/2009/02/08/continuous-deployment).

sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage