6 Testing

published book

This chapter covers

  • Identifying which type of tests to write for infrastructure systems
  • Writing tests to verify infrastructure configuration or modules
  • Understanding the cost of different types of tests

Recall from Chapter 1 that infrastructure as code involves an entire process to push a change to a system. You update scripts or configurations with infrastructure changes, push them to a version control system, and apply the changes in an automated way. However, you can use every module and dependency pattern from chapters 3 and 4 and still have failed changes! How do you catch a failed change before you apply it to production?

You can solve this problem by implementing tests for infrastructure as code. Testing is a process that evaluates whether or not a system works as expected. This chapter reviews some considerations and concepts related to testing infrastructure as code to reduce the rate of change failure and build confidence in infrastructure changes.

Definition

Testing infrastructure as code is a process that evaluates whether or not infrastructure works as expected.

Imagine you configure a network switch with a new network segment. You can manually test existing networks by pinging each server on each network and verifying their connectivity. To test if you set up the new network correctly, you create a new server and check if it responds when you connect to it. This manual test takes a few hours for two or three networks.

As you create more networks, you can take days to verify your network connectivity. For every network segment update, you must manually verify the network connectivity and the servers, queues, databases, and other resources running on the network. You cannot test everything, so you only check a few resources. Unfortunately, this approach can leave hidden bugs or issues that only appear weeks, even months, later!

To reduce the burden of manual testing, you can instead automate your tests by scripting each command. Your script creates a server on the new network, checks its connectivity, and tests connections to existing networks. You invest some time and effort into writing the tests but save hours of manual verification by running an automated script for any subsequent changes to the network.

Figure 6.1 shows the amount of effort in hours compared to the number of infrastructure resources when you do manual and automated testing. When you run the network tests manually, you have to spend a lot of time on testing. The effort increases the more resources you add to your system. By comparison, writing automated tests takes an initial effort. However, the effort to maintain the test generally decreases as your system grows. You can even run automated tests in parallel to reduce the overall testing effort.

Figure 6.1. Manual testing may have lower effort initially, but as the number of infrastructure resources in your system increases, it increases in effort. Automated testing takes a high initial effort but decreases as you grow your system.

Of course, testing doesn’t catch every problem or eliminate all failures from your system. However, automated testing serves as documentation for what you should test in your system every time you make a change. If a hidden bug chooses to appear, you spend some time writing a new test to verify the bug doesn’t happen again! Tests lower the overall operational effort over time.

You can use testing frameworks for your infrastructure provider or tool or native testing libraries in programming languages. The code listings use a Python testing framework called pytest and Apache Libcloud, a Python library to connect to Google Cloud Platform. I tried to write the tests to focus on what the test verifies and not the syntax. You can apply the general approach to any tool or framework.

Do not write tests for every single bit of infrastructure as code in your system. Tests can become difficult to maintain and, on occasion, redundant. Instead, I’ll explain how to assess when to write a test and what type of test applies to the resource you’re changing. Infrastructure testing is a heuristic - you’re never going to be able to predict or simulate a change to production fully. A helpful test provides insight and practice into configuring infrastructure or how a change will impact a system. I’ll also separate which tests apply to modules such as factories, prototypes, or builders versus general composite or singleton configuration for a live environment.

livebook features:
highlight, annotate, and bookmark
Select a piece of text and click the appropriate icon to annotate, bookmark, or highlight (you can also use keyboard shortcuts - h to highlight, b to bookmark, n to create a note).

You can automatically highlight by performing the text selection while keeping the alt/ key pressed.
highlights
join today to enjoy all our content. all the time.
 

6.1 The infrastructure testing cycle

Testing helps you gain confidence and assess the impact of changes to infrastructure systems. However, how can you test a system without creating it first? Furthermore, how do you know that your system works after applying changes?

You can use the infrastructure testing cycle in figure 6.2 to structure your testing workflow. After you define an infrastructure configuration, you run some initial tests to check your configuration. If they pass, you can apply the changes to active infrastructure and test the system.

Figure 6.2. Infrastructure testing verifies whether or not you can apply changes to a system. After applying changes, you can use additional tests to confirm the changes succeeded.

In this workflow, you run two types of tests. One kind of test statically analyzes the configuration before you deploy the infrastructure changes, and the other dynamically analyzes the infrastructure resource to make sure it still works. Most of your tests follow this pattern by running before and after change deployment.

6.1.1 Static analysis

How would you apply the infrastructure testing cycle to our network example? Imagine you parse your network script to verify that the new network segment has the correct IP address range. You don’t need to deploy the changes to the network. Instead, you analyze the script, a static file.

In figure 6.3, you define the network script and run static analysis. If you find the wrong IP address, the tests fail. You can revert or fix your network changes and re-run the tests. If they pass, you can apply the correct network IP address to the active network.

Figure 6.3. You can either fix the configuration to pass the tests or revert to a previously successful configuration when static analysis fails.

Tests that evaluate infrastructure configuration before deploying changes to infrastructure resources perform static analysis.

Definition

Static analysis for infrastructure as code verifies plaintext infrastructure configuration before deploying changes to live infrastructure resources.

Tests for static analysis do not require infrastructure resources since it usually parses the configuration. They do not run the risk of impacting any active systems. If static analysis tests pass, we have more confidence that we can apply the change.

I often use static analysis tests to check for infrastructure naming standards and dependencies. They run before applying changes, and in a matter of seconds, they identify any inconsistent naming or configuration concerns. I can correct the changes, rerun the tests to pass, and apply the changes to infrastructure resources.

Tests for static analysis do not apply changes to active infrastructure, making rollback more straightforward. If tests for static analysis fail, you can return to the infrastructure configuration, correct the problems, and commit the changes again. If you cannot fix the configuration to pass static analysis, you can revert your commit to a previous one that succeeds! You’ll learn more about reverting changes in chapter 11.

6.1.2 Dynamic analysis

If the static analysis passes, you can deploy changes to the network. However, you don’t know if the network segment actually works. After all, a server needs to connect to the network. To test connectivity, you create a server on the network and run a test script to check inbound and outbound connectivity.

Figure 6.4 shows the cycle of testing network functionality. Once you apply changes to the live infrastructure environment, you run some tests to check the functionality of the system. If the test script fails and shows the server cannot connect, you return to the configuration and fix it for the system.

Figure 6.4. When dynamic analysis fails, you can fix the testing environment by updating the configuration or reverting to a previously working configuration.

Note that your testing script needs a live network to create a server and test its connectivity. The tests that verify infrastructure functionality after applying changes to live infrastructure resources perform dynamic analysis.

Definition

Dynamic analysis for infrastructure as code verifies system functionality after applying changes to live infrastructure resources.

When these tests pass, we have more confidence that the update succeeded. However, if they fail, they identify a problem in the system. If the tests fail, you know that you need to debug, fix the configuration or scripts, and rerun the tests. They provide an early warning system for changes that might break infrastructure resources and system functionality.

You can only dynamically analyze a live environment. What if you don’t know if the update will work? Can you isolate these tests from a production environment? Rather than apply all changes to a production environment and test it, you can use an intermediate testing environment to separate your updates and test them.

6.1.3 Infrastructure testing environments

Some organizations duplicate entire networks in a separate environment so they can test larger network changes. Applying changes to a testing environment makes it easier to identify and fix the broken system, update configuration, and commit the new changes without affecting business-critical systems.

When you run your tests in a separate environment before promoting to the active one, you add to the infrastructure testing cycle. In Figure 6.5, you keep the static analysis step. However, you apply your network change in a testing environment and run dynamic analysis. If it passes the testing environment, you can apply the changes to production and run dynamic analysis in production.

Figure 6.5. You can run a static and dynamic analysis of infrastructure in development before applying the changes to production.

A testing environment isolates changes and tests from the production environment.

Definition

A testing environment is an environment that is separate from production and used for testing infrastructure changes.

A testing environment before production helps you practice and check changes before deploying production. You better understand how they affect existing systems. If you cannot fix the updates, you can revert the testing environment to a working configuration version.

You can use testing environments for the following:

  • Examine the effect of an infrastructure change before applying it to a production system
  • Isolate testing for infrastructure modules (refer to Chapter 5 for module sharing practices).

However, keep in mind that you have to maintain testing environments like production environments. When possible, an infrastructure testing environment should adhere to the following requirements:

  • Its configuration must be as similar to production as possible.
  • It must be a different environment from the application’s development environment.
  • It must be persistent (i.e., do not create and destroy it each time you test).

In previous chapters, I mentioned the importance of reducing drift across environments. If your infrastructure testing environment duplicates production, you will have more accurate testing behavior. You also want to test infrastructure changes in isolation, away from a development environment dedicated to applications. Once you’ve confirmed that your infrastructure changes have not broken anything, you can push them to the application’s development environment.

It helps to have a persistent infrastructure testing environment. This way, you can test whether or not updates to running infrastructure will potentially affect business-critical systems. Unfortunately, maintaining an infrastructure testing environment may not be practical from a cost or resources standpoint. I’ll outline some techniques for cost management of testing environments in Chapter 12.

In the remainder of the chapter, I’ll discuss the different types of tests that perform static and dynamic analysis and how they fit into your testing environment. Some tests will allow you to reduce your dependency on a testing environment. Others will be critical to assessing the functionality of a production system after changes. Later in this book, I will cover rollback techniques specific to production and incorporate testing into continuous infrastructure delivery.

livebook features:
discuss
Ask a question, share an example, or respond to another reader. Start a thread by selecting any piece of text and clicking the discussion icon.
discussions
Get Infrastructure as Code, Patterns and Practices
add to cart

6.2 Unit tests

I mentioned the importance of running static analysis on infrastructure as code. Static analysis evaluates the files for specific configurations. What kinds of tests can you write for static analysis?

Imagine you have a factory module to create a network named “hello-world-network” and three subnets with IP address ranges in 10.0.0.0/16. You want to verify their network names and IP ranges. You expect the subnets to divide the 10.0.0.0/16 range amongst themselves.

As a solution, you can write some tests to check the network name and subnet IP address ranges in your infrastructure as code without creating the network and subnet. This static analysis verifies the configuration parameters for expected values in a matter of seconds.

Figure 6.6 shows that your static analysis consists of several tests run simultaneously. You check the network name, number of subnets, and IP ranges for subnets.

Figure 6.6. Unit tests verify that a configuration parameter, such as network name, equals an expected value.

We just ran unit tests on the network infrastructure as code. A unit test runs in isolation and statically analyzes infrastructure configuration or state. These tests do not rely on active infrastructure resources or dependencies and check for the smallest subset of configuration.

Definition

Unit tests statically analyze plaintext infrastructure configuration or state. They do not rely on live infrastructure resources or dependencies.

Note that unit tests can analyze metadata in infrastructure configuration or state files. Some tools offer information directly in configuration, while others expose values through state. The next few sections provide examples to test both types of files. Depending on your infrastructure as code tool, testing framework, and preference, you may test one, the other, or both.

6.2.1 Testing infrastructure configuration

We’ll start by writing unit tests for modules that use templates to generate infrastructure configuration. Our network factory module uses a function to create an object with the network configuration. You need to know if the function “_network_configuration()” generates the correct configuration.

For the network factory module, you can write unit tests in pytest to check the functions that generate the JSON configuration for networks and subnets. The testing file includes three tests, one for the network name, the number of subnets, and IP ranges.

Pytest will identify tests by looking for files and tests prefixed by “test_”. We named the testing file “test_network.py” so pytest can find it. The tests in the file each have the prefix “test_” and some descriptive information on what the test checks.

Listing 6.1 Use pytest to run unit tests in test_network.py
import pytest    #A
from main import NetworkFactoryModule    #B
 
NETWORK_PREFIX = 'hello-world'    #C
NETWORK_IP_RANGE = '10.0.0.0/16'    #C
 
 
@pytest.fixture(scope="module")    #D
def network():    #D
   return NetworkFactoryModule(    #D
       name=NETWORK_PREFIX,    #D
       ip_range=NETWORK_IP_RANGE,    #D
       number_of_subnets=3)    #D
 
 
@pytest.fixture    #E
def network_configuration(network):    #E
   return network._network_configuration()['google_compute_network'][0]    #E
 
 
@pytest.fixture    #F
def subnet_configuration(network):    #F
   return network._subnet_configuration()[    #F
       'google_compute_subnetwork']    #F
 
 
def test_configuration_for_network_name(network, network_configuration):    #G
   assert network_configuration[network._network_name][
       0]['name'] == f"{NETWORK_PREFIX}-network"
 
 
def test_configuration_for_three_subnets(subnet_configuration):     #H
   assert len(subnet_configuration) == 3
 
 
def test_configuration_for_subnet_ip_ranges(subnet_configuration):     #I
   for i, subnet in enumerate(subnet_configuration):
       assert subnet[next(iter(subnet))
                     ][0]['ip_cidr_range'] == f"10.0.{i}.0/24"

The testing file includes a static network object passed between tests. This test fixture creates a consistent network object that each test can reference. It reduces repetitive code used to build a test resource.

Definition

A test fixture is a known configuration used to run a test. It often reflects known or expected values for a given infrastructure resource.

Some of the fixtures separately parse the network and subnet information. Any time we add new tests, we don’t have to copy-and-paste the parsing. Instead, we reference the fixture for the configuration.

You can run pytest in your command line and pass an argument with a test file. Pytest runs a set of three tests and outputs their success.

$ pytest test_network.py
==================== test session starts ====================
collected 3 items
 
test_network.py ...                                    [100%]
 
===================== 3 passed in 0.06s =====================

In this example, we imported the network factory module, created a network object with configuration, and tested it. You don’t need to write any configuration to a file. Instead, you reference the function and test the object.

This example uses the same approach I take to unit testing application code. It often results in smaller, more modular functions that you can test more efficiently. The function that generates the network configuration needs to output the configuration for the test. Otherwise, the tests cannot parse and compare the values.

6.2.2 Testing domain-specific languages

How do you test your network and subnet configuration if you use a domain-specific language (DSL)? You don’t have functions that you can call in your test. Instead, your unit tests must parse values out of the configuration or dry run file. Both types of files store some kind of plaintext metadata about infrastructure resources.

Imagine you used a DSL instead of Python to create your network. The example creates a JSON file with Terraform-compatible configuration. The JSON file contains all three subnetworks, their IP address ranges, and names. In figure 6.7, you decide to run the unit tests against the network’s JSON configuration file. The tests run quickly because you do not deploy the networks.

Figure 6.7. Unit tests against dry runs require generating a preview of changes to infrastructure resources and checking it for valid parameters.

In general, you can always unit test the files you used to define infrastructure as code. If a tool uses a configuration file, like AWS CloudFormation, HashiCorp Terraform, Ansible, Puppet, Chef, and more, you can unit test any lines in the configuration.

For example, you can test the network name, number of subnets, and subnet IP address ranges for your network module without generating a dry run. I run similar tests with pytest to check the same parameters.

Listing 6.2 Use pytest to run unit tests in test_network_configuration.py
import json    #A
import pytest
 
NETWORK_CONFIGURATION_FILE = 'network.tf.json'    #B
 
expected_network_name = 'hello-world-network'    #C
 
 
@pytest.fixture(scope="module")    #D
def configuration():    #D
   with open(NETWORK_CONFIGURATION_FILE, 'r') as f:    #D
       return json.load(f)    #D
 
 
@pytest.fixture
def resource():    #E
   def _get_resource(configuration, resource_type):    #E
       for resource in configuration['resource']:    #E
           if resource_type in resource.keys():    #E
               return resource[resource_type]    #E
   return _get_resource    #E
 
 
@pytest.fixture    #F
def network(configuration, resource):    #F
   return resource(configuration, 'google_compute_network')[0]    #F
 
 
@pytest.fixture    #G
def subnets(configuration, resource):    #G
   return resource(configuration, 'google_compute_subnetwork')    #G
 
 
def test_configuration_for_network_name(network):    #H
   assert network[expected_network_name][0]['name'] \    #H
       == expected_network_name    #H
 
 
def test_configuration_for_three_subnets(subnets):    #I
   assert len(subnets) == 3    #I
 
 
def test_configuration_for_subnet_ip_ranges(subnets):    #J
   for i, subnet in enumerate(subnets):    #J
       assert subnet[next(iter(subnet))    #J
                     ][0]['ip_cidr_range'] == f"10.0.{i}.0/24"    #J

You might notice the unit tests for DSLs look similar to those of programming languages. They check the network name, number of subnets, and IP addresses. Some tools have specialized testing frameworks. They usually use the same workflow of generating a dry run or state file and parsing it for values.

However, your configuration file may not contain everything. For example, you won’t have certain configurations in HashiCorp Terraform or Ansible until after you do a dry run. A dry run previews infrastructure as code changes without deploying them and internally identifies and resolves potential problems.

Definition

A dry run previews infrastructure as code changes without deploying them. It internally identifies and resolves potential problems.

Dry runs come in different formats and standards. Most dry runs output to a terminal, which you can save the output to a file. Some tools will automatically generate the dry run to a file.

As a general practice, I prioritize tests that check configuration files. I write tests to parse dry runs when I cannot get the value from configuration files. A dry run typically needs network access to the infrastructure provider API and takes a bit of time to run. On occasion, the output or file contains some sensitive information or identifiers that I do not want a test to explicitly parse.

While dry run configuration may not adhere to the more traditional software development definition of unit tests, the parsing of dry runs does not require any changes to active infrastructure. It remains a form of static analysis. The dry run itself serves as a unit test to validate and output the expected change behavior before applying the change.

6.2.3 When should you write unit tests?

Unit tests help you verify that your logic generates the correct names, produces the correct number of infrastructure resources, and calculates the correct IP ranges or other attributes. Some unit tests may overlap with formatting and linting, concepts I mentioned in Chapter 2. I classify linting and formatting as part of unit testing because they help you understand how to name and organize your configuration.

Figure 6.8 summarizes some use cases for unit tests. You should write additional unit tests to verify any logic you used to generate infrastructure configuration, especially with loops or conditional (if-else) statements. Unit tests can also capture wrong or problematic configurations, such as the wrong operating system.

Figure 6.8. Write unit tests to verify the resource logic, highlight potential problems, or identify team standards.

Since unit tests check the configuration in isolation, they do not precisely reflect how a change will affect a system. As a result, you can’t expect a unit test to prevent a major failure during production changes. However, you should still write unit tests! While they won’t identify problems while running a change, unit tests can prevent problematic configurations before production.

For example, someone might accidentally type a configuration for 1000 servers instead of 10 servers. A test to verify the maximum number of servers in a configuration can prevent someone from overwhelming the infrastructure and manage the cost. Unit tests can also prevent any insecure or non-compliant infrastructure configuration from a production environment. I will cover how to apply unit tests to secure and audit infrastructure configuration in Chapter 8.

In addition to early identification of wrong configuration values, unit tests help automate checking complex systems. When you have many infrastructure resources managed by different teams, you can no longer manually search through one resource list and check each configuration. Unit tests communicate the most critical or standard configurations to other teams. When you write unit tests for infrastructure modules, you verify that the internal logic of the module produces the expected resources.

Use cases for unit tests include checking that you’ve created the expected number of infrastructure resources, pinned specific versions of infrastructure, or used the correct naming standard. Unit tests run quickly and offer rapid feedback at virtually zero cost (after you’ve written them!). They run on the order of seconds because they do not post updates to infrastructure or require the creation of active infrastructure resources. If you write unit tests to check the output of a dry run, you add a bit of time because of the initial time spent generating the dry run.

livebook features:
settings
Update your profile, view your dashboard, tweak the text size, or turn on dark mode.
settings
Sign in for more free preview time

6.3 Contract tests

Unit tests verify configuration or modules in isolation, but what about dependencies between modules? In Chapter 4, I mentioned the idea of a contract between dependencies. The output from a module must agree with the expected input to another. You can uses tests to enforce that agreement.

For example, let’s create a server on a network. The server accesses the network name and IP address using a facade, which mirrors the name and IP address range of the network. How do you know that the network module outputs the network name and IP CIDR range and not another identifier or configuration?

You use a contract test in figure 6.9 to test that the network module outputs the facade correctly. The facade must contain the network name and IP address range. If the test fails, it shows that the server cannot create itself on the network.

Figure 6.9. Contract tests can quickly verify a configuration parameter equals an expected value, such as a network facade with proper outputs.

A contract test uses static analysis to check that module inputs and outputs match an expected value or format.

Definition

Contract tests statically analyze and compare module or resource inputs and outputs to match an expected value or format.

Contract tests help enable evolvability of individual modules while preserving the integration between the two. When you have many infrastructure dependencies, you cannot manually check all of their shared attributes. Instead, a contract test automates the verification of the type and value of attributes between modules.

You’ll find contract tests most useful for checking inputs and outputs of heavily parameterized modules (such as factory, prototype, or builder patterns). Writing and running contract tests helps detect wrong inputs and outputs and documents the module’s minimum resources. When you do not have contract tests for your modules, you won’t find out if you broke something in the system until the next time you apply the configuration to a live environment.

Let’s implement a contract test for the server and the network. Using pytest, you set up the test by creating a network with a factory module. Then, you verify the network’s output includes a facade object with the network name and IP address range. You add these tests to the server’s unit tests.

Listing 6.3 Contract test to compare the module outputs with inputs
from network import NetworkFactoryModule, NetworkFacade
import pytest
 
network_name = 'hello-world'    #E
network_cidr_range = '10.0.0.0/16'    #F
 
 
@pytest.fixture
def network_outputs():     #A
   network = NetworkFactoryModule(    #B
       name=network_name,    #B
       ip_range=network_cidr_range)    #B
   return network.outputs()    #C
 
 
def test_network_output_is_facade(network_outputs):    #D
   assert isinstance(network_outputs, NetworkFacade)    #D
 
 
def test_network_output_has_network_name(network_outputs):     #E
   assert network_outputs._network == f"{network_name}-subnet"     #E
 
 
def test_network_output_has_ip_cidr_range(network_outputs):     #F
   assert network_outputs._ip_cidr_range == network_cidr_range     #F

Imagine you update the network module to output the network ID instead of the name. That breaks the functionality of the upstream server module because the server expects the network name! Contract testing ensures that you do not break the contract (or interface) between two modules when you update either one. Use a contract test to verify your facades and adapters when expressing dependencies between resources.

Why should you add the example contract test to the server, a higher level resource? Your server expects specific outputs from the network. If the network module changes, you want to detect it from the high-level module first.

In general, a high-level module should defer to changes in the low-level module to preserve composability and evolvability. You want to avoid making significant changes to the interface of a low-level module because it may affect other modules that depend on it.

Infrastructure contract tests require some way to extract the expected inputs and outputs, which may involve API calls to infrastructure providers and verifying the responses against expected values for modules. Sometimes, this involves creating test resources to examine the parameters and understand how fields like ID should be structured. When you need to make API calls or create temporary resources, your contract tests can run longer than a unit test.

livebook features:
highlight, annotate, and bookmark
Select a piece of text and click the appropriate icon to annotate, bookmark, or highlight (you can also use keyboard shortcuts - h to highlight, b to bookmark, n to create a note).

You can automatically highlight by performing the text selection while keeping the alt/ key pressed.
highlights
join today to enjoy all our content. all the time.
 

6.4 Integration tests

How do you know that you can apply your configuration or module changes to an infrastructure system? You need to apply the changes to a testing environment and dynamically analyze the running infrastructure. An integration test runs against test environments to verify successful changes to a module or configuration.

Definition

Integration tests run against testing environments and dynamically analyze infrastructure resources to verify if they are affected by module or configuration changes.

Integration tests require an isolated testing environment to verify the integration of modules and resources. In the next sections, you’ll learn about the different integration tests you can write for infrastructure modules and configurations.

6.4.1 Testing modules

Imagine a module that creates a Google Cloud Platform (GCP) server. You want to make sure you can create and update the server successfully. In figure 6.10, you write an integration test. First, configure the server and apply the changes to a testing environment. Then, you run integration tests to check that your configuration update succeeds, create a server, and name it “hello-world-test.” The total runtime of the test takes a few minutes because you need to wait for a server to provision.

Figure 6.10. Integration tests usually create and update infrastructure resources in a testing environment, test their configuration and status for correctness or availability, and remove them after the tests.

When you implement an integration test, you need to compare the active resource to your infrastructure as code. The active resource tells you whether or not your module deployed successfully. If someone cannot deploy the module, they potentially break their infrastructure.

An integration test must retrieve information about the active resource with the infrastructure provider’s API. For example, you can import a Python library to access the Google Cloud Platform (GCP) API in your server module’s integration test. The integration test imports Apache Libcloud, a Python library, as a client SDK for the GCP API.

The test builds the server’s configuration using the module, waits for the server to deploy, and checks the server’s state in the GCP API. If the server returns a “running” status, then the test passes. Otherwise, the test fails and identifies a problem with the module. Finally, the test tears down the test server it created.

Listing 6.4 Integration tests for server creation in test_integration.py
from libcloud.compute.types import NodeState    #G
from main import generate_json, SERVER_CONFIGURATION_FILE
import os
import pytest
import subprocess
import test_utils
 
TEST_SERVER_NAME = 'hello-world-test'
 
 
@pytest.fixture(scope='session')    #A
def apply_changes():    #A
   generate_json(TEST_SERVER_NAME)    #B
   assert os.path.exists(SERVER_CONFIGURATION_FILE)    #B
   assert test_utils.initialize() == 0    #C
   yield test_utils.apply()    #C
   assert test_utils.destroy() == 0    #H
   os.remove(SERVER_CONFIGURATION_FILE)    #H
 
 
def test_changes_have_successful_return_code(apply_changes):    #D
   return_code = apply_changes[0]
   assert return_code == 0
 
 
def test_changes_should_have_no_errors(apply_changes):     #E
   errors = apply_changes[2]
   assert errors == b''
 
 
def test_changes_should_add_1_resource(apply_changes):     #F
   output = apply_changes[1].decode(encoding='utf-8').split('\n')
   assert 'Apply complete! Resources: 1 added, 0 changed, 0 destroyed' in output[-2]
 
 
def test_server_is_in_running_state(apply_changes):      #G
   gcp_server = test_utils.get_server(TEST_SERVER_NAME)      #G
   assert gcp_server.state == NodeState.RUNNING      #G

When you run the tests in this file in your command line, you’ll notice that it takes a few minutes because the test session creates the server and deletes it.

$ pytest test_integration.py
========================== test session starts =========================
collected 4 items
 
test_integration.py ....                                          [100%]
 
==================== 4 passed in 171.31s (0:02:51) =====================

The integration tests for the server apply two main practices. First, I wrote tests that follow the sequence of:

  • Render configuration, if applicable
  • Deploy changes to infrastructure resources
  • Run tests, accessing the infrastructure provider’s API for comparison
  • Delete infrastructure resources, if applicable.

The example implements the sequence using a fixture. You can use it to apply any arbitrary infrastructure configuration and remove it after testing.

Second, I ran module integration tests in a separate module testing environment (such as a test account or project) away from testing or production environments supporting applications. To prevent conflicts with other module tests in the environment, I label and name the resources based on the specific module type, version, or commit hash.

Definition

A module testing environment is an environment that is separate from production and used for testing module changes.

Testing modules in a different environment than a testing or production environment helps isolate failed modules from an active environment with applications. You can also measure and control your infrastructure cost from testing modules. I’ll cover the cost of cloud computing in greater detail in Chapter 12.

6.4.2 Testing configuration for environments

Integration tests for infrastructure modules can create and delete resources in a testing environment, but integration tests for environment configurations cannot. Imagine you need to add an A record to your current domain name configured by a composite or singleton configuration. How do you write some integration tests to check if you added the record correctly?

You encounter two problems. First, you cannot simply create and then destroy DNS records as part of your integration tests because it may affect applications. Second, the A record depends on a server IP address to exist before you can configure the domain.

Instead of creating and destroying the server and A record in a testing environment, you run the integration tests against a persistent testing environment that matches production. In figure 6.11, you update the DNS record in infrastructure as code for the testing environment. Your integration tests that the DNS in the testing environment matches the expected correct DNS record. After the test passes, you can update the DNS record for production.

Figure 6.11. You can run integration tests against a testing environment with long-lived resources to isolate the changes from production and reduce the dependencies you need to create for the test.

Why run the DNS test in a persistent testing environment? First, it can take a long time to create a testing environment. As a high-level resource, DNS depends on many low-level ones. Second, you want an accurate representation of how the change behaves before you update production.

The testing environment captures a subset of dependencies and complexities of the production system so you can check that your configuration works as expected. Keeping similar testing and production environments means that a change in testing provides an accurate perspective of its behavior in production. You want to aim for early detection of problems in the testing environment.

6.4.3 Testing challenges

Without the integration tests, you would not know if a server module or DNS record updates successfully until you manually check it. They expedite the process of verifying that your infrastructure as code works. However, you will encounter a few challenges with integration testing.

You might have difficulty determining which configuration parameters to test. Should you write integration tests to verify every configuration parameter you’ve configured in infrastructure as code matches the live resource? Not necessarily!

Most tools already have acceptance tests that create a resource, update its configuration, and destroy the resource. Acceptance tests certify that the tool can release new code changes. These tests must pass in order for the tool to support changes to infrastructure.

You don’t want to spend additional time or effort writing tests that match the acceptance tests. As a result, your integration tests should cover whether or not multiple resources have the correct configuration and dependencies. If you write custom automation, you will need to write integration tests to create, update, and delete resources.

Another challenge involves deciding whether or not you should create or delete resources during each test or run a persistent testing environment. Figure 6.12 shows a decision tree for whether or not you should create, delete, or use a persistent testing environment for an integration test.

In general, if a configuration or module does not have too many dependencies, you can create, test, and delete it. However, if your configuration or module takes time to create or requires the existence of many other resources, you will need to use a persistent testing environment.

Figure 6.12. Your integration test should create and delete resources based on module or configuration type and dependencies.

Not all modules benefit from a create-and-delete approach in integration testing. I recommend running integration tests for low-level modules, such as networks or DNS, and avoid removing the resources. These modules usually require in-place updates in environments with a minimal financial cost. I often find it more realistic to test the update instead of creating and deleting the resource.

Resources created by integration tests for mid-level modules, such as workload orchestrators, may be persistent or temporary depending on the size of the module and resource. The larger the module, the more likely it will need to be long-lived. You can run integration tests for high-level modules, such as application deployments or SaaS, and create and delete the resources each time.

A persistent testing environment does have its limits. Integration tests tend to take a long time to run because it takes time to create or update resources. As a rule, keep modules smaller with fewer resources. This practice reduces the amount of time you need for a module integration test.

Even if you keep configurations and modules small with few resources, integration tests often become the culprit of your infrastructure provider bill’s increasing cost. A number of tests need long-lived resources like networks, gateways, and more. Weigh the cost of running an integration test and catching problems against the cost of misconfiguration or a broken infrastructure resource.

You may consider using infrastructure mocks to lower the cost of running an integration test (or any test). Some frameworks replicate an infrastructure provider’s APIs for local testing. I do not recommend relying heavily on mocks. Infrastructure providers change APIs frequently and often have complex errors and behaviors, which mocks do not often capture. In chapter 12, I discuss some techniques to manage the cost of testing environments and avoid mocks.

livebook features:
discuss
Ask a question, share an example, or respond to another reader. Start a thread by selecting any piece of text and clicking the discussion icon.
discussions
Sign in for more free preview time

6.5 End-to-end tests

While integration tests dynamically analyze configuration and catch errors during resource creation or update, they do not indicate whether or not an infrastructure resource is usable. Usability requires that you or a team member use the resource as intended.

For example, you might use a module to create an application, called a service, on Google Cloud Platform (GCP) Cloud Run. GCP Cloud Run deploys any service in a container and returns a URL endpoint. Your integration tests pass, indicating that your module correctly creates the service resource and permissions to access the service.

How do you know if someone can access the application URL? Figure 6.13 shows how to check if the service endpoint works. First, you write a test to retrieve the application URL as an output from your infrastructure configuration. Then, you make an HTTP request to the URL. The total run time takes a few minutes, most of it from creating the service.

Figure 6.13. End-to-end tests verify the end user’s workflow by accessing the webpage at the application’s URL.

You created a test for dynamic analysis that differs from an integration test called an end-to-end test. It verifies the end-user functionality of the infrastructure.

Definition

End-to-end tests dynamically analyze infrastructure resources and end-to-end system functionality to verify if they are affected by infrastructure as code changes.

The example end-to-end test verifies the end-to-end workflow of the end user accessing the page. It does not check for the successful configuration of infrastructure.

End-to-end tests become vital for ensuring that your changes don’t break upstream functionality. For example, I might accidentally update a configuration that allows authenticated users to access the GCP Cloud Run service URL. My end-to-end test fails after applying the change, indicating that someone may no longer access the service.

Let’s implement an end-to-end test for the application URL in Python. The test for this example needs to make an API request to the service’s public URL. It uses a pytest fixture to create the GCP Cloud Run service, test the URL for the running page, and delete the service from a testing environment.

Listing 6.5 End-to-end test for GCP Cloud Run service
from main import generate_json, SERVICE_CONFIGURATION_FILE
import os
import pytest
import requests    #C
import test_utils
 
TEST_SERVICE_NAME = 'hello-world-test'
 
 
@pytest.fixture(scope='session')     #A
def apply_changes():     #A
   generate_json(TEST_SERVICE_NAME)     #B
   assert os.path.exists(SERVICE_CONFIGURATION_FILE)     #B
   assert test_utils.initialize() == 0     #C
   yield test_utils.apply()     #C
   assert test_utils.destroy() == 0     #G
   os.remove(SERVICE_CONFIGURATION_FILE)     #G
 
 
@pytest.fixture     #D
def url():     #D
   output, error = test_utils.output('url')     #D
   assert error == b''     #D
   service_url = output.decode(encoding='utf-8').split('\n')[0]     #D
   return service_url     #D
 
 
def test_url_for_service_returns_running_page(apply_changes, url):    #E
   response = requests.get(url)     #E
   assert "It's running!" in response.text     #F

Note that if you want to run an end-to-end test in production, you do not want to delete the service. You usually run end-to-end tests against existing environments without creating new or test resources. You apply changes to the existing system and run the tests against the active infrastructure resources.

More complex infrastructure systems benefit from end-to-end tests because they become the primary indicator of whether or not a change has affected critical business functionality. As a result, they help test composite or singleton configurations. You do not usually run end-to-end tests on modules unless they have a large number of resources and dependencies.

I write most of my end-to-end tests for network or compute resources. For example, you can write a few tests to check network peering. The tests provision a server on each network and check if the servers can connect.

Another use case for end-to-end tests involves submitting a job to a workload orchestrator and completing it. This test determines whether or not the workload orchestrator functions properly for application deployment. I once included end-to-end tests that issued HTTP requests with varying payloads to ensure upstream services could call each other without disruption, no matter the payload size or protocol.

Outside of network or compute use cases, end-to-end tests can verify the expected behavior of any system. If you use configuration management with a provisioning tool, your end-to-end tests verify that you can connect to the server and run the expected functionality. For monitoring and alerts, you can run end-to-end tests to simulate the expected system behavior, verify that metrics have been collected, and test the triggering of the alert.

However, end-to-end tests are the most expensive tests to execute in terms of time and resources. Most end-to-end tests need every infrastructure resource available to fully evaluate the system. As a result, you may only run end-to-end tests against production infrastructure. You may not run them in a testing environment because it often costs too much money to procure enough resources for the test.

livebook features:
settings
Update your profile, view your dashboard, tweak the text size, or turn on dark mode.
settings
Tour livebook

Take our tour and find out more about liveBook's features:

  • Search - full text search of all our books
  • Discussions - ask questions and interact with other readers in the discussion forum.
  • Highlight, annotate, or bookmark.
take the tour

6.6 Other tests

You may encounter other types of tests outside of unit, contract, integration, and end-to-end tests. For example, you want to roll out a configuration change to a production server that reduces memory. However, you don’t know if the memory reduction will affect the overall system.

Figure 6.14 shows that you can check if your change affected the system using system monitoring. Monitoring continuously aggregates metrics on the server’s memory. If you receive an alert that the server’s memory reaches a percentage of its capacity, you know that you may affect the overall system.

Figure 6.14. Continuous tests run at short intervals to verify that a set of metrics do not exceed a threshold.

Monitoring implements continuous testing with “tests” to check that metrics do not exceed thresholds run at regular, frequent intervals.

Definition

Continuous tests (such as monitoring) run at regular, frequent intervals to check that the current value matches an expected value.

Continuous testing includes monitoring system metrics and security events (when the root user logs into a server). They offer dynamic analysis on an active infrastructure environment. Most continuous tests take the form of alerts, which notify you of any problems.

You may encounter another type of test called a regression test. For example, you may run a test over a period of time to check if your server configuration conforms to your organization’s expectations. Regression tests run regularly but do not have the frequency of monitoring or other forms of continuous testing. You may choose to run them every few weeks or months to check for out-of-band, manual changes.

Definition

Regression tests run periodically over an extended period of time to check if infrastructure configuration conforms to the expected state or functionality. They can help mitigate configuration drift.

Continuous and regression tests often require special software or systems to run. They ensure that running infrastructure behaves with expected functionality and performance. These tests also set a foundation for automating a system to respond to anomalies.

For example, systems configured with infrastructure as code and continuous tests can use autoscaling to adjust resources based on the metrics such as CPU or memory. These systems can also implement other self-healing mechanisms, such as diverting traffic to an older version of an application upon errors.

livebook features:
highlight, annotate, and bookmark
Select a piece of text and click the appropriate icon to annotate, bookmark, or highlight (you can also use keyboard shortcuts - h to highlight, b to bookmark, n to create a note).

You can automatically highlight by performing the text selection while keeping the alt/ key pressed.
highlights
join today to enjoy all our content. all the time.
 

6.7 Choosing tests

I explained some of the most common tests in infrastructure, from unit tests to end-to-end tests. However, do you need to write all of them? Where should you spend your time and effort in writing them? Your infrastructure testing strategy will evolve, depending on the complexity and growth of your system. As a result, you will constantly be assessing which tests will help you catch configuration issues before production.

I use a pyramid shape as a guideline for infrastructure testing strategy. In figure 6.15, the widest part of the pyramid indicates you should have more of that type of test, while the narrowest part indicates that you should have fewer. At the top of the pyramid are end-to-end tests, which may cost more time and money because they require active infrastructure systems. At the bottom of the pyramid are unit tests, which run in seconds and do not require entire infrastructure systems.

Figure 6.15. Based on the test pyramid, you should have more unit tests than end-to-end tests because it costs less time, money, and resources to run them.

This guideline, called the test pyramid, provides a framework for different types of tests, their scope, and frequency. I adapted the test pyramid from software testing to infrastructure, modifying it to infrastructure tools and constraints.

Definition

The test pyramid serves as a guideline for your overall testing strategy. As you go up the pyramid, the type of test will cost more time and money.

In reality, your test pyramid may be shaped more like a rectangular or pear, sometimes with missing levels. You will not and should not write every type of test for every infrastructure configuration. At some point, the tests become redundant and impractical to maintain.

Depending on the system you want to test, it may not be practical to adhere to the test pyramid in its ideal. However, avoid what I jokingly call the “test signpost”. A signpost favors many manual tests and not much of anything else.

6.7.1 Module testing strategy

I alluded to the practice of testing modules before releasing them in Chapter 5. Let’s return to that example, where you updated a database module to PostgreSQL 12. Rather than manually creating the module and testing to see if it works, you add a series of automated tests. They check for the module’s formatting and create a database in an isolated module testing environment.

Figure 6.16 updates the module release workflow with the unit, contract, and integration tests you can add to check that your module works. After the contract tests pass, you run an integration test that sets up the database module on a network and checks if the database runs. After completing the integration test, you delete the test database created by the module and release the module.

Figure 6.16. You can break down the testing stage of your module release workflow to include unit, contract, and integration tests.

A combination of unit, contract, and integration tests adequately represents whether or not a module will work correctly. Unit tests check for module formatting and your team’s standard configurations. You run them first, so you get fast feedback on any violations in formatting or configurations.

Next, you run a few contract tests. In the case of the database module, you check if the network ID input to the database module matches the output of the network ID from the network module. Catching these mistakes will identify problems between dependencies earlier in your deployment process.

Focus on unit or contract testing to enforce proper configuration, correct module logic, and specific inputs and outputs. The testing workflow outlined in Figure 6.16 works best for modules that use the factory, builder, or prototype patterns. These patterns isolate the smallest subset of infrastructure components and provide a flexible set of variables for your teammates to customize.

Depending on the cost of your development environment, you can write a few integration tests to run against temporary infrastructure resources, which you delete at the end of the test. By investing some time and effort into writing tests for modules with many inputs and outputs, you ensure that changes do not affect upstream configuration and that the module can run successfully on its own.

6.7.2 Configuration testing strategy

Infrastructure configurations for active environments use more complex patterns like singleton or composite. A singleton or composite configuration has many infrastructure dependencies and often references other modules. Adding end-to-end tests to your testing workflow can help identify issues between infrastructure and modules.

Imagine you have a singleton configuration with an application server on a network. Figure 6.17 outlines each step after you update the size of the server. After pushing the change to version control, you deploy the change to a testing environment. Your testing workflow begins with unit tests to verify formatting and configuration quickly.

Next, you run integration tests to apply changes and verify that the server still runs and has a new size. You complete your verification by testing the entire system using an end-to-end test. The end-to-end test issues an HTTP GET to the application endpoint. Figure 6.17 repeats the process in production to ensure that the system did not break.

Figure 6.17. Infrastructure as code using the singleton and composite patterns should run unit, integration, and end-to-end tests in a testing environment before deploying the changes to production.

Just because you created or updated a server successfully does not mean the application it hosts can serve requests! With a complex infrastructure system, you need additional tests to verify dependencies or communication between infrastructure. End-to-end tests can help preserve the functionality of the system.

Repeating the same tests between testing and production environments offers quality control. If you have any configuration drift between testing and production environments, your tests may reflect those differences. You can enable or disable specific tests depending on the environment.

6.7.3 Identifying useful tests

The testing strategies for modules and configurations can help guide your initial approach to writing valuable tests. Figure 6.18 summarizes the types of tests you might consider for modules and configurations. Modules rely on unit, contract, and integration tests while configurations rely on unit, integration, and end-to-end tests.

Figure 6.18. Your testing approach will differ depending on whether or not you write a module or environment configuration.

How do you know when to write a test? Imagine your teammate might know that a database password needs to have alphanumeric characters with a 16-character limit. However, you might know this fact until you update a 24-character password, deploy the change, and wait five minutes for the change to fail.

I consider the practice of updating your tests a matter of turning unknown knowns into known knowns in your system. After all, you use observability to debug unknown unknowns and monitoring to track the known unknowns. In figure 6.17, you convert siloed knowledge (unknown knowns) that someone else knows into tests (known knowns) for team knowledge. New tests often reflect siloed knowledge that the team should know and acknowledge.

Figure 6.19. Infrastructure testing converts siloed knowledge, something someone else might know, into a test to reflect the team’s knowledge.

A good test shared knowledge to the rest of the team. You don’t always need to build a new test. Instead, you might find an existing test that doesn’t check for everything. Use a test to prevent your team from repeating problems.

Besides adding tests, you’ll remove tests. You might write a test and discover that it fails half the time. It does not provide helpful information or increase your confidence in the system because of its unreliability. Removing the test cleans up your testing suite and helps eliminate false positives from flakiness.

Furthermore, you’ll remove tests because you don’t need them. For example, you might not need contract tests for every module or integration tests for every environment configuration. Always ask yourself if the tests provide value and if they run reliably enough for you to get sufficient information about the system!

The next chapter will show how to add tests to a delivery pipeline for your infrastructure as code. Even if you do not choose to automate the testing workflow, you have an opportunity to examine how changes could potentially affect your infrastructure.

livebook features:
discuss
Ask a question, share an example, or respond to another reader. Start a thread by selecting any piece of text and clicking the discussion icon.
discussions

6.8 Exercises and Solutions

livebook features:
settings
Update your profile, view your dashboard, tweak the text size, or turn on dark mode.
settings
Sign in for more free preview time

6.9 Summary

  • The test pyramid outlines an approach to testing. The higher the test level in the pyramid, the more costly the test.
  • Unit tests verify static parameters in modules or configurations.
  • Contract tests verify that the inputs and outputs of a module matches expected values and formats.
  • Integration tests create test resources, verify their configuration and creation, and delete them.
  • End-to-end tests verify that the end user of an infrastructure system can run the expected functionality.
  • Modules using factory, builder, or prototype patterns benefit from unit, contract, and integration tests.
  • Configurations using composite or singleton patterns applied to environments benefit from unit, integration, and end-to-end tests.
  • Other tests include monitoring for continuously testing system metrics, regression tests for out-of-band manual changes, or security tests for misconfigurations.
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage