10 Security container isolation

 

This chapter covers

  • All Linux security features used to keep containers isolated from each other
  • Read-only access to kernel filesystems needed for processes within a container but which must be blocked from write access
  • Masking of kernel filesystems to hide information from the host system
  • Linux capabilities limiting the power of root
  • The PID, IPC and network namespaces, which hide most of the operating system from processes within containers
  • The mount namespace, which along with SELinux limit the container processes’ access to only the designated image and volumes
  • The user namespace, which allows you to write root processes inside of a container that are not root outside of a container

In this chapter and chapter 11, I review and demonstrate some additional security considerations when using Podman to run containers. Some of the content was covered in other chapters, but I think it is useful to concentrate on these features from a security perspective.

One of the most frequent problems I see with people running containers is that when the container process is denied some access, the user’s first reaction is to run the container in --privileged mode, which turns off all security separation for your container. Understanding how to deal with the security features discussed in this chapter helps you avoid needing to do this.

10.1 Read-only Linux kernel pseudo filesystems

10.1.1 Unmasking the masked paths

10.1.2 Masking additional paths

10.2 Linux capabilities

10.2.1 Dropped Linux capabilities

10.2.2 Dropped CAP_SYS_ADMIN

10.2.3 Dropping capabilities

10.2.4 Adding capabilities

10.2.5 No new privileges

10.2.6 Root with no capabilities is still dangerous

10.3 UID isolation: User namespace

10.3.1 Isolating containers using the - -userns=auto flag

Summary