8 Troubleshooting Kubernetes

 

This chapter covers

  • Monitoring and viewing logs
  • Determining high CPU or RAM usage
  • Resolving common cluster problems
  • Analyzing network traffic to identify communication concerns

As this is the biggest topic (30%) on the CKA exam, we’re going to cover troubleshooting in detail in this chapter. Troubleshooting means fixing problems with applications, control plane components, worker nodes, and the underlying network. When running applications in Kubernetes, problems will arise, such as concerns with Pods, Services, and Deployments.

This chapter will help you understand the logs that a container might output in the process of debugging and getting the application back to a healthy state. If the problem is not the application, it may be the underlying node, the underlying operating system, or a communication problem on the network. On the exam, you’ll be expected to know the differences between an application failure, a cluster-level problem, and a network problem and how to troubleshoot and determine a resolution in the shortest amount of time.

8.1 Understanding application logs

8.1.1 Container log detail

8.1.2 Troubleshooting from inside the container

8.2 Cluster component failure

8.2.1 Troubleshooting cluster events

8.2.2 Worker node failure

8.2.3 Did you specify the right host or port?

8.2.4 Troubleshooting kubeconfig

8.3 Network troubleshooting

8.3.1 Troubleshooting the config