7 Collecting and storing logs

 

This chapter covers

  • Building the five layers of a modern logging pipeline
  • Collecting logs from systems, applications, infrastructures, and third parties
  • Using a message broker to pass logs from producers to consumers
  • Understanding techniques to analyze logs through task-specific modules
  • Learning how to store logs effectively and implement a retention policy
  • Evaluating tools to access and visualize both raw logs and metrics

You probably already know that you should be collecting logs on all applications and systems, but it’s easy to wonder why, what kind, and exactly how much logging is needed. We’ll spend this chapter discussing what a modern logging pipeline looks like, and what logs should be sent to it, but before we get started, allow me to illustrate the purpose of logging through the eyes of a security engineer.

I once worked a security incident where access to a privileged user account had been compromised, leading to secret information being disclosed to attackers. The incident was serious enough that dozens of people were mobilized to investigate the impact of the disclosure. Everyone was running around trying to answer the obvious questions: How did this happen? How much data has been disclosed? How far back does the compromise go? What should we tell our users? And the press? Are we going to be OK?

7.1 Collecting logs from systems and applications

7.1.1 Collecting logs from systems

7.1.2 Collecting application logs

7.1.3 Infrastructure logging

7.1.4 Collecting logs from GitHub

7.2 Streaming log events through message brokers

7.3 Processing events in log consumers

7.4 Storing and archiving logs

7.5 Accessing logs

Summary