7 Performance and scaling

 

This chapter covers

  • Tuning Fluentd to maximize resources using workers
  • Deploying Fluentd with fan-in and -out patterns
  • Using deployment patterns for scaling
  • Implementing high availability and deployments
  • Using Fluentd with microservice patterns

In previous chapters, we worked with just a single Fluentd instance. Still, we live in a world of distribution, virtualization, and containerization, which typically needs more than a single instance. In addition to distribution considerations, we need to support elasticity through scaling up (adding more CPUs or memory to a server to support more processes and threads) and scaling out (deploying additional server instances to have workload distributed via load balancing) to meet fluctuating demands (along with the reverse scale down and in). Enterprises demand resilience to handle failure and disaster scenarios. To provide good availability, we should at least have an active server and a standby server deployed, with both servers using configuration files that are kept synchronized. Configuration synchronization makes it possible to start up the standby server on short notice if the first instance fails (active-passive). In the more demanding cases, active-active deployments are needed with servers spread across multiple data centers; this is very conventional as a deployment pattern. A single server solution in the enterprise space is a rarity.

7.1 Threading and processes to scale with workers

7.1.1 Seeing workers in action

7.1.2 Worker constraints

7.1.3 Controlling output plugin threads

7.1.4 Memory management optimization

7.2 Scaling and moving workloads

7.2.1 Fan-in/log aggregation and consolidation

7.2.2 Fan-out and workload distribution

7.2.3 High availability

7.2.4 Putting a high-availability comparison into action

7.3 Fluentd scaling in containers vs. native and virtual environments

7.3.1 Kubernetes worker node configuration

7.3.2 Per-cluster configuration

7.3.3 Container as virtualization