concept `percentile` in category `microservice`

appears as: percentiles, percentile

The Tao of Microservices

This is an excerpt from Manning's book The Tao of Microservices. Login to get full access to this book.

¹⁴A percentile tells you what percentage of responses came in under the given time. For example, a 500 ms response time at the 90th percentile means 90% of responses took less than or equal to 500 ms.

to see more go to 2.5. The microservice dependency tree

⁷To calculate a percentile, take all of your data points, sort them in ascending order, and then take the value that’s at index (n × p / 100) – 1, where n is the number of values and p is the percentile. For example, the 90th percentile of {11,22,33,44,55,66,77,88,99,111} is 99 (index is 8 == (10 × 90 / 100) – 1). Intuitively, 90% of values are at or below 99.

Percentiles are useful because they align better with business needs. You want most customers to have a good experience. By charting the percentile, rather than the average, you can directly measure this, and you can do so in a way that’s independent of the distribution of the data. This handles the caching scenario (where you had two user experience clusters) and still provides a useful summary statistic.

to see more go to 6.1.3. Using percentiles

Let’s consider a failure scenario in the context of a monolith and see how percentiles can help. Suppose you have a system with tens of servers at most. One of these servers is experiencing problems with a specific API endpoint. In your daily review of the system metrics, the response-times chart for this API endpoint for the server in difficulty looks like figure 6.6.

Figure 6.6. Time series chart of average and 90th percentile response times

This chart shows response time over time. For each time unit, the chart calculates the average and the 90th percentile. To help you understand how the underlying data is distributed, each individual response time is also shown as a gray dot (this isn’t something normally shown by analytics solutions). By comparing historical performance to current performance, you can see that there was a change for the worse. By using percentiles, you avoided missing the problem, because the average doesn’t show it unambiguously. This approach will work when you’re reviewing a small number of servers, but it clearly won’t scale to microservices.

to see more go to 6.1.3. Using percentiles

concept percentile in category microservice

The Tao of Microservices

Figure 6.6. Time series chart of average and 90th percentile response times

Unable to load book!

concept `percentile` in category `microservice`