8 Approximate quantiles on data streams

 

This chapter covers:

  • Reviewing the concept of exact quantiles and understanding constraints imposed by streaming data context
  • Understanding different types of errors of approximate quantiles
  • Learning about T-digest and Q-digest algorithm and how to apply them on a data stream
  • Comparison of T-digest and Q-digest on a realistic web-site-time-spent data

8.1 Exact quantiles

8.2 Approximate Quantiles

8.2.1 Additive error

8.2.2 Relative error

8.2.3   Relative error in data domain

8.3 T-digest - how it works

8.3.1  Digest

8.3.2   Scale functions

8.3.3   Merging T-Digests

8.3.4   Space bounds for T-digest

8.4 Q-Digest

8.4.1  Constructing a Q-digest from scratch

8.4.2   Merging Q-digests

8.4.3   Error and space considerations in Q-digest

8.4.4   Quantile queries with Q-digest

8.5 Simulation results

8.6 Summary