8 Approximate quantiles on data streams

 

This chapter covers:

  • Reviewing the concept of exact quantiles and understanding constraints imposed by streaming data context
  • Understanding different types of errors of approximate quantiles
  • Learning about T-digest and Q-digest algorithm and how it is applied to a data stream
  • Comparison of T-digest and Q-digest on a realistic web-site-time-spent data

8.1       Exact quantiles

8.2       Approximate Quantiles

8.2.1   Additive error

8.2.2   Relative error

8.2.3   Relative error in data domain

8.3       T-digest - how it works

8.3.1    Digest

8.3.2   Scale functions

8.3.3   Merging T-Digests

8.3.4   Space bounds for T-digest

8.3.5   Relative error 

8.4       Q-Digest

8.4.1    Constructing a Q-digest from scratch

8.4.2   Merging Q-digests

8.4.3   Error and space considerations in Q-digest

8.4.4   Quantile queries with Q-digest

8.5       Simulation results

8.6       Summary