3 Approximate Membership and Bloom Filter

 

This chapter covers:

  • Learning what Bloom filters are, why and when they are useful
  • Understanding how Bloom filters work
  • Configuring a Bloom filter in a practical setting
  • Exploring the interplay between Bloom filter parameters
  • Learning about quotient filter as a Bloom filter replacement
  • Understanding how quotient filter works, and its comparison to the Bloom filter

Bloom filters seem to be all the rage these days. Most self-respecting industry blogs have articles fleshing out how Bloom filters enhance the performance in their infrastructure, and there are dozens of Bloom filter implementations floating around in various programming languages, each touting its own benefits. Bloom filters are also interesting to computer science researchers, who have, in past decade, designed many modifications and alternatives to the basic data structure, enhancing its various aspects. A skeptic and a curmudgeon in you might ask himself: What’s all the hype?

The large part of the reason behind Bloom filter popularity is that they have that combination of being a fairly simple data structure to design and implement, yet very useful in many contexts. They were invented in 1970s by Burton Bloom[28],[29] but they only really “bloomed” in the last few decades with the onslaught of large amount of data in various domains, and the need to tame and compress such huge datasets.

3.1   How It Works

3.1.1   Insert

3.1.2   Lookup

3.2   Use Cases

3.2.1   Bloom Filters in Networks: Squid

3.2.2   Bitcoin mobile app

3.3   Configuring a Bloom filter for your application

3.3.1   Examples

3.4   A bit of theory

3.4.1   Can we do better?

3.5   Further reading: Bloom filter adaptations and alternatives

3.6   Quotient filter

3.6.1   Quotienting

3.6.2   Resizing

3.7   Summary

sitemap