3 Approximate Membership: Bloom filter and Quotient filter
This chapter covers
- Learning what Bloom filters are, why and when they are useful
- Understanding how Bloom filters work
- Configuring a Bloom filter in a practical setting
- Exploring the interplay between Bloom filter parameters
- Learning about quotient filter as a Bloom filter replacement
- Understanding how quotienting works, as well as merging and resizing of a quotient filter
- Comparing the performance of Bloom filter and quotient filter
Bloom filters have become a standard in systems that process large datasets. Their widespread use, especially in networks and distributed databases, comes from the effectiveness they exhibit in situations where we need a hash table functionality, but do not have the luxury of space. They were invented in 1970s by Burton Bloom[1],[2] but they only really “bloomed” in the last few decades due to an increasing need to tame and compress big datasets. Bloom filters have also piqued the interest of the computer science research community who developed many variants on top of the basic data structure, in order to address some of its shortcomings and adapt it to different contexts.