5 Disjoint sets: Sub-linear time processing

 

This chapter covers

  • Solving the problem of keeping a set partitioned into disjoint sets and merging partitions dynamically
  • Describing an API for a data structure for disjoint sets
  • Providing a simple linear-time solution for all methods
  • Improving the running time by using the right underlying data structure
  • Adding easy-to-implement heuristics to get quasi-constant running time
  • Recognizing use cases where the best solution is needed for performance

In this chapter we are going to introduce a problem that seems quite trivial—so trivial that many developers wouldn’t even consider it worth a performance analysis, so they’d just implement the obvious solution to it. Nevertheless, if the expression “wolf in sheep’s clothing” was applied to data structures, this would be the best heading for this chapter.

5.1 The distinct subsets problem

5.2 Reasoning on solutions

5.3 Describing the data structure API: Disjoint set

5.4 Naïve solution4

5.4.1 Implementing naïve solution

5.5 Using a tree-like structure11

5.5.1 From list to trees

5.5.2 Implementing the tree version

5.6 Heuristics to improve the running time13

5.6.1 Path compression

5.6.2 Implementing balancing and path compression

5.7 Applications

5.7.1 Graphs: Connected components

5.7.2 Graphs:15 Kruskal’s algorithm for minimum spanning tree

5.7.3 Clustering

5.7.4 Unification

Summary

sitemap