chapter thirteen

13 Sets

 

A set is a data structure that can store items without a particoular ordering. Sets are widely used in computer science and have a strong connection with mathematical set theory. Clojure offers two types of sets with their corresponding initializer functions:

  • hash-set is the most used, offering fast lookup, a special syntax literal #{} and a transient version. hash-set is implemented on top clojure.lang.PersistentHashSet which in turn is a thin layer on top of clojure.lang.PersistentHashMap, the same of hash-map. Both hash-set and hash-map are instances of Hash Array Mapped Trie [212]. hash-set offers near constant time lookup (at O(log32N)), addition and removal of items.
  • sorted-set also guarantees uniqueness of items but additionally maintains ordering based on a comparator. Their Implementation is based on Red Black trees (see subseq call-out section) and offers a well balanced logarithmic access (at O(log2N)).

Both set types can be used as functions (especially as predicates) to verify the presence of items with a concise syntax:

((sorted-set 5 3 1) 1) ; #1
;; 1

(some #{1 2 3} [0 4 6 8 1]) ; #2
;; 1

The clojure.core namespace contains some basic functions to create a set (hash-set, set, sorted-set) or remove elements from a set (disj). conj and into also works on sets (along with many other collection types). Functions like union, intersection or difference are available from the clojure.set namespace.

13.1  hash-set

13.2  set

13.3  sorted-set and sorted-set-by

13.4  disj

13.5  union, difference and intersection

13.6  subset? and superset?

13.7  select, index, rename, join and project