13 Sets

 

A set is a data structure that can store unique items without a particular ordering. Sets are widely used in computer science and have a strong connection with mathematical set theory. Clojure offers two types of sets with their corresponding initializer functions:

  • hash-set is the most used, offering fast lookup, a special syntax literal #{} and a transient version. hash-set is implemented on top clojure.lang.PersistentHashSet which in turn is a thin layer on top of clojure.lang.PersistentHashMap, the same of hash-map. Both hash-set and hash-map are instances of Hash Array Mapped Trie [213]. hash-set offers near constant time lookup (at O(log32N)), addition and removal of items.
  • sorted-set also guarantees uniqueness of items but additionally maintains ordering based on a comparator. Their implementation is based on Red Black trees (see subseq call-out section) and offers a well balanced logarithmic access (at O(log2N)).

Both set types can be used as functions (especially as predicates) to verify the presence of items with a concise syntax:

((sorted-set 5 3 1) 1) ; #1
;; 1

(some #{1 2 3} [0 4 6 8 1]) ; #2
;; 1

The clojure.core namespace contains some basic functions to create a set (hash-set, Set, sorted-set) or remove elements from a set (disj). conj and into also works on sets (along with many other collection types). Functions like union, intersection or difference are available from the clojure.set namespace.

13.1 hash-set

13.2 set

13.3 sorted-set and sorted-set-by

13.4 disj

13.5 union, difference and intersection

13.6 subset? and superset?