6 Trie, radix trie: Efficient string search

 

This chapter covers

  • Understanding why working with strings is different
  • Introducing trie1 for efficient string search and indexing
  • Introducing radix tree2 as a memory-efficient evolution of trie
  • Using prefix trees to solve string-related problems
  • Leveraging tries to implement an efficient spell-checker

1. Also known as prefix tree.

2. Also known as radix trie or compact prefix tree.

How many times have you sent a text or an email or tweeted in a hurry, only to realize a second later that you made a typo? For me, it was too many times! Lately, however, we’ve had a precious ally on our side in email clients and browsers in general: spell-checkers! If you’d like to know more about how they work and how to implement them efficiently, this chapter is the right place to start.

In chapter 3 we described balanced trees, which offer the best compromise when it comes to containers and are ideal for efficiently storing dynamically changing data on which we need to perform frequent searches. In appendix C we discussed and compared the options we have for containers providing fast lookup, fast insertion, or fast removal, and trees offer the best tradeoff between all the operations.

Balanced trees, in particular, guarantee logarithmic running time in the worst-case for all the main operations. In the general case, when we don’t know anything about the data we need to store and (later) search, this is really the best we can hope.

6.1 Spell-check

6.1.1 A prncess, a Damon, and an elf walkz into a bar

6.1.2 Compression is the key

6.1.3 Description and API

6.2 Trie

6.2.1 Why is it better again?

6.2.2 Search

6.2.3 Insert

6.2.4 Remove

6.2.5 Longest prefix

6.2.6 Keys matching a prefix

6.2.7 When should we use tries?

6.3 Radix tries

6.3.1 Nodes and edges

sitemap