11 External memory sorting

 

This chapter covers

  • Understanding the importance of efficient sorting on disk
  • Revising the two most classical in-RAM sorting algorithms: merge-sort and quick-sort
  • Learning how external memory merge-sort works
  • Understanding how external memory quick-sort works
  • Understanding the relationship between searching and sorting in internal versus external memory

In the previous chapter, we learned about different ways to design indices in databases. Indices embody the fundamental problem of searching in computer science. Another fundamental problem that crops up in databases—and pretty much everywhere else—is sorting. Just think of how many times you’ve used the function sort() in your code to order a set of data.

11.1 Sorting use cases

11.1.1 Robot motion planning

11.1.2 Cancer genomics

11.2 Challenges of sorting in external memory: An example

11.2.1 Two-way merge-sort in external memory

11.3 External memory merge-sort (M/B-way merge-sort)

11.3.1 Searching and sorting in RAM vs. external memory

11.4 What about external quick-sort?

11.4.1 External memory two-way quick-sort

11.4.2 Toward external memory multiway quick-sort

11.4.3 Finding enough pivots

11.4.4 Finding good enough pivots

11.4.5 Putting it all back together

11.5 Math bit: Why is external memory merge-sort optimal?

11.6 Wrapping up