front matter
A few years ago, a Python-based pipeline that my team was working on suddenly ground to a halt. A process just kept using CPU and was not finalizing. This function was critical to the company and we needed to solve the problem sooner rather than later. We looked at the algorithm and it seemed OK—in fact, it was quite a simple implementation. After many hours with several engineers looking at the problem, we found that it all boiled down to searching on a list—a very big list. The problem was trivially solved after converting the list into a set. We ended up with a much smaller data structure with search times in milliseconds, not hours.