preface
It was 2021, and I had a week off before starting a new job at Discord. They told me I’d be working with the distributed database Apache Cassandra to start, but they were in the midst of switching to ScyllaDB—a more performant rewrite of Cassandra. That week, I went hunting for resources to learn about ScyllaDB, but resources outside of the official docs were few and far between. I ended up mostly studying Cassandra and pretending that every time I saw the word Cassandra, it actually said ScyllaDB. This approach wasn’t the worst option, but it left some definite gaps in my knowledge that I had to work to fill in later.
Because we were running both databases together when I started, I was able to compare their behaviors. I immediately was a big fan of how, by distributing their data, they provide scalability and fault tolerance. Coming from a relational database background, I’d seen how a single database node going offline due to a cloud-provider problem could wreck an application’s availability. ScyllaDB’s and Cassandra’s more gradual degradation paradigm brings immediate benefits. The catch lies in their comparative performance. The Cassandra database felt like it was always alerting, paging someone to fix a failure or mitigate an overwhelmed cluster. But the ScyllaDB databases were quiet; they rarely paged, and they exhibited better performance. We finished the ScyllaDB migration a few months later, and the barrage of Cassandra alerts ceased.