chapter seven

7 Files & Storage

 

This chapter covers:

  • Learn how data is represented on physical storage devices
  • Write your own data structures to your preferred file format
  • Build a tool to read from files and inspect their contents
  • Create a working key-value store that’s immune from becoming corrupt

Storing data permanently on digital media is trickier than it looks. This chapter takes you though some of that detail. To transfer information held by ephemeral electrical charges in RAM to (semi-)permanent storage media—and then to be able to retrieve it again later—takes several layers of software indirection.

The chapter introduces some new concepts for Rust developers, such as how to structure projects into library crates. This is needed because one of the projects is ambitious. By the end of the chapter, you would have built a working key-value store that’s guaranteed to be durable to hardware failure at any stage.

During the chapter, we’ll work through a small number of side quests. For example, we implement parity bit checking and explore what it means "to hash" a value. To start with however, let’s see if we can create patterns from the raw byte sequence within files.

7.1  What is a file format?

7.2  Creating your own file formats for data storage with serde

7.2.1  Writing data to disk with serde & the bincode format

7.3  Implementing a hexdump Clone

7.4  File operations in Rust

7.4.1  Opening a file in Rust and controlling its file mode

7.4.2  Interacting with the file system in a type-safe manner with std::fs::Path

7.5  Implementing a key-value store with a log-structured, append-only storage architecture

7.5.1  The key-value model

7.5.2  Introducing actionkv v0.1: an in-memory key-value store with a command line interface

7.6  actionkv v0.1 front-end code

7.6.1  Tailoring what is compiled with conditional compilation

7.7  Understanding the core of actionkv: the libactionkv crate

7.7.1  Initializing the ActionKV struct

7.7.2  Processing an individual record

7.7.3  Writing multi-byte binary data to disk in a guaranteed byte order

7.7.4  Validating I/O errors with checksums