2 Hashing
This chapter covers
- Defining hash functions
- Introducing security archetypes
- Verifying data integrity with hashing
- Choosing a cryptographic hash function
- Cryptographic hashing with the hashlib module
In this chapter, you’ll learn to use hash functions to ensure data integrity, a fundamental building block of secure system design. You’ll also learn how to distinguish safe and unsafe hash functions. Along the way I’ll introduce you to Alice, Bob and a few other archetypal characters. I use these characters to illustrate security concepts throughout the book. Finally, you’ll learn how to hash data with the hashlib module.
2.1 What is a hash function?
Every hash function has input and output. The input to a hash function is called a message. A message can be any form of data. The Gettysburg Address, an image of a cat, and a Python package are examples of potential messages. The output of a hash function is a very large number. This number goes by many names: hash value, hash, hash code, digest, and message digest. In this book I use the term hash value. Hash values are typically represented as alphanumeric strings. A hash function maps a set of messages to a set of hash values. Figure 2.1 illustrates the relationship between a message, a hash function, and a hash value.