chapter two

2 Hashing

This chapter covers

Defining hash functions
Introducing security archetypes
Verifying data integrity with hashing
Choosing a cryptographic hash function
Cryptographic hashing with the hashlib module

In this chapter, you’ll learn to use hash functions to ensure data integrity, a fundamental building block of secure system design. You’ll also learn how to distinguish safe and unsafe hash functions. Along the way I’ll introduce you to Alice, Bob and a few other archetypal characters. I use these characters to illustrate security concepts throughout the book. Finally, you’ll learn how to hash data with the hashlib module.

2.1 What is a hash function?

Every hash function has input and output. The input to a hash function is called a message. A message can be any form of data. The Gettysburg Address, an image of a cat, and a Python package are examples of potential messages. The output of a hash function is a very large number. This number goes by many names: hash value, hash, hash code, digest, and message digest. In this book I use the term hash value. Hash values are typically represented as alphanumeric strings. A hash function maps a set of messages to a set of hash values. Figure 2.1 illustrates the relationship between a message, a hash function, and a hash value.

Figure 2.1 A hash function maps an input known as a message to an output known as a hash value

2 Hashing

This chapter covers

2.1 What is a hash function?

Figure 2.1 A hash function maps an input known as a message to an output known as a hash value

2.1.1 Cryptographic hash function properties

2.2 Archetypal characters

2.3 Data Integrity

2.4 Choosing a cryptographic hash function

2.4.1 Which hash functions are safe?

2.4.2 Which hash functions are unsafe?

2.5 Cryptographic hashing in Python

2.6 Checksums

2.7 Summary