Hashing is not encryption – redux

Hashing and encryption are two different things, even if they're built upon the same cryptographic foundation.

Earlier this month, we touched briefly on hashing vs encryption in an attempt to disambiguate the terms. This worked well for many people, but there were a few folks who still missed the point.

I want to take time to address some comments and questions that article brought up. My hope is that this helps further explain how these two concepts are related yet completely distinct.

A cryptographic hash function H can be used for encryption

Yes, but also no.

A cryptographic hash function can be used along with a given key to generate a cryptographically secure, pseudrandom string of bytes. In plain English: a hash can turn a key into something sufficiently random that it looks like garbage. This “garbage” can then be mixed together with a string of sensitive data to produce an encrypted message.

Since the hash will always turn the key (and a set, random seed) into the same “garbage” each time, we can easily extract the original message from its corresponding ciphertext. This is fundamentally how most stream ciphers work today.

But the hash function itself is not used for encryption. It’s used along with other functionality to encrypt data. This is a bit of a pedantic point, but it’s important. While a hash function is a critical part of an encryption operation, a hash function does not itself encrypt anything.

Hashing is also not secure hashing, cryptographic hash functions are a small subset of all hash functions.

This is a very valid critique of my previous post. In general, a hash function is any operation that converts data from one domain into another, usually smaller, domain in a deterministic way. There are many hash functions that we use in day-to-day computing, but which are not secure.

MD5 is a hashing algorithm that used to be leveraged for security – it’s not cryptographically secure, though. Please don’t use it for anything important. CRC32 is a checksum algorithm used to convert from any arbitrarily long value to a 32-bit digest. The Luhn algorithm is used to generate a single integer checksum digit for credit card validation.

When I talk about hashing and encryption together, I am explicitly referring to cryptographically secure hash algorithms like SHA-2 and Argon2. When I criticize engineers who mistake hashing for encryption, I am critiquing those who misuse a one-way hashing algorithm as if it’s an encryption scheme.

Of course you can un-hash data. With a rainbow table.

A rainbow table is a dictionary that maps plaintext values to their resulting hash output. Given a specific hashing algorithm, it’s possible for an attacker to create a lookup table by brute force.

Using a lookup table to map from a hashed value to its original value is not decryption. It’s not un-hashing the data. Confuse these concepts at your own peril.

Referring to hashing as encryption is like calling a fingerprint a lock.

Yes. Exactly my point.