# Cryptographic Hash Functions Verify Information

Cryptographic hash functions are one-way functions designed to take arbitrary data as input (e.g., a number, a short message, an image, or the collected works of Shakespeare) and generate fixed-length output (e.g., a 128-bit or 256-bit number). The output is called a *hash* or *hash value.* Hash values can act like a fingerprint—a unique identifier—for files or texts. They're designed in a way that makes it extremely unlikely that any two non-identical inputs would output the same hash value (when this does happen, it is called a *collision).* In particular, even a small change to the input data, such as changing one letter in the entire collected works of Shakespeare, would result in a new hash value that is completely unrelated to the original hash.

A commonly used cryptographic hash function is MD5 (message digest algorithm, iteration five), which takes any data as input and outputs a 128bit hash value, like this:

As you can see, changing just one character in *mysecretpasswordisCATS* results in a completely different hash output:

This property enables hash functions to be used to verify that certain information is correct or to match what someone else claims is correct, without needing to scrutinize the actual information. For example, MD5 is used to check whether a downloaded file is safe to use and is free of errors that might have occurred during data transmission. If the MD5 hash value of the downloaded data matches the hash value provided by a reputable source, you can be certain that data does not contain any hidden viruses and was not corrupted during the file transfer. The slightest alteration to the file would cause a noticeable change to the hash output. A hash is like a tamperproof seal: If it's broken, don't buy the product.

Another, more exciting, use of cryptographic hash functions is proving that you know a secret password without giving it away. Imagine you are a spy behind enemy lines. After many days of traveling under the cloak of darkness, you finally reach a guarded warehouse to meet with fellow agents. A guard at the front door asks you for the secret code, but you aren't sure whether the guard is on your side. You need to prove that you know the code without risking the mission by letting your secret code fall into the wrong hands! What do you do? You give him a hash of the secret code. If he knows the code, he can calculate the hash and verify that you also know it. If he doesn't and isn't supposed to know the code, you haven't revealed it.

This dramatic example describes current standard procedure whenever you create a new account with a username and password for a website. The password is never stored on the website's servers; instead, the hash of the password is stored, and the server checks whether the hash of what you typed matches what is on record. As a result, if the server is ever stolen or hacked into, no passwords will be revealed.

# Public Key Cryptography

The invention of public key cryptography in the 1970s was a significant breakthrough, allowing for much of the technology we take for granted today. Until then, all methods of encryption required the sender and receiver to know the same secret encryption key to encrypt and decrypt the message (also known as *symmetric key* cryptography). However, this method presented a problem because it assumed that, at some earlier time, the sender and receiver had a safe way of communicating to decide on an encryption key without the fear of anyone eavesdropping.

In public key cryptography (also known as *asymmetric key* cryptography), two different keys are created: a public key that is shared to *encrypt* the message and a private key that is confidential to *decrypt* the message (yes, the same private key that is used to spend bitcoins). With asymmetric key cryptography, communicating securely with anyone using an unsafe channel, like the radio or Internet, is easy: You share your public key with others who want to communicate with you, and then anyone can send you encrypted messages that only you can read using your private key. Because the public key cannot be used to decrypt messages, no danger occurs if it falls into the wrong hands. If others want you to send *them* encrypted messages, they give you *their* public key, and so on.

How does asymmetric key cryptography work? The original method, called *RSA encryption* after its developers—Ron Rivest, Adi Shamir, and Leonard Adleman, is based on integer factorization.^{[1]} Let's imagine that

Crowley wants to communicate with others using the RSA method. Crowley first needs to create a public and a private key (see Table 7-1). He can do this at any time before he starts his communications.

**Table 7-1: Sending an Encrypted Message Using the RSA Method**

Once the public and private keys are generated, Crowley can distribute the public key widely (along with the prime product, n). Then anyone can use this key to encrypt a message meant only for Crowley.

Of course, to encrypt and decrypt messages, you need some way of converting text into a mathematical form, which is called *encoding.* Converting a number back to text is called *decoding.* Encoding and decoding should not be confused with encrypting or decrypting because you are not scrambling the information. Many different encoding schemes exist, and it doesn't matter which one you use, but all parties involved need to use the same one.

Let's assume that the letter *a* becomes a 0, the letter *b* becomes a 1, the letter *c* becomes a 2, and so on through the alphabet. In Table 7-2 we've encoded the message "fade" using this technique.

**Table 7-2: Encoding the Word "Fade" Using a Simple Scheme**

Now that we have an encoded message we want to send to Crowley, as well as Crowley's public key, we can encrypt his message so that only Crowley can decrypt it, as shown in Table 7-3.

**Table 7-3: Encrypting the Word "Fade" Using Crowley's Public Key**

Almost an identical scheme can be used to prove one's identity, using what is referred to as a *digital signature.*

- [1] Ron Rivest, Adi Shamir, and Leonard Adleman, "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems,"
*Communications of the ACM*21, no. 2 (1978): 120-126.