Introduction to Strong Cryptography – p1.1 – Keyed Hashing, Hash Attacks, MACs, Shakespeare

Posted: June 20, 2011 in cryptography
Tags: , , ,

This article is one of many in my Strong Cryptography series. Today we’ll dive into some deeper to some Hashing topics. Also I’m taking suggestions for subjects in my posts… Last time it was US Founding Fathers, this week it’s tragedies.

We learned in our last article what a hash function is and how for every input, there is exactly one corresponding output. We’re going to apply that concept to two commons situations where confidentiality is needed.

Common Situation #1: A situation in Verona

Lets say you have Romeo who needs to email the following message to Mercutio: “I dreamt a dream tonight.” Unfortunately for Romeo, all the ISPs in Verona are owned by the Montague family (Not much better than Comcast). Romeo doesn’t mind if the Montagues can read his message, but he certainly has an issue of they change his message. How could Romeo be sure his message was delivered without being modified by the Montague family?

Well you could run the message through a hash function and attached the hash to the message. The Montagues however, could simply modify the message, and attached the new hash value. To get around that, I suppose Romeo could calculate the hash, then tell Mercutio in private what the hash value was. However, this would defeat the purpose of emailing him, since in private he could just tell Mercutio the contents of the message.

Any ideas?

Introducing the new MAC (not what you think)

Fortunately, there is an easy solution: Message Authentication Codes or MAC. A MAC function is a ‘keyed hash function’, so you could call it a cousin of a normal hash function. In fact, MAC functions and hash functions are nearly interchangeable! In the last article, we covered the properties of a hash function. So what makes a MAC function different from a hash function?

  • You must posses a secret key can produce a hash value.
  • A MAC function shouldn’t reveal ‘hints’ about the secret key, even if an attacker chooses the input.
Because you must posses a secret key to produce the hash value, they are often called “Keyed Hash Functions.” So lets run the quote “The lady doth protest too much, methinks” through a MAC function:
	private static final byte[] keybytes = new byte[] { (byte) 0xfc, (byte) 0x4f, (byte) 0xbe, (byte) 0x23,
		(byte) 0x59, (byte) 0xf2, (byte) 0x42, (byte) 0x37, (byte) 0x4c, (byte) 0x80, (byte) 0x44, (byte) 0x31,
		(byte) 0x20, (byte) 0xda, (byte) 0x20, (byte) 0x0c };

	public static void main(String[] args) throws Exception {
		Mac mac = Mac.getInstance("HMACSHA1");
		Key key = new SecretKeySpec(keybytes, "HMACSHA1");
		mac.init(key);
		byte[] hashValue = mac.doFinal("The lady doth protest too much, methinks".getBytes());
		String encodedHashValue = Base64.encodeBase64String(hashValue);
		System.out.println(encodedHashValue);
	}

The above program produces this output:

oOUKVR5hDRh4n0H+beVO4JvMw64=

If you use a different key (lets change the last byte to 0x0d) you’ll get a completely different hash:

cDdJwEBm9qIni9A7QfIQ1e9G8qo=

So how does apply to Romeo and Mercutio? First, they meet in private and create a secret key. Both Romeo and Mercutio will know the secret key. Next, when they want to send an email, they run the message through the MAC function and produce a signature. Finally, the recipient runs the message through the same MAC function, using the same key. It the calculated signature matches the signature on the message, they can be assured the message was not tampered with during transmission.

A MAC is actually just a wrapper

Take note of the following line in the above implementation:

Mac mac = Mac.getInstance("HMACSHA1");

You may remember from the last article that SHA1 is a fairly standard hash algorithm. Well here’s the whole story: a MAC function is just a wrapper around a standard hash function. The HMAC part of HMACSHA1 is the method of wrapping. There are other MAC implementations, such as VMAC, but they are less commonly used. VMAC performs a little better than HMAC. Since HMAC and VMAC are just wrappers, you can take almost any hash function and turn them into a keyed hash function. Using SHA1’s older brothers, we can run the same quote through the program:

  • HMACSHA256: POP9cefgoEC9pUaWXqQ8lBbW9CdMi1k3t7LXAGYl87s=
  • HMACSHA512: ALFIpPMbphQJ7KZQHacGIFH3T2qI5AKUqaD8lDilNnjajGL29cZdp68sLeQTjDKD+cIAfZN86udfRdecGeTm0A==

Common Situation #2: Revisiting storing passwords in a database

If you recall in the last article, we decided on the following table structure:

USERNAME PASSWORD_HASH SALT
tjefferson43 HiE0AZhRWAs6Mmd7dVqppM1WtJc= WTg68cVxPI
paulrevere1776 aktJ/0cn69Y41vssNfZDHY1CsdE= sWVUaTGa6e

Lets add one additional column:

USERNAME PASSWORD_HASH SALT MAC_KEY
tjefferson43 RFu51fVI/0y8gmPXkjo0Op9FRHU= WTg68cVxPI KvzpFe690vsVcW1jLqQ1vg==
paulrevere1776 SmZedewg1JD7kRhndEW2cRmcyGw= sWVUaTGa6e FgIpIDFi9kJgXvkp44ua4Q==

So what did we do here? Instead of using straight sha-1 function to store a representation of the user’s password, we now use HMACSHA1. What does this do for you? Not much. The better question is what does it to for hackers… It makes their life extremely difficult!

A MAC defends you from hackers (again, not what you’re thinking)

In our original table setup, a hacker could use two forms of attacks to crack our passwords. He could use a rainbow attack, (yes it’s really called that… No I don’t know why) or a brute force attack (not to be confused with a Bruce force attack).

A rainbow attack uses a multi-terrabyte database to do a reverse lookup on our hash algorithm. The database is generated ahead of time, possibly by cooperating parties and thousands of separate computers to divide up the workload. Even with the combined effort, this still takes months to generate useful tables. However, with the finished rainbow table, an attacker could lookup a hash function output to get an arbitrary input that produces the same hash as the user’s real password. Yikes! The nice thing is, using HMACSHA1 makes this infeasible for a hacker. Because we have keyed our hash algorithm differently for every user’s password, they would have to collectively regenerate the rainbow table for every individual entry, making the rainbow very very very difficult in practice!

A brute force attack is related to the rainbow attack. A brute force attack tries random combinations of letters and numbers, or even common english words until it finds an input that produces the correct output. Unfortunately, HMACSHA1 provides little additional security against brute force attacks. However, because they require powerful computing systems to run, they are less common. There are only two ways to defend against brute force attacks:

  • Enforce good polices around passwords:
    • 12 characters
    • No english words
    • at least 2 numbers
    • at least 2 symbols
    • at least 2 uppercase letters
  • Use SHA-512 or SHA-256 instead of SHA-1.

If you’ve enjoyed this article, all that I ask if you vote me up on DZone and HackerNews. Until next time, thank you for reading!

About these ads
Comments
  1. Yuri Niyazov says:

    In the “defending from hackers” part of the post, could you clarify how the salt and the hmac interact?

  2. Yuri Niyazov says:

    What is unclear is this: “However, with the finished rainbow table, an attacker could lookup a hash function output to get an arbitrary input that produces the same hash as the user’s real password. ”

    Given that we are using salts, it seems to me that they would need to find arbitrary input that produces the same hash given the use of same salt. If that’s the case, why do we need the mac_key? Or are you implying that the rainbow tables include hashes for all passwords and all salts? That seems infeasible.

    • Jonathan says:

      >> Or are you implying that the rainbow tables include hashes for all passwords and all salts? That seems
      >>infeasible.

      It it feasible, but not for those reasons. You don’t have to have the entire keyspace of the hash function covered. Remember that there are a fixed number of outcomes in a hash function meaning that all hash functions are periodic. They also contain outcomes that are unreachable (dead space in the key space). Finally, and most importantly, the attacker doesn’t have to have recovery the original plaintext, he just needs a particular plaintext that will produce the same hash value as the real plaintext (a collision).

      A salt adds entropy to a users password. It’s designed to keep an attacker from guessing other passwords from a single password. It *does* offer some security against _simple_ Rainbow tables, but the advanced rainbow tables available these days mitigate those benefits.

      HMAC makes an attackers job one step harder by changing what the values mean on his rainbow tables. Even if he had the entire keyspace of the sha-1 function covered in his rainbow table (which is infeasible with today’s technology), he wouldn’t be able to produce a collision.

      If the attacker has the HMAC key, using HMAC only slight, if any, improvement against brute force attacks. Brute force attacks without the HMAC key fairly impossible, but with the database structure above, they would always have the HMAC key.

      • Yuri Niyazov says:

        “Finally, and most importantly, the attacker doesn’t have to have recovery the original plaintext, he just needs a particular plaintext that will produce the same hash value as the real plaintext (a collision).”

        It is this part that confused me in the original discussion, and I am still a bit confused. When a user registers an account, the following occurs:

        salt = securerandom();
        crypted_password = sha1(plaintext + salt)

        When a normal user attempts to login, what happens is:
        plaintext = params[:password]
        if (crypted_password == sha1(plaintext + salt)) {
        allow_login()
        }

        Doesn’t that mean that the attacker needs the salt during the rainbow table computation? If so, then what makes an HMAC different from a salt in this situation?

  3. Jonathan says:

    >>Doesn’t that mean that the attacker needs the salt during the rainbow table computation?
    No, he doesn’t…

    If you have:
    hash1 = sha1(Salt + password)
    it’s possible for an attacker to find using a rainbow table:
    hash2 = sha1(someRandomString)
    so
    hash1 == hash2

    So even though someRandomString isn’t the original password, it doesn’t matter because the sha1 function create the same hash for two different strings. This is called finding a collision.

    Your original password + salt might have been: “s3Cu3eP4aASJDFPAOWJEF”

    but the attacker found that: “!@FA(VM>+!@$_()JCMF!PA)!_#_(%” produce the same hash using his rainbow table. I think the piece you might be misunderstanding is the attackers input is _arbitrary_. It will not match the password+salt.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s