del.icio.us Digg DZone Reddit StumbleUpon
Storing Passwords Securely - Willie Wheeler
« Previous | 1 | 2 | 3 | 4 | 5 | Next »

Understanding hash functions

Hash functions are a tool from cryptography, and they are functions in the mathematical sense: for any given "acceptable" input, it spits out a specific output. In this case, the domain of acceptable inputs would be plaintext strings, and outputs are garbled-up (i.e., encrypted) versions of the plaintext strings, called hashes.

Let's play around with some examples before we go on, just so you can see what I'm talking about. Go to your favorite search engine and type "online hash function calculator." (Obviously you can use command line tools and programming APIs to compute hash functions as well.) You should see some links that allow you to calculate hashes using specific hash functions with names like MD5, SHA-1, RIPEMD-160 and others. Here's one:

Select an MD5 hash calculator, and enter some plaintext. When you hit submit, you should see a hash value. Note that whitespace in your plaintext is significant, so if you hash 'friend' and 'friend[CR]', the output will be different.

Here are some MD5 hashes:

Plaintext input MD5 hash (hex)
friend 3af00c6cad11f7ab5db4467b66ce503e
friends 28f20a02bf8a021fab4fcec48afb584e
password 5f4dcc3b5aa765d61d8327deb882cf99
I pledge allegiance to the flag bb00ea10b4ee04c4319c0c05bf9c29fc
Table 1. Passwords and their MD5 hashes

Just for kicks let's get some SHA-1 hashes for the same plaintext inputs:

Plaintext input SHA-1 hash (hex)
friend e69867ca7d5a7b0ab60a2a61e7b791c106f7bf64
friends 3d9209c4598bfbc38b3c096081bee3a09697e939
password 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8
I pledge allegiance to the flag fc1d13f4e3b942d6c31185ff033c5b7acfe22751
Table 2. Passwords and their SHA-1 hashes

There's a lot of good stuff going on here. The main point is that MD5 and SHA-1 both do essentially the same thing: they convert plaintext into garbled text with certain properties (which we'll discuss momentarily). So they're both hash functions, albeit different hash functions.

I just mentioned "certain properties." Here are some worth noting:

  • All of the MD5 hashes are the same length as each other, and similarly, all of the SHA-1 hashes are the same length. MD5 hashes use 128 bits (usually expressed as 32 hex digits) and SHA-1 hashes use 160 bits (usually expressed as 40 hex digits). In general hash functions (or more exactly, cryptographic hash functions) return fixed-size hashes.
  • It was really easy to compute the hashes from the inputs. If you used one of the online calculators, for example, your hash value was calculated more or less instantly after you provided the plaintext.
  • Looking at the hashes, there's no obvious way to map it back to the plaintext. Ideally it is computationally infeasible to determine the plaintext input from the hash.
  • Similar inputs, such as 'friend' and 'friends', produce dramatically different hashes.

The properties just given are characteristic of hash functions. For a hash function to be "cryptographically secure," we typically want two more properties to be true:

  • Given a hash, it is computationally infeasible to find an input that produces that hash.
  • Given an input, it is computationally infeasible to find another input that produces the same hash.

The MD5 and SHA-1 hash functions are generally considered to be secure though researchers have found weaknesses in each. For many practical purposes, it is at the time of this writing (late 2008) safe to use either of the two. If however you are doing something that requires exceptionally strong security (for example, financial applications), please consult a security expert.

That's what we needed to know about hash functions. Now let's learn how we can use them to store passwords securely.

Social bookmarks: del.icio.us Digg DZone Reddit StumbleUpon
« Previous | 1 | 2 | 3 | 4 | 5 | Next »

Comments (12)

Great article, very timely given the continuous stream of new's about security exploit's and problem's. Another item consideration might be to use per user random salt's. If you continue down this line of tutorials If i may suggest a topic, how to address API security, and also browser to server security.

Thanks
By Aaron Raddon on Sep 1, 2008 at 9:13 PM PDT
Hi Aaron, really nice to see you here. :-D Thanks for the nice words about the article. I agree that security articles are timely and I'm afraid they probably always will be.

Per-user random salts are a great idea, especially in the case where you are somehow able to keep the random salts separate from the passwords. If the passwords are compromised but the salts are not, having random salts will be much more effective than using a sequence-based PK, because the attacker won't know what needs to be hashed, and a brute-force approach would probably be infeasible if the number of salt bits is sufficiently high. I don't myself know best practices around keeping passwords and per-user salts separate--it seems to me that if you store the salts and the passwords in the same table then the attacker will typically have either both the password and the salt or neither--but I'd be interested to hear if somebody can suggest a best practice in this area.

Having said that, non-random salts (such as a numeric PK) are still a lot better than just a straight hash. This at least forces the attacker to create a new rainbow table for each user he wants to attack instead of being able to rely on a single generic rainbow table.
By Willie Wheeler on Sep 1, 2008 at 10:02 PM PDT
I did enjoy this article but perhaps it would better be entitled Storing Password Hashes Securely as you are in fact storing hashes and not passwords.

There are some situations which require storing passwords themselves. This is a very difficult problem I would love to see more written about.
By Richard Minerich on Sep 2, 2008 at 7:02 AM PDT
Hi Richard. I can see what you are saying. I agree that in some cases you want to store passwords themselves (or at least encrypted passwords that you can decrypt). That can happen for example if you need to use the password to authenticate into some other system. In that case you might consider using an actual cipher (two-way) to encrypt the password before storage instead of a hash function (one-way).
By Willie Wheeler on Sep 2, 2008 at 7:26 AM PDT
Nice article Willie!!! Can you share some best practices to implement "remember me" functionality? Do you think using a "salted hash" is a good enough solution to store passwords in cookies?
By Venugopal on Sep 2, 2008 at 8:48 AM PDT
Excellent article describing a subject that can be difficult for people to get there heads around.
Do you have any more advice about choosing a salt? Would you create a random string and store it with the user?
By Ben on Sep 2, 2008 at 9:33 AM PDT
Great article Willie. You've opened by eyes to the random salt concept. Thanks, Collin
By Collin on Sep 2, 2008 at 10:19 AM PDT
@Venu: Regarding remember-me, one suggestion would be to recognize the reality that it trades security for usability. It's really authenticating the browser (using a persistent cookie) instead of the user, which is sometimes OK and sometimes not. (For example, several users in the household can share a machine; several users may share a public machine.) So I would suggest that when someone remember-me's into your app, the safest assumption is to assume that the user is very possibly *not* the user who they appear to be. This assumption forces you to distinguish high- and low-risk functionality, and if the user attempts something high-risk, you force an actual login. Amazon does exactly this. A remember-me user can see product recommendations, but if he wants to buy something he has to log in.

For the actual mechanics (for instance, what tokens to use), there are multiple approaches. Here are two you might take a look at:

http://static.springframework.org/spring-security/site/apidocs/
org/springframework/security/ui/rememberme/TokenBasedRememberMeServices.html

and

http://static.springframework.org/spring-security/site/apidocs/
org/springframework/security/ui/rememberme/PersistentTokenBasedRememberMeServices.html

The first doesn't require a database, but it does involve storing the username in the cookie, which may be unacceptable in certain cases. (For example, if you are doing remember-me for a credit repair website, you may not want to expose usernames in cookies.) The second does require a database but it avoids the username problem. It is based on the article

http://jaspan.com/improved_persistent_login_cookie_best_practice

Hope that helps.
By Willie Wheeler on Sep 2, 2008 at 9:16 PM PDT
@Ben: See my response to Aaron, but I'll elaborate.

One approach is to use randomization to create secret salts. That makes brute forcing the hash orders of magnitude harder because your salt search space is now orders of magnitude larger. But this only works if the salts can be kept secret. If the attacker has the per-user salts (which seems likely if he already has the password database) then brute-forcing a hash with a random salt is no different than brute-forcing a hash with a nonrandom salt--one the salt is revealed, it makes no difference whether the salt is random or not. The salt is known; the search is over. :-)

If you can keep the salt(s) secret (either global salt or per-user, however you decide to do it), then your passwords are much more secure, since it's essentially like having two passwords, with at least one of them being very strong. But now you just have to figure out how you're going to manage to keep the salts secret. That is a challenge in its own right.

The other approach is to treat the salts as known rather than secrets. Though it's easier to crack this scheme, this approach is still useful because it foils attempts to use a precomputed rainbow table against your passwords. The attacker can still get at your passwords by building new rainbow tables, but now he has to work at it. :-) You're essentially creating a deterrent that will cause many attackers to look elsewhere, and such deterrents are very much a tool you can use to enhance security.
By Willie Wheeler on Sep 2, 2008 at 9:57 PM PDT
Just one more thought regarding the deterrent thing I just mentioned. I was at a security talk the other day and the presenter noted that attackers are very much ROI-driven. Deterrence adds security because it drives down the ROI.
By Willie Wheeler on Sep 2, 2008 at 10:12 PM PDT
when will your book Spring in practice come in Indian market ?
By punit singh on Nov 15, 2008 at 2:41 AM PST
@punit: Thanks for asking. I'm not sure how Manning handles international sales, so I can't speak to the Indian market. In terms of the book itself being available, the original estimate was June 2009, but I suspect that we will push that back a bit since the original schedule didn't account for Spring 3, and we obviously want to cover Spring 3.
By Willie Wheeler on Nov 17, 2008 at 1:10 AM PST

Post a comment

Your name:
Your e-mail address (won't be displayed):
Your web site (optional):
example: www.xyz.com
Your comment:
Preview:
By You
Please help us reduce comment spam:
Spring in Practice
My brother and I are writing Spring in Practice for Manning!

What's New?

2008-12-14 - We've just submitted a few more chapters of the book for review, so we're about halfway done.
2008-10-20 - I've added a new mailing list feature to the site. Sign up to receive e-mail updates about new articles.
2008-09-30 - We've released chapter 4 (User registration) and chapter 5 (Authentication) of Spring in Practice.
2008-09-11 - By popular demand, I've added an RSS feed to the site.
Home | Consulting | Tech Articles | Mailing List | Contact | Spring Blog
Copyright © 2008 Wheeler Software, LLC.