Why bad grammar makes good passwords
Out of 1,434 real passwords that were at least 16 characters long, researchers' software was able to crack more than 20 percent in a few hours.
Wed, Jan 23, 2013 at 02:40 PM
Lousy grammar and nonsense may help boost password strength, a new research paper argues.
Because the minimum recommended length of passwords has increased over the past few years, it's often suggested that users create passphrases, or groups of words, that can be strung together to form a long password.
For example, "the big boy runs fast" becomes "thebigboyrunsfast." Run through an encryption algorithm, it becomes a long string of random-looking gibberish.
(Because "thebigboyrunsfast" is all lower-case letters, it's still a lousy password; "th3b!6b0YruN5f@5t" would be much better.)
But there's an Achilles' heel in creating phrase-based passwords. It's the fact that most English speakers will craft phrases that make sense.
Ashwini Rao and Gananand Kini at Carnegie Mellon and Birenda Jha at MIT have developed proof-of-concept password-cracking software that takes advantage of that weakness. It cracks long passwords, and beats existing cracking software, simply by following rules of English grammar.
"Using an analytical model based on parts-of-speech tagging, we show that the decrease in search space due to the presence of grammatical structures can be as high as 50 percent," the researchers write in their paper.
Figures of speech
The concept is simple and a bit like doing a crossword puzzle.
The researchers' software assumes that there will be regular parts of speech in most long passwords. It separates possible component words into categories such as nouns, verbs and adjectives and uses grammar to predict which words might appear.
For example, there are tens of thousands of common nouns, adjectives and adverbs in English, but only three articles: "a," "an" and "the."
If a phrase-based password is using proper grammar, it's likely to use articles. That reduces the number of possible words and makes it easier for a password-cracking computer to decipher.
When trying to crack the encrypted version of "thebigboyrunsfast," a grammar-cracking computer would guess that the first word might be an article.
If the first letter turns out to be "t," chances are pretty good the next one is "h." From there, the software would try "e," "E" and "3," a common numerical substitute for "e." That's a lot easier than running through all 94 possible characters on a standard keyboard.
If "e," "E" or "3" is also right, then the cracking software would have gotten three characters into the passphrase in no time at all.
Since "the" is a complete word, the next step would be to guess the next word.
Instead of running through all the possible choices, the algorithm starts by limiting its choices to word lists of nouns and adjectives and their most common numerical-substitution variants.
Pretty soon, the computer would get to the adjective "big" and decrypt it. Knowing that adjectives usually precede nouns in English, it would then run through its list of common nouns and quickly decipher "boy."
This grammar-based process greatly cuts down on computing time, making it possible to decipher a good percentage of relatively long passphrases.
Three times as good
In the researchers' data set of 1,434 real passwords that were at least 16 characters long (drawn from a pool of nearly 40 million known passwords), their software was able to crack more than 20 percent in a few hours. Commonly used cracking software was able to crack only 6 percent.
Even better, 10 percent of the data set was cracked only by the researchers' software, and not by any of the three pieces of common cracking software.
"Long passwords [are] a promising user authentication mechanism," Rao and her colleagues conclude. "However, to achieve the level of security and usability envisioned with long passwords, we have to understand the effect of structures present in them. Further, we have to make policies and enforcement tools cognizant of the effect of structures."
Rao and her colleagues don't put forward any examples, but the upshot of their research is that it's best not to craft long passwords from phrases that both use proper grammar and make sense.
Instead, get creative. Try poor grammar and spelling, as in "de whippoorsnapper sashay sideway," or get completely silly, as in "flipper flopper fliddle fladdle."
It doesn't matter how correct it is, as long as you can easily remember it.
Related on TechNewsDaily and MNN: