Here we discuss some of the critical factors in creating password dictionaries, and use various methods to create some examples.
Please create good Karma by using these techniques for good reasons; To protect rather than attack, and to show people the risks, and how and why to choose better passwords.
There can be big differences in the approach for online or offline attacks, but basically we are trying to obtain or create a dictionary that contains the password. The dictionary needs to be big enough to contain the password, but small enough to be processed in a reasonable time-frame.
Offline hash cracking
If you have obtained some password hashes, you have all the time in the world to attack them offline, that is, until the passwords change of course.
Having a good dictionary which contains the password could save a large amount of time, making it unnecessary to use a brute force attack (by trying every possible combination)
Due to the offline nature of hash cracking, very large dictionaries can be used, and will prove to be effective in many cases, as users generally choose poor passwords.
Online attacks
Online attacks can be much more difficult, as sending thousands of attempts against an online system can be very time-consuming.
In addition, many online systems and protocols have protection mechanisms builtin. These mechanisms limit the amount of attempts, per IP, or in a given session or time-frame, to protect the system from brute force attacks. If the system is properly configured by the administrator, failed attempts will be logged, and attacks can be reviewed. Some systems can automatically disable usernames that are being attacked - and though this can lead to potential DoS attacks by attackers locking-out users, it does offer great protection against unauthorized access.
In online attacks, large dictionaries and brute force attacks are generally impractical.
The key to success is developing or choosing a relatively small dictionary, that is still likely to contain the password.
Userlists
For online attacks you are most likely going to need a list of usernames as well as passwords. It is important that this list is focused (no point in attacking usernames that don't exist) so mostly these usernames would be based on a list of common names (such as root, admin, Administrator etc) or gained from previous reconnaissance.
Simple tests
Once you have your username list, online attack tools such as Hydra can be used to provide basic tests (such as using a blank password, or a password the same as the username) before using more exhaustive tests. These quick tests sometimes yield results so well worth trying first. Here is an example using SSH:
hydra 192.168.1.35 ssh2 -s 22 -L users.txt -p password -e -ns -t 10
-L is the user list
-p is a password to try
-e and -ns denote trying an empty or "same as username" password
Even just these basic tests can produce some level of success. Surprising? Also, because of the small number of combinations the test is very fast.
Then more exhaustive attacks can be performed using word-lists. For example:
hydra 192.168.1.35 ssh2 -s 22 -L users.txt -P wordlist.txt -t 10
So where do you get the wordlist from?
There is already a substantional password dictionary already inbuilt into Backtrack. You can find this in the following directory
/pentest/windows-binaries/misc/wordlist.txt
This is a list of 1.5 million potential passwords, all uppercase (I guess with a view to enumerating LM hashes) but you can quickly turn this into the lowercase equivalent and/or combine the two as follows:
cat /pentest/windows-binaries/misc/wordlist.txt | tr [:upper:] [:lower:] > lowercase.txt
cat upper.txt lower.txt | sort > combined.txt
This doesn't give mixed case of course, but can potentially double your chances if you are cracking NTLM hashes, i.e. it gives you 3 million to try.
Note: This will be way too big for online attacks, and you will likely need something more focused in those cases.
Online dictionary downloads
There are many dictionaries available online, such as the milworm dictionary, which is a downloadable file containing passwords, from years worth of submissions to an online hash-cracker used by Hackers and Pentesters. This dictionary has many of the most common passwords contained in it's 83,000+ entries.
The Milw0rm website is no-longer online, but an example Google search should find something:
inurl:milw0rm.txt
Is your password in this dictionary? Take a look.
Are you thinking about changing your password right now?
Google is such a helpful chap. Try searching Google with something like:
dictionary filetype:txt
See what you can find, there are lots of languages and resources to choose from, but they may need to be trimmed, edited, joined etc
Creating your own dictionaries
As I showed in one of my previous blogs, it is possible to take a text file and turn it into a dictionary. Here is an example to create a simple word-list dictionary from a text file:
cat sourcefile.txt | tr [:punct:] " " | xargs -n 1 -P0 echo >> wordlist.txt
sort -u wordlist.txt > dict.txt
This expression takes the source-file, removes the punctuation, processes the words one per line into a file, and sorts the words to remove duplicates.
You may need quite a big file to create a reasonable dictionary. For an example, I chose one of my blog entries, which was 6K, and ended up with 413 words, including capitalization. We can remove capitalization and count what we have left with the following:
cat dict.txt | tr [:upper:] [:lower:] | sort -u | wc -l
Only 386 unique words out of 2,282 - maybe I need to improve my vocabulary ;o) ?
Anyway, the point is that we can chose the files, thus targeting our dictionary based on the content of the files. For example, if you were pentesting a company, you could gather some of their online literature, and use that as a source for your dictionary.
Sequences can easily be created
Taking the example above, we can add some numbers on the end of each word (something users often do to avoid password policies on reuse and expiry)
for word in $( cat dict.txt ); do for num in $( seq 1 13 ); do echo $word$num ; done ; done > dictplusnum.txt
Which adds numbers 1 to 13 on the end of each word, producing a much larger dictionary, but one that may produce better results.
Nouns and Names
There are lots of resources online that can be used to create dictionaries.
One example I have used is using online census data. Take the US census data as an example:
http://www.census.gov/genealogy/names/names_files.html
Another advantage of using this type of resource, is that the names in a census are often sorted in popularity, so for example you can create a dictionary of the top 500 mail, female, and surnames with something like the following:
wget http://www.census.gov/genealogy/names/dist.all.last
wget http://www.census.gov/genealogy/names/dist.female.first
wget http://www.census.gov/genealogy/names/dist.male.first
head -n 500 dist.* | cut -d" " -f1 | grep -v == | tr [:upper:] [:lower:] | sort -u > topnamesdict.txt
The expression takes the first 500 lines in each file, removes extra data, converts the content to lowercase, and sorts to remove duplicates.
In my test, this yields 1388 names (due to duplicates in the lists) Obviously you could use the whole files for a more exhaustive attack, or alternatively use these resources to create username lists.
Spidering websites
There are various tools for spidering websites and creating dictionaries from their content. One such tools is called cewl, which can be found here:
http://www.digininja.org/projects/cewl.php
With some install corrections and troubleshooting, for Backtrack 4 users here:
http://www.backtrack-linux.org/forums/backtrack-howtos/27991-installing-cewl-backtrack4.html
Usage is of the form:
./cewl.rb -d 1 -v www.bbc.co.uk > bbc.txt
So, if you know a persons interest, be it celebrities or football, cars or gardening, you can find some sites related to this, spider them, and create a focused dictionary based on a users interest.
If you know the password policy, so much the better, as you can trim out words which don't meet the criteria to reduce the size of the list.
Using the password dictionary
A couple of quick examples in addition to Hydra (shown above):
With aircrack to crack a WPA pre-shared key, from an authentication handshake in a capture dump:
aircrack-ng -e mywifi -w wordlist.txt
With John the Ripper (jtr) to crack previously captured hashes:
./jtr --wordlist wordlist.txt hashes.txt
Summary
Any or all of the above (and more) could be combined to create either exhaustive, or finely tuned, dictionaries for use with various online and offline password attack tools. This can be used in combination with brute force attacks, but makes password attacks much more efficient, if the password is found in the dictionary.
Mitigations for these attacks include
- Training users to choose better passwords, especially ones that are not likely to appear in a dictionary
- Longer passwords (passphrases rather than passwords)
- Discouraging password sequences; password1, password2, password3 etc
- Implementing and enforcing stronger password policies
- Implementing defenses such as password lockouts and expiring passwords