Thursday, October 29, 2015

Passwords, Password Cracking, and Pass Phrases

Yesterday I was contacted by a journalist who had questions about passwords.  I tried to convey some concepts to her, but when her response was "Wow.  You must really like math!"  I knew I had failed to communicate.  The story may have accomplished some purpose, but not one that would please a technical audience.  Below, I try again.

The story was partly prompted by a new password policy at UAB, where I work.  The basic policy is that you should have a 15-character password, but the quid pro quo for that is that you will only have to change your password once per year.

How Often to Change Passwords

We'll talk about the 15-characters below, but quickly about the one year.   The original "wisdom" about changing your passwords every thirty days was based on the fact that the average computer hacker using an average computer would need about thirty days to crack a password.  By changing your company's passwords every thirty days, if a hacker had managed to grab your /etc/passwd file or to dump all of your Windows hashes, by the time they had cracked the passwords, they would all be obsolete.  Now many Windows passwords can be cracked in seconds and most in less than a day.

There are still times to change the passwords more frequently.  Specifically:
  • any time you feel that someone may have observed you enter your password 
  • any time you have been exposed to malware or phishing
  • any time you have a change in administrative/trusted computing personnel (people who may know 'shared passwords' or passwords to routers/switches/servers)
  • whenever you are changing hardware or lose control of your devices (lost/stolen/sold computer/laptop/phone)
 Other than those times, there is really no reason to change your passwords, but an annual refresh still seems reasonable. 

Classes of Password Problems

Password re-use

One of the biggest problems that we face today with passwords is that people use the same passwords everywhere! Some studies have suggested that as many as 55% of adults use a single password on all websites! (See, for example, this 2013 UK study, or this June 2015 study by Harris Interactive, showing that 59% of Americans re-use passwords because it is too hard to remember them!)
Why is password re-use such a big deal? Because of the common problem of even the largest websites getting hacked and losing passwords!

  • 000Webhost - Just this week a major provider of free webhosting services had 13 million userids and passwords stolen (See story in Forbes or from Troy Hunt).
  • Ashley Madison - 11 million passwords have been cracked! CNN Ashley Madison passwords cracked, including the most popular passwords: 123456, password, 12345, 2345678, and qwerty. Other common passwords were "helpme", "midnight", and "yamaha".
  • Adobe - in 2013 150 million Adobe software users (that is YOU if you have ever downloaded Adobe's PDF Reader or Flash Player) had their userids, password hashes, and password "hints" leaked. Crackers soon made short work of millions of those passwords by matching hashes of leaked passwords and combining multiple hints to determine the underlying password.
  • LinkedIn - in 2012, hackers revealed that they had stolen 6.5 million userids and passwords from LinkedIn!

It is now generally accepted that every time one of these "major password dumps" hits the Internet, criminals use automated programs to test these userid and password combinations at all of the other bank, credit card, and merchant shops where you may have used the same userid and password on another account.  Many people make the error of treating their Email password as an "unimportant" account, failing to recognize that if I have your email password I now know where you bank (if you receive electronic statements), who you communicate with (and with your password, I *AM* you), and when you will be traveling!

Overly simple passwords
Many people who think they are being clever actually choose common passwords used by other people who thought they were being clever. A study in 2008 listed the 500 most common passwords at that time, and many of the continue to be widely used, including "clever" passwords such as "ncc1701" (the number of the Starship Enterprise), "bond007", and "qwertyui".)

One of my first exposures to the password problem came from the notorious "Morris Worm" which crashed the entire Internet back in the 1980s by using a simple password guessing list to break in to servers on the Internet. After each server was compromised, it would then try to break in to every other server it could find, starting by testing the 432 hard-coded passwords against every account it could find, and moving on to more complex cracking techniques. Robert Morris the Hacker, was the son of Robert Morris the Unix pioneer at Bell Labs. The Senior Morris had published a paper in 1979 called Password Security: A Case Study. After his death, a slashdotter revealed that he had discovered the senior Morris capturing other Bell Labs employees' passwords -- which may actually have been the source of the password list the younger Morris ended up using in his worm!

When I was a young Systems Programmer working at Samford (in 1989) I used the Morris Password list to require users at Samford to change their password if they were using any of those words. We added a few other common passwords to the list that we found our local users liked, including: bulldogs, bulldog!, ROLLTIDE, samford, and aubie1.


Password Cracking


 Let's talk about cracking alphabets:

If you have a one character password, and you restrict your password to only using the 26 lowercase alphabetic characters, guessing your password will take 26 guesses. abcdefghijklmnopqrstuvwxyz. Done! We've guessed your password!

If you have a TWO character password, how many guesses will it take? 26 SQUARED or 26^2 = 676 guesses, from aa, ab, ac to zx, zy, zz.

By raising the LENGTH of the password, we change the exponent. a 3 character password is 26^3, 4 characters = 26^4, 5 character = 26^5, etc.

By raising the SIZE of the alphabet, we change the BASE.
Lowercase = 26
Uppercase = 26
Numeric = 10
Special characters = 33
`~!@#$%^&*() -_=+[{]}\|;:'",<.>/?
(including the "space")

If we combine all of these, 26 + 26 + 10 + 33 = 95, we have a strong "alphabet" that resists crackers who have only been guessing "all lowercase" or "all lowercase plus numbers".

All the way back in the 1979 paper, Robert Morris warned about the dangers of password cracking and how simple passwords could be easily guessed by computers. In 1979, he calculated the time to crack various passwords, based on a combination of the length of the password and the size of the alphabet.

Now let's look at 1979 cracking times from the paper by Mr. Morris Senior:
n26 lower36 lower + numbers62 alpha + numbers95 printable charsall 128 ASCII chars
1 char 30 msec40 msec80 msec120 msec160 msec
2 chars800 msec2 sec5 sec11 sec20 sec
3 chars22 sec58 sec5 min17 min sec44 min
4 chars10 min35 min5 hrs28 hrs93 hrs
5 chars4 hrs21 hrs318 hrs112 days500 days
6 chars107 hrs760 hrs2.2 years29 yrs174 yrs

In 1979, a six character password with upper+lower+numeric+symbol would protect us from cracking for 29 years!  But today's computers are FAR faster than that! How does that compare to today's password cracking speeds?

To guess all 7-character lowercase passwords would be 26^7 guesses, or 8,031,810,176 (8 billion guesses!)

A secret about Windows passwords comes into play here. In environments that still use Windows XP, Windows defaults to a password storage mechanism called "LanMan Compatibility." That means that if your password is LONGER than 7 characters, Windows actually splits the password into two parts and hashes the first 7 characters as one hash, and the remaining 1-7 characters as a second hash. So, instead of a 14-character Windows XP password having a complexity:

26^14 = 64,509,974,703,297,150,976 (64 QUINTILLION guesses!)

It actually is stored as:

26^7 + 26^7 = 8 billion + 8 billion = 16 billion

Of course no one in their right mind is still running Windows XP! (right?)

Still, 16 billion guesses sounds like a lot, right? Unfortunately, not anymore.  How long would it take to crack a password that required 16 billion guesses?  If you have the right computer, LESS THAN ONE SECOND.

In December 2012, Ars Technica ran a story called 25 GPU Cluster Cracks Every Standard Windows Password in 6 hours!. The story is about a 5-server setup built with 25 Graphical Processing Unit cards (the video cards that the gamers love) that can guess 350 BILLION PASSWORDS PER SECOND!

So what do we do?

Even in Windows XP though, if we went to FIFTEEN characters, LanMan compatibility was broken, and we no longer divided the password, meaning that we now have:

26^15 if we use only lower case characters, or 95^15 if we use UPPER+lower+numeric+special characters!

95^15 = 463,291,230,159,753,366,058,349,609,375 (463 OCTILLION guesses!!!!)

463 OCTILLION divided by 350 Billion Passwords per second means . . .

1,323,689,229,027,866,760 seconds or
22,061,487,150,464,446 minutes or
367,691,452,507,740 hours or
15,320,477,187,822 days or
41,973,910,103 years


At UAB, we've decided that anyone who can wait 41 BILLION YEARS to crack your password is welcome to have all your data.


Of course we have to remember Moore's Law.

 Moore's Law suggests that computers double in speed every 18 months. While that doesn't sound like much, that means in 18 months it would only take 20.5 billion years. 18 months after that it would take 10.25 billion years. So in thirty-six 18 month periods, or 54 years, we would be able to crack that password in less than a year. That doesn't even take into consideration the fact that we will be able to harness additional computers together to use larger networks of computer to do the guessing.

Pass Phrases = 15 characters? How will I remember!?!?!

Remember that we not only need a LONG password, with a COMPLEX character set, we also need to make sure we don't re-use passwords across multiple sites!

There are two theories on how to do that.

One is to use password management software -- something like "LastPass" or "LogMeInOnce" -- I'm not going to address those packages here, other than to link to one review at PC Magazine -- The Best Password Managers for 2015 and to caution that MANY of the mobile phone apps that claim to be password managers are RIDICULOUSLY insecure! (See the article: ElcomSoft analyzes 17 Smartphones’ Secure Password Managers, Finds No Security).


The other theory, the one I like and use, is to use Pass phrases.


A pass phrase is a combination of words that is memorable TO YOU but that would not be something anyone else would know or use. Remember that the main trick criminals use to try to get your password is guessing commonly used passwords from a password list or dictionary BEFORE they start "brute-forcing" or guessing every combination of letters, numbers, and symbols. Password crackers come with dictionary files such as "10,000 most common passwords" and "100,000 most common passwords" and "English language names and places" and "Oxford English Dictionary Word List". We need to make sure OUR pass phrase is not on any of those lists.

Think of a memorable event. Or something you are unlikely to forget. Or a favorite book or movie. I'll give you an example of each of those.

Memorable Events
When my son got married we had an interesting situation. He hates cake. Always has. And yet WEDDING CAKES and GROOM'S CAKES are a major part of a wedding. My son did cookies instead of a groom's cake. So a password I used at about that time was:

theGROOMprefer$c00kies -- 22 characters. upper, lower, symbols, and numbers.

A common mistake people make with the numbers and symbols is to just put a "1!" at the end of their chosen word. Hackers know this, and cracking programs automatically check for that! I use common symbol and number substitions, such as replacing the letter "o" with the number zero (0), or replacing an "s" with a "$". E = 3, S = 5, A = @ are also some common substitions that are still easy to remember.

Unlikely to Forget
As many Christians do, I like to memorize scripture. I will often choose a password that relates to the site I'm visiting and invokes a Bible verse. For example, "Ancestry.com" is a family tree website. One of my favorite Psalms, Psalms 1, says that people who meditate on God's word are "like a tree planted by rivers of water" so a good pass phrase for Ancestry for me might be:

th@tTR33fromPsalms1 -- 19 characters (That tree from Psalms 1). Upper, lower, symbols, number.

I also use passwords to remind myself or motivate myself. When my brother was adopting two sons from the Ukraine I had a password:

Pr@ying4Dima&Vladik!

One of my Computer Forensics graduate students, Ran Sun, shared a presentation on passwords that included a link to this great article How a Password Changed My Life, where the author uses his passwords to remind himself to forgive someone, to encourage himself to stop smoking, and many other 'self-improvement' motivators.

Movies, Books, and Other tricks
One of my earliest password tricks was using a favorite book or movie title as a password. I remember telling one class about pass phrases and saying that one of my early passwords was "Robert Heinlein says the Moon is a Harsh Mistress". A bright student said "Oh! I see, take the first letter of every word to make your password -- RHstmiahm!" No. My password was actually: "RobertHeinleinsaysthemoonisaharshmistress". At that time 52^40. I don't care that it didn't use numbers or symbols.

Maybe your password is something related to an action by your favorite character: "Darth$@y$LukeIAMyourfather!"

or a combination of the author and his title "Hemingway&the0ldman&thesea"

or the year you first saw the movie: "1977.isawStarWarswithChad"

There are tons of ways to make a memorable pass phrase that will be memorable ONLY TO YOU!


The future of Password anti-cracking

The next technological trick to countering password cracking is to store the password hashes in a way that is more computationally complex. If an array of GPUs can guess 350 billion passwords per second, what is necessary is to make the process of guessing a SINGLE password require more computation time. Because a "real" user is only going to enter the password once, if it were to take even a full second for the password to be checked, that would be acceptable in most cases -- and yet it would make it much harder to "brute force" the account. bcrypt, an algorithm by Niels Provos and David Mazieres, is one such algorithm. Depending on the settings, it can reduce the number of password guesses per second down to under 20 even with a very fast computer! 20 vs. 350,000,000,000 will give the attacker a distinct disadvantage!

Last year at Password 2014 Conference in Norway, Thorsten Kranz presented a paper called On Password Guessing with GPUs and FPGAs (click for video of his presentation). This annual academic conference on passwords includes the "Password Hashing Competition" that discusses why bcrypt and scrypt are the best ways to store passwords.  For the uber-geeks, you will enjoy watching that!



Saturday, October 03, 2015

Hillary's Email Server and the New York City malware

Wednesday night (September 30th) I had a strange Tweet in my notifications from a journalist at ForeignPolicy:

https://twitter.com/EliasGroll/status/649385038694510592
Elias explained that he was wanting some quotes in response to a "hyperbolic AP story" by Bradley Klapper, Jack Gillum and Stephen Braun that had posted on the AP wire. (The same story has been posted in the Washington Post, US News & World Report and other top news sources.
The story begins with the opening paragraph:

Russia-linked hackers tried at least five times in August 2011 to trick Hillary Rodham Clinton into infecting her computer systems while she was secretary of state, according to newly released emails from the State Department.
The New York Times version of the story is far more sensational (and far more incorrect) in their telling of the story. Given the victim of all this attention, you would have thought these stories were from Fox News! Here's NYT making up scary security-sounding stuff:

Still, the evidence that Mrs. Clinton's personal account had been on the receiving end of a "spear phishing" attempt, revealed in a batch of her emails released by the State Department on Wednesday, raises the same question the F.B.I. is trying to answer as it combs through the forensic evidence from the server that was once in Mrs. Clinton's basement.
In fact, a disclaimer on the bottom of the NYT news story now reads:
A headline on Friday with an article about Hillary Rodham Clinton's email server overstated what is known about an investigation into the server's security. As the article correctly noted, Mrs. Clinton received spam email that was intended to place malware on her computer network; the investigation has not yet determined that the malware effort was successful.

What Elias did that apparently the AP reporters and the NYT reporters did NOT do was a simple Google search. If they had, they would have seen the story on this blog, dated August 17, 2011, with the headline New York City "Uniform Traffic Ticket" tops spammed malware. The image that accompanied that story, shown below, reveals why the email was turned over to the government:



 As Politico suggests in their story Most Clinton spam messages likely deleted, the workers tasked with finding "work-related" emails to turn over probably started with a few simple rules like "turn over all the emails that are from .gov addresses" -- which would include this spam, which claimed to be from @nyc.gov.

The point of that CyberCrime & Doing Time blog post was to share that this was one of the highest volume spam campaigns we had seen that summer!  Just in the UAB Spam Data Mine, we had received 11,000 copies of this email!  Spear-Phishing, which the New York Times wrongly suggests happened here, is when an email message is personalized to target a particular high-wealth or high value target.  If Hillary Clinton was targeted, so were about 11,000 mostly entirely fictitious people whose spam goes into the UAB Spam Data Mine, as well as a few hundred people who chose to share their emails with us!

What is ChepVil?

It isn't a mystery at all.  In fact, we have that documented in the blog post as well.  The malware is not mysterious at all.  It was part of a "pay-per-install" malware ring that was very popular at that time.  When my lab at UAB reported the malware to VirusTotal, it was detected by 18 of 43 anti-virus programs, with both Microsoft and Sophos detecting the malware and calling it "Chepvil" (Microsoft called it "TrojanDownloader: Win32/Chepvil.N" while Sophos called it "Mal/ChepVil-A" - we were using the name "FraudLoad" for this malware at that time).  You can see that August 17, 2011 VirusTotal report as it looked the day we reported it.  (And you can see in the comment there, also from that day, that we explained the source of the malware and gave a link back to our blog post.)

ChepVil is a type of malware that was heavily based on the BredoLab malware, although by August 2011, the BredoLab original author was already in jail.  Armenian programmer, Georgy Avanesov,  was arrested in October of 2010 when the Dutch High Tech Crime Team police seized 143 servers located at LeaseWeb in the Netherlands that he used to control his world-wide spamming operations.  At the time of his arrest, BredoLab was infecting 3 million computer per month and being used to send approximately 3.6 billion spam messages per day.  Despite this massive seizure, because his source code was already known by other malware criminals, the attacks quickly resumed following his arrest.

The August 17, 2011 version of this malware made a connection back to the Russian domain name sfkdhjnsfjg.ru, (associated with BredoLab, according to Sophos, see for example this Sophos report from August 4, 2011.)

We reported malware communicating to that server to the Microsoft Malware Protection Center on August 11, 2011 -- pointing out that it was hosted on the IP address 195.189.226.103, one of several IP addresses on that same netblock that took turns hosting sfkdhjnsfjg.ru during August 2011, all  hosted in Mykolayiv, Ukraine.   The first time we saw this family of malware communicating with that server was in a big campaign imitating the FBI on May 5, 2011.  The same malware family pretended to be the United Parcel Service on June 9, 2011, sending my lab at UAB more than 54,000 copies of the malware.  We produced a map of the computers that sent us both the May 5 FBI spam and the June 9 UPS spam and shared it with law enforcement at that time:


The point is - it wasn't "targeted" and it wasn't "spear-phishing" and it isn't a "mystery" about how it  came to be sent to Mrs. Clinton.   This wasn't a clever Russian master mind sitting in his evil lair dreaming of taking over the State Department.  One of the millions of spam bots that were part of this network (or actually probably FIVE of them) asked the Command & Control server "Who shall I spam next?" and happened to draw Mrs. Clinton's email address.

But What COULD the Malware Do? 

In August of 2011, the primary thing that Chepvil did was deliver "Fake Anti-Virus" software.  That's it.  The malware would connect to the server and ask "What additional malware would you like to infect me with?"  The server would then see who was currently paying the highest commission to have their malware installed, and whether the daily quota for installing that additional malware had already been fulfilled, and install whatever it was told to install.

In August of 2011 - the only thing we saw Chepvil install was Fake Anti-Virus, and a near cousin "Fake System Alert".  So, *IF* Mrs. Clinton had actually been infected by this malware, it would have caused a pop-up animation to play, claiming she was infected with dozens of nasty viruses, and that she needed to pay the criminals $59 to get rid of the malware.  None of that is true -- the malware is actually just "ScareWare" -- intended to irritate you with pop-up warnings about being infected until you finally give up and pay the "license fee" or have the malware professionally removed from your PC.

The Daily Malware Report

Olivia Foust Vining (now at PhishLabs, Hi Olivia!) was the student malware analyst in my lab who brought this malware to my attention that day in her "Daily Malware Report" (a research project sponsored by UPS!)  By the end of her shift, we had actually seen 45,377 copies of the malware!  Her report gave every 15 minute breakdowns of how many copies we received during the morning hours.


count |        mbox         
-------+---------------------
   326 | 2011-08-17 03:30:00
   264 | 2011-08-17 03:45:00
  1880 | 2011-08-17 04:00:00
   756 | 2011-08-17 04:15:00
  1930 | 2011-08-17 04:30:00
  2608 | 2011-08-17 04:45:00
  5982 | 2011-08-17 05:00:00
  4364 | 2011-08-17 05:15:00
  3544 | 2011-08-17 05:30:00
  2418 | 2011-08-17 05:45:00
  2262 | 2011-08-17 06:00:00
   999 | 2011-08-17 06:15:00
   870 | 2011-08-17 06:30:00
   972 | 2011-08-17 06:45:00
   643 | 2011-08-17 07:00:00
   277 | 2011-08-17 07:15:00
   354 | 2011-08-17 07:30:00
   200 | 2011-08-17 07:45:00
  4571 | 2011-08-17 08:00:00
  3974 | 2011-08-17 08:15:00
  3109 | 2011-08-17 08:30:00
  2047 | 2011-08-17 08:45:00
  1617 | 2011-08-17 09:00:00
(23 rows)

For comparison, here is the count of the other high malware volumes for that day:

count |             md5_hex              
-------+----------------------------------
 45377 | 1c2b06a9fbbea641ae09529e52f29b96 <= the "Uniform traffic ticket" malware
  3484 | e7b48c4421a68740dfd321dade6fd5e6 <= "End of July Statement" malware
  2627 | c1f67a7542359397544bd0af0b546166 <= "Your credit card has been blocked" malware
  1021 | d22eadfda41fcbeb692c600c97d10ff5 <= "Money Transfer Information" malware

But how did Spammers learn Mrs. Clinton's email address?

There are four primary ways that spammers gather email addresses.

The first is specialized software programs that scour the web looking for email addresses on websites.  One of the richest sources of these is actually "archives" of large email lists.  When email lists provide web access to their history, many do so publicly, allowing these scraping tools to learn the email addresses of every person mentioned on the mailing list.  Spammers also JOIN tons of mailing lists to be able to gather the email addresses posted there.

Data dumps are another rich source of email addresses.  Do you recall, for example, the Adobe breach in 2013 when 38 million people who had ever used an email address to register for the free download of Adobe reader or any other Adobe product had their email addresses publicly revealed?  Such events are great days for the spammer community!

Next, we have malware on other people's computers. Many malware programs have as one "module" code that will scan a computer for email addresses.  If even ONE of Hillary's regular correspondents became infected with malware, her email address would have been discovered that way.

Lastly, we have SMTP harvesters.  These programs scan for mail servers, enumerate the domains served by that server, and then begin asking "do you deliver email for al@yourdomain.com? amos@? ann@? ... zach@?" The more intense of these servers will ask for every single letter and number combination, until it has a complete list of the "known" email addresses for the given domain.

So . . . it isn't surprising at all that even "secret" email addresses receive spam.

Thanks, Foreign Policy, for getting it right! 

I was pleasantly surprised by how well Elias Groll handled the details on this story.  He quickly identified the scare-mongering going on over at the AP, and reached out for the facts.  Obviously what I shared above is far too much technical detail for the readers of FP, but I do want to commend the level-headed reporting in their story:

Clinton's Private Emails Show Aides Worried About the Security of Her Correspondence