Friday 2 December 2016

The secret life of Passwords

I am writing on a math related topic yet again. We will talk about passwords.
Promise me that you would not inform the police, CBI,RAW,FBI,CIA,KGB, Mossad etc. about this and I will share a secret with you! Well, here is the thing. I know passwords that other people use -- At least 1 lakh of them and just for kicks I collected some 7 lakh + credit/debit card pin numbers as well. I am not lying. And I am not in touch with Ajith Doval or Tiger or any other ex spy. How on earth can I make such lofty claims then?
Every once in a while hackers steal passwords and post them publicly. There was this news about the LinkedIn Data breach, there was this hullabaloo about the Dropbox, Yahoo mail and Adobe data breaches. And people have analysed such publicly available data. These analyses have time and again revealed that Einstein was right when he said this: Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.
If your password is 123456 or "password" or 12345678 or welcome or abc123, Congratulations to you! You have made Einstein proud. These are the top candidates in the pile of the most popular ones and as you might have guessed, these passwords are very easy to guess. Those geniuses with "iloveyou",monkey,123123,princess and qwerty can stop smiling now. There are probably more than 2 lakh people who use one of these! Analysis on Credit card PINs was even more surprising. Of the 35 lakh numbers analysed nearly 11% were "1234"!! 1111 is the 2nd most popular one with 0000 taking the Bronze medal. If you know these 3 numbers you know the PINs for at least 7 lakh credit cards!! Many among these 11% people were probably thinking that 1234 is one of those incredibly difficult numbers to predict. Human beings are so very predictable when it comes to numbers and maths.
All that leads us to other interesting questions. How exactly are these passwords stored and how hackers crack them? You type your password every time you login into Facebook, it lets you in if the password is correct. So, Facebook probably knows what your password is, how else can it decide if your password is correct, right? Wrong! Facebook or Gmail or any sophisticated site does not store your password. Or else those software engineers sitting inside Facebook can randomly pick profiles, read their passwords and play pranks and those working in the banks can go shopping more often. Worse still, imagine a data breach. Hackers can happily walk away with all the passwords in one shot. If the password is not stored, how the heck does facebook know if the password that you entered is correct? To answer that let us dig a little deeper.
What if they create a file called facebookrocks.txt and have entries like this in it: user: Sharath, password: genius. There will be 1 billion entries like this in it, any self respecting small-time thief can steal this file and he will have entire Facebook under his control. It should be obvious that that is not how things are done. What if we get clever and do a little more of spy stuff? What if we say a = 1, b = 2, c = 3, d = 4 and so on and take all the numbers for the letters in my password? if the password is "cab" then he numbers are 3,1 and 2. We can do some high school level math shit here. We can multiply them together and add 4 to it, we get 3*1*2 + 4 = 10. Now in the file we can store numbers like this instead of the original passwords. This type of thing is called as encryption. This way even if there is a breach the thief will only see numbers. The problem with this approach is that we are underestimating the thieves. All the thieves need is your approach, they just have to do the reverse. Hackers are good at maths and they will figure out the passwords by looking at numbers that you produced. You added 4, they can subtract 4 and get to 6, from 6 they can get 3,1,2 and arrive at your password. So, what you need is something that cannot be reversed.
Enter hashing. Imagine hashing entering like Kichcha Sudeepa in Kotigobba 2. Hashing is a bit like making Apple milkshakes. You take apple, sugar and milk and prepare the milkshake, but you cannot produce the apple, sugar and milk back from the milkshake. If your password is like the apple, hashing converts it into apple milkshake,something that can not be reversed. You can produce milkshake from the apple, but the reverse is not possible. There are mathematical ways of doing this.
Let us take the password "cab", what we can do is take the numbers corresponding to the letters(1 = 1, b = 2, g = 7, e = 5 etc. ), say we get the number 321, we can take the cube of it, multiply that by the 2nd prime number after 321 (Yes, mathematicians love prime numbers!), subtract 567 from it, divide that by 681, convert the resulting number to binary etc. In binary we only have 1s and 0s. The result might look like this: 110010010010001010, now just for fun we can flip every second bit of this number(Change 0 to 1 and 1 to 0), we can show off our mathematical prowess further by converting this number to Hexadecimal(In hexadecimal we count till 9 and then instead of 10 we start with letters A,B,C and all). After all this circus our password might look like this: 2ab96390c7dbe3439de74d0c9b0b1767.
In short, what we did was to take the password and do the mathematical equivalent of twisting,turning,squishing,crushing and garbling it to such an extent that it started looking really strange. Hashing is a one-way mangling process that is impossible to work backwards. If our password "cab" is the apple, 2ab96390c7dbe3439de74d0c9b0b1767 is the apple milkshake. This is the mathematical version of "You can get milkshake from the apple but you cannot get the apple back from milkshake". Facebook stores this weird looking number(the milkshake) instead of your actual password. Now every time you try to login Facebook will twist,turn,crush and garble your password. This garbled version is compared to the milkshake version that is already stored. The trick is that, given the same password, it will always spit out the same garbled version. Hence there will be a match.
Now let us say that our thieves hacked the database and they got this ugly looking number 2ab96390c7dbe3439de74d0c9b0b1767. It cannot be reversed. There is no way to get our password "cab" from this weird number. What do they do now? What they can do is to have a ready-made table called rainbow table. They can simply take the Oxford dictionary, take all the words in it starting from the 1st page and produce the hashes(milkshake) for these words. Probably the 927th word in the Oxford dictionary is "cab", so their 927th entry will be a hash that exactly looks like the weird number(milkshake) that Facebook stored. This is why you should try not to have meaningful English words as your password. If the Oxford dictionary has 1.7 lakh words, they can get your password within 1.7 lakh attempts. They simply have to create hashes for all words in the dictionary and compare it with the leaked list. This is called as dictionary attack. And passwords like 123456 and abc123 and "password1" will be the easiest victims of such an attack. If you can produce a hash for cab, you can always produce another for passwords like cab12. They are as predictable as the next Kejriwal tweet for a given issue.
To solve this problem what sites do is, add something random to your password before hashing/garbling it. This is called as salting. If your password is cab, they will first make it "cab7dbe3439" and then hash it, this is like adding some random fruit with apple before you turn it into milkshake. There is no word called "cab7dbe3439" in the dictionary and it is not easy to guess the random thing(Salt) that got added after "cab". Hence the dictionary attack will fail. All sophisticated sites use Salted hashing for this reason.
Attackers can do one more thing called brute forcing. A brute force attack tries every possible combination of characters. They can try aaa, aab,aac,aa1,aa2,aba,abb and so on. It is so goddamn expensive computationally, but they can still do it. Sites can make their life difficult by making the hashing slow, I just did 7 or 8 things to the password to garble it, sites can do many more crazy things, for instance there can be 252 steps of hashing. This will make hacker's life very difficult.
All said, What can you do? If your password is 12345678 or welcome or princess or anything in the popular 25 list, change it. There are many ways of creating good passwords. Indians can use a simple trick. Use some words or phrases from your mother tongue. One popular technique is to think of a phrase that is easy to remember, take first letters of it. For instance, your phrase can be: My Uncle Can Drink 8 Cups of Coffee Everyday. This will become "MUCD8COCE", this is a strong password that is easy for you and very difficult for the program's to guess. If the password is "IrhBBaoRS", it will really tax the hacker's software(The phrase I used was "I Really Hate Big Boss And Other Reality Shows" by the way!) One more thing you can do is to lie while answering the security questions. For instance, if the question is: who is your favourite singer? Your answer can be Himesh Reshammiya.

No comments:

Post a Comment