Thursday 15 September 2016

Lying with Mathematics!!

I am posting on maths once again, providing one more chance for you to issue death threats to me. Okay, today we will talk about partial differential equations and 3-dimensional Euclidean space. Just joking!! The post is actually about how to lie with mathematics. Darwin's evolution made sure that the human brain has evolved to appreciate crappy reality shows and post bullshit in Facebook. If there is one thing for which the human brain was not meant to be, it is grasping maths. Think about it. You are a hunter-gatherer roaming in a jungle. You hear a rustle in the bushes. Last thing you want to do is sit with a pen and paper solving a fancy maths equation to compute the probability that it might be a tiger. The math was as simple as this: Heard a sound? Ready One,Two,Three. Run! it is then not surprising that anybody can fool people with numbers and math.

My imaginary friend recently conducted a survey and concluded that 80% of the people staying in Karnataka do not know Kannada. Looks suspicious isn't it? Turns out that this survey was done near Maratha Halli bridge and he spoke to only 20 people. The problem is that the Maratha Halli bridge does not represent the entire state. A much better survey would include at least 1000 people picked from at least 8-10 different districts. It should include HIndus, Muslims and christians, there should be rich, poor and the middle class people, north and south Karnataka should be included, Age of the people interviewed should be considered etc. etc. I made this one up but our real life is full of phony surveys like this. Take a look at this survey conducted by CNN-IBN, The Hindu and CSDS: http://www.thehindu.com/…/0…/The_Popularity_Stak_746936a.pdf
They claimed that 39000 people were interviewed and Rahul Gandhi was the most popular choice for PM!! Modi was 4th in the list. Either they drank one bottle of Maaza and randomly cooked up the numbers or they interviewed people like Rajdeep, Barkha and nutcases like Arundhati Roy. In US they conducted one survey on the sex lives of people. Results were surprising: An average male sleeps with 7 women and the average woman sleeps with 4 men. How on earth can that be? Simple. People lie. Men tend to exaggerate their sexual encounters and women tend to downplay it.

When a kid dies before the age of 1 and the cause of death can not be explained, it is called as Sudden infant death syndrome (SIDS). Sally Clark was an unfortunate soul who lost 2 of her babies like that. Trupti Patel was another who lost 3 of her babies like that. People thought that something was wrong with Sally and Trupti's cases and they were made to face the trial. Medically no evidence could have convicted them as there was nothing conclusive. And maths was used! Say you toss a coin thrice. What is the probability that all 3 will be heads? That is easy. There is a 50-50 chance every time of it being a head. One head? 50%. 2 heads? 50% of 50% or half of half or 1/2 X 1/2. 3 heads? Half of half of half. Or 1/2 X 1/2 X 1/2. Chance is 12.5% or 1/8. You multiply the probabilities to get the probability of 3 heads, multiply the individual probabilities of Independent events and you get the total probability. This mathematical idea was used. Expert witness said something like this: What is the chance that one kid will die of SIDS in a non-smoking family? It is 1 out of 8500 or 1/8500. So, what is the chance of 2 SIDS death in the same family? You multiply, remember? 1/8500 X 1/8500 gives something like 1 out of 7 Crore. So the possibility of 2 kids in the same family dying out of SIDS is 1 out of 7 crore. It is such a rare occurance, 1 out of 7 crore! So it must have been a murder! "One death is a tragedy, 2 is suspicious, 3 is murder" he said. Sally even went to jail.

Assume that you were CSP(<Insert your favorite lawyer from your favorite Tv serial here>). Math is clear. For independent events you can multiply the probabilties to get the total probability. How would you save Sally and Trupti? What is wrong with this argument? Think.
Read the math statement again. Did you notice the words "independent events"? That is the answer. Formula works only if 2 deaths are independent. What if some unknown mosquito bite caused the death? What if the same mosquito bit both the kids? What if some unknown genetic factor caused it? Then both deaths would have the same causes, they are related, they are not at all Independent events. Hence the 1 out of 7 crore figure might be rubbish. There is another problem. Even if the possibility is 1 out of 7 crore, that does not mean that chances of her innocence are 1 out of 7 crore. To understand this let us take an imaginary case.

You are walking in a street with no lights somewhere in Bengalooru. A red i10 car stops close to you. A fella gets out, tries to stab you, you punch back and he escapes. You didn't see his face properly. But you noticed that he shouted something like "Jai Hrithik" and he was six-foot tall. You tell this to the Police. And guess what? Next morning the police actually see a 6 foot tall man driving a red i10 car and he happens to be a Hrithik fan! Let us say that chance of one person driving an i10 car is 1 out of 2000 , 1 out of 10 people will be 6 foot tall and 1 out of 100 will be Hrithik fans. You mutiply the probabilities. 1/2000 X 1/10 X 1/100. That is 1/20 lakh. So the chance that a random i10 driver being a Hrithik fan and 6 foot tall is 1 out of 20 lakh. Makes sense so far. Now, tell me Mr Sherlock Holmes, what is the probabilty that he is guilty?

What is the chance of his innocence? It is not 1 out of 20 lakh! For that we need other numbers. How many people are there in Bengaluru? 1 Crore. When there are 1 crore people and the chances are 1 per 20 lakh, you can say that there might be
5 people who will be like that just by coincidence. Even if you are 1 in 1 crore, there will still be 125 people exactly like you since our population is 125 crores. So there might be 5 people like our dude. He might be one of the 5. Chance of his guilt is 1 out of 5, meaning probability of his innocence is 80%. Sally was eventually released but she died later. Trupti was acquitted. You can google for the case of Lucia De Berk, who was a nurse and was accused of 13 murders because of junk maths and faulty reasoning. Her life was also destroyed before better sense prevailed. There is a film called Lucia de B. based on her life.

Let us say that 2 coaching classes are offering IAS and MBA classes. Class A - 65% success rate, Class B - 55%. Which one is better? A you said? Wait!
Number of aspirants in Class A - MBA - 80(60 clear it) IAS - 20(5 crack it)
What about B? It has 90 IAS aspirants and 10 MBA. 45 have cracked IAS and all 10 have cracked MBA.
After splitting, success rate looks like this:
Class A - MBA - 75%, IAS - 25%
Class B - MBA - 100%, IAS - 50%
Total: Class A - 65%, Class B - 55%.
What ??!! Individually B has done better than A in both IAS and MBA, but in total Class A is better! Crazy? Yes. Reason? IAS is difficult to crack, MBA is easier. Class B had lots of IAS guys, class A has lots of MBA dudes. failure rate is higher in IAS. Another example will make it clear.

Say after witnessing Gopichand, Saina Nehwal and Sindhu you take a sudden interest in badminton. You offer a sum of 1 crore Rs for one upcoming talent's training. You have to select the best player. 100 people apply and finally 2 are selected, now you ask your secretary to select a winner. Here is some statistics about Soumya and radhika:
Monday: Soumya - 100% wins, Radhika - 80% wins
Tuesday: Soumya - 30% wins, Radhika - 0% wins
Who do you think is the better player? Soumya? Not so fast! What if I told you that on Monday Soumya played 1 match and Radhika 10? Soumya won one match that day, Radhika won 8 out of 10. Tuesday they reversed it. Tuesday, Soumya played 10 matches and won 3 out of 10, Radhika played only 1 match and lost. In total, Radhika - 8/11 and Soumya - 4/11. See! Numbers can be manipulated.

In an imaginary survey they found that 73% of people in Karnataka spend their afternoons watching TV serials. Well, this survey was done on weekdays, hence most men and working women were not available, only housewives were interviewed which distorted the numbers. If The Hindu publishes a study claiming that as per Indian Statistical Institute 59% people in India don't like sex, most people would believe it. A number does not mean anything unless you put it in a context.

Another study showed that 67% of the people who take frequent breaks in office get cancer and 71% of those who like Mint candy get cancer. So think twice before you take your next break or eat Mint candies! Turns out that many who take frequent breaks are using that time for smoking and most consume mint candies after smoking. It is the tobacco that is causing the Cancer, not breaks or mint candies. Moral of the story? Correlation does not always imply causation. Just because A and B are occurring together you can not say that A causes B. But people still say things like whenever I wear a red shirt Indian Cricket team wins, whenever X has acted in a film it has failed, so X is an Iron leg and such. Check some hilarious correlation graphs here:http://www.tylervigen.com/spurious-correlations
As Someone had rightly put it: There are three kinds of lies: lies, damned lies, and statistics