Or what I learned about Analysis from reading detective novels and watching TV shows
Nowadays, it seems that hardly a day goes by when we don’t come across some article or other about Big Data or Advanced Analytics. Even NPR got into the mix recently, when it interviewed Kenneth Cukier the co-author of the book: Big Data: A Revolution That Will Transform How We Live, Work, and Think on Morning Edition. In general, the promise is that by applying scientific rigor and data mining techniques to Big Data, we will discover correlations that we didn’t know about before and this will revolutionize the business world. Needless to say, I’ve seen similar promises being made before. As someone who is very passionate about the effective use of Information, I remain cautiously optimistic. It’s great that there’s now critical mass around this drive to rely more on hypotheses and predictive models to make decisions. If we are indeed looking to get serious about Business Intelligence (in its broadest interpretation), I’d say why not learn some techniques from the true Intelligence professionals?
What possibly do the CIA, FBI and CSI have in common with Business Intelligence? For one thing, it seems that charismatic and photogenic individuals from these organizations really have a lock on prime time TV shows and books on the bestseller list. The second but more relevant point is: these agencies actually do a lot of investigation and data analysis. Long before unstructured data became an excuse to sell more Big Data software solutions, these professionals were sifting through witness testimonies, phone conversations and trying hard either to prove guilt or find the needle in the proverbial haystack. And unlike the business world, their analysis is frequently a matter of life or death – it can result in several years in jail, an incorrect drone strike or a terrorist plot that goes unnoticed until it is too late.
I’ve read my share of detective novels and mysteries growing up and continue to enjoy everything in the genre. I loved Sherlock Holmes, but felt cheated by Agatha Christie‘s novels. Unlike the traditional detective story where there were usually clues at the crime scene that lead to more and ultimately the perpetrator, her plot lines usually revolved around a crime with multiple suspects, who all had possible motives and the means to do the crime. I was never satisfied by the ending which typically revealed the most unlikely suspect as the person committing the crime. I bring this up because I learned recently from CSI, that there’s a technical term for this: Analysis of Competing Hypotheses or ACH for short. Most elementary students know that a science project starts with a Hypothesis. As analysts in the business world, we may not always start with a hypothesis, but data analysis involves some evaluation of possible causes. So when Kenneth Cukier claims that “Big Data doesn’t care about causes, just correlation” I understand he’s talking about specific use cases perhaps in Marketing, but it is an approach that dismays me. Correlations certainly wouldn’t fly in a court room, although they may be used to understand patterns in crime and fraud.
In any case, the CSI I’m referring to is actually not the Crime Scene Investigation unit (which does excellent work with physical evidence), but the Center for the Study of Intelligence which is a division of the CIA. The CSI site has a wonderful book called Psychology of Intelligence Analysis written by Richards Heuer that describes ACH in some detail in Chapter 8:
Analysis of competing hypotheses (ACH) is an eight-step procedure grounded in basic insights from cognitive psychology, decision analysis, and the scientific method. It is a surprisingly effective, proven process that helps analysts avoid common analytic pitfalls. Psychological research into how people go about generating hypotheses shows that people are actually rather poor at thinking of all the possibilities
Step-by-Step Outline of Analysis of Competing Hypotheses
1. Identify the possible hypotheses to be considered. Use a group of analysts with different perspectives to brainstorm the possibilities.
2. Make a list of significant evidence and arguments for and against each hypothesis.
3. Prepare a matrix with hypotheses across the top and evidence down the side. Analyze the “diagnosticity” of the evidence and arguments–that is, identify which items are most helpful in judging the relative likelihood of the hypotheses.
4. Refine the matrix. Reconsider the hypotheses and delete evidence and arguments that have no diagnostic value.
5. Draw tentative conclusions about the relative likelihood of each hypothesis. Proceed by trying to disprove the hypotheses rather than prove them.
6. Analyze how sensitive your conclusion is to a few critical items of evidence. Consider the consequences for your analysis if that evidence were wrong, misleading, or subject to a different interpretation.
7. Report conclusions. Discuss the relative likelihood of all the hypotheses, not just the most likely one.
8. Identify milestones for future observation that may indicate events are taking a different course than expected.
On TV, we’ve seen Dr. House and his team of diagnosticians leverage the techniques above to treat patients. As most Law and Order aficionados will attest to, the Defense lawyer does his best to prove that the primary hypothesis (guilt) is false, by showing it rests on shaky ground. An alternative hypothesis may be presented to sow doubt in the minds of the jury. In the business world, the costs of making a bad judgment are rarely that high. Important findings are reviewed in meetings, but it’s easy to put too much faith in the data and act on it, if no one plays the role of the Defense Lawyer. If you believe like I do that true Business Intelligence is Intelligence that guides the right decision and action, then as Analysts and decision makers, we need to get serious about the Hypotheses we evaluate. Otherwise, all that data analysis and Analytics does, is further confirm our biases.