Data is Not Reality Part 2 of a 3-Part Series

Originally published 19 April 2011

In Part 1 of this series – "It’s An Analytical World" – I took issue a bit with trends such as fact-based management and competing on analytics. In fact, I showed that there is not much fact to work with in management. Everything we observe comes to us through our limited senses, and everything we measure is measured by instruments that only measure what they are designed to measure – with limited results. From a theoretical point of view, there really isn’t much we can claim we truly know. After this somewhat depressing conclusion, what we need is some practical advice.

Perhaps there cannot be such a thing as truth of fact, but at least it's the closest thing to truth we have. Surprisingly, help comes from another era – the Middle Ages. William of Ockham (c. 1288 - c. 1348) was an English Franciscan friar that today is best known for a principle called Ockham's razor (although this term was not his). Ockham's razor is the idea that "entities must not be multiplied beyond necessity" or "plurality should not be posited without necessity." In other words, if there are several hypotheses on how to explain a certain phenomena, you should favor the hypothesis that contains the fewest assumptions and has the fewest elements. Go for succinctness. The use of the term "razor" means that in formulating a correct hypothesis, you need to shave away all unnecessary complexities. Although "the simplest explanation is often the best" may be a bit too simplistic, this is how Ockham's razor is usually paraphrased.

Measurement is One-Off from Reality, Calculations are Two-Off or More

What follows from this is that the more you are removed from an unambiguous measurement (a truth of fact) and the more you stack up assumptions (truths of reasoning), the more likely you will be wrong. This all seems pretty theoretical, but there are many practical implications. Let's call, for practical purposes, a direct observation or measurement zero-steps-off reality. The moment we do something with that information, such as combine it with other information to create a ratio, aggregation or any other kind of calculation, it is one step off reality, and the probability that the result is not precise increases. A calculation based on a calculation then would be two steps off. The more steps off you are, the more you need to rely on truths of reason that might be inherently correct, but don't match with reality.

Seen this way, profit is a measure of success far removed from atomic measurement of costs, revenues and other components that ultimately lead to establishing profit. There are many assumptions involved in determining the actual profit, even within highly standardized accounting rules. As a result, profit is simply too unreliable to assess the health of a company. Cash flow is close to direct and unambiguous measurement, and as a result much more reliable. Or, consider an equally prominent performance indicator that is even less defined: customer satisfaction. It doesn't tell you that much, as it is often a composite of many factors. Things that can be directly measured and that drive customer satisfaction, such as on-time deliveries, the number of products returned, complaints, and repeat orders or referrals, tell you much more.

The problems associated with truths of reason can have widespread societal consequences.

Remember the 2010 eruption of the Eyjafjallajökull volcano in Iceland that led to shutting down airports all over Europe? Airplanes could be in danger when flying through a cloud of volcanic ash. The problem was that there were no actual measurements of ash particles; the decision to close the airports was based on meteorological models that predicted the distribution of the ashes. In the meantime, this was costing airlines $200 million per day. Truths of reasoning were used where truths of fact would have been appropriate.

The same can be said of the 2008 credit crunch. Banks would buy packages of collateralized debt with a certain risk profile, would use amazingly complex financial models to chop them up and repackage them, and would sell them off again. At no point in that process did the transaction hit the real business model of a bank. It was pure speculation. Once reality changed (subprime mortgages not being paid off anymore), the model broke down and a chain reaction occurred. Again, truths of reasoning were used where truths of fact were needed.

Predicting the Future

"It's hard to make predictions, especially about the future." This quote has been attributed to many, ranging from author Mark Twain to baseball legend Yogi Berra to Danish physicist Niels Bohr. Not a very good start when turning to predictive analytics. If we can't predict the future, predictive analytics is a misnomer. The easiest way to explain this is by using an example we all know. When we rehearse a conversation in our heads ("... if they say this, I will say that"), that's predictive. When after the conversation we tell ourselves, "Darn, I should've said that," that's analytics. You see, the term is problematic.

However, predictive analytics is the hottest of analytical topics in the market today. (Admittedly, predictive analytics has a nice ring to it; something we'd all like to have.) Actually, there is not much new under the hood. All the underlying statistics, data mining techniques, operations research and game theory principles have existed for many years. Predictive analytics is just another label. In this type of analytics, we can distinguish predictive models, descriptive models and decision models. Predictive models look for certain relationships and patterns that usually lead to a certain behavior, point to fraud, predict system failures, assess credit worthiness, and so forth. By determining the explanatory variables, you can predict outcomes in the dependent variables. Descriptive models aim at creating segmentations, most often used to classify customers based on sociodemographic characteristic, lifecycle, profitability, product preferences and so forth. While predictive models focus on a specific event or behavior, descriptive models identify as many different relationships as possible. Lastly, there are decision models that use optimization techniques to predict results of decisions. This is sometimes also referred to as "what-if" analysis.

Although predictive analytics sounds like a very scientific field, its Newtonian approach is problematic. The English-Irish philosopher and member of Parliament Edmund Burke (1729-1797) remarked that our society is so big and so complicated that a single mind cannot possibly understand how it works in its full complexity – let alone make predictions about it. Furthermore, it changes continuously and organically, not like a machine at all.

Let's reconsider Chris Anderson’s article (discussed in Part 1 of this series) about the end of theory in this light. If we collect all the data in the world, and construct the perfect model, indeed we can run every correlation we can think of or – more importantly – the system can think of. These correlations then describe the relationships (but should not be confused with causality) in the data. Per definition, data comes from the past, and we can only measure what is already there. This means the correlations are retrospective in nature as well. In the end, there is only one thing we can say about the future:  there is a high probability that it will different from today. This way you can argue that running analytics actually invalidates the predictive value of the data. The only thing you can predict using predictive analytics is in situations when there is no change. That's not much of a prediction.

Even rock-solid science shows many examples of sudden, radically changing patterns. Take, for instance, fractals, a set of mathematics that relies on recursive equations (a formula that uses itself again...and again, and again). Starting out fairly predictably, soon their complexity becomes endless. Two very similar fractal expressions can lead to extremely different results in just a few iterations. Fractals are used to describe shapes in nature, such as clouds, mountains and rivers. They are also used for data compression and even for art.

Nassim Nicholas Taleb, a professor in risk engineering, former Wall Street trader and author of The Black Swan, uses this example to describe reality in society as well. Taleb argues that change doesn't happen gradually, which is the assumption that most of us work from, but rather in “jumps,” controlled by “the tyranny of the singular, the accidental, the unseen and the unpredicted.” Gradual change is our paradigm, yet actual change is “almost always outlandish.” Taleb refers to these events as "black swans.” This was a common expression in 16th century London to describe an impossible thing, as obviously swans are white. However, in 1697, Willem de Vlamingh led an expedition on the Swan River in Australia and actually found black swans. The term then started to mean something deemed impossible, but found to exist. Taleb uses the term to describe hard-to-predict, rare and high-impact events beyond what we could normally expect. And these black swans appear all the time. Think of the H1N1 or SARS epidemic in global health, the collapse of Enron in business, the credit crunch in 2008, or positive things such as the invention of text messaging. And who knows, next year we may learn about a cure for cancer, a way to store massive amounts of energy in a small pill, or discover there is alien life. Our world will never be the same again.

Taleb argues we do not live in “Mediocristan,” the world we understand. We live in ”Extremistan,” where sudden developments in many different areas can have immensely disruptive consequences. Given the diversity of black swans and the fact they happen all the time, it doesn't make a lot of sense to trust our models of reality, even the perfect ones. Disruption is always around the corner. We keep falling for it because of a few fallacies. First of all, we learn by induction, which means we draw general conclusions from observations and experience. As we cannot know what we don't know, we are not ready for "exceptions to the rule." Second, we believe that history repeats itself, so we only look for change we know. Third, we seek meaning in events and invent explanations after the fact, which is much more comforting than staring at sheer randomness. Taleb specifically warns experts – who claim they understand their area of expertise and use massive amounts of statistics – to offer insights to support their conclusions. Experts in particular tend to underestimate the uncertainty of events.

So if we can’t even trust the experts, who (or what) can we trust? More on that in Part 3 of this series.

SOURCE: Data is Not Reality

  • Frank BuytendijkFrank Buytendijk

    Frank's professional background in strategy, performance management and organizational behavior gives him a strong perspective across many domains in business and IT. He is an entertaining speaker at conferences all over the world, and was recently called an “intellectual provocateur” and described as “having an unusual warm tone of voice.” His work is frequently labeled as provocative, deep, truly original, and out of the box. More down to earth, his daughter once described it as “My daddy sits in airplanes, stands on stages, and tells jokes.” Frank is a former Gartner Research VP, and a seasoned IT executive. Frank is also a visiting fellow at Cranfield University School of Management, and author of various books, including Performance Leadership (McGraw-Hill, September 2008), and Dealing with Dilemmas (Wiley & Sons, August 2010). Frank's newest book, Socrates Reloaded, is now available and is highly recommended. Click here for more information on how to get your copy today.

    Editor's Note: More articles and a link to his popular blog are available in Frank's BeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Frank Buytendijk

 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!