Impact | September 2015
“We Can't Rely on Machines” Michael L. Brodie, Research Scientist at the Massachusetts Institute of Technology (MIT), is convinced that Big Data has more potential than the hype suggests, but also more risks. An interview about opportunities and threats. Everybody is talking about Big Data. Is it more than a hype? Michael L. Brodie: It is a little bit like at the beginning of the internet. The hype was large ly based on people who tried to make money selling their pro ducts. But the internet has chan ged our world in ways the hype couldn’t conceive. So at the moment the hype for Big Data comes from IBM, SAS, SAP – the large vendors of these solutions. Forecasts of the Big Data Mar ket show a huge market growth from 7.6 billion dollars in 2011 to 84.6 billion dollars in 2026. So yes, there is a lot of hype? But I actually think it is far more profound and powerful than most people are conceiving it at the moment. It has already changed a very large number of operating processes in health care, manufacturing, market ing and stock markets. How ever, it is not as widely used as one might think. Big Data and Big Data Analytics are in their infancy with respect to opera tional deployment and our un derstanding of it. So Big Data isn't promoted beyond its value? No, these methods are actually seen as the fourth scientific par adigm, meaning that they have the potential of a completely different and faster way of solv ing many challenging problems
of humanity, of health care and poverty. Gartner, one of the world's leading information technology research compa nies, estimates that 80 percent of all business processes world wide will change within the next five to ten years, all based on Big Data Analytics. Gartner also predicts that 85 percent of the Fortune 500 will be unable to exploit Big Data in 2015. The much bigger impact will be over the next decades. Sometimes it seems that there is a blind faith in Big Data. Right. My experience shows that people who are very excit ed about Big Data may not be very familiar with forecasts or statistically based prognosis. In order to use statistical tech niques to analyse data you have to understand the power and the limits of statistics. And al most every statistical outcome is probabilistic. That means it may happen only within the
predicted error bounds and confidence level. When you get an answer you still can’t say what SAP's stock price will be next Wednesday. You say with a probability of 0.75 the stock price will be this or that amount of money. So it is always quali fied by some sort of error bars. Error bars give a general idea of how precise a measurement is. But 80 percent of the people I interact with and who want to consume Big Data results don't know what an error bar or pro babilistic answer is. What does this mean from a customer's point of view? Most customers of data anal ysis want to know something like: “Will I sell strawberry Pep si more than vanilla? Because I have to tell my manufacturing line how to switch.” And you say to them: “Well, with this or that probability you are likely to sell more strawberry in New Jersey over these weeks, and in
Michael L. Brodie is a Research Scientist in the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology (MIT). With over 40 years of experience he advises startups and is a member of Advisory Boards of national and international research organizations. He is an adjunct professor at the National University of Ireland. For more than 20 years he was Chief Scientist of IT at the US telecom company Verizon, a Fortune 20 company. There he was responsible for advanced technologies, architectures, and methodologies for Information Technology strategies and for guiding industrial scale deployments of emergent technologies. He is concerned with the Big Picture aspects of information ecosystems including business, economic, social and technical applications. His current research and applied interests include Big Data and Data Science. Brodie holds a PhD in Databases from the University of Toronto and a Doctor of Science from the National University of Ireland. Recently Michael L. Brodie was invited as keynote speaker at the 2nd «Swiss Conference on Data Science» organized by the ZHAW Datalab. The topic of his talk: “The Emerging Discipline of Data Science: Principles and Techniques for Data-Intensive Analysis”.
California you are more like ly to sell vanilla over the same period.” So they have to under stand that the answers can only be probabilistic. They never get a complete certified answer. Al most every measurement one makes in the world is probabil istic. That means every compu tational answer made based on data is also probabilistic. So ob viously education is going to be critical. Not only in understand ing it from a customer's point of view but also in expressing it from a Data Scientist's who pro duces these answers. Some people are warning that Big Data threats humanity. What is your opinion? There is an organization called "Future of Life" (http://future oflife.org). It has just recently been created by very famous scientists and entrepreneurs with a strong commitment to technology like Stephen Hawking, the famous physicist from Cambridge, Bill Gates from Microsoft and Elon Musk, the CEO of Tesla. Their vision is to limit the risks of automation and Big Data so that they won't negatively impact humani ty. The media have dramatized that by stating that these peo ple were saying “Artificial Intel ligence may end life as we know it on the planet”. But of course they didn’t say that. Their objec tive is to safeguard life and de velop optimistic visions of the future in order to mitigate exis tential risks facing humanity from Artifcial Intelligence. The nature of the threat is rather that we in artificial intelligence don't really understand what it does. We see the outcomes and they seem in many cases to be quite positive. But the most ad vanced research institutes like those at MIT, when they talk