Focus: Researching with AI
Augmented Science Artificial intelligence is becoming an ever more natural aspect of the research process. But before it can trigger a scientific revolution, researchers first have to learn to understand better just what kind of assistant they’ve invited into their labs. By Roland Fischer
I
ntelligent machines and self-learning systems have been keeping researchers busy for decades now. Initial reports of attempts at machine learning in the identification of genetic patterns were published over 20 years ago. And in particle physics, they’ve been experimenting with artificial intelligence (AI) for so long that some reviews from the year 2000 even reported a sagging interest and called for rapid revival. “Neuronal networks were actually studied and employed in various experiments at CERN back in the 1990s”, recalls Sigve Haug from the Laboratory for High Energy Physics at the University of Bern. They simply didn’t call it ‘machine learning’ at the time. AI everywhere Today, the use of such AI methods in large experiments in particle physics is almost the norm, whether in data reconstruction or data analysis. And they are also often used in distributed computing, where programs have to learn when and how computing processes can be distributed in the most efficient manner. But AI isn’t just omnipresent at CERN. Suddenly, the situation is very similar everywhere. Artificial intelligence is the current credo in research. Physical chemistry, molecular biology, medical genetics, astrophysics and even the digital humanities: wherever large amounts of data are to be found, AI isn’t far away. Is the development towards AI as laboratory assistant – in other words towards a mixed research team of man and machine – the next, necessary step? “Absolutely”, says Karsten Borgwardt, a professor at the Machine Learning and Computational Biology Lab at ETH Zurich. “In many fields in the life sciences where we work with high-throughput technologies, we simply can’t do without it any more”. The amounts of data are simply too big if you want to link half a million medical histories with the corresponding genetic data. “No human being can recognise any meaningful, hith-
erto unrecognised patterns with the naked eye any more”. Such data volumes can only be handled with efficient statistical procedures such as those currently being developed by specialists like Borgwardt. In any case, the border between statistics and machine learning is fluid today, he says. Science on steroids Artificial intelligence as a natural partner in the research process: this vision is reminiscent of Garry Kasparov’s ‘Advanced Chess’ idea that he came up with shortly after his defeat against Deep Blue, almost exactly 20 years ago. In future, he said, humans should no longer play against each other or against machines, but joint man/ machine teams should compete instead. This would enable the game to be raised to a whole new level, believed Kasparov: a game of chess beyond the bounds of human strategic possibilities.
A system can truly ‘over-learn’: the more you train it, the worse it gets. “Machine learning is the scientific method on steroids”, writes the AI expert Pedro Domingos of the University of Washington in his book ‘The Master Algorithm’. In it, he postulates something along the lines of a super-machine-learning method. By means of an intensive use of AI, research would become quicker, more efficient and more profound.This would free researchers from their statistical routine and let them concentrate wholly on the creative aspects of their work. Domingos promises nothing less than a new, golden age of science. Not all researchers engaged with AI are keen to sing from the same happy song sheet. Neven Caplar of the Institute for Astronomy at ETH Zurich is a data nerd through and through: he runs the data blog
astrodataiscool.com and has recently used machine learning to quantify the gender bias in astronomical publications. For a few years now, Caplar has noticed a definite upswing in publications that include AI. But he doubts whether the methods in his field will allow for any big breakthrough. Astronomy is “a science of biases”, he says. It’s also about controlling the instruments as well as possible. For this reason, AI shouldn’t be conceived as a ‘black box’ – in other words, AI should not be a practical tool that delivers good results, but whose precise means of functioning remains incomprehensible. When it comes to handling the observation data, their interpretation by a human researcher is still the crucial aspect, says Caplar. The black-box problem “Oh, this black box!”, cries his colleague Kevin Schawinski (see also: ‘The physics of everything’, p. 30). Everyone is talking about AI being a ‘black box’, claiming we aren’t able to scrutinise the logic and arguments of a machine. Schawinski is an astronomer, and doesn’t see AI like that. From his perspective, it’s simply a new research method that has to be calibrated and tested for us to understand it properly. That isn’t different from any other method that science has appropriated, he says. After all, there is no one who can comprehend every single aspect of complex experimental assemblies such as the Large Hadron Collider at CERN or the Hubble Telescope. Here, Schawinski trusts the research community just as much. They know how to ensure that the scientific process functions robustly. Together with colleagues from the computer sciences, Schawinski has launched the platform space.ml, which is a collection of easy-to-use tools to interpret astronomical data. He has himself developed a method that uses a neuronal network to let us improve images of galaxies; more information can thereby be extracted, without the computer needing further specifications. With other applications, so-called
Swiss National Science Foundation – Swiss Academies: Horizons No. 113
13