Theory & struggle 2015 by MARX MEMORIAL LIBRARY & WORKERS' SCHOOL

Leo Impett ‘Artificial intelligence, perhaps more than any other technical or mathematical endeavour, is influenced by a host of underlying ideologies and values.’

Vygotsky and Marxist artificial intelligence This is an introductory review article on the increasing use of Vygotskian psychology for artificial intelligence and is intended for a nontechnical audience. It seeks to explain the role of Vygotskian Activity Theory in debates on artificial intelligence by understanding the cultural, historical and technical thinking behind artificial intelligence research. The Soviet psychologist Lev Vygotsky sought to establish a specifically Marxist approach to human thought and language that related the development of consciousness to tool-using social production and the role of language in enabling such collective, practical interaction. He argued that instrumental and abstract thought stemmed materially from the way humans collectively use tools, from the simple to the complex, to develop – both over time as a species and immediately for the intellectual development of each child. Vygotsky died of tuberculosis in 1934 but his cultural-historical psychology was actively carried forward in the Soviet Union, among others, by A. R. Luria and A.N. Leontiev. In the West his work remained relatively unknown, or at least unestablished, until advances in the educational and developmental psychology of the 1970s. After the fall of the Soviet Union, his work – particularly Activity Theory (AT) — has become central to the fields of artificial intelligence (AI) and human-computer interaction (HCI), which had been to a large extent based on the mechanistic notions of cybernetics and cognitive science. In the context of artificial intelligence, Activity Theory has passed through two opaque reappropriations: that of Western psychologists during the Cold War, and that of computer scientists in the last decade. Marxist elements of his cultural-historical psychology, of which Activity Theory forms a central part, have been largely ignored 1. At the same time, critiques of artificial intelligence tend to focus on cybernetic or cognitive-science models of computation, which are both philosophically and technically outdated. Very recent developments in deep learning have changed the technical field considerably — and might lead to a new era of self-critical technical research.

18 theory&struggle

Cybernetics, GOFAI, Deep Learning

Unlike Vygotskian theories, which emerged from his work on child development and the psychology of art, cybernetics grew largely out of American research into anti-aircraft gun control systems in the Second World War 2. Cybernetics is largely related to the control of industrial systems, feedback loops, and analogue control. Philosophically it was, for French cognitive scientist Jean-Pierre Dupuy: not the anthropomorphization of the machine but rather the mechanization of the human 3 With the advent of digital computing, artificial intelligence and cybernetics split into separate fields only at a special conference in 1956; with artificial intelligence moving into the digital realm of symbolic representation and rule-based thinking. This rule-based paradigm dominated work in artificial intelligence until after the fall of the Soviet Union. It considered intelligence to be replicable by the rule-based manipulation of symbols, which was also the technical base for digital computing at that time. Newell and Simon best summarised the major axiom of this view, naming it the Physical Symbol System Hypothesis: A physical symbol system has the necessary and sufficient means for general intelligent action 4 This move coincided with a similar trend in psychology towards cognitive science, a symbolic and representational understanding of the human mind. This form of artificial intelligence, now called Good Old-Fashioned Artificial Intelligence (GOFAI), was quite successful in solving symbolic problems for which a system had been especially designed — such as the famous game between Deep Blue and Garry Kasparov. However, these systems of symbolic representation found it difficult to work in a real, physical environment, where signals were noisy and situations largely unpredictable. American roboticist Rodney Brooks became one of the strongest critics of GOFAI, most famously in his 1990 paper Elephants Don't Play Chess; instead, he proposed a system where: Physical interaction with the environment [is] the primary source of constraint on the design of intelligent systems 5

This emergent, situated understanding of intelligence has been the state-of-the-art since the early 1990s — even in systems that have nothing to do with the physical world (such as stock-market trading). Intelligent systems are no longer encoded with specific rules, but are simply provided with large datasets and methods of pseudo-symbolic quantifiable encoding. Machine Learning 6, now a more common term than ‘artificial intelligence’, is a statistical process of linking feature vectors (a series of numbers that encode something about a situation) to labels (manual annotations about the situation, which we would like to later predict). In the simplest case, one might learn the relationship between wind speed (a feature vector) and rainfall, and learn to predict the probability of future rain based on wind measurements. Similar systems are used for very complex tasks, with a considerable level of success in linking simple calculable properties to complex high-level representations. A classic computer-vision task, for instance, is learning to classify images (a cat, a human, a car) between very simple feature vectors based on colour or gradient (see Figure 1 above 7) — which state-of-the-art systems can do with near-perfect accuracy, an unthinkable task for the explicit symbolic manipulation of GOFAI. An important development in artificial intelligence, enabled by a rapid increase in the speed of

Creative Commons

Leo Impatt is completing his Master’s Degree in machine learning at the engineering Department of the University of Cambridge, and will shortly start a doctoral fellowship in Computer Science at the Ecole Poytechnique Federale de Lausanne, Switzerland. His research interests lie at the intersection of machine learning and humancomputer interaction, focusing particularly on musical interaction, performance and the digital humanities.

Figure 1: an image detected as a human (left) and a gradient-based feature vector (right): adapted from Dalal et al

Lev Vygotsky

computation available to researchers, is deep learning. Deep learning systems are based on large networks of artificial neurons, designed to emulate the most basic functions of biological neural intelligence. Deep learning systems are heavily related to classical machine learning, with an important distinction: under deep learning, it is frequently not necessary to explicitly encode a feature vector. High-level symbolic or representational structure is learned from raw data: in the example in Figure 1, we would learn to detect a human in an image based only on the raw pixel values (which, in the digital realm, is the image itself), and not explicitly specifying a lower-dimensional representation of colour gradients. In late 2013, DeepMind (now owned by Google) demonstrated the generalisable power of this approach, by teaching a deep neural network to play a series of Atari videogames based only on the pixel values of the computer-screen — training a single generic model to be able to play six different games 8. The lack of an explicit feature poses an ethical problem in many applications of artificial intelligence. In industry, computer vision techniques are frequently used for intelligent security and surveillance camera systems. Where classic machine learning techniques of feature encoding would focus largely on quite abstract hand-crafted feature encodings (such as Figure 1, right), deep learning systems take into account the totality of the information in an image. Indeed, it is technically very difficult — in some cases impossible — to have a deep learning system actively ignore any information it is fed. Why does this pose an ethical problem? Because there is some information present in a statistical sense which we are ethically bound not to consider. In the US, a black male is 4.9 times more likely to be arrested than the population average. We might consider a situation where we feed such a deep learning surveillance system a ‘training dataset’ — a few hundred CCTV videos of arrested people, and a few thousand of non-arrested people. Such a deep learning system, designed to alert suspicious activity, would — by simple Bayesian inference — sound a false alarm 4.9 times more for black men than for the average

theory&struggle 19