Page 1

Spanish version published in Cuadernos de Logística n. 41 December 2017

Artificial vision Concept. Technology. Applications. Benefits.

Collaboration between Javier Martínez García, CEO and Founder of Odyssey Robotics, that develops and manufactures guidance systems for autonomous vehicles. Mechanical Engineer at UMH. www.odysseyrobotics.com and Cristina Peña Andrés, Senior Director at Coxgomyl, Business School Teacher, International Trade and Supply Chain Expert. Senior Industrial Engineer at UPM. www.cristinapenaandres.com Traslation to english done by Irene Robles, translator and science fiction author. www.ireneroblesscifi.com

When? There is no doubt that in the last two decades we have experienced a great technological development, although most of the advances have been linked mainly to the world of communications. The way of putting human beings in contact, in spite of their distance and location, has been definitively modified; the way of socializing has been revolutionized; or the way of working, learning and enjoying leisure. However, there hasn't been a truly global and consistent change in the industry, transportation or daily life of human beings.


Fortunately, there are pilot projects that predict how the world will change over the next five years in a definitive way. Science fiction, which got us excited decades ago, is possible today. These futuristic stories in which new societies appeared, selfguided precision machines and new ways of transporting people and goods are already on the runway. Artificial vision is going to be one of the technological changes associated with industry, transport, construction and house automation more impressive that we will be able to live.

Precedents Even before the industrialization of handicrafts (transformation that nowadays takes the form of mass production), a great number of hours were invested in inventing methods that would reduce the complexity of the processes that result in a product. This was accomplished by applying simplifications, parameterizing operations and creating indicators that would ensure that the ability to repeat a result didn't depend on something as variable as human ability. The direction in all this time (covering most of man's history) has always been the same: to make it easier to achieve something difficult. As an example:

A craftsman is a person who brings together knowledge of various disciplines over a period of time, which constitutes his whole life and which, along with his deductions and after years of practice, is able to generate something more linked to the concept of art than to the product. The need to produce more, faster and cheaper, only accelerates this process of evolution. The end of this road is total automation. The total automation, not only of productive processes, but of any type of process that affects directly or indirectly to the daily life, is something towards humanity is directed with speed. And among all the technologies that form this path, the artificial vision has a dominant position.

Why artificial vision? If the most versatile morphology for a robot is the anthropomorphic, due to it is for the man for whom the whole of what is around us has been designed and constructed, the most powerful general-purpose sensor is the eyesight. And, as with humanoid robots, artificial vision is a technique that involves a great deal of complexity.


What is artificial vision? Artificial vision is a process in which starting from a an n input, which can be a number, a mark or a symbol, a performance or execution is given, previously programmed by computer as output. INPUT: DATA

OUTPUT: ACTION Computer processing

Therefore, there is a multidimensional data (an image is, for example, a set of numerical data), that based on disciplines such as geometry or physics, give gives the option of processing input information through vision sensors as would make the human eye.

What is a sensor? A sensor is an object capable of detecting physical or chemical magnitudes such as light intensity, temperature, distance, acceleration, inclination, pressure, force or humidity and transform them into electrical variables such as electric current or voltage. The sensor can be connected to a computer for access to a database.

Vision sensors

Despite being developed since before the 80's, today's vision sensors continue to present great challenges. To understand the functioning of a vision sensor, a simple ex example ample is the proximity sensor. A proximity sensor can offer two or three types of information: near, far away, and even in some sensors, somewhere in between. This information is processed directly. The result is all or nothing or in some sensors, a percen percentage tage is offered. For example:

It can be determined whether a liquid is below a set level, which would automatically mean that the tank is emptying, acting quickly accordingly. Imagine a matrix of 40 x 40 sensors, as a surface on which an open hand is located, the output of this set of sensors would be something like this image:


The dark area corresponds to the activated sensors, on which the hand is located. All other sensors are inactive (white zone). It isn't difficult for our brain to classify the image and give us a detailed description: it is a human hand, it has five fingers, it is open... However, all that we have after measuring with our sensors is a soup of numbers with activation states, where through algorithms and mathematical methods we have to calculate and discern whether it is actually a hand or anything else. For example, if the objective is to light a light bulb every time the hand is placed on the sensor surface, with the restriction of not light it up in the presence of other objects, there would be a non-trivial problem due to there is only data that reflect activation and position, but we have to go further and create a specific association between a hand and a map of activated / deactivated sensors that represent it. Although it depends on the specific target to achieve, a small image size, such as 320x240 pixels, is equivalent to 76.800 approach sensors. If we talk about a video, which is nothing more than a sequence of images representing motion, we will work with 76.800 sensors per frame, and if this video has a frame rate of 25 frames per second, we find that it must read, relate and conclude on the status of nearly two million sensors every second. This explains the great computational requirement in terms of volume and quantity of data, and why only in recent years we are seeing advances in this field thanks to the increase in capacity of the equipment.

The data In short, the data are important, but it is also the definition of the minimum quantity or quality of the data, whose process gives the expected response. Would a blurry image be a correct input? What happens in an artificial vision process?


If there is a rainy weather that prevents from seeing the traffic signs, a driver receives insufficient data through his eyes to provide a correct driving as an output. In the case of artificial vision, conceptually we are faced with the same thing, but the difference lies in the mapping or reconstruction of a scene. The ability to complete scenes for a computer goes beyond human capacity, usually affected by emotion or nerves, or by pure sensory deficiencies that prevent optimized input.


Current image processing and applications Today, the algorithms and techniques used in the analysis of vision allow to detect hands and other limbs with good precision and in addition to the latest advances can recognize faces, emotions and even give an estimated age of the person they are seeing. Another function is the 3D scanning of objects or the guidance of vehicles, applications where the vision, despite being still in full development, already surpasses traditional sensors thoroughly. With a more industrial approach, artificial vision can, therefore, either recognize objects, control times, detect anomalies or defects in parts, or inspect and control industrial processes, such as assembly, painting or the finished batches themselves, either packed or boxed up. But in addition, it can help to predict future events, enabling it for preventive maintenance, or for tracking past events, hence its value for logistic traceability.

Technology behind artificial vision The key to the most impressive functionalities of artificial vision in recent times is machine learning. Machine learning is exactly what the words refers to: learning machines. Conceptually determining which spatial and sensor-triggering combination is generated by an open hand is such an extraordinarily complex task that it chooses to tackle the problem from another point of view: to teach an algorithm an enormous amount of open hands in order to the concept "open hand" emerges as a group of numbers in a table. These numbers, without any sense for us, contain the most perfect abstraction possible of the concept "open hand" that the machine has been able to generate. In this way, the machine can observe an image, compare it with abstraction and tell us if it is a hand, what attitude it represents or even if it belongs to a child or an adult.

This technique applied to any field that involves processing large volumes of data under certain conditions results in the fact that in many applications the machines achieve superhuman capabilities.


Benefits of artificial vision The benefits of artificial vision, within reach of society, can be divided into 7 blocks:









1. SAFETY: There are tasks that put at risk the integrity of the person and they could be executed by machines, reducing the dangerousness that is exposed to the person, and reducing the accident rate according to some activities. 2. QUALITY: In general, a machine is more precise than a human, being able to perform a finer and more careful work, with a qualitatively better result. 3. DISALIENATION: There are repetitive and heavy tasks that don't promote or develop any activity of the human being as a social and intelligent being, which are assumed by machines and allow man to focus more on his creative and management part. 4. EFFICIENCY: Some tasks performed by machines are much more efficient and incur lower overall costs, which can make them more sustainable, and more respectful of energy sources and environment. 5. FLEXIBILITY: Product customization, in some way linked to handmade or minority production, entailed higher costs than the mass production facilitated by the third industrial revolution. However, the artificial vision can allow the individualization at reduced costs, allowing to adapt the supply to the demand.


6. CONTROL: There are controls that escape the detection by a human being because of their degree of precision, because a malfunction isn't evident at first sight, because their wear is hidden... In this case, the machines take, as input, data that contain much more precise measurements than a person can collect, and therefore the processing of higher quality data will give a much more rigorous and consistent output and in accordance with the current situation. 7. ADJUSTMENT: Machines are connected by internet, and nowadays, besides they control if an activity is correct or not, they can contribute information of the surroundings, so there is a "return communication in real time". Based on the returned data, this environment can be regulated. An example of this is Ethernet communication.

Conclusions I. Are we ready for the coming social change? Do we understand how, for example, 3D cameras collect data that are processed and based on them, decisions are made in real time acting on newly controlled scenarios, regulating them, setting them up according to a previously established schedule? Are we aware that transport, manufacture, construction, treatment and handling of objects, the way to understand nature and its morphology, color or position will change in a matter of a few years? Do we know that artificial vision doesn't require sensors to tag predetermined positions, so they are much more versatile, and that after that there are algorithms programmed to self-guide in the same way that humans do now? Surely, we are beginning to assume it as a society, thanks to bar codes and readers, autonomous vehicles (AGV) and other machines that coexist in our day to day, almost without realizing it. In any case, it will be the degree of well-being achieved


(comfort, ergonomics, health) joined with the way of connecting to machines (the sensation that generates us and the experience that cause us) the guidelines that establish the limit of coexistence accepted and the degree of delegation of decision making in the hands of pre-programmed algorithms. II. Machine learning is already a subject of discussion and controversy since the idea of God in a Box is more than a hype of the moment (see Singularity University). The development of this type of technique and the continuous search for new applications start to raise fear among those who see an insurmountable argument: it isn't rational to invest human time in a task that a machine can do better in all the senses than affect the production. This is the future of artificial intelligence, which generates dreams of science fiction, fear and ideas of rebellion. A future that many describe apocalyptic, simplified to the concept “man versus machine”. Therefore, it must be remembered that, regardless of their intelligence, machines will always be a work in the image and likeness of the man or society that created them, and in whatever drift the future takes, these will only be a lever for our own purposes. © Cuadernos de Logística 2017


Profile for RJH

Artificial vision  

Artificial vision  

Profile for rjh9