JOURNYS Issue 11.1

Page 15

rather than the bench or the door, and you will spend more time scanning these based on your reasonable expectation and prior knowledge about birds. Sometimes, you can learn new information about the characteristics of a target item and utilize this information to find the target faster. Returning to the airport example, if your friend informed you that he will be wearing a blue shirt, then you would be able to focus on people who are wearing blue shirts and skip scanning those who are not in blue to facilitate the search and terminate the process faster (Figure 5C). In addition to prior knowledge and expectations, various other factors can effectively guide topdown processing during visual search, which are summarized in Figure 6. Compared with existing AI systems, the human brain appears to be much better at utilizing and combining factors associated with top-down processing, even before most of the objects and details within a scene are recognized and consciously processed [4]. Thus, top-down processing represents a domain in which machines, such as AI and computer vision systems, are currently incapable of simulating human performance. How can a computer utilize its past experience, maximize its motivation, and rely on its long-term memory of prior knowledge and expectations about the world? Addressing this question will help scientists and engineers bridge the gap between the human brain and machines during the performance of visual search in complex, natural scenes. Over the past 40 years, much research on visual search has been conducted, providing us with a good grasp of the mechanisms that underlie bottom-up and top-down processing and that allow human observers to locate a target item in complex scenes (whether it is your old friend in the airport scene, a bird in a forest, or Waldo in your book). The remarkable search abilities of the human brain are the result of attentional guidance mechanisms that rely on combinations of bottom-up and top-down factors [8]. Although computer vision and AI systems have achieved and surpassed the bottomup processing abilities of humans [7], the top-down processing factors, such as semantic knowledge, prior information based

Figure 6: Top-down Examples and References Some examples of top-down factors and relevant papers that reference each example.

on past experiences, and common sense regarding the external environment, in addition to motivational or emotional factors, are what allow humans to outperform machines during this important task. The visual search field is just beginning to understand how bottom-up and top-down processing interact with each other in real time and how this interaction can be effectively implemented to improve search performance in the computer vision and AI systems and other practical applications. Future challenges for the field include understanding how the principles of attentional guidance driven by bottom-up and top-down processing can be expanded from processing two-dimensional images to processing immersive, dynamic, three-dimensional environments and how these processes can be utilized in practical applications that require outstanding visual search abilities, such as surveillance, security cameras, self-driving cars, and robots.

References

Figure 5: Visual Search Application in Natural Scene An example of visual search in a naturalistic scene. Our knowledge and expectations regarding the world guide our eye movements (indicated by colored circles on this image), which reflects our optimal strategy for effectively identifying a target.

[1] Wolfe JM. What Can 1 Million Trials Tell Us About Visual Search? Psychological Science. 1998;9(1):33-39. doi:10.1111/1467-9280.00006. [2] Schweizer K. Visual search, reaction time, and cognitive ability. Perceptual and Motor Skills. 1998;86(1):79-84. [3] Rayner K. Eye movements and attention in reading, scene perception, and visual search. The Quarterly Journal of Experimental Psychology. 2009; 62: 1457-1506. [4] Wu CC, Wolfe JM. Eye Movements in Medical Image Perception: A Selective Review of Past, Present and Future. Vision. 2019; 3:32. [5] Luck SJ, Hillyard SA. Spatial filtering during visual search: Evidence from human electrophysiology. Journal of Experimental Psychology: Human Perception and Performance. 1994;20(5):1000-1014. doi:10.1037/0096-1523.20.5.1000. [6] Sobel KV, Gerrie MP, Poole BJ, Kane MJ. Individual differences in working memory capacity and visual search: The roles of top-down and bottom-up processing. Psychonomic Bulletin & Review. 2007; 14(5): 840-845. [7] Itti L, Koch C. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research. 2000; 40: 1489-1506. [8] van Zoest W, Donk M. Bottom-up and top-down control in visual search. Perception. 2004; 33(8). 927-937.

14 | JOURNYS | SUMMER 2020


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.