visual motion perception

Page 1

J

a blur.

ordinary cameras a shutter "freezes" the ge; even in a television camera, ch has no shutter, the scanning raster n electron beam serves the same purpose. In all animals, however, the eye operates without a shutter. Why, then, is the world we see through our eyes not cx

J

works not like

sports testify to man's remarkable ability to visually determine the precise spatiotemporal position of a small fast-moving object. The traditional comparison of the eye and the camera serves the useful didactic purpose of explaining how light rays are focused to produce a two-dimensional image on the surface of the retina. DGculties arise, however, when the photoreceptors embedded in the retina are likened to a photographic film. Unless eliberately wants to get a blurred ge on the film it must be exposed to incident light rays for only a brief od, just enough for the photosensichemicals in the film to "capture" image. Although it is true that the a1 receptors have a similar ability to re photons, their real function is ctured world of not to capture images but to mediate enefit of a shut- changes in light flux. The light impingthe receptors (the rods and the gives rise to a continuous change structure of photosensitive molehe change in structure releases a ons in the receptor, culminating a bioelectric signal that travels from e receptor into adjacent nerve cells. ' The strength of the signal varies with n the light flux. Within a few milliseconds - the myriad changes in signal pattern e. In many lower animals the effi- over the entire retina are combined and erception of moving objects see

y is moving. A motionle hin easy reach, goes quite Evidence for a similar dep anges in the visual stim periments where a special device holds

but mathematical

networks at relay stations in &e nl brain and finally by the neural netwo within a number of receiving termi~ in the cerebral cortex. The result at conscious level is the perception of I tion in visual space. Thus the eyv basically an instrument for analyzhr changes in light flux over time ratl~ than an instrument for recording sr:lti patterns. Roughly speaking, witho~~l change in the light striking the reccphr there would be no change in ion flow L I I I ~ no neural respo

I n studies of visual perception it is often important to distinguish betwrrr~ monocular and binocular vision. For 11 163 range of phenomena I shall take up 1~ however, the contribution made by I ocular perception can be ignored. In ( laboratory at the University of U my colleagues and I have contr variety of experiments to examine 11 the eye deals with moving visual s We include under this heading thc r r l o tion of stationary objects perceived I) moving observer as well as the motioi moving o perceived by a statiou observer. As an i consider camera to make a picture of a f r i d r You look through the viewfinder and cw. tomarily take a few steps forward trl backward until you have the subjtwt properly framed and the image e x p d

rists, hips, knees and ankles. This sequence, which proceeds in vertical columns svartii~# the film can tell in a fraction of a second that they are seeing the movements of two peo&,

. =


I

you move toward the subject every optical element in the viewfinder streams radially outward from a central point. Conversely, when you step backward, the image contracts radially toward the center. If you are a careful photographer, you probably also check the effects of moving the camera up and down and from side to side. Such movements generate optical flows considerably more complex than the radial flow produced by moving directly toward the subject. All such changes in the viewfinder, however, follow the laws of central perspective. They are continuous perspective he optical flow of images into the iewfinder of a camera (or into the camitself when the lens is open)-correds to the optical flow impinging on retina during locomotion. From geometrical point of view it does matter whether it is the camera th bject in front of ould be tivial to say friend to take a step towar me effect on the size of his your moving a step toward him. ificant, however, that in the first of the surrounding envis fixed and in the second environment expands from the optical center. e, when objects move in our ,they give rise to local flow patterns; when we move around in environment, there is an optical flow across the entire retinal surface. In everyday perception the opt

ts a complex combination of patterns generated by the observer's own motion and patterns generated by the motion of moving objects. Even when the observer is simply standing still or sitting, the sway of his body or small movements of his head add a small 'locomotion component" to the flow of the retinal images. Movements of the eye itself introduce a further component into the total flow; the movement can be smooth, as when an observer follows the flight of a ball, or jerky, as when your eye follows these words by a number of saccadic eye movements. The summation of all such optical flows over the retina determines the character of the incessant flow nerve impulses from the retinal rec tors. In order to study the visual formation supplied by a light-reflec of

tion of visual s long ago as ishop Berkeley). The theory was further developed by Hermann von Helmholtz in the 19th century and is still familiar today in a modified version known as cue theory. According to this theory, the two-dimensional image on the retina is visualIy interpreted as being three-dimensional by a number of cues, or signs. The cues are available not only in the ge itself but aIso in the activity of oculomotor apparatus. The cues inude binocular disparity in the images by the two eyes, convergence and

accommoda age, interposition of figures, binocul:l~ perspective and so on. The theory ; t h ) invokes visual-motor experience ;IIN! learning as impo factors. Berkeley knew o etry (the discovery of other geomctl.icSu was still in the future), and as a resull Ircj began his study of the relation betwocw a stimulus and a percept by analyzi~lg the retinal image as if it could be a& quately measured with a ruler and it protractor. Even today many excellmt theorists staywithin the tradition of mcw suring optical projections in rnillimctcm and degrees of arc. This approach 11nr ven rise to many artificial p r o b l c w ch as trying to explain how retinal ilnges of different sizes and forms G I I I eke perception of the same object. New geometries that have come il~lo existence since Berkeley's day are frca of the Euclidean parallel axiom, whicd~ leads to the postulate that parallel liws do not meet. One of the geometries i11nt is not fettered by the parallel axiom i q projective geometry. That geometry is 01 special interest for the study of vision bvcause it is the geometry of optical paths through pinholes and lenses and provides the theoretical basis forperspectivo drawing. It is characterized as being a nonmetric geometry because it deals exclusively with relatio ticular measuremen The first principles of theoretical analysis of visual space perception was made by J. J. Gibson of Cor-


sual stimulus. Math-

at Gibson termed r-order variables," is the effective stimulus. The gradients and variables are essentially consequences of central projection. Gibson also applied these principles to moving patterns, speaking of stimulus flow rather than stimulus images. own thinking closely follows Gib's. Experimental work over the past two decades has led me to break completely with the Euclidean model and to adopt projective relations as the theoretical foundation for investigations of visual space and motion. In retrospect it seems strange tha should have been hypothesized, as it wa

that projective geometry is a geometry dealing with certain relations that remain invariant under perspective transformation. These invariances serve as a counterpart in terms of figural equivalence for the Euclidean figural congruence under the conditions of rigid motion. Mathematicians have also developed a special system of coordinates (homogeneous coordinates) that are determined by distance relations rather than by absolute distances and that make

c

< G >


ble as the same object seen at erent angles [see upper illustration on preceding page]. The forms in the pictures are equivalent because of certain invariant relations, although from a Euclidean point of view they are all different.

-

F'rorn

recent studies of motion per-

moving in three-dimensional space. Indeed, it has been found that continuous perspective transformations always evoke the perception of moving objects with a constant size and shape. This means that the particular projection chosen perceptually by the visual system is one that represents Euclidean invariance under the conditions of motion in

cordance with s from daily life is perhaps t h c l lrmt tt i - * make this rather abstract . ; ~ r \ t r r t t w t t c *ier to grasp. My little gl~t~~l~ltlittit; runs across the floor of nly ~ f i d y t, :+f to show me a ladybug wiilhl~igrirr lir! finger. The optical flow protl~irwt! lilt wr eyes by this scene includw l l ~ rI t l l h - isg components: the light nflwtod f11wt i the floor, the walls and 1110 h r t t ~ i t t t t t . trt

_---

ENSIONAL FIGURE seems to be traced by an imaginary rod that is created when two spots of light move at constant speed on the opposite sides of a rectangular path. The built-in tendency for the visual system to perceive the moving spots

as being connected to each other and forming a rigit1 m l ~ s ~ t l i t t * leads to the perception of a rod that is rotating around u n l n l l n t ~ * + ~ central point in a jerky manner, executing a strange ~ l ~ r r w l i ~$8t r ; sional motion the observer quite probably has never s c w 1 Iret<*ts


1

I [

[

I quite clearly this way as having- a common frame of reference. Perceptually I experience the room as being static, the child as running across the floor, the child's hand and arm as moving relative to her body, the child's finger as moving relative to her hand and the ladybug as moving re1at'ive to the child's finger. Thus my visual system abstracts a hierarchical series of moving frames of reference and motions relative to each of them. The perceptual analysis of the ovtical flow as a hierarchi& series of cimponent motions follows closely the principles of ordinary mathematical vector analysis; hence it has been termed -perceptual vector analysis. In our laboratory at the University of Uppsala we have-devoted much experimental effort to a search for the baiic principles underlying this perceptual function.

I shall now briefly describe some typical experiments in my laboratory involv' ing perceptual vector analysis and its )

geometric basis. In most of the experiments the visual stimuli consist of computer-controlled patterns displayed on a televisionlike screen and ~roiectedinto the eyes of our subjects by means of a collimating device that removes parallax I

J

demonstrated inone of my earliest experiments. The stimulus pattern consists simply of three bright spots, A, B, C, one above the other, moving back and forth along straight [see lower iuustration on page 791. When the top and bottom spots, A and C, are displayed alone, moving horizontally to the left and to the right, they seem to be rigidly connected. When the middle spot, B, is presented "lone, it is "correctly" seen as moving in a sloping path. When the three elements are presented simultaneously, however, we get an example of perceptual vector analysis. The entire unit ABC seems to be moving horizontally as a unit, but the path of does not appear to be sloping; instead B seems to be moving vertically up and down in a straight line. This result can be generalized: Equal vectors or vector components form a perceptual unit that acts as a moving - frame of reference in relation to which secondary components seem to move. A more recent series of experiments in which a few points trace an ellipse or some other conic section provides other striking insights into the geometryof perception. If we present on our display - screen two spots opposite an imaginary '


f a square onal path. The perception of bending may continue until touches the opposite corner. A given observer will initially perceive bending as being either e can reverse apparent direction of motion.

center point tracing an ellipse, obs always seem to see a rigid rod of only the end points are visible [see upper illustration on page 801.Even more sur' ' the rod is seen as rotating in a m (or toward) a to the comcircle. Even

- ment in which the two spots of light follow a perfectly rectangular path [see lower illwtration on page 801. I must admit I was surprised to find that even in this case the two spots appear to be the lighted ends of a rigid rod rotating around a fixed central point. One might expect that one would simply see two spots (perhaps elastically connected) chasing each other around a rectangular track. Instead an imaginary rod is again seen; its length seems to be constant as the rod describes a curious path in which it rotates for part of the time in a nearly vertical plane and then slants rapidly away from the vertical and back again. So strong is the perceptual tend&cy toward abstract projective invariance that a highly complex and "unnatural" motion--one that may not have been seen before-is preferred to the simple

track traced by two moving ntly it is obligatory that the spatial relation between two isolated moving stimuli be perceived as the simplest motion that preserves a rigid connection between the stimuli. The general formula is sp motion. In a related b full outline of a simple geoetric figure whose shape is systematied in a particular way. For e observer may be shown alternately contracting and ex[see top illustration on precedat the observer perceives, however, is a square of fixed size alternately receding and approaching. He s the square as a stationt is changing in size. The result again means that the visual system automatically prefers invariance of figure size, obtained by inferring motion in three-dimensional space. The next experiment I shall describe is perceived two different ways by different observers. Some observers seem to see it only one way whereas for others the two types of percept alternate. In this presentation the top and bottom of a square alternately shrink and expand as in the preceding experiment while the sides of the square move in and out a smaller distancesAGeometrically a large square collapses to form a somewhat smaller rectangle. " * then exoands to its original shape [see bottom ikstration on preceding page]. All observers have the impression that the figure is alternately advancing and retreating. For one group of observers, however, the figure seems

to change during from a square into a rectangle and l ~ i d again. F& a second group of obsrwrr~ the square seems to remain a sqwrrt rl all times, but a square that is rotskirig back and forth around its horizo~~tiil ralk as it advances and retreats. Thus wo rtt counter two variants of a vector ani~lyla in the geometric framework of ca~rtltll projection. The first variant is prvtlcrt lady interesting because it reprowts perception of simultaneous motion rrr~d change of shape, such as one migltl ri 1 in a moving cloud or a ring of cipto[&g smoke. A final example, taken from a N o t experiments that Gunnar Janssol~u r d 1 have recently published, involvcs it ~~illr er subtle change in the geomctry of it square: one corner is made to movo n ' l r ) ~ ly toward the center of the sqrw c t w t d back again [see illustration at lcfI], ?'lie result is interesting because it t l t v r ~ c t t i strates a new type of perceptual irwm I ance. The observer has the illuhiiu~tit&! the square is a flexible surfaco with corner that is bending toward hiw. 'l'hlk may seem surprising, but it is jujt wlr&l[ one would expect if the figural ~ I I I I I I ~ ~ C ~ interpreted as being a continuous Irt I spective projection. It is a common characteristic of ,111 tlitexperiments I have described 1 1 ~ 1t l ~ observer is evid~lltlynot free to c ~ l t r ~ s t r r r between a Euclidean interprcti~tfotrcd the changing gcometry of tllc 1ig11teIlr the display and a projective intcnrpr~t~i tion. For example, he cannot pc*t.a\itttli himself that what he sees is simply t i square growing larger and smallcr in [lrti same visual p1:111o;his visual systtw 411 sists on telling him that he is scdtr6 square of constant size approachil~grr crt! receding. Hcncc hc perceives rigit1 [ t l r t tion in depth, rotution in a specific #ldfll. bending in dcptli and so on, paircxtl ~ I i l i the highest possiblc degree ol ol+r I constancy. T h e theory of visual perception I l l n ~ t outlined hcrc is based on s t u d i i ~; ~ 1 [ 1 t artificial and highly simplified sti~n~~lilft patterns. Such cxpcriments 11~1p-(I tat demonstrate tht\t. the visual systctlr tireq the geometry of ccntral perspcctivtr art11 enabled us to formulate the principlttg r j i perceptual vcctor analysis. It ww niti ural for my collcugues and me to a d ourselves: Is thcro any way to show trk perimentally tl~ritthe principles of pat ceptual analysis dso hold true for thi more complcx pirltcrns of motio~tsV I I countered in everyday life? In nlt at tempt to answur this question wc ~ ~ I U I some years ngo to study experimc~tlrillf


-

-

ated by men and animals, that might be called biological motions. Consider, for example, all the intricate coordinations of frequencies, phase relations, amplitudes and acceleration patterns that are accomplished by one's skeletal structure when one merely walks across the floor. Even in such a simple act scores of articulated bones make precise rotations around dozens of joints. Our simple early experiments had demonstrated that the moving points of an otherwise invisible s line carry enough information to the impression of a rigid line m three-dimensional space. We hypothesized that if we pres motions of the joints of a walking p in the form of a number of bright of light moving against a d ground, an observer might pe the spots represented someone walk We attached small flashlight bulbs to shoulders, elbows, wrists, hips ankles of one of our co-workers and made a motion-picture film of him as he moved around in a darkened room [see illustration below].

The results, when the motion picture

our expectations. During the opening scene, when the actor is sitting motionless in a chair, the observers are mystified because they see only a random collection of lights, not unlike a constellation. As soon as the actor rises and starts moving, however, the observers instantly perceive that the lights are attached to an otherwise invisible human being. They are able not only to differentiate without hesitation between walking and g movements but also to recognize anomalies in the actor's behavior, s the simulation of a slight limp.

ity of the human e a dozen or two ghts as the motions of people led us to study the minimum exposure time required for the sensory organization of such patterns. The result, recently published by our group, is that a tenth of a second (the time needed to

LIGHT TRACKS OF WALKING PERSON (left) are recorded by making a time exposure in a dark room of a subject fitted with 12 small lights at his principal joints, as is shown at the right. The continuous streaks generated in this way have no obvious interpre-

-often- enough to enable a na'ive observer to identify a familiar biological motion. This finding, together with results not yet published, has led me to believe that the ability of the visual system to abstract invariant relations from the kind of patterns I have been describing is the product of "hard-wired," or fixed, visual pathways originating at the retina and terminating in the cortex. It is as if the hierarchies of relative invariances in the optical flow were filtered out and established before the visual signals reach the level of consciousness. And contrary to tation the more complex a projeccoherent pattern is from the mathcal point of view, the more effecthe sensory decoding is. (Witness decipherment of the dancing lights.) Evidently as the degrees of freedom are reduced the stimuli become rich in r dundant information. rom our investigations we now Fthat the component in the optical flo that is a consequence of locomotion erally represents a continuous pers tive transformation. Generalizing fur from ow experiments, we conclude th

tation. If, however, the moving-light patterns are reco motion-picture film, one can see instantly when the film is ed that it portrays a person walking. Motion-picture frames o similarly lighted subjects dancing in the dark appear on page


CANS0 SPREADS 800 GALLONS OF WATER AT A TIME,

XOOPED FROM NEARBY U K E S M UIGM SEED TAXI W S .

QUESTAR PHOTOGRAPHS THE FIREFIGHTERS us a fabulous collection of photographs ORAD Radar Site i n Quebec. What had of forest-fire fighters taken with started as a brush fire was fanned by a steady 30 mph breeze that sent flames licking up th mountain toward the mdomes on top. Airborne help quickly converged on the scene and there are many excellent shots of the planes i n action, 2 of which are shown h e n waterbombing the blaze. The film was Tri-X, exposed at ASA 1200. Focusing the Questar was tricky, Keyworth rays, what with the planes moving away from him at 150 feet per second, but in every case the picture is sharp and clear with great depth of field. "Ever since Iacquire my Questar i t has been my goal to secure interesting stop-action aviation photographs," he ys. We have the w h story in a leaflet for those who would like it. Just A 1 LEFT, A M U O R

RADOMES ON MOUNTAIN (SHOWN BETWEEN REGROUND. RIGHT, FIRE NOW U N E R CONTROL.

a complete description of Q finest, most versatile telescope with i cia/ applications i n research and industry, be sure to send for the Quesrar booklet 150 photographs by Questor owners token with the 3-1/2 and Seven. Send $1 moiling costs on this continent. By air to South America, $2.50; Europe and North A

Box F20, New Hope, Pennsylvonio 1893

rnement Du Quebec plane trove cockpit. Antennae wires are t

human visual environment are inte ed as rigid structures inrelative m In this regard the theory and our e ences are in good correspondence; can be no doubt that we perceiv environment as being rigid. The term relative motion can however, that either the perceiver or environment (or both) can be rega as moving relative to the other. Bot periments and experience indicate t l l i the environment forms the frame of rc erence for human locomotion. The wol is perceived as being stationary and observer as being in motion. From point of view of theory we may no theless ask: Why is the eye itself not ultimate reference? Why does one I perceive the ground to be moving stead of oneself? From the point of viwz of function, the answer is easy: The ceptions supplied by a "stationary" would be less informative. But let n nore function, since we are considc~II the principles of decoding. We recognize, of course, that vis11111 information about locomotion does no1 stand alone; it interacts with signals 110 1 1 other sense organs that report boclily movements: organs in the joints, ixr tl~r muscles, in the inner ear and so on. 1 I 15 evident, however, that our conscioust~cwi of locomotion requires something Experiments have shown that the perception of locomotion is able to o ride conflicting spatial information I'rorii those other sensory channels. Tlltla II seems that the optical flow that covmr the retina during locomotion takes ptwodence over all other sensory informal I ~ I . The work I have reported, togcstllrt with comparable studies from many oth er laboratories, provides the out1i11c.vtrl what one might characterkze as a rctlci~ tivistic theory of vision. Th.e central lird= ing is that the geometry of the decotlit~g of visual stimuli is a relational onc 5iritb lar to projective geometry. In acrordance with this geometry, series of reliltb invariances, or perspective transfol r ~ r i i s tions, are abstracted from the total c p tical flow. This results in hierarcl~icd systems of different components thn ( iu'r perceived both in common and in r& tive perspective transformations. As out experiments make clear, human bviiip tend to perceive objects as posscsmg constant Euclidean shapes in rigid 11111 tion in a three-dimensional world. In ra41 life these principles of visual aii.~lynh taken together give rise to a satisfactorll\ close correspondence between the p l t y s i cal world and what we perceive llln{


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.