Page 1

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY 2006, 59 (9), 1567 – 1580

A configural effect in visual short-term memory for features from different parts of an object Jean-Franc¸ois Delvenne and Raymond Bruyer Catholic University of Louvain, Louvain-la-Neuve, Belgium

Previous studies have shown that change detection performance is improved when the visual display holds features (e.g., a colour and an orientation) that are grouped into different parts of the same object compared to when they are all spatially separated (Xu, 2002a, 2002b). These findings indicate that visual short-term memory (VSTM) encoding can be “object based”. Recently, however, it has been demonstrated that changing the orientation of an item could affect the spatial configuration of the display (Jiang, Chun, & Olson, 2004), which may have an important influence on change detection. The perceptual grouping of features into an object obviously reduces the amount of distinct spatial relations in a display and hence the complexity of the spatial configuration. In the present study, we ask whether the object-based encoding benefit observed in previous studies may reflect the use of configural coding rather than the outcome of a true object-based effect. The results show that when configural cues are removed, the object-based encoding benefit remains for features (i.e., colour and orientation) from different parts of an object, but is significantly reduced. These findings support the view that memory for features from different parts of an object can benefit from object-based encoding, but the use of configural coding significantly helps enlarge this effect.

Visual short-term memory (VSTM), the temporary storage of visual information (Phillips, 1974), has been a central topic among cognitive scientists for many years (Baddeley, 1986; Logie, 1995; Pashler, 1988; Phillips, 1974; Sperling, 1960). Nowadays it is commonly acknowledged that VSTM can be differentiated from iconic memory (Phillips, 1974; Sperling, 1960), the preattentive visual sensory memory having the capacity to

store a large amount of visual information but only for a very short period of time (less than half a second). It can also be differentiated from verbal short-term memory (Baddeley, 1986; Baddeley & Hitch, 1974), which involves the temporary storage of verbal information, and from spatial short-term memory (e.g., Hecker & Mapperson, 1997; Mecklinger & Muller, 1996; Smith et al., 1995; Tresch, Sinnamon, &

Correspondence should be addressed to Jean-Franc¸ois Delvenne, Catholic University of Louvain (UCL), Faculty of Psychology, Cognitive Neuroscience Unit, 10, Place du Cardinal Mercier, B-1348 Louvain-la-Neuve, Belgium. Email: jean-francois.delvenne@psp.ucl.ac.be The first author is supported by the Belgian National Fund for Scientific Research (FNRS). We are grateful to Glyn Humphreys, Yaoda Xu, and Yuhong Jiang for their helpful comments on this manuscript. We would particularly like to thank Steve Luck for suggesting the design of Experiment 4. # 2006 The Experimental Psychology Society http://www.psypress.com/qjep

1567 DOI:10.1080/17470210500256763


DELVENNE AND BRUYER

Seamon, 1993), which is concerned with spatial information. Obviously, VTSM is narrowly connected with perception. Indeed, visual information must first be perceived so as to be maintained in memory. In everyday life, visual information that we perceive at a given time is usually rich in details. A large number of objects are continuously present in the visual field, exceeding what can be handled by the brain, and therefore stored in VSTM. There is indeed much evidence that very little visual information can be consciously perceived and held simultaneously in memory from one moment to next (Irwin, 1991; Luck & Vogel, 1997; Pashler, 1988; Phillips, 1974; Simons, 1996). A good illustration of this limited capacity in visual processing is the phenomenon known as “change blindness”. People often fail to notice large changes in a visual scene when these changes occur across brief perceptual disruptions such as eye movements, blank intervals, blinks, and so on (see the review by Simons & Levin, 1997). Using simple stimuli (colours, letters, orientations, shapes, etc.), studies have shown that people can simultaneously encode and maintain only 3 to 4 items (Luck & Vogel, 1997; Pashler, 1988), which is comparable with estimates of attentional capacity (Pylyshyn & Storm, 1988; Scholl, 2001) and mental storage capacity (Cowan, 2001). The total number of simple features that can be retained in VSTM, however, can be significantly enhanced when the features are grouped into a few discrete objects (Irwin & Andrews, 1996; Luck & Vogel, 1997; see also Vogel, Woodman, & Luck, 2001, for a full report; Wheeler & Treisman, 2002). This finding has been used to support the view that the units of VSTM could be integrated objects. Moreover, this effect of integration (i.e., the object-based encoding benefit) in VSTM would be greatly dependent on how perceptual mechanisms parse the visual input. For example, Xu (2002a) has demonstrated that features are better registered in VSTM when they belong to the same part of an object than when they belong to different parts of an object (see also Duncan, 1993). Similar effects of

1568

integration were evidenced by Delvenne and Bruyer (2004), who have shown that, when features are located at the same spatial location, and thus in the same part of an object, they are not only better registered in VSTM than when they belong to different parts of an object, but they are also registered just as accurately as single features. As soon as the features are from different parts of an object, the location/proximity and connectedness become decisive in determining the degree of the objectbased encoding benefit (Xu, 2005). Just as features might be parts of an object, objects themselves are parts of a larger scene. When viewing a visual scene, the spatial configuration information about where objects are located with respect to their neighbours seems to be immediately encoded in VSTM (Chun & Jiang, 1998; Jiang, Olson, & Chun, 2000), prior to the objects themselves. For example, Jiang et al. (2000) have shown that a spatial configuration change between two displays containing coloured squares affected colour VSTM. In contrast, a colour change between two displays did not affect the memory of the configuration. Spatial configuration seems also to be preattentively (Aginsky & Tarr, 2000) and very well stored in VSTM, even better than the objects themselves (Simons, 1996). Previous studies that have demonstrated an object-based encoding for features from different parts of an object (Xu, 2002a, 2002b) used the change detection paradigm in which two arrays of multiple visual items are presented and separated by a brief interval. Participants had to decide whether the two arrays are similar in terms of feature identity. Because all the features are in the same spatial location from the first to the second array, the encoding of the spatial configuration seems to be of no use in such a task. However, these studies used the orientation feature as one of the relevant dimensions. In that case, the encoding of the spatial configuration of the display might be helpful in detecting a line orientation change from one array to another. Indeed, the spatial relations between items in a display of line orientations might be disrupted with a change of orientation because the precise location of the stimulus’ pixels has changed too.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)


CONFIGURAL EFFECT IN VSTM

This particular characteristic of a line orientation change has been previously proposed by Delvenne, Braithwaite, Riddoch, and Humphreys (2002) and Jiang, Chun, and Olson (2004). Delvenne et al. (2002) used a change detection task in which each object in a display was surrounded by a circle. Such an experimental manipulation was proposed to change the formation of the object at a pre-VSTM level by considering the objects of the display as identical discs that could only be distinguished by their inner parts. While a location change of a stimulus in a display might be detected through a global spatial configuration change of that display (Jiang et al., 2000), a location change of a stimulus within a circle was assumed to decrease the perception of the spatial configuration change. Indeed, with circles around the objects, the spatial configuration might result from the spatial relations between the circles rather than between the objects that are inside the circles. Delvenne et al. (2002) found that when this manipulation was applied to an orientation and texture change detection task, the orientation but not the texture change detection task was selectively affected. These findings were taken as evidence that the detection of a line orientation change can be assisted by the detection of a spatial configuration change. In the same way, Jiang et al. (2004) found that a location change detection of an item in a display was unaffected by changes in the shape or colour of contextual items. However, such location change detection was severely impaired by changes in the orientations of contextual items. The authors proposed that changing the orientations of the stimuli altered the perceptual organization of the display, which is consistent with Delvenne et al. (2002)’s proposal. If a line orientation change in a display is detected through a spatial configuration change, it is therefore more than reasonable to assume that the efficiency of such change detection should be inversely correlated with the complexity of the spatial configuration. A good example of studies that may have, at the same time, varied the complexity of spatial configuration and used the orientation dimension are those that have

demonstrated an object-based encoding benefit for features from different parts of an object (Xu, 2002a, 2002b). These studies used colour and orientation as the two relevant dimensions and showed that a feature change was better detected (indicative of a better memory) when the features were grouped into different parts of the same object than when the features were spatially separated. When features are all spatially separated in a display, the spatial configuration is most likely based on the spatial relations between each feature. In contrast, when the features are grouped into different parts of an object, the spatial configuration could be based on the spatial relations between the objects rather than between the features, leading to a reduced amount of spatial relations. Therefore, the perceptual grouping of features into objects could make the spatial configuration of the display simpler. In the present study, we asked the question: Does the object-based encoding benefit observed by Xu (2002a, 2002b) reflect more accurate configural change detection rather than reflecting a true object-based effect? Three competing hypotheses about this observed object-based effect were compared. The first one was the strong object-based hypothesis, according to which a true object-based encoding exists for features of an object when they are from different parts of that object. The second hypothesis was the strong configural hypothesis. Here, the perceptual grouping of features into different parts of an object improves the change detection performance purely thanks to the use of configural coding. The third possibility, which we call the weak configural hypothesis, is that an object-based encoding exists for features from different parts of an object, but the use of configural coding can help enlarge this effect.

EXPERIMENT 1 In this experiment, we attempted to replicate Xu’s (2002b, Exp. 3) findings, in which an object-based encoding benefit was found when features (colour

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)

1569


DELVENNE AND BRUYER

and line orientation) were grouped into different parts of an object. Additionally, in order to suppress the use of configural coding in the change detection task, we surrounded each orientation of line by a frame in some conditions (see Figure 1). Our objectives were (a) to see whether the object-based encoding benefit could be replicated with similar stimuli to those used by Xu (2002b), and (b) to examine whether we can observe a reduction or a suppression of the object-based encoding benefit when configural cues are removed (i.e., when the line orientations are each surrounded by a frame). According to the strong object-based hypothesis, the objectbased encoding benefit should not be affected by the frames. In contrast, the strong configural hypothesis predicts that the frames should fully suppress the object-based effect. Finally, in accordance with the weak configural hypothesis, we should still observe an object-based encoding benefit with the frames, but with a significant drop in the magnitude of the effect.

Method Participants A total of 16 undergraduate students from the University of Louvain-la-Neuve participated in this 40-min study for course credit (12 females; mean age Ÿ 20 years, range 18– 23 years). All had normal (self-reported) or corrected-tonormal visual acuity and colour vision. Participants were unaware of our hypothesis. Apparatus In every experiment of the present study, the displays were generated by a PC of 733 MHz on a 00 17 screen. The scripts for the experiments were generated by E-Prime programming software. Materials Colour orientation mushroom-like objects similar to those used by Xu (2002b; Exp. 3) were used. The mushroom caps carried the relevant colour feature, and the mushroom stems carried the relevant orientation feature. Caps were in one of six colours (red, blue, yellow, green, grey, and

1570

Figure 1. Samples of displays used in Experiment 1. In (A) and (B), the relevant colour and orientation features were located on separated objects; in (C) and (D), the relevant colour and orientation features were located on different parts of the same object. A frame surrounded each line orientation in (B) and (D). The different grey levels represent different colours. The results (A 0 and correct response latencies) for monitoring colour orientation conjunction and disjunction displays in the unsurrounded and surrounded conditions are shown in (E) and (F). Error bars represent standard errors in the means.

turquoise), selected to maximize their discriminability. Three orientations of line (458, 908, and 1358 relative to horizontal) in a light-greenish

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)


CONFIGURAL EFFECT IN VSTM

colour were used for the stems.1 At a viewing distance of 60 cm, the sizes of the cap were 1.348  0.578, and those of the stems were 0.868  1.058 (except for the vertical, 908 orientation, which subtended a 0.388  1.058 visual angle). All items were displayed against a black background within a space subtending a 7.28  7.28 visual angle, with the objects in a given display separated at least by 2.88 (centre to centre). There were two main display types. In the disjunction displays, three detached pairs of caps and stems were shown (Figures 1A and 1B). In the conjunction displays, three attached pairs of caps and stems were shown (Figures 1C and 1D). For each display type, there were two conditions: In one condition, each stem was surrounded by a frame (Figures 1B and 1D) whereas in the other condition, no frame surrounded the stems (Figures 1A and C). The frame subtended a 1.18  1.338 visual angle and was displayed in white. Thus, a total of four display types were used in the present experiment: (A) disjunction, (B) surrounded disjunction, (C) conjunction, and (D) surrounded conjunction. In both (A) and (B), the relevant colour and orientation features were located on separate objects, whereas in (C) and (D), the relevant features were located on different parts of the same object. Participants monitored both the cap colours and the stem orientations for a possible colour change in one of the caps or a possible orientation change in one of the stems. Trials were blocked by condition, for which the order was counterbalanced across participants. In each trial, a same/different judgement had to be made for two successive displays. For each condition, there were a total of 96 trials, with 24 cap colour change trials, 24 stem orientation change trials, and 48 no-change trials uniformly distributed into three blocks. At the beginning of each condition, a detailed description of the task was given to the participant, followed by 16 practice trials before the experimental trials.

The computer randomly chose three of the six colours. This procedure was reiterated to select the three orientations with the constraint that the same orientation was used no more than twice in a given display. When a colour changed, the changed colour was one of the three colours not previously allocated to any caps. When a stem changed its orientation, the changed orientation was one of the two remaining values. There were nine (3  3) virtual sectors in the displays, but only the eight peripheral ones could carry an object. This procedure was used to avoid the possible encoding advantage for the central location in the visual field. The objects were distributed over the eight virtual sectors as follows: for (A) and (B), six of the eight locations were selected at random; for (C) and (D), three of the eight locations were selected at random, with the constraint that the three chosen positions were always separated by at least one blank sector. This procedure ensured that both the disjunction and conjunction displays occupied similar envelopes.

Procedure For each trial, the sequence of displays was as follows: A fixation cross flashed for 500 ms at the centre of the screen, followed by the sample display for 400 ms. The sample display was then replaced by a 1,000-ms blank, black-background interval, and, finally, the test display appeared and remained on the screen until a response was given. Participants had to indicate by a key-press, as accurately and as quickly as possible, whether the test display was the same as, or different from, the sample display. As soon as the participant made a key-press, the test display disappeared, and the next trial started 1,000 ms later. At the end of each block, participants were given their correct response rate and mean response latency and were kindly warned if their scores were lower than 70% of correct responses. There

1

As shown in Figure 1, the 458 and 1358 orientations were oriented rectangles with a vertical upper part. We used these particular features in order to compare as accurately as possible the data from the present paper to those of Xu (2002b)’s study. THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)

1571


DELVENNE AND BRUYER

were breaks between each block during which participants could rest for as long as they wished.

Results and discussion In the following analysis, and in all subsequent experiments of the present study, we used the nonparametric measure of sensitivity (A0 ).2 The percentages of correct responses were 76.69%, 74.80%, 86.25%, and 79.26% for the disjunction, surrounded disjunction, conjunction, and surrounded conjunction conditions, respectively. The resulting means of A 0 are shown in Figure 1E. The display type (disjunction vs. conjunction)  condition (unsurrounded vs. surrounded) analysis of variance (ANOVA; repeated measures) on sensitivity revealed a significant main effect of display type, F(1, 15) ¼ 12.53, MSE ¼ 0.0021, p , .005, with better performance for conjunction than for disjunction displays, as well as a significant main effect of condition, F(1, 15) ¼ 28.61, MSE ¼ 0.0014, p , .001, with better performance for unsurrounded than for surrounded displays. More importantly, the interaction between display type and condition was significant, F(1, 15) ¼ 5.58, MSE ¼ 0.00075, p , .05, showing a stronger effect of display type when the stems were not surrounded by a frame. In other words, the advantage of the conjunction over the disjunction displays was significantly reduced for the surrounded displays. A post hoc analysis (Tukey HSD tests) confirmed the significant differences between (A) and (C), p , .001, and between (B) and (D), p , .05, and a significant condition effect for the conjunction displays (p , .001), but not for the disjunction displays (p ¼ .100). Concerning the correct response latencies (see Figure 1F), the Display Type  Condition ANOVA (repeated measures) revealed a significant effect of display type, F(1, 15) ¼ 4.68,

MSE ¼ 4,294, p , .05, with conjunction displays being performed faster than disjunction displays, but no effect of the condition despite a tendency towards faster responses for unsurrounded than for surrounded displays, and no interaction. Clearly, no speed –accuracy trade-off was at work. The present experiment revealed a striking object-based encoding effect for colour orientation mushroom-like objects, therefore replicating Xu’s (2002b; Exp. 3) findings. Indeed, when the features were located on different parts of the same object, the change detection performance was improved compared to when they were all spatially separated. However, by introducing the frames around the stems, we found a significant decrease in the object-based encoding advantage. The frames around the stems are assumed to suppress the detection of a configural change caused by a change of a line orientation from the sample to the test display. These findings provide evidence against the strong object-based hypothesis as the magnitude of the object-based encoding benefit was reduced in the absence of configural cues. However, the strong configural hypothesis must also be rejected as an object-based encoding advantage was still observed even when the configural cues were suppressed. Therefore, both objectbased encoding and configural coding account for the better change detection performance when the features are grouped into different parts of an object than when they are all spatially distributed, supporting the weak configural hypothesis. The next experiments were designed to rule out three alternative accounts for the drop in the object-based encoding benefit when the stems were surrounded by a frame, which could not involve spatial configuration. First, the local contours of the frames may have served as lateral masks, or noise that could make the correct perception of the lines much more difficult. Such a lateral masking at a pre-VSTM level (e.g., the

2 0 A increases from .5 for chance performance to 1.0 for perfect performance (see Macmillan & Creelman, 1991, for more detailed information regarding A0 ). A0 was calculated for each participant in each condition following the formula developed by Grier (1971), A0 ¼ 0.5 þ [(H 2 g)(1 þ H 2 g) / 4H(1 2 g)], where H is the rate of correct detection of change (hit rate) and g the rate of incorrect detection of change (guessing rate). When g was greater than H, the following formula was used (Aaronson & Watts, 1987): A0 ¼ 0.5 2 [(g 2 H)(1 þ g 2 H)/4g(1 2 H)].

1572

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)


CONFIGURAL EFFECT IN VSTM

perception of the lines) could have led to the disruption of an object representation in VSTM. Note that this objection may also be applied to the use of circles in Delvenne and al.’s (2002) study. Second, visual complexity of the surrounded conditions could make feature perception and encoding much harder in the conjunction than in the disjunction condition, and that may explain why the object-based encoding was significantly reduced compared to the unsurrounded conditions. Finally, proper integration of the cap with the stem in the surrounded conjunction condition might have been more difficult to compute because of the line of the frame that was placed between the cap and the stem (see Figure 1D). Participants might have perceived that delimitation between the two parts of the mushroom-like object, and, consequently, they might have experienced difficulties in perceiving these features as two parts of the same object.

EXPERIMENT 2 In this experiment, we used a simple visual search task to test whether the correct perception of the lines was disrupted with the introduction of the frames around them.

Method Participants A total of 8 volunteers from the University of Louvain-la-Neuve participated in this 20-min study (6 females; mean age ¼ 25.5 years, range 22 –28 years). All had normal (self-reported) or corrected-to-normal visual acuity and colour vision. Participants were obviously unaware of our hypothesis. Materials The three types of orientation (458, 908, 1358) from Experiment 1 were used. They were distributed randomly against a black background at 16 possible positions evenly located in an imaginary 4  4 matrix that subtended a visual angle of 98  98. As is shown in Figures 2A and 2B, the

Figure 2. Examples of search displays used in Experiment 2. In (A), the lines were not surrounded by a frame, whereas in (B), a frame was placed around each line. The results (correct response latencies) for unsurrounded and surrounded displays, and when the target was present and absent, are shown in (C). Error bars represent standard errors in the means.

stimuli were not perfectly aligned with each other and were separated at least by 28 (centre to centre). Each display contained 8, 12, or 16 stimuli. There were two display types: In the unsurrounded displays, the lines were not surrounded by a frame (Figure 2A), whereas in the surrounded displays, a frame was placed around each line (Figure 2B). The target could be the 458, 908, or 1358 orientation of line. The distractors were randomly distributed into the two remaining orientations. The target was present in 50% of the trials. For each display type, trials were distributed into three blocks according to the type of target. There were 60 trials for each block (20 trials at each set size). Consequently, a total of 360 trials were used in the present visual search experiment and uniformly distributed into six blocks. At the beginning of each block, participants were shown the target they had to search for, followed by 10 practice trials before the experimental trials.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)

1573


DELVENNE AND BRUYER

Procedure For each trial, a fixation dot flashed for 500 ms at the centre of the screen, followed by the visual search display, which remained on the screen until a response was given. The task was to indicate by a key-press, as accurately and as quickly as possible, whether the target was present or absent. As soon as the participants made their key-press, the display disappeared, and the next trial started 500 ms later. There were breaks between each block during which participants could rest as long as they wanted. The experiment lasted approximately 20 minutes.

Results and discussion Concerning the correct response rate, the Condition  Display Type  Set Size ANOVA (repeated measures) revealed a condition effect, F(1, 7) ¼ 30.8, MSE ¼ 0.00134, p , .001, such that there were more misses than false detections (4.41% and 0.35% of the trials, respectively). However, the effects of display type and set size were not significant, and there was no interaction. In the analysis of the response latencies, some of the data were removed because of response errors or when the latencies were longer than 3,000 ms (0.025%). The Condition (presence vs. absence of target)  Display Type (unsurrounded vs. surrounded)  Set Size ANOVA (repeated measures) on the remaining data was analysed, and the means are shown in Figure 2C. The effect of condition was highly significant, such that it was faster to detect the presence of the target than to detect its absence, F(1, 7) ¼ 41.17, MSE ¼ 137,825, p , .001. There was also a significant set size effect, F(2, 14) ¼ 37.26, MSE ¼ 19,220, p , .001, and a significant interaction between the condition and the set size effect, F(2, 14) ¼ 19.26, MSE ¼ 12,678, p , .001. A post hoc analysis (Tukey HSD tests) revealed a condition effect in every set size ( p , .001), but a set size effect only when the target was absent ( p , .01 between set sizes 8 and 12; p , .05 between set size 12 and 16; p , .001 between set sizes 8 and 16), such that the response latencies increased with the number of distractors. However, search

1574

time in the surrounded and unsurrounded displays did not differ, F(1, 7) ¼ 0.0716, MSE ¼ 97,033, p ¼ .7968, indicating that the perception of the lines was not impaired with the frames around them. These data demonstrated that the perception of the lines is not impaired with the frames around them. Therefore, it is unlikely that the decrease of the object-based encoding benefit found in Experiment 1 when the lines were surrounded by the frames may be explained by an incorrect perception of the lines.

EXPERIMENT 3 The second alternative explanation to the findings of Experiment 1 is that the objects might have been visually more complex in the conjunction display than in the disjunction display when a frame surrounded the stems. As a result, the comparison between the unsurrounded and surrounded conjunction displays might have been biased against the surrounded display, erasing a potential similar object-based advantage for that display. To address this, we used the two surrounded displays from Experiment 1 (see Figures 1B and 1D), and we asked participants to attend either to features from one dimension (colour or orientation) or to features from both dimensions. If the encoding of one dimension of the surrounded display is not affected by whether or not features were attached or detached, visual complexity cannot account for the reduction in the object-based encoding advantage found in Experiment 1 for the surrounded displays.

Method Participants A total of 12 undergraduate students from the University of Louvain-la-Neuve participated in this 40-min study for course credit (11 females; mean age ¼ 19.6 years, range 19– 22 years). All had normal (self-reported) or corrected-tonormal visual acuity and colour vision. Participants were unaware of our hypothesis.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)


CONFIGURAL EFFECT IN VSTM

Materials and procedure The procedure and all stimulus parameters were the same as those in Experiment 1, except that only the surrounded disjunction (Figure 1B) and the surrounded conjunction (Figure 1D) displays were used in two different conditions. In one condition, participants had to memorize only one of the two dimensions (monitoring one dimension), whereas in the other condition, they had to memorize both dimensions (monitoring both dimensions). For the monitoring one dimension condition, there were a total of 96 trials, with 24 colour change trials and 24 no-change trials in the monitoring colour condition, and with 24 orientation change trials and 24 no-change trials in the monitoring orientation condition. For the monitoring both dimensions condition, there were a total of 96 trials, with 24 colour change trials, 24 orientation change trials, and 48 nochange trials uniformly distributed into four blocks. Trials were blocked by display type and condition, for which the order was counterbalanced across participants. At the beginning of each block, a detailed description of the task was given to the participant, followed by 16 practice trials before the experimental trials.

Results and discussion When one dimension had to be monitored, the percentages of correct responses were 93.14% and 90.97% in the disjunction and conjunction displays, respectively. When both dimensions had to be monitored, the percentages of correct responses were 76.13% and 81.94% in the disjunction and conjunction displays, respectively. The final means of A0 are shown in Figure 3A. The Display Type (disjunction vs. conjunction)  Condition (monitoring one vs. both dimensions) ANOVA (repeated measures) on sensitivity revealed a significant main effect of condition, F(1, 11) ¼ 60.831, MSE ¼ 0.0016, p , .001, with better performance for monitoring one dimension than for monitoring both dimensions, but no significant effect of display type. However, the interaction between display type and condition was highly significant, F(1, 11) ¼ 19.706,

Figure 3. Results (A 0 and correct response latencies) of Experiment 3 for monitoring one versus both dimensions for colour orientation conjunction and disjunction displays in the surrounded conditions. Error bars represent standard errors in the means.

MSE ¼ 0.00056, p , .001. Post hoc analysis (Tukey HSD tests) revealed a significant difference between disjunction and conjunction displays when both dimensions had to be monitored (p , .001), but no difference between those two display types when one dimension had to be monitored (p ¼ .589). Concerning the correct response latencies (Figure 3B), the Display Type  Condition ANOVA (repeated measures) revealed a significant effect of the condition, F(1, 11) ¼ 13.3, MSE ¼ 12,608, p , .005, with monitoring one dimension being performed faster than monitoring both dimensions. There was no display type effect and no interaction. No speed – accuracy trade-off was observed. In the present experiment, visual complexity of the surrounded displays was taken into account by measuring the differences between monitoring features from one versus both dimensions. By this measure, it was found that the performance for the disjunction and conjunction displays differed only when both dimensions had to be monitored. The lack of difference between these two display types when only one dimension had to be

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)

1575


DELVENNE AND BRUYER

monitored suggested that the objects were not visually more complex when they were presented as multiple-parts objects (conjunction display) than when they were displayed as single-parts objects (disjunction display). Consequently, visual complexity of the surrounded displays cannot account for the significant decrease in the object-based encoding advantage observed in Experiment 1 for those types of display compared to the unsurrounded displays.

Experiment 1 and a surrounded conjunction display where the frames were located around the entire mushroom-like object (Figures 4A and 4B). Because both the cap and the stem were bounded by the same frame, it was difficult to have a related surrounded disjunction condition.

EXPERIMENT 4 In this experiment, we examined whether the decrease in the object-based encoding benefit associated with the introduction of the frames around the stems could be explained by the presence of the line between the cap and the stem of a mushroom-like object. The occurrence of this line might indeed make the object-based encoding much more difficult. One way to rule out the possibility that the frame interferes with object integration is to place a frame around each entire mushroom rather than just around the stem. In that case, there is no reason to suspect that this should interfere with integrating the two parts of the mushroom object since no line would separate the stem from the cap. However, according to the configural hypothesis of the frames, this should still reduce the perception of a spatial configuration change—thus the use of configural coding.

Method Participants A total of 11 undergraduate students from the University of Louvain-la-Neuve participated in this 40-min study for course credit (10 females; mean age ¼ 19.6 years, range 18– 24 years). All had normal (self-reported) or corrected-tonormal visual acuity and colour vision. Participants were unaware of our hypothesis. Materials and procedure Two conjunction displays were used in the present experiment: the conjunction display from

1576

Figure 4. Samples of displays used in Experiment 4. In the conjunction condition (A), no frame surrounded the objects; in the surrounded conjunction condition (B), a frame surrounded each entire mushroom-like object. The different grey levels represent different colours. The results (A 0 and correct response latencies) for monitoring one versus both dimensions for colour orientation conjunction displays in the unsurrounded and surrounded conditions are shown in (C) and (D). Error bars represent standard errors in the means.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)


CONFIGURAL EFFECT IN VSTM

As an alternative, we used the same procedure as that of Experiment 3 where participants had to memorize either one of the two dimensions (monitoring one dimension) or both dimensions (monitoring both dimensions).

Results and discussion When one dimension had to be monitored, the percentages of correct responses were 91.76% and 92.33% in the conjunction and surrounded conjunction displays, respectively. When both dimensions had to be monitored, the percentages of correct responses were 87.97%, and 83.71% in the conjunction and surrounded conjunction displays, respectively. The means of A0 are shown in Figure 4C. The Display Type (conjunction vs. surrounded conjunction)  Condition (monitoring one vs. both dimensions) ANOVA (repeated measures) on sensitivity revealed a significant main effect of condition, F(1, 10) ¼ 19.399, MSE ¼ 0.01951, p , .005, with better performance for monitoring one dimension than for monitoring both dimensions, but no significant effect of display type. Moreover, the interaction between display type and condition was significant, F(1, 10) ¼ 6.315, MSE ¼ 0.00266, p , .05. Post hoc analysis (Tukey HSD tests) revealed a significant difference between conjunction and surrounded conjunction when both dimensions had to be monitored (p , .05), but no difference between those two display types when one dimension had to be monitored (p ¼ .999). Concerning the correct response latencies (Figure 4D), the Display Type  Condition ANOVA (repeated measures) revealed a significant main effect of display type, F(1, 10) ¼ 11.423, MSE ¼ 38,173, p , .01, with faster responses in the conjunction display, a significant main effect of condition, F(1, 10) ¼ 5.511, MSE ¼ 37,353, p , .05, with faster responses when one dimension had to be monitored, but no interaction. These data revealed that the encoding of both the cap and the stem was affected when a frame surrounded each entire mushroom-like object. Because no line separated the two features, it is unlikely that the frames interfered with integrating

the two parts of the objects. Moreover, the absence of difference between the two display types when only one dimension had to be monitored indicated that the visual complexity of the displays was equivalent with or without the frames around the objects. Together, these findings reject the hypothesis that the frame effect is due to the presence of a line between the cap and the stem, which could interfere with object integration. Rather, they provide additional support for the view that placing a frame around the orientated lines disrupts the spatial configuration change detection caused by a line orientation change.

GENERAL DISCUSSION The present study addressed an important issue regarding whether the object-based encoding benefit in VSTM for features from different parts of an object could be accounted for by the use of configural coding rather than by the outcome of a true object-based effect. Previous change detection studies demonstrating this effect have used the colour and orientation as the relevant dimensions (Xu, 2002a, 2002b). Although changing the colour of an item does not alter the spatial configuration of the display (Jiang et al., 2000), changing the orientation of an item disrupts the configuration as the relative spatial locations are no longer preserved (Jiang et al., 2004). Such a configural alteration of the display may be easily and quickly detected by the participants (Jiang et al., 2000; Simons, 1996), especially when the features are grouped into a few discrete objects. Indeed, in that case, the number of spatial relations in the display is reduced compared to when all features are spatially separated, making the spatial configuration less complex. Can the disparity in the complexity of spatial configuration between disjunction and conjunction displays explain the so-called observed object-based encoding benefit? In the current study, we compared three competing hypotheses about the object-based effect: the strong object-based hypothesis, the strong configural hypothesis, and the weak configural

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)

1577


DELVENNE AND BRUYER

hypothesis. In Experiment 1, we replicated the object-based encoding benefit initially observed by Xu (2002b) for mushroom-like objects defined by the juxtaposition of a colour and a line orientation. In order to prevent the participants from using configural coding in change detection, we surrounded each line orientation with a frame. The merit of using surrounding frames is that the spatial configuration would now result from the spatial relations between each frame rather than between each object. Consequently, a line orientation change in a display of surrounding frames should not alter the spatial configuration any longer. Note that the present paradigm, which consists of incorporating each object from a display into a fixed frame, may provide an interesting tool to control configural changes that may occur when an object is replaced, removed, or displaced in a visual display. The results revealed that the objectbased encoding benefit was significantly reduced, but not suppressed with the frames. The effect of the frames on the object-based encoding benefit could not be simply attributed to (a) an incorrect perception of the lines when a frame is placed around each of them (Experiment 2), (b) a greater visual complexity of the mushroom-like objects when the frames were added (Experiment 3), or (c) the presence of the line of the frame that is sited between the cap and the stem of a mushroom-like object that could make the integration of the two parts difficult to compute (Experiment 4). These findings support the weak configural hypothesis, according to which a true objectbased encoding exists for features that are grouped into different parts of an object, but the use of configural coding helps enlarge this benefit. Perceptual grouping would affect how objects are represented in VSTM as well as how they are related to each other. At the object level, properties such as location/proximity and connectedness between object parts could be crucial in determining the object-based encoding benefit and, therefore, the total amount of information that can be simultaneously encoded into VSTM (Xu, 2005). At the configural level,

1578

perceptual grouping may simplify the configural representation of the display in reducing the number of spatial relations in the display. Although the current study provides a compelling demonstration that the use of configural coding can help enlarge the object-based encoding effect, additional research is needed to fully understand the nature of the configural representations in VSTM and to further specify the relationship between configurations and the representations of individual objects. Note that in the present study, the frames dramatically affected the change detection of a line orientation (Experiments 1, 3, and 4), whereas they had no effect in the visual search task (Experiment 2). This may appear to be inconsistent with the recent study by Alvarez and Cavanagh (2004), in which a close relationship between visual search rate and memory capacity was observed. The authors found that stimuli that resulted in more efficient visual search also led to better change detection. Here, we show that stimuli resulting in more efficient change detection do not necessary lead to better visual search. This may be explained by the fact that different types of visual properties were used in the two tasks. In a change detection task, display items had to be compared to a memory display. In the absence of frames, a change of line orientations affected the global configuration, and the participants could have taken advantage of these configural cues to detect the change. In contrast, in the visual search task, participants had to compare display items to one memory target. In that case, there were no configural cues that allowed participants to process differentially the lines with and without frames. To sum up, the present study provides further support that an object-based encoding benefit exists in VSTM for features that are grouped into different parts of an object (Xu, 2002a, 2002b) and demonstrates that the use of configural coding can considerably help extend the objectbased benefit. Original manuscript received 23 July 2004 Accepted revision received 16 June 2005 First published online 8 December 2005

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)


CONFIGURAL EFFECT IN VSTM

REFERENCES Aaronson, D., & Watts, B. (1987). Extensions of Grier’s computational formulas for A0 and B0 to belowchance performance. Psychological Bulletin, 102, 439– 442. Aginsky, V., & Tarr, M. J. (2000). How are different visual properties of a scene encoded in visual memory? Visual Cognition, 7, 147– 162. Alvarez, G. A., & Cavanagh, P. (2004). The capacity of visual short-term memory is set both by visual information load and by number of objects. Psychological Science, 15, 106– 111. Baddeley, A. D. (1986). Working memory. Oxford, UK: Oxford University Press. Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. Bower (Ed.), Recent advances in learning and motivation. New York: Academic Press. Chun, M. M., & Jiang, Y. (1998). Contextual cueing: Implicit learning and memory of visual context guides spatial attention. Cognitive Psychology, 36, 28 – 71. Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24, 87 – 185. Delvenne, J. F., Braithwaite, J. J., Riddoch, M. J., & Humphreys, G. W. (2002). Capacity limits in visual short-term memory for local orientations. Current Psychology of Cognition, 21, 681–690. Delvenne, J. F., & Bruyer, R. (2004). Does visual shortterm memory store bound features? Visual Cognition, 11, 1 – 27. Duncan, J. (1993). Similarity between concurrent visual discriminations: Dimensions and objects. Perception & Psychophysics, 54, 425– 430. Grier, J. B. (1971). Nonparametric indexes for sensitivity and bias: Computing formulas. Psychological Bulletin, 75, 424– 429. Hecker, R., & Mapperson, B. (1997). Dissociation of visual and spatial processing in working memory. Neuropsychologia, 35, 599–603. Irwin, D. E. (1991). Information integration across saccadic eye movements. Cognitive Psychology, 23, 420– 456. Irwin, D. E., & Andrews, R. V. (1996). Integration and accumulation of information across saccadic eye movements. In T. Inui & J. L. McClelland (Eds), Attention and performance: Information integration in perception and communication (Vol. XVI, pp. 125– 155). Cambridge, MA: MIT Press.

Jiang, Y., Chun, M. M., & Olson, I. R. (2004). Perceptual grouping in change detection. Perception & Psychophysics, 66, 446– 453. Jiang, Y., Olson, I. R., & Chun, M. M. (2000). Organization of visual short-term memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 683–702. Logie, R. H. (1995). Visuo-spatial working memory. Hove, UK: Lawrence Erlbaum Associates Ltd. Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 309, 279– 281. Macmillan, N. A., & Creelman, C. D. (1991). Detection theory: A user’s guide. New York: Cambridge University Press. Mecklinger, A., & Muller, N. (1996). Dissociations in the processing of “what” and “where” information in working memory: An event-related potential analysis. Journal of Cognitive Neuroscience, 8, 453– 473. Pashler, H. (1988). Familiarity and visual change detection. Perception & Psychophysics, 44(4), 369– 378. Phillips, W. A. (1974). On the distinction between sensory storage and short-term visual memory. Perception & Psychophysics, 16, 283– 290. Pylyshyn, Z. W., & Storm, R. W. (1988). Tracking multiple independent targets: Evidence for a parallel tracking mechanism. Spatial Vision, 3, 179– 197. Scholl, B. J. (2001). Objects and attention: The state of the art. Cognition, 80, 1 – 46. Simons, D. J. (1996). In sight, out of mind: When object representations fail. Psychological Science, 7, 301– 305. Simons, D. J., & Levin, D. T. (1997). Change blindness. Trends in Cognitive Sciences, 1(7), 261– 267. Smith, E. E., Jonides, J., Koeppe, R. A., Awh, E., Schumacher, E. H., & Minoshima, S. (1995). Spatial versus object working memory: Pet investigations Journal of Cognitive Neuroscience, 7, 337– 356. Sperling, G. (1960). The information available in brief visual presentations. Psychological Monographs: General and Applied, 74, 1 – 29. Tresch, M. C., Sinnamon, H. M., & Seamon, J. G. (1993). Double dissociation of spatial and object visual memory: Evidence from selective interference in intact human-subjects. Neuropsychologia, 31, 211– 219. Vogel, E. K., Woodman, G. F., & Luck, S. J. (2001). Storage of features, conjunctions, and objects in visual working memory. Journal of Experimental Psychology: Human Perception and Performance, 27, 92 – 114.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)

1579


DELVENNE AND BRUYER

Wheeler, M. E., & Treisman, A. M. (2002). Binding in short-term visual memory. Journal of Experimental Psychology: General, 131, 48 – 64. Xu, Y. (2002a). Encoding colour and shape from different parts of an object in visual short-term memory. Perception and Psychophysics, 64, 1260– 1280.

1580

Xu, Y. (2002b). Limitations of object-based feature encoding in visual short-term memory. Journal of Experimental Psychology: Human Perception and Performance, 28, 458– 468. Xu, Y. (2005). Encoding objects in visual short-term memory: The roles of location and connectedness. Manuscript submitted for publication.

THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2006, 59 (9)

Configural effect in VSTM  

This paper reveals that memory for features from different parts of an object can benefit from object-based encoding, but the use of configu...