Readingcorrectingerrors by Earl DeMott

Read Writ (2010) 23:803–834 DOI 10.1007/s11145-009-9190-x

Reading during sentence composing and error correction: A multilevel analysis of the influences of task complexity Luuk Van Waes Æ Marie¨lle Leijten Æ Thomas Quinlan

Published online: 10 June 2009 Springer Science+Business Media B.V. 2009

Abstract In this study we investigated the role of reading, how writers coordinate editing with other writing processes. In particular, the experiment examines how the cognitive demands of sentence composing and the type of error influence the reading and writing performance. We devised an experimental writing task in which participants corrected an embedded error (orthographic near-neighbors or farneighbors) and completed a sentence (using 1 or 3 context words)—in either order. Data were collected by logging keystrokes and recording eye-movements. The results revealed that both error and sentence complexity influenced the approach to error-correcting. Participants generally completed the partial sentence first, and then corrected the error (approximately 90% of the items). Task complexity reinforced this tendency. Moreover, in most of these cases, the error was fixated at least once prior to sentence completion. This suggests that the error was detected (at least partially), but the correction response was inhibited. The differences in cognitive load also affect the reading activity during planning. This investigation illustrates how the interplay of two task factors, error and sentence complexity, appears to influence how writers coordinate error-correcting with sentence composing. Keywords Writing processes Working memory Cognitive load Eye-tracking Text produced so far Error correction

L. Van Waes (&) M. Leijten Department of Management, University of Antwerp, Prinsstraat 13, 2000 Antwerp, Belgium e-mail: luuk.vanwaes@ua.ac.be URL: http://www.ua.ac.be/luuk.vanwaes T. Quinlan Educational Testing Service, Corporate Headquarters, Rosedale Road, Princeton, NJ 08541, USA

123

804

L. Van Waes et al.

Introduction During writing we monitor the development of our text on the screen very frequently. Touch typists can continually monitor their text as it emerges, while less skilled typists switch more often between keyboard and screen monitoring (Johansson, Wengelin, Johansson, & Holmqvist, this issue). As Kaufer, Hayes, & Flower (1986) observed, writers visually interact with their own text for various purposes, such as ‘finding prompts to allow search and retrieval of more ideas’ and ‘evaluating the formal correctness or suitability of the text produced so far (TPSF)’. While considerable research has focused on the role of revising (for an overview, see Allal, Chanquoy, & Largy, 2004), relatively little work has examined the coordination of revising or—more particularly—editing with other writing processes. In the present study we mainly focus on the role of reading and editing during writing. Hayes and Flower’s (1980) original conceptions of editing as a component of reviewing are still useful for considering online editing, apart from offline reviewing. Proofreading studies have almost exclusively focused on the latter, examining error-detection as a stand-alone task. Several proofreading studies have found that participants read familiar texts more quickly (Levy, 1983; Levy & Begin, 1984) and corrected them more accurately (Daneman & Stainton, 1993; Levy, Di Persio, & Hollingshead, 1992). However, Pilotti and her colleagues (Pilotti, Chodorow, & Thornton, 2004; Pilotti, Maxwell, & Chodorow, 2006) question whether familiarity alone accounts for improved proofreading efficiency, and whether by-products of rereading, i.e., articulatory and auditory feedback, somehow mediate this often observed familiarity effect. To induce familiarity, participants read (and reread) texts under different feedback conditions, with and without articulatory and/or auditory feedback. Pilotti et al. (2006) observed that prior reading (familiarity) improved reading times, in all conditions, but that rereading with auditory feedback also improved the accuracy of error correcting. Pilotti et al. concluded that auditory feedback improves detection accuracy by facilitating the processing of phonological information; whereas familiarity benefits proofreading fluency by improving overall reading efficiency. The authors suggest that for optimal proofreading performance writers should read aloud, something few writers do. Although proofreading has been studied quite extensively, relatively little is known about how findings from this area of research may generalize to editing during composing. We know that writers, especially skilled writers, often reserve some time at the very end of the writing process to actually ‘proofread’ their final product. However, arguably, the overwhelming majority of detection and correction happens during composing. It is inherent to writing that several cognitive subprocesses (like planning, translating and reviewing/editing) occur nonlinearly in short cyclic episodes (Hayes & Hayes, 1980; Severinson Eklundh, 1994). While composing a draft, writers will very frequently allow sentence composing to be interrupted by tasks like fixing typographical errors, changing punctuation, and rewording phrases. These actions suggest that writers are constantly reading their texts in order to evaluate them, a process central to more recent conceptions of

123

Reading during sentence composing and error correction

805

skilled writing (Hayes, 1996). Thus, writers are frequently juggling ‘reading for editing’ with other processes, such as reading to plan the next sentence. Theoretically, editing must compete with other writing subprocesses for limited cognitive resources, with the possibility of inducing interference and cognitive overload. Accordingly, the writer’s ability to multi-task is constrained by the availability of cognitive resources (McCutchen, 1996), or—to use Hayes and Flower’s words—writers are often on ‘‘full-time cognitive overload.’’ This competition for cognitive resources could force writers to adopt specific strategies (Hayes, Flower, Schriver, Statman, & Carey, 1987; Kellogg, 1996). On the question of how writers coordinate editing with other writing processes, previous research has generally focused at a global level. Kellogg (1988) and Galbraith and Torrance (2004), for instance, studied the effect of different drafting strategies on the quality of the text as a whole. An outlining strategy, for example, has the function of separating idea generation and organization processes from text production processes. When adopting a dualdraft approach (Elbow, 1973, 1981), writers create a rough draft they focus on expressing the content of their text, without worrying too much about how well expressed the content is. They (temporarily) ignore some low level issues, like spelling and the correction of typographical errors, in order to get the content down on paper. However, one study found the outlining approach superior to a dual-draft approach in producing better quality texts (Kellogg, 1988). The question of how best to coordinate editing with other writing processes remains a very open question. It would seem that at certain times writers ignore quite a few errors in the text produced so far (TPSF) in order to focus on formulating content, while on other occasions they find it very hard to ignore low-level errors in the text. During writing writers not only reread the text produced so far (TPSF) in search for errors, but as depicted in the classical writing model by Hayes and Flower (1980), the TPSF is a central component of the task environment. On the one hand writers may use the TPSF as a kind of visual stimulus to plan and produce new text (Kaufer, Hayes, & Flower, 1986). On the other hand the writer may reread the previously produced text in order to evaluate it for correctness or appropriateness. The different orientation of rereading often leads to a dilemma in the writers’ mind when being confronted with an anomaly in the TPSF: Should I prioritize error correction or delay it and continue text generation first (so as not to lose planned content)? Evaluating the TPSF adds a layer of complexity to writing, and writers must somehow coordinate it with other writing processes. Revision during writing involves error analysis, detection, diagnosing and correction. This evaluating process has received much attention in cognitive science (see Rabbitt, 1978; Rabbitt, Cummings, & Vyas, 1978; Sternberg, 1969) and, more recently in computer based writing research and writing with speech technology (see for instance, Flower, Hayes, Carey et al., 1986; Hacker, 1997; Hacker, Plumb, Butterfield et al., 1994; Larigauderie, Gaonac’h, & Lacroix, 1998; Leijten & Van Waes, 2005; Piolat, Roussey, Olive et al., 2004; Van Waes & Schellens, 2003). It is assumed that error correction strategies of writers are affected by working memory resources and the complexity of the error (Blau, 1983; Kellogg, 1999, 2001, 2004; Leijten, 2007a).

123

806

L. Van Waes et al.

In order to further investigate this, Leijten and her colleagues (Leijten, 2007a; Leijten, De Ridder, Ransdell, & Van Waes, 2007) developed an experimental paradigm presenting writers with a dilemma: writers were offered an incomplete sentence clause with the demand to complete the sentence by formulating new content based on an auditory prompt. Although this task does not exactly replicate ‘normal’ writing in professional setting, it is not unlike the type of writing exercise a student might encounter in the younger grades. For the purposes of the present investigation, the task allowed us to isolate the dilemma writers are confronted with in this kind of situation and manipulate conditions related to (a) sentence composing and (b) error correcting (see Method section). Leijten’s findings show that in most cases writers choose to complete the sentence before correcting the error. However, she also found that writers’ performance was influenced by other factors, viz. writing mode and error type. In a previous study, Leijten (2007b, see Chap. 5 and 6) created different experimental conditions: on the one hand related to writing mode (keyboard based word processing vs. speech recognition), and on the other hand related to error types. That is, texts were offered only visually or were also preceded by an auditory prime. Via this approach the researchers wanted to extend the work on error correction by describing the cognitive effort and error correction strategies related to different types of naturally-occurring online errors in different writing modes. Next to writing mode, the span of the errors presented in the TPSF was manipulated: four types of errors were implemented, ranging from large errors to small typing errors. By comparing error types the effect of the error span on the working memory was determined. The results showed that the experimental condition, in which the TPSF was either offered only visually, or also through read-aloud before the visual prompt, influences the writer’s strategies during error analysis. In the speech condition writers ‘delayed’ error correction more often and start writing sooner than in the non-speech condition. In other words, writers more often opted to prioritize text production when speech was present. In general the addition of speech reduced the cognitive demands of writing. Also the error span (large versus small speech recognition errors) had a quite consistent effect on strategy choice. This effect was especially powerful when interacting with the speech condition. On the one hand, large errors lead to longer preparation and production times as well as to slower interference reaction times, indicating that these kind of errors consume more working memory resources. On the other hand, they produced a higher rate of error correction success than small errors. As a follow-up study, the present study focuses the coordination of reading (for detecting errors) and writing (in this case, completing a sentence).

Aim Leijten’s study raises several new questions. Her results suggest that the dynamics of composing are influenced by error complexity, an external factor related to the type of error. However, also internal factors influence cognitive load and might

123

Reading during sentence composing and error correction

807

affect the writersâ&#x20AC;&#x2122; preference to correct the error first or not. Therefore, in this study we also manipulated an internal factor. We varied the task complexity by asking the participants to either include one or three words when completing the sentence. Furthermore, the previous study showed that, given the load associated with holding new information in working memory, participants tended to complete the sentence first, and then correct the error. The possibility of deferring error correction raises the question about when detection happens: Do participants detect an error but defer its correction until after completing the sentence? Or, do they only read the TPSF very superficially during a first pass, without really detecting the error? To address these questions, we employed an eye-tracking system to record the reading behavior of the participants. So far little is known about the role of reading when reviewing or revising a text. One of the main reasons is that until recently, the state-of-the-art in observation methods and tools could only provide limited data on these subprocesses. For instance, keystroke-logging provides valuable and fine grained information about writersâ&#x20AC;&#x2122; online pausing behavior, which reveals something of the mental activity associated with writing processes (Sullivan & Lindgren, 2006). Although pauses during writing appear to coincide with planning (Matsuhashi, 1981; Schilperoord, 2002), they may also reflect other writing processes, such as evaluating. Thanks to advances in the use of eye-tracking technology, it is now possible to collect relatively precise data about the visual behavior during these pauses. This technology has already been used in reading research for some time (for reviews, see: Drieghe, Rayner, & Pollatsek, 2005; Rayner, 1998; Rayner, 2004; Rayner & Juhasz, 2004). In writing research, however, eye-tracking technology has only been applied quite recently (Andersson et al. 2006; Wengelin et al., 2009). This may be due to several factors, especially the relative complexities of studying real-time writing processes. While writing text on a computer, for instance, the text on the screen is continuously changing which makes it hard to create rigorous one-to-one relations between fixations and the emerging words on the screen automatically.1 As mentioned above, eye-tracking systems make it possible to gather data that relate to reading processes while writing. Eye-tracking data certainly provide new forms of data that enable writing researchers to explore and examine aspects of the writing process that have been otherwise covert, which have the potential of challenging our current conceptions of writing. However, interpretation of the data seemingly depends upon conceptual models that have yet to be developed. Because these models do not yet exist, we have opted for the present study to conduct a quasi-experimental investigation. Although observing writing in a naturalistic setting has obvious appeal, such writing may be influenced by any number of things. Therefore, we desired to relate reading behavior to specific writing subprocesses, as a basis for a better understanding of the role of error processing in sentence 1

Recently, however, a software program has been developed, called EyeWrite (Simpson & Torrance, 2007). This system allows researchers to study writers composing by keyboarding. It makes it possible to collect precise timings for the creation of each character and to combine this logging information post hoc with information about the writersâ&#x20AC;&#x2122; eye movements. In other words the program creates a synchronized log of both keyboard and eye activity. Researchers in Poitiers developed a comparable research tool, called Eye & Pen, to study handwriting (Alamargot, Chesnet, Dansac, & Ros, 2006).

123

808

L. Van Waes et al.

completion performance and other more global strategies. Consequently, we devised a more structured writing task, eliciting a specific type of reading for writing, the detection of spelling errors. The objective of the present study is explorative. We want to analyze the cognitive load related to the detection of various error types, to describe error correction strategies, and to identify reading behavior that relates to these different strategies. The analysis of reading patterns should provide a more fine-grained analysis of the extent to which errors are processed prior to sentence completion. The central aim of this study is to investigate how the cognitive demands of sentence composing and the type of error influence the reading and writing performance. Therefore, our main research questions are: Writing behavior and complexity •

What effect does task complexity (either content-complexity or error-complexity) have on: – –

the coordination of writing processes? the priority given to sentence production or error correction (correction strategy)?

Reading during writing • • •

How is reading attention distributed between the given partial sentence (TPSF), the error, and the point of inscription? In which stage of the writing process do writers ‘detect’ the error for the first time, and how is the correction strategy related to this detection? What effect do content complexity and error complexity have on the reading behavior?

Accuracy •

Is there a relation between the reading behaviour and the accuracy of task completion?

The description of reading behavior during writing on the basis of logged eye fixations and saccades is still in an early stage. Therefore, we also hope that by identifying different variables that relate to eye activity, this study will also contribute to a better understanding of eye-tracking data in the context of writing research in general.

Method In a laboratory setting, the participants were required to complete a series of items, that involved correcting an embedded error (if present) and completing a sentence—

123

Reading during sentence composing and error correction

809

in either order (see also Quinlan, Loncke, Leijten, & Van Waes, 2009). Each item consisted of a partial sentence that was first presented to the participants auditorily, as a type of priming. Next, one or three context words were presented on the screen. The participants were instructed to use these words to complete the sentence. The partial sentence was then presented as a written text (TPSF) on the screen. Then, the participants were prompted to complete the sentence, using the context words, and correct the error (if present). By creating this controlled readingâ&#x20AC;&#x201C;writing context, we were able to observe the way in which writers shift between writing subprocesses: monitoring, error detection, error correcting and sentence composing. By varying complexity of sentence completion and error correction, we were able to investigate the research questions mentioned above. We utilized an eye-tracking system to record participantsâ&#x20AC;&#x2122; (re)reading behavior while completing the tasks. Eye fixations provide potentially valuable information to refine the interpretation of keystroke-logging data. Participants Thirty-two undergraduate students participated in the experiment (13 male, 19 female). The students all had Dutch as their first language and were between 19 and 25 years old. The participants who volunteered to take part in the experiment received â&#x201A;Ź 7.50 for participating. Design In most studies of (off-line) proofreading, researchers have presented participants with short texts with various types of errors embedded in them (e.g., Daneman & Stainton, 1993; Levy, 1983; Levy & Begin, 1984; Pilotti et al., 2006). In these experiments participants were typically directed to find and correct the errors in a static text. To the extent that proofreading happens during composing, how do writers coordinate it with other writing processes? The present experiment was designed to emulate salient aspects of error-detecting and correcting during (online) sentence composing. As a writer formulates his or her thoughts into sentences, errors can occur during inscribing, which the writer may or may not detect. When an error is detected, the writer may correct it immediately or later. If he or she corrects it immediately, then it will seemingly interrupt other writing processes. In order to simulate the conditions of error detecting/correcting, as a writer might encounter during sentence composing, we opted for a cloze task. We constructed six sets in which the same 42 experimental sentences were randomly ordered in subsets to avoid order effects. The participants were randomly assigned to one of six different sets. Participants completed 46 task items (including 4 practice items), with a short pause in the middle (which we used to recalibrate the eye-tracking system, if necessary). The kind of errors within each script was equally varied, roughly 50% with correct words (n = 21), 25% with orthographic near-neighbor errors (n = 10), and 25% with orthographic far-neighbor errors (n = 11). The number of context words was also comparably distributed over the different scripts and randomly ordered within each script, with 50% (n = 21) one context word and

123

810 Table 1 Overview of the characteristics of the sentences used in each reading-writing task

L. Van Waes et al.

Characteristics

Test sentences (3 correct, 1 incorrect)

Number of sentences 4

Correct sentences

Incorrect sentences: orthographic near-neighbors

Incorrect sentences: orthographic far-neighbors

Total

50% (n = 21) three context words. Participants encountered the partial sentences in the same order, while the condition in which the sentence was presented (number of context words and type of error) was counterbalanced across participants. Materials The main part of the experiment consisted of a reading–writing task. The participants had to read and complete 46 sentences (Table 1). All the materials were presented in Dutch. By presenting participants with partial sentences (i.e., a cloze task), we aimed to facilitate basic text production. Accordingly, we wanted to minimize the influence of extraneous factors. Since domain knowledge is known to have a powerful influence on composing (McCutchen, 1986; Voss, Vesonder, & Spilich, 1980), we wanted to control the effects of planning. Similarly, since characteristics of the sentence could influence participant’s ability to complete the item, we attempted to control salient text characteristics. Thus, each partial sentence: (a) referred to things and events in the general domain, (b) provided sufficient local context to identify the error, (c) contained the first clause of a complex clause with a connector, and (c) included 9–12 words. Our intention for introducing auditory priming was to induce familiarity, prior to seeing the partial sentence. Under normal conditions, the writer is familiar with his own text, which is the product of his own planning and formulating. Upon hearing the partial sentence read-aloud, participants could begin to build a mental representation of the sentence. To vary the complexity of sentence completion, participants were presented with either one or three words, and then asked to integrate them in completing the sentence. Fedorenko, Gibson, and Rohde (2006) found that, during reading comprehension, holding three words in working memory was more disruptive than holding one. These context words were topically related to the partial sentence, providing content and minimizing the need for idea generating. Besides varying the complexity of sentence completion, we also varied the nature of error correction. The results of Larigauderie, Gaonac’h, and Lacroix’s (1998) proofreading study suggest that different types of errors require different levels of processing, and that certain errors require more extensive processing than others. In the present study each partial sentence contained a single word that could appear in one of the following error-conditions:

123

Reading during sentence composing and error correction

(1) (2) (3)

811

correct word: e. g. the Dutch word ‘‘knuffel’’ [cuddly toy] orthographic near-neighbor: e.g. ‘‘knuppel’’ orthographic far-neighbor. e.g. ‘‘kronkel’’

The recent generation of writing tools, i.e., word processing and speech recognition, generally either automatically correct or mark non-word errors (e.g., ‘‘tpyos’’). Consequently, we ensured that all errors were real-words. Near- and farneighbor errors were constructed in relation to the correct word, sharing the same word class, number of syllables (i.e. two or three), and comparable word frequency. Errors deviated from correct words according to internal graphemes (in the onset of the unstressed syllable): with (a) near-neighbors deviating only one phoneme and one or two graphemes and (b) far-neighbors deviating by two phonemes and more than two graphemes. Within the partial sentence, errors were located internally, never as the first or last words, and equally distributed between nouns and verbs. Thus, errors filled the same syntactic role as the correct words—while semantically jarring with the context of the partial sentence. Apparatus and procedure Data was collected from each participant individually in a single session of approximately one hour. All materials were presented in Dutch. A custom designed computer program in.NET administered the experimental tasks. The program controlled the design, and stored the results of the tests, several time stamps, the text produced in the writing task and the text location. To log the linear development of the writing process during the completion task, Inputlog (Van Waes & Leijten, 2006) was used to capture the keyboard & mouse input and calculate the pausing time afterwards. During the writing task, we recorded eye fixation and keystroke information using the Eyelink II eye-tracking system (SR Research, Osgoode, Canada), managed by the Gazetracker program (EyeResponseTechnologies, 2002). The participants (n = 32) were fitted with the Eyelink II headgear, and the right-eye camera was adjusted for optimal orientation and focus. Next, the eye-tracking system was calibrated to the eye movements of the individual participant. As a dot appeared at various locations on the computer display, the participant was directed to watch it closely. The calibration procedure was repeated until a ‘‘good’’ (or in two cases ‘‘fair’’) rating was established. After calibration, the participant began the experimental items, with a short pause midway for the purposes of recalibrating the eytracking system. The experiment started with a general overview of the experiment. The participants could both read the information on the computer screen and listen simultaneously to the text through their headset because everything was also read out loud. The experimental session consisted of two main parts: (1) (2)

First part reading–writing test Second part reading–writing test

The experiment consisted of two blocks of reading–writing tasks. In a self paced flow, the participants first heard an auditory prime in which the partial sentence was

123

812

L. Van Waes et al.

read out aloud twice. Next one or three context words appeared on the screen for respectively one or three-seconds. The participants were informed that they should complete the partial sentences presented in the written form on the next screen, using these context words. They were also told that they should focus both on accuracy and on speed. They had to complete the sentence as quickly as possible and they had to—if necessary—correct the errors within the partial sentence (presented as TPSF). It was also explicitly mentioned that they should decide themselves the order of task completion in each item. Before they started the task the participants viewed a short video demonstrating an experimental item. After the video they were presented with complete (written and read out aloud) instructions, followed by four practice items. To summarize the procedure technically: each item consisted of four screens, presented sequentially: 1. 2. 3. 4.

Opening screen with sound-icon, which the participant clicks to hear the auditory prime. Blank screen during which the partial sentence is read aloud, then repeated. One or three context words (in one line, separated by commas) at the bottom of the screen, for 1 and 3 s, respectively. The partial sentence (as read out aloud in screen 2) is displayed. The partial sentence may include an error or not.

Example Auditory prime

Het kleine meisje drukte haar knuffel hard tegen haar wang en… [The little girl pushed her cuddly toy firmly against her cheek and…

Context words

schreeuwen, moeder, winkel [cry, mother, shop]

Correct partial sentence

Het kleine meisje drukte haar knuffel hard tegen haar wang en…

Incorrect partial sentence

Het kleine meisje drukte haar knuppel hard tegen haar wang en…

Analysis Data consisted of participants’ (a) responses to the tasks, (b) typing behavior, and (c) reading behavior. This data was captured automatically by (a) the experimental stimuli computer program (in.Net), (b) Inputlog, and (c) the eyetracking system (which included the Eyelink II managed by Gazetracker), respectively. After processing and scoring, the data was analyzed from two perspectives: a quantitative multi-level analysis and a qualitative analysis to illustrate the reading–writing interaction in more detail. Product and process measures Both product and process measures were used. The variables were used to analyze cognitive effort and characterize error correction strategies in the different

123

Reading during sentence composing and error correction

813

conditions. In this paragraph we shortly define the dependent variables used in the different analyses. Preparation time (initial planning) The preparation time represented the time spent preparing to complete the item, before the participant’s first action. It was operationalized as the duration between: the close of the context words screen and the participant’s first mouseclick, the necessary first step to position the cursor, either for correcting the error or completing the sentence. Production time (writing) The production time represented the time spent completing an item. It was operationalized as the duration between: the participant’s first action (either mouseclick or keystroke) until her signal of finishing the item (a mouseclick on ‘‘okay’’). Accuracy (completion and correction) Each item was scored as ‘correct’ or ‘incorrect’. To be correct, the participant’s response had to include: (a) the sentence grammatically completed, (b) using all the context words, with (c) the error corrected (if present). About one half of the items in the experiment contained an error that needed to be corrected. The correction accuracy and completion accuracy is coded binary. Order of task completion (immediate versus delayed error correction) For every sentence we logged whether the cursor was initially positioned either within or after the partial sentence, which we used as a proxy for task order. A mouseclick in the middle of the partial sentence suggests error-correcting, while a mouseclick at the end of the partial sentence suggests sentence completion. Eye fixations The eye fixation data (from the Eyelink II) was processed by the Gazetracker software, which enabled us to identify the location of fixations within three zones of interest: a) the partial sentence, b) the error, and c) the point of inscription (i.e., production of new content). To capture aspects of reading behavior, the eye fixations were measured as follows: • •

Fixation frequency: total number of fixations and number of fixations per zone (error, rest of the partial sentence, and production) Fixation duration: total duration of fixations (sum), duration per zone and duration of the first fixation in the error zone;

123

814

• • • •

L. Van Waes et al.

Fixation process: number of fixations during prewriting; Fixation interval: duration between first and last fixation in the error zone, and between the first and the second fixation; Distance of saccades: total horizontal distance of saccades between the successive fixations (as a measure of rereading); Transitions: total number of zone crossings in an item, frequency of movements in or out of the error zone, and frequency of transitions to the error zone in the prewriting phase.

Two views of the dynamics of composing Figure 1 is a graphical representation of one item in which both keyboard & mouse actions (right y-axis) and eye fixations (left y-axis with Gaze Tracker look zones, viz. eye fixations in the production zone or the TPSF zone, error word or other words) are shown. The x-axis is the time line, with 0 ms as the starting time (the moment when the TPSF was made visible to the writer). It took the writer about 15 s to complete the sentence. The example shows an item in which a writer corrects the error in the TPSF before completing the partial sentence. The partial sentence presented as TPSF was: ‘Ze probeerde verwoed het brandende frituurvet te bluffen door …’ [She fiercely tried to put out the burning frying fat by…]. In this example the writer first selects part of the word with an error (‘luffen’)2 and then completed the word (‘blussen’) to correct the error. After having repositioned the cursor with the mouse, he completes the sentence: ‘… er water op te gooien.’ [… throwing water on it.]. The eye-tracking data show that the participant starts reading the TPSF shortly after the 5 s indication on the time line (left to right fixations with short saccades) until he fixates the error (5 fixations before the error is corrected, without leaving the error fixation zone). After the correction the fixations follow the cursor while typing the new text, interrupted by two short fixations in the TPSF zone before typing the last word. Then he rereads the TPSF once again and closes the item (without really rereading the newly produced text). Figure 2 shows a different approach, in which the writer completes the sentence before correcting the error. He delays the error correction although in the prewriting phase the error is fixated several times (one transition). He first completes the sentence (‘riep haar moeder’ [called her mother]) and after a very short pause (851 ms)3 moves the mouse to the error to correct it.

The linear representation of the fragment generated by Inputlog looks like this (pauses in mille seconds between brackets): [Movement][LeftButton][Movement][LeftButton](1082)[Movement](531) lussen [Movement] [LeftButton][Movement]er•zater•op•te•(811)gooire[BS2]en:•(872)[Movement] [LeftButton]. 3

The pause is shown in the linear output of Inputlog: [Movement][LeftButton](2364)ripe(731)[BS2] (741)r(1062)[BS](501)ep haar moeder.(851)[Movement] [LeftButton](1121)[Movement](701)nufe(811) [BS]fel(681)[Movement] [LeftButton].

123

Reading during sentence composing and error correction

815

Fig. 1 Example of sentence completion episode in which the error (embedded in the TPSF) is corrected first

Fig. 2 Example of a sentence completion episode in which the error correction is delayed

Multilevel analysis The data from the present experiment were analyzed from a hierarchical perspective by applying a multilevel framework. The main advantage of the multilevel approach is that it minimizes the need for data aggregation, especially when compared with more conventional statistical analyses (like ANOVAs or t-tests). The application of multilevel analysis for writing research is mainly disseminated by Van den Bergh

123

816

L. Van Waes et al.

and his colleagues (Quene´ & Van den Bergh, 2004; Van den Bergh & Rijlaarsdam, 1996).4 We opted for multilevel analysis because a unilevel approach leads to a possible loss of statistical power due to data aggregation on the participant’s level resulting in one mean score per condition and per error type. These aggregated data do not always adequately treat differences between writers and between sentences when analyzing their behavior during the interaction with the different error types in the TPSF. Not taking into account this nuance can lead to an aggregation bias in the interpretation of the analyses. So, the main advantage of multilevel modeling is that it accounts for the hierarchy within collected observations and the dependencies within a hierarchical structure (see also Goldstein, 1995). Also the fact that the number of observations per person sometimes slightly differs,5 does not affect the power of the analyses. For each dependent variable described above, multilevel regression models are presented with each sentence of an experimental item (level 1) nested within participants (level 2). Each model consists of three parts: an estimated mean for that specific variable (reading/writing), a characterization of each writer (as a deviation of the mean), and a characterization of each partial sentence as it was presented to the writer. In other words, sentences are ‘nested’ into participants. This approach enabled us to analyze the effects of different conditions and error types on each dependent variable at the sentence level, taking into account person characteristics. We conducted the multilevel analysis in three steps. In the first step we estimated the so-called ‘zero model’ to gain insight into the variance between participants and the variance between sentences. These variances provide information about the distribution of the variance of participants as opposed to the variance of sentences. By calculating the intra-class correlations (ICC) we can evaluate the relative amount of variance that can be attributed to the partial sentence themselves rather than the participants. In the next step we estimated the ‘netto zero model’ in which we have integrated the variables that characterize the participants (e.g. sex and memory span). In this step we have analyzed whether specific characteristics of the participants should be taken into account in the further analyses. Based on these models we get insight in the unique variance between sentences, after correction for higher level variables. It is this unique variance that we want to explain in accordance with our hypotheses. In the final step we used interaction models. The first type of interaction model is related to the cognitive load condition (1 or 3 context words). The second type of interaction model compares the various types of partial sentences (no-error, near-neighbor error, or far-neighbor error) that the writers are confronted with. 4

For a comprehensive guide to multilevel analysis we would like to refer to a tutorial by Quene´ and Van den Bergh (2004). Another researcher, Barr (2008), applied the MLR framework in the context of a reading study in which also eye-tracking data were analyzed. This article also contains a mini-manual for researchers who want to apply multilevel analyses to their own data sets.

Although the number and type of sentences is strictly controlled in this experiment—as opposed to more ecologically valid observations of writing processes—a few sentences could not be added to the data set for technical reasons (e.g. because of a calibration problem). The results for the variables related to these sentences were coded as missing values, which resulted in a data set with slightly deviating total numbers of scores.

123

Reading during sentence composing and error correction

817

Results In this section, we report the results of three sets of MLM analyses. First, we report the analyses of the data related to writing behavior, focusing on the latency measures during item completion, i.e., the duration preceding the first keystroke (preparation time) and following the first keystroke (production time). Second, we report the results that relate to the reading activities during the item completion. These results are based on the analyses of the eye fixation data. Lastly, we report the analyses of relationships between accuracy scores and reading behavior.

Writing behavior and complexity The first research question concerned the effect of sentence complexity and error complexity on the organization of the writing task. Table 2 shows the results of the multilevel analyses for preparation and production time. For both variables the zero model and the netto zero model is presented. In the random part of the model the intra-class correlation is estimated, showing that in the zero model accounted for variance due to participants, about 1% (Preparation time) and 17% (Production time). The zero netto model for Production time shows that the total time needed to complete the task was significantly influenced by content complexity (the requirement to process one versus three context words) and error complexity (the requirement to detect and correct near-neighbor versus far-neighbor errors). Participants spent significantly longer when integrating three context words relative to one context word. Also, the overall task time was significantly longer when it involved correcting orthographic far-neighbor errors than near-neighbor errors. These results are not surprising, and mainly serve to confirm that it takes longer to (a) string three words into a phrase (relative to one word), and (b) fix a far-neighbor error (relative to a near-neighbor error). Table 2 also shows that Preparation time is not affected by the number of Context words and the Error type. However, Preparation time appeared longer when participants opted to correct the error first, M = 2,058 ms (n = 108; sd. 1,032), in contrast to completing the sentence first, M = 1,416 ms (n = 753; sd. 2,883). The former (error-first) approach likely reflects Preparation time being taken over for searching for the error, which might otherwise be done afterwards, while the relatively high variability associated with Preparation time in the latter (sentencefirst) approach suggests that participants may initially spend more or less time looking for an error, even though they will opt to complete the sentence first. This effect might also be explained by the fact that writers gave strategic choices more deliberate consideration. Notice also the high estimated standard error for the cases in which the error correction is delayed. Figure 3 shows a complementary representation showing the average preparation time on an aggregated group level. For this purpose, the writers were grouped on

123

–

Sex

Immediate/delayed error correction

Context words (one or three)

Error type (near or far neighbor error)

0.01

Sentence level variance

ICC

0.01

7,351,103.00 (360,774.60)

36,089.12 (77,194.85)

–

691.226 (199.552)

–

3,346.116 (217.463)

Effect for participants that correct at least 25% of the error corrections immediately

ICC Intra class correlation, SE Standard error

82,627.11 (89,248.45)

7,353,464.00 (360,934.20)

Participant level variance

Random part

1,494.828 (105.991)

Intercept

Fixed part

0.17

29,185,230.00 (1,434,410.00)

6,012,974.00 (1,779,961.00)

–

15,794.620 (473.334)

Zero model Estimated SE

Netto zero model Estimated SE

Production time

Preparation time

0.18

26,733,460.00 (1,622,741.00)

5,833,195.00 (1,866,730.00)

2,486.399 (433.911)

3,762.502 (436.431)

–

12,711.970 (707.793)

Netto zero model Estimated SE

Table 2 Parameter estimates of intercept, participants’ characteristics and the intra class correlations for preparation time and production time

818 L. Van Waes et al.

Reading during sentence composing and error correction

819

2500

preparation time

2000

1500

1000

500

0 0-50%

51-75%

76-99%

100%

percentage of delayed error correction (partcipant level)

Fig. 3 Relation between Preparation time (ms) and the participantsâ&#x20AC;&#x2122; preference to delay error correction

their tendency to either delay error correction or not.6 The decreasing line shows the gradual drop in the preparation time needed when the error correction is delayed. However, this effect disappears in the overall analysis of the total production time (see Table 2) indicating a different distribution of reading and revision activities. In the next section we will relate these results to the eye fixation data in order to more explicitly interpret these results. Second, we were interested to know whether content complexity and error complexity had an effect on the correction strategy. As in the previous experiment (Leijten, 2007a), participants in the present experiment most often opted to finish the sentence first, and correct the error afterwards (approximately 90% of items). By completing the sentence first, participants probably freed themselves from the load of keeping the context words active in working memory. Moreover, the analysis of the data (Table 3) showed that the participantsâ&#x20AC;&#x2122; tendency to delay the error correction was affected by the number of Context words that they were required to include in the sentence (content complexity). For instance, the chance of a participant delaying the correction of a far neighbor error increased by more than 10% when she was required to integrate three context words, in comparison to integrating only one context word.7 6

We grouped the participants on the basis of their preference to either delay the error correction or not. For instance, when participants delayed error correction in less than half of the items, we grouped them in the first group (0â&#x20AC;&#x201C;50%). Some of the participants were very consistent in their approach: they delayed the correction of all the errors till after they had completed the sentence. Those writers were categorized under group 4 (100%).

This percentage is calculated on the basis of the odds ratios derived from the binominal interaction model of Cursor Position preference related to Context words and Error type.

123

820

L. Van Waes et al.

Table 3 Parameter estimates of intercept, participants’ characteristics and intra class correlations for delayed error correction and accuracy (logits) Delayed error correction

Accuracy error correction Accuracy completion

Zero Netto zero model model Est. (SE) Est. (SE)

2.004

1.687

3.495

(0.228)

(0.322)

(0.258)

Error type (near/far) –

–

Context words (one/ – three)

0.510

–

-1.708

Zero Netto zero model model Est. (SE) Est. (SE)

Fixed part Intercept

–

2.945

4.103

(0.155)

(0.383)

(0.244)

(0.419)

Random part Participant level variance

1.285

2.176

0.171

(0.414)

(0.665)

(0.520)

–

0.000

0.032

(0.000)

(0.203)

Est. Parameter estimate, SE Standard error

Cognitive load apparently relates to the participants’ approach to coordinating error correction with sentence completion. As mentioned above, overall, participants rarely corrected the error first. However, when they did, it occurred significantly more often under the low-load (one context word) than the high-load (three context words) condition. This result suggests that holding one word in working memory leaves sufficient working memory resources to afford more flexibility in problem-solving. The order of task completion does not seem to be related to the Accuracy of the error correction. We could not find a significant relation between correction strategy (delay or not) and the writers’ ability to correct the error in the TPSF. The overall high success rate—more than 96% of the errors were corrected—indicates something of a ceiling effect, suggesting that the error correction task were well within participants’ capabilities. Similarly, participants were generally able to use the context words to successfully complete the sentence cloze task. However, the analyses revealed that participants were significantly more successful in the integration of one context word, relative to three context words. On average they were more than 5% more accurate in the former case. This result supports the notion that participants found it more difficult to string three words into a phrase context, relative to one word. Reading during writing Analyses of the eye fixation data made it possible to relate and complement the findings above to the associated reading behavior. Based on our research questions, we were especially interested in identifying reading patterns that could be associated with distinct approaches to correcting errors. We present the descriptive statistics, followed by the results of the multilevel analysis.

123

Reading during sentence composing and error correction

821

Table 4 Descriptive statistics for eye fixations in each look-zone TPSF excl. error zone

Error

Production

Total

Mean

Total number of fixations

7.07

4.51

Total duration of fixation (in seconds)

2.58

2.00

2.44

2.16

13.41

8.02

22.91

9.51

1.29

1.26

5.35

3.65

9.23

4.43

In total we collected 19 728 eye fixations, which averaged to 616 fixations per participant, with 23 fixations per item. On average the duration of a fixation was 403 ms.8 About 20% of the fixations were outside the predefined look-zones (i.e., TPSF, error or production zone); these fixations were excluded from the analyses. About 10% of the total number of fixations was within the error zone, which accounted for 14% of the total duration of all fixations (Table 4). So, on average, the fixation duration on the error-word is longer in comparison to the fixation on other words in the TPSF (TPSF M = 0.365 vs. Error M = 0.529 s), and—proportionally—the error is fixated about 3–4 times more often.9 Based on the analysis of eye fixations in the different look zones and taking into account their distribution in time, we distinguished three patterns: (a)

immediate correction after one or more fixations on the error word in the preparation time; (b) delay of error correction till after the sentence production with one or more fixation on the error during preparation time, and (c) delay of error correction till after the sentence production with no fixation on the error during preparation time. For about 18% of the items in which the participants opted to delay the error correction, the error was not fixated before the writers decided to complete the sentence first. Moreover, about half of the participants show a consistent reading pattern in which the error is fixated at least once during the preparation time. In none of the trials this group seem to have ‘overlooked’ the error, but decided to prioritize sentence completion to error correction anyway. On the other hand the data show that some participants almost consistently prioritize sentence production and do not (or hardly) engage in reading the partial sentence. The difference in

In this experiment we used a temporal threshold of 200 ms in order to capture the majority of reading fixations and to exclude noise. Rayner (1978) found the mean fixation duration during reading to be 250 ms, which has since been replicated by others. Other studies have shown somewhat shorter fixation, between 200 and 250 ms, for other types of visual behavior (see also Manor & Gordon, 2003). Because reading during writing might also relate to these other types of visual behavior (e.g., identification of objects) we opted for a 200 ms threshold.

Of course, we have to take into account that some of the words in the TPSF were, for instance, articles. We know from reading research that this kind of short and high frequent words are often only parafoveally fixated (see, for instance, Drieghe, Rayner & Pollatsek, 2005; Inhoff, Eiter, Radach & Juhasz, 2003). The manipulated errors in the TPSF were all longer words (nouns) of at least two syllables.

123

822

L. Van Waes et al.

behavior does not affect the accuracy of their quality scores for the final sentence, nor does this strategy affect the total time on task. Of course, fixating on an error does not ensure that the participant successfully detected it. However, the average length of the first fixation on the error is significantly longer than fixations on the other words: 0.525 (SD = 0.441) versus 0.401 (SD = 0.071; t(682) = 7.587, p \ .001), which might be an indication that most of the first fixations on the error word involve more extensive processing. The fact that the mean duration between the first and the second fixation on the error is only 386 ms (SD = 712) adds to this hypothesis: many fixations on the error zone seem to be followed quite quicklyâ&#x20AC;&#x201D;after a relatively short saccadeâ&#x20AC;&#x201D;by a second fixation within the same zone. A series of successive fixations on the error word suggest more extensive processing, perhaps for the purposes of verifying the error and prescribing the correction. We used multilevel analyses to examine the influence of our two main factors (i.e., content complexity and error complexity) upon reading behavior (i.e., the frequency, duration, and location of eye fixations). See Table 5 for the most important netto zero models that were estimated in the multilevel analyses taking into account the effect of Context words and Error type (and Table 7 in the Appendix, representing the zero models). The total number of fixations per item is significantly affected by both the number of Context Words and the Error Type. When three context words have to be integrated with the partial sentence (instead of one), cognitive resources may be drawn away from other processes, such as reading. By way of compensation, more fixations are needed for processing, whether to comprehend the partial sentence or detect the error.10 The fact that the writers interact more intensively with the text on the screen is also reflected in the increasing total distance of the saccades between fixations (calculated on the basis of the x-values that represent the pixel locations on the screen). This parameter is an indirect measure to reflect the amount of (re)reading during writing. The number and the length of the fixations on the error word is not influenced by the number of Context Words. All the general characteristics of eye fixations were significantly affected by the type of error. The number of fixations and the total distance of the saccades increased when the writers have to correct a far neighbor error as opposed to a near neighbor error in the TPSF. Also the frequency and duration of fixations in the error zone are affected by the type of error. We were also interested in the reading activities that preceded the first action (either sentence completion or error correction), i.e. as a distinct window onto error detection. As mentioned previously, in more than 80% of the sentences the error was fixated at least once during this preparation time. Neither the type of error, nor the number of context words presented, appears to relate significantly to the probability of participants fixating on the error. However, the total number of fixations and the total length of the fixations on the partial sentence decreases when 10 Note that we should take into account that the participants had to include three context words, they needed more time to complete the task (Table 2). Therefore, the (re)reading behavior and the total number of fixations is to a certain extent confounded by time on-task.

123

Duration between first and last fixation in error zone

Duration between first and second fixation in error zone

First fixations

Length of fixation in either TPSF or Error zone during preparation time

Number of fixation in TPSF zone during preparation time

Fixation in error zone before text completion (yes/no)

Preparation time

Total distance saccades (x-value, number of horizontal pixels)

Length of fixation in error zone

Number of fixations in error zone

Number of fixations

General

(527.187)

4,921.788

(28.140 –

–

(0.044)

384.980

-0.222

(0.065)

(0.155)

(0.262)

0.908

-0.354

4.835

(0.230)

–

(110.290)

1.017

874.897

(169.834)

–

3,505.646

(0.123)

1.358

(0.195)

–

(0.577)

2.427

5.917

19.215

(.140)

(384.070)

2,317.658

–

(09.599)

250.588

0.089)

0.720

(01.61)

1.297

(0.574)

2.321

(24,559.650) 20,930,540.000 (1,271,353.000)

6,526,350.000 (1,943,209.000)

500,340.100

(0.020)

0.404

(0.250)

5.098

–

(103,488.100)

1,703,743.000

(0.068)

1.120

(0.163)

3.311

(2.832)

46.628

Sentence level Est. (SE)

(6,288.466)

6,471.448

(0.030)

0.104

(0.500)

1.801

(0.230)

1.381

(183,325.700)

631,293.100

(0.106)

0.360

(0.137)

0.422

(9.100)

33.579

Participant level Est. (SE)

Error type Est. (SE)

Intercept Est. SE)

Context words Est. (SE)

Random part

Fixed part

11,364.170

13,751.850

1,728.661

3,920.161

–

9,929.046

1,754.412

3,521.278

3,917.533

Deviation

Loglikelihood

Table 5 Zero netto models: Parameter estimates of intercept and experimental sentence characteristics (context words and error type) for different characteristics of eye fixations (n = 861)

Reading during sentence composing and error correction 823

123

Transitions to error zone from production zone

Transitions to error zone within TPSF

Transitions to error zone

Error zone

Table 5 continued

(0.026)

0.310

(0.076)

0.870

(0.086)

1.208

–

(0.075)

0.265

(0.005)

0.009

(0.032)

0.105

(0.048)

0.145

Participant level Est. (SE)

Error type Est. (SE)

Intercept Est. SE)

Context words Est. (SE)

Random part

Fixed part

(0.017)

0.339

(0.027)

0.560

(0.049)

0.809

Sentence level Est. (SE)

1,527.444

2,000.300

1,552.694

Deviation

Loglikelihood

824 L. Van Waes et al.

Reading during sentence composing and error correction

825

more than one Context word has to be included in the sentence. Although neither error-type nor sentence complexity appeared to influence the probability of the error being fixated, the latter appeared negatively related to the frequency and duration of fixations on the TPSF. In other words, when the cognitive load increases, participants fixate less often on the partial sentence, although the total duration of the preparation time is not influenced by the number of context words (see Table 5). Probably an increasing number of fixations outside the TPSF-zone during the preparation time in the condition with three context words can be interpreted as a kind of ‘staring in order to plan’. Next, we examined the durations between (a) the first and the second fixation and (b) the first and the last fixation. As mentioned above, the relatively short interval between the first two consecutive fixations on the error—in most cases during the preparation time—may be an indication that the error is detected during the first pass. The experimental factors apparently did not influence the detection, either positively, or negatively. However, when we compare the duration between the first and the last fixation on the error in the two error conditions, the time span increases when a far neighbor error has to be corrected. So, the higher complexity of this type of error did not only lead to more fixations on the error, but error complexity was associated with a larger spread in time of the fixations.11 Also the next analysis is in line with this finding, i.e. the number of times a writer returned to the error zone after having fixated a word outside the error zone. We analyzed both the number of transitions to the error zone in general and, more specifically, the transitions between other words in the TPSF and the error zone. In either case the writers tend to return to the error zone more often. However, the number of times they return to the error zone during or after sentence completion is not influenced by the experimental conditions. So, neither rereading, nor delayed error correction seems to be affected by the cognitive load or the error type. Reading behavior and accuracy Finally, we used logistic multilevel modeling to examine relationships between reading behavior and the accuracy of task completion. We used three quality measures: correct inclusion of context words in the sentence produced, appropriate correction of the error and overall correctness of the final sentence (spelling and grammar). In the model we focus on reading behavior during the preparation time: Can we identify characteristic reading patterns during the preparation period, before writers pressed the first keystroke? Table 6 shows the results for the zero models and the netto zero models for Accuracy. The probability (expressed in logits) is calculated for the condition with one and three context words. We also included an interaction factor (i.c. Length of fixation in TPSF and error zone during Preparation time for 3 context words).

11 Note that the total Production time was also affect by the number of Context words. Therefore, cautiousness is recommended when interpreting this measure of duration between first and last fixation in the error zone.

123

Sentence level variance

Participant level variance

Random part

0.248 (0.301)

(0.302)

(1.484)

0.253

1.362

1.158

(1.474)

(0.000)

0.000

(0.961)

(0.000)

0.000

(1.019)

0.846

(1.023)

0.000

(0.451)

0.166

(0.000)

(0.436)

0.117

(0.000)

0.000

(0.571)

0.399

(0.358)

(0.477)

-0.024

(0.349)

2.795

(0.446)

3.573

Netto zero model Est. (SE)

0.337 0.758

(0.241)

3.028

(0.293)

3.551

Zero model Est. (SE)

Accuracy final sentence

(0.476)

(0.534)

0.654

0.537

(0.383)

3.122

(0.519)

3.009

Length of fixation in TPSF and Error zone during preparation time for 3CW

(0.280)

(0.179

3.703

(0.353)

3.323

-0.745* (0.348)

2.463

(0.698)

(0.434)

2.400

5.026

4.134

Length of fixation in TPSF and Error zone during preparation time

Fixations during preparation time

Intercept 3 context words

Intercept 1 context word

Fixed part

Netto zero model Est. (SE)

Zero model Est. (SE)

Netto zero model Est. (SE)

Accuracy error correction

Accuracy completion

Table 6 Parameter estimates of intercept and fixations during preparation time for accuracy measures (logits)

826 L. Van Waes et al.

Reading during sentence composing and error correction

827

Fig. 4 Probability for completion Accuracy scores, i.e. all context words correctly included (one versus three context words)

The Accuracy measure related to context words (completion) is significantly influenced by the intensity of the interaction with the TPSF (and the error) during preparation time. When participants do more reading at the beginning (during Preparation), their attempt to complete the sentence using the context words is more likely to be unsuccessful. To visualize the change related to the total length of fixations, we calculated the expected probability on the basis of the model. The plot in Fig. 4 shows how the accuracy of sentence-completion drops off, the longer participants spend reading during preparation, presumably looking for errors. This drop-off in accuracy appears precipitous when participants are required to hold three words in working memory, but is negligible when holding one word. For instance, the chance that three words are included correctly decreases with more than 50% when we compare the situation in which the TPSF is fixated for about 1 s as opposed to 4 s (resp. a chance of about .80 versus .40). The other accuracy measures do not seem to be affected by a more intense interaction with the TPSF in the initial stage, nor positively, nor negatively. No significant interaction effects were found.

Conclusion and discussion The purpose of this study was to examine the role of reading in writing, specifically the coordination of editing with sentence composing. Anyone who writes, or watches others write (e.g., teachers), know that writing often proceeds in fits and starts. Typically, a writer may produce a few words of a sentence, then stop to fix an error. Our purpose was to investigate the cognitive processes underlying this seemingly trivial pattern. Participants in the present study completed a series of items, which required completing a sentence and, often, correcting an error in the partial sentence that was presented to them as â&#x20AC;&#x2DC;text produced so farâ&#x20AC;&#x2122;. Participants could attempt the two tasks in either order.

123

828

L. Van Waes et al.

Our first research questions focused on the effect of task complexity on the coordination of writing processes, and—more particularly—on the priority given to sentence production or error correction (correction strategy). For about 90% of total items, the participants completed the sentence first and then corrected the error. Moreover, when the task complexity increased, the tendency to delay error correction also increased. The second research question more specifically concerned reading during writing, for the purposes of proofreading. The results of the eye fixation data revealed that, for more than 80% of the items with an error, participants fixated on the error at least once during the preparation period, i.e., before a keystroke was pressed. Fixations on errors tended to be of longer duration than fixations outside the error zone, in the production zone. These results suggest that the error was at least partially detected during the preparation period, with executive function inhibiting the correcting response. For the experienced writers in our sample, error correcting may be relatively automatized. However, automaticity does not preclude inhibition. Cohen and Poldrack (2008) found that ‘‘the ability to inhibit a motor response does not decrease with automaticity, suggesting that some aspects of automatic behavior are not ballistic’’ (p. 108). Finally, we observed an interesting relation between reading behavior and the accuracy of task completion. As the participant spend more time searching for the error, the probability of successfully completing the sentence declined (cf. last research question on Accuracy). This inverse relation was only observed when sentence composing was relatively more difficult (i.e., when participants were required to compose with three words, as opposed to one). This suggests that more complex writing problems might draws alter the dynamics of composing. This leaves open the possibility that reading to detect errors may draw cognitive resources away from formulating. A possible implication could be that less skilled typists might have a disadvantage. We expect that they have a higher probability to forget the focus of new content they were trying to compose. Although the dynamics of writing may appear somewhat chaotic, with one process interrupting another, the present results suggest a certain orderliness. The results suggest that editing can interfere with composing, when the two processes that involve active processing happen concurrently. That is, if the writer is attempting to mentally arrange words into a phrase—but then stops to reread the text produced so far (TPSF) and detects an error—the lexical representations may begin to fade in working memory. However, the two tasks (sentence composing and error correcting) need not be processed concurrently. In fact, for about 90% of the items, participants opted to correct the error after completing the sentence, although in most cases they had already detected the error. Scheduling tasks in this way avoids direct conflicts for attentional resources, by creating a ‘‘lull’’ in processing. In terms of completing the sentence, the packet of information has moved from formulating to transcription, with no new sentence-information in the pipeline. This lull in processing frees attentional resources for other purposes, e.g., error detecting. Although writing researchers have long characterized editing as ‘‘interrupting’’ composing (Hayes & Hayes, 1980), a better term might be ‘‘opportunistic.’’ Because

123

Reading during sentence composing and error correction

829

editing must share attentional resources with other processes, it must ‘‘wait’’ until those resources become available. While executive processing may bias the scheduling tasks to minimize direct conflicts over limited attentional resources, giving rise to a certainly orderliness, the dynamics of composing may necessarily entail a haphazard component. Proofreading or editing involves a search for errors, and searching is a probabilistic endeavor. Typographical errors and spelling errors arise from various causes during text production, such that the writer may be initially unaware of them. Editing involves searching for errors that may or may not be there, which (if present) may be more or less salient. Relatively weak literacy skills may contribute to a writer making errors, while also making him less able to detect and correct them. Further, motivational factors may influence the writers ability and inclination to edit. Thus, the influence of any or all of these factors can make a particular error more or less easy to detect, i.e., require more or less extensive processing. The present results suggest that more extensive processing for editing can interfere with sentence composing, when the two tasks occur concurrently. It would seem that executive processing mitigates this kind of interference, by scheduling tasks to minimize direct conflicts over attentional resources. Writing is often called a ‘‘complex process’’ and the present study examines one aspect of this complexity, the coordination of editing and sentence composing. Although the dynamics of composing, i.e., the way in which writers coordinate various types of problem-solving, may appear chaotic, they appear to reflect patterns of processes that arise from sharing limited attentional resources. Specifically, executive function appears to schedule tasks to minimize inference, so as to keep attentional resources focused upon sentence composing, at times perhaps by inhibit editing. These results tend to support the ‘‘process approach’’ to writing instruction, in which the stages of writing (i.e., planning, drafting, revising, and editing) are modeled as somewhat distinct stages, at least on the sentence level. To conclude, we want to critically reflect about certain limitations of the results and give some suggestions for further research. The first issue relates to the validity of the writing context we created to operationalize the kind of conflict between formulating and error correction present in normal writing. In ‘normal’ writing, the writer reads and rereads the text that they themselves have typed. In the present experiment, we attempted to simulate the writer’s familiarity with his or her own text via auditory priming (participants heard the spoken version of the partial sentence before reading the written version). So, although the writers were stimulated to build a mental representation of the partial sentence, the participants did not actually type the first part of the sentence themselves. As we all know, there is often something like a proprioceptive sense when we make a typing mistake. In those cases we often immediately reread the just produced text to check whether there was a slip of the finger or not.12 It is unclear how these kind of differences

12 This process of rereading triggered by the possibility of having made a typing mistake is probably also different for touch typists, because they follow the emerging text more constantly on the screen (see also Johansson et al., 2009, this issue).

123

830

L. Van Waes et al.

might have affected our results. Therefore, comparing the data of controlled and more natural writing activities might be very useful. Another issue we would like to raise in this discussion is related to the method we used in the data analysis. In our analysis we used multilevel regression to analyze our data. The main advantage of this multilevel approach is that it minimizes the need for data aggregation and that it allows the researcher to analyze the data from a hierarchical perspective. The hierarchical models we presented consisted of two levels, viz. the participant and the sentence or trial level. Recently, Barr (2008) published an article in which he presented a new framework to analyze eye-tracking data also using multilevel logistic regression (see also Quene´ & Van den Bergh, 2008). In his approach he suggests to analyze the data at three levels. The lowest level in his model is the level of individual observations, individual data frames representing single fixations. Therefore, all individual data points logged by the eyetracking software could be included in the analysis without aggregation, allowing a more fine-grained analysis of changes over time in the distribution of fixations. His approach certainly opens new perspectives, also for the analysis of the kind of data we collected in this experiment. The advantage would be that in such a module the variance between sentences as well as the variance between individuals a´nd the residual variance (i.e. the interaction between sentence and subjects as well as the error variance) is estimated. The practical manual Barr presents in the second part of his article certainly is an interesting guideline to conduct this three level hierarchical multilevel analysis. Third, in this article we mainly focused on how writers (re)read partial sentences that contain errors. Although half of the partial sentences in fact contained no error, we hesitate to formulate conclusions about the (re)reading behavior of these correct instances. The main reason is that we think that it might have been influenced by the experimental context. Post-hoc interviews and our own informal observations suggest that, sometimes, when reading a correct sentence some kind of ‘searching’ behavior occurs to verify whether perhaps the error was overlooked. So, writers might sometimes be biased by the fact that they know that many partial sentence did contain an error. Because it is certainly worthwhile to better describe the reading interaction with a correct TPSF, we would recommend to create a ‘correct only’ condition in a follow-up experiment, so as to avoid experimental bias of incorrect sentences. This condition would also allow researchers to better compare the reading behavior in the correct and the incorrect condition. The last element refers to the threshold we used in filtering the eye fixations. Based on the research on reading (Rayner, 1978, since replicated by many others) we opted for a fixation threshold of 200 ms. The key challenge in identifying fixations is to distinguish them reliably from saccades. In this respect, the minimum fixation duration is critical. While a temporal fixation threshold of 200 ms has become the de facto standard in many reading studies, other studies have shown the relevance of somewhat shorter fixation. Manor and Gordon (2003), for instance, argue that this threshold may not be optimal for quantifying eye movements in other types of visuocognitive tasks. They systematically explored temporal fixation thresholds below 200 ms, using biologically relevant (human face) and abstract (complex geometric) stimuli. Their results show that fixations (\200 ms)

123

Reading during sentence composing and error correction

831

significantly altered spatiotemporal patterns of fixation for both face and geometric stimuli. We think that a comparable study in the context of writing would be very worthwhile to answer questions like: What is the threshold for fixations when skimming the text produced so far to identify a typing mistake (taking into account that there is a kind of proprioceptive awareness that signals, for instance, a slip of the finger and urges the writer to skim this typing error on the screen)? What is a typical pattern for a touch typist that follows the emerging text on the screen? As mentioned in the introduction, a lot of these questions are still unanswered, but better insights into these aspects of reading and writing, are necessary building blocks to a better understanding of the interaction between both cognitive processes. Acknowledgments We would like to thank dr. Sven De Maeyer who assisted us patiently in building the multilevel models. Maaike Loncke cooperated with us in preparing the materials for the experiment and Bart Van de Velde did a wonderful job in programming the experiment. We would also like to thank David Galbraith and the other reviewers for their helpful comments on an earlier draft of this article.The experiment was conducted in the eye-tracking lab of the University of Tilburg, The Netherlands. We would like to thank prof. A. Maes and his team for hosting us and giving us the opportunity to use their lab. The project was partly funded as a BOF/NOI (New Research Initiatives) project by the University of Antwerp (2005â&#x20AC;&#x201C;2007).

Appendix See Table 7.

Table 7 Zero models: Parameter estimates of intercept for different characteristics of eye fixations (n = 861) Fixed part

Random part

Loglikelihood

Intercept Est. (SE)

Participant level Est. (SE)

Sentence level Est. (SE)

Deviance

22.807 (1.085)

35.398 (9.406)

56.363 (2.768)

6,006.126

1.426 (0.070)

2,792.851

1,885,886.000 (92,628.630)

14,957.230

5.098 (0.250)

3,920.161

General Number of fixations Number of fixations in error zone Length of fixation in error zone

1.263 (0.082) 0.161 (0.054)

Total distance saccades 4,071.753 (x-value, number of (147.697) horizontal pixels)

622,881.500 (174,404.900)

Preparation time Fixation in error zone 1.017 (0.230) 1.381 (0.230) before text completion (yes/no) Number of fixation in TPSF zone during preparation time

4.835 (0.262) 1.801 (0.500)

123

832

L. Van Waes et al.

Table 7 continued Fixed part

Random part

Intercept Est. (SE)

Participant level Est. (SE)

Length of fixation in 0.797 (0.062) 0.107 (0.031) TPSF and Error zone during preparation time

Loglikelihood Sentence level Est. (SE)

Deviance

0.416 (0.020)

1,754.106

First fixations Duration between first and second fixation in error zone

384.980 (28.140)

6,471.448 (6,288.466)

500,340.100 (24,559.650)

13,751.850

Duration between first and last fixation in error zone

4,511.389 (362.576)

3,279,165.000 (1,049,103.000)

23,557,560.000 (1,157,022.000)

17,108.040

Error zone Transitions to error zone 1.159 (0.067) 0.110 (0.036)

0.848 (0.042)

2,348.414

0.851 (0.063) 0.105 (0.032)

0.560 (0.027)

2,000.300

Transitions to error zone 0.310 (0.026) 0.009 (0.005) from production zone

0.339 (0.017)

1,527.444

Transitions to error zone within TPSF

References Alamargot, D., Chesnet, D., Dansac, C., & Ros, C. (2006). Eye and pen: A new device to study reading during writing. Behavior Research Methods, Instruments and Computers, 38(2), 287–299. Allal, L., Chanquoy, L., & Largy, P. (Eds.) & G. Rijlaarsdam (Series Ed.) (2004). Revision: Cognitive and Instructional Processes: Studies in Writing 13: Springer. Andersson, B., Holmqvist, K., Holsanova, J., Johansson, V., Karlsson, H., Stro¨mqvist, S., et al. (2006). Combining keystroke logging with eye-tracking. In L. Van Waes, M. Leijten, C. Neuwirth, et al. (Eds.), Writing and digital media (Vol. 17, pp. 166–172). Oxford: Elsevier. Barr, D. J. (2008). Analysing ‘visual world’ eyetracking data using multilevel regression. Journal of Memory and Language, 59(4), 457–474. Blau, S. (1983). Invisible writing: Investigating cognitive processes in writing. College, Composition and Communication, 34, 297–312. Cohen, J. R., & Poldrack, R. A. (2008). Automaticity in motor sequence learning does not impair response inhibition. Psychonomic Bulletin & Review, 15(1), 108–115. Daneman, M., & Stainton, M. (1993). The generation effect in reading and proofreading: Is it easier or harder to detect errors in one’s own writing? Reading and Writing, 5(3), 297–313. Drieghe, D., Rayner, K., & Pollatsek, A. (2005). Eye movements and word skipping during reading revisited. Journal of Experimental Psychology: Human Perception and Performance, 31(5), 954– 959. Elbow, P. (1973). Writing without teachers. New York: Oxford University Press. Elbow, P. (1981). Writing with power. Oxford, MS: Oxford University Press. EyeResponseTechnologies. (2002). Gazetracker. Charlottesville, VA: Eye response technologies. Fedorenko, E., Gibson, E., & Rohde, D. (2006). The nature of working memory capacity in sentence comprehension: Evidence against domain-specific working memory resources. Journal of Memory and Language, 54(4), 541. Flower, L., Hayes, J. R., Carey, L., Schriver, K., & Stratman, J. (1986). Detection, diagnosis and the strategies of revision. College, Composition and Communication, 37, 16–55.

123

Reading during sentence composing and error correction

833

Galbraith, D., & Torrance, M. (2004). Revision in the context of different drafting strategies. In L. Allal, L. Chanquoy, & P. Largy (Eds.), Revision: Cognitive and instructional processes (Vol. 13, pp. 63– 85). Dordrecht: Kluwer. Goldstein, H. (1995). Multilevel statistical analysis. London: Edward Arnold. Hacker, D. J. (1997). Comprehension monitoring of written discourse across early-to-middle adolescence. Reading and Writing, 9(3), 207–240. Hacker, D. J., Plumb, C. S., Butterfield, E. C., Quathamer, D., & Heineken, E. (1994). Text revision: Detection and correction of errors. Journal of Educational Psychology, 86(1), 65–78. Hayes, J. R. (1996). A new framework for understanding cognition and affect in writing. In C. M. Levy & S. Ransdell (Eds.), The science of writing: Theories, methods, individual differences, and applications (pp. 1–27). Mahwah, NJ: Lawrence Erlbaum Associate. Hayes, J. R., & Flower, L. S. (1980). Identifying the organization of writing processes. In L. W. Gregg & E. R. Steinberg (Eds.), Cognitive processes in writing (pp. 3–30). Mahwah, New Jersey: Lawrence Erlbaum Associates. Hayes, J. R., Flower, L., Schriver, K., Statman, J., & Carey, L. (1987). Cognitive processes in revision. In S. Rosenberg (Ed.), Reading, writing, and language possessing (Vol. 2, pp. 176–240). Cambridge: Cambridge University Press. Hayes, J. R., & Hayes, L. S. (1980). Identifying the organization of writing processes. In L. W. Gregg & E. R. Steinberg (Eds.), Cognitive processes in writing (pp. 3–30). Mahwah, New Jersey: Lawrence Erlbaum Associates. Inhoff, A., Eiter, B., Radach, R., & Juhasz, B. (2003). Distinct subsystems for the para foveal processing of spatial and linguistic information during eye fixations in reading. The Quarterly Journal of Experimental Psychology, 56(5), 803–827. ˚ ., Johansson, V., & Holmqvist, K. (2009). Looking at the keyboard or the Johansson, R., Wengelin, A monitor: Relationship with text production processes. Reading and Writing: An Interdisciplinary Journal (this issue). Kaufer, D. S., Hayes, J. R., & Flower, L. (1986). Composing written sentences. Research in the Teaching of English, 20(2), 121–140. Kellogg, R. T. (1988). Attentional overload and writing performance: Effects of rough draft and outline strategies. Journal of Experimental Psychology. Learning, Memory, and Cognition, 14(2), 355–365. Kellogg, R. T. (1996). A model of working memory in writing. In C. M. Levy & S. E. Ransdell (Eds.), The science of writing: Theories, methods, individual differences and applications (pp. 57–71). Hillsdale, NJ: Lawrence Erlbaum. Kellogg, R. T. (1999). Components of working memory in text production. In M. Torrance & G. Jeffery (Eds.), The cognitive demands of writing: processing capacity and working memory effects in text production (Vol. 3, pp. 43–61). Amsterdam: Amsterdam University Press. Kellogg, R. T. (2001). Competition for working memory among writing processes. American Journal of Psychology, 114, 175–191. Kellogg, R. T. (2004). Working memory components in written sentence generation. American Journal of Psychology, 117, 341–361. Larigauderie, P., Gaonac’h, D., & Lacroix, N. (1998). Working memory and error detection in texts: What are the roles of the central executive and the phonological loop? Applied Cognitive Psychology, 12, 505–527. Leijten, M. (2007a). Writing and speech recognition: observing error correction strategies of professional writers (Vol. 160). Utrecht, The Netherlands: LOT. Leijten, M. (2007b). How do writers adapt to speech recognition software? The influence of learning styles on writing processes in speech technology environments. In M. Torrance, L. Van Waes, & D. Galbraith (Eds.), Writing and cognition: Research and applications (Vol. 20, pp. 279–292). Oxford: Elsevier. Leijten, M., De Ridder, I., Ransdell, S., & Van Waes, L. (2007). The effect of errors in the text produced so far strategy decisions based on error span, input mode, and lexicality. Research paper University of Antwerp, Faculty of Applied Economics, 9, 33. Leijten, M., & Van Waes, L. (2005). Writing with speech recognition: The adaptation process of professional writers with and without dictating experience. Interacting with Computers, 17(6), 736–772. Levy, B. A. (1983). Proofreading familiar text: Constraints on visual processing. Memory & Cognition, 11(1), 1–12. Levy, B. A., & Begin, J. (1984). Proofreading familiar text: Allocating resources to perceptual and conceptual processes. Memory & Cognition, 12(6), 621–632.

123

834

L. Van Waes et al.

Levy, B. A., Di Persio, R., & Hollingshead, A. (1992). Fluent rereading: Repetition, automaticity, and discrepancy. Journal of Experimental Psychology. Learning, Memory, and Cognition, 18, 957–971. Manor, B. R., & Gordon, E. (2003). Defining the temporal threshold for ocular fixation in free-viewing visuocognitive tasks. Journal of Neuroscience Methods, 128, 85–93. Matsuhashi, A. (1981). Pausing and planning: The tempo of written discourse production. Research in the Teaching of English, 15(2), 113–134. McCutchen, D. (1986). Domain knowledge and linguistic knowledge in the development of writing ability. Journal of Memory and Language, 25(4), 431–444. McCutchen, D. (1996). A capacity theory of writing: Working memory in composition. Educational Psychology Review, 8(3), 299–325. Pilotti, M., Chodorow, M., & Thornton, K. C. (2004). Error detection in text: Do feedback and familiarity help? The Journal of General Psychology, 131(4), 242–266. Pilotti, M., Maxwell, K., & Chodorow, M. (2006). Does the effect of familiarity on proofreading change with encoding task and time? Journal of General Psychology, 133(3), 287–299. Piolat, A., Roussey, J. Y., Olive, T., & Amada, M. (2004). Processing time and cognitive effort in revision: effects of error type and of working memory capacity. In L. Allal, L. Chanquoy, P. Largy, & Y. Rouiller (Eds.), Revision: Cognitive and instructional processes (pp. 21–38). Dordrecht: Kluwer Academic Publishers. Quene´, H., & Van den Bergh, H. (2004). On multi-level modeling of data from repeated measures designs: A tutorial. Speech Communication, 43(1–2), 103–121. Quene´, H., & Van den Bergh, H. (2008). Examples of mixed-effects modeling with crossed random effects and with binomial data. Journal of Memory and Language, 59(4), 413–425. Quinlan, T., Loncke, M., Leijten, M., & Van Waes, L. (2009). Writers juggle problem-solving: the role of executive function in writing (submitted). Rabbitt, P. (1978). Detection of errors by skilled typists. Ergonomics, 21, 945–958. Rabbitt, P., Cummings, P., & Vyas, S. (1978). Some errors of perceptual analysis in visual search can be detected and corrected. Quarterly Journal of Experimental Psychology, 30, 417–427. Rayner, K. (1978). Eye movements in reading and information processing. Psychological Bulletin, 85(3), 618–660. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124(3), 372–422. Rayner, K. (2004). Eye movements, cognitive processes, and reading. Studies in Psychology and Behavior, 2, 482–488. Rayner, K., & Juhasz, B. J. (2004). Eye movements in reading: Old questions and new directions. European Journal of Cognitive Psychology, 16, 340–352. Schilperoord, J. (2002). On the cognitive status of pauses in discourse production. In T. Olive & C. M. Levy (Eds.), Contemporary tools and techniques for studying writing (pp. 61–90). Dordrecht/ Boston/London: Kluwer Academic Publishers. Severinson Eklundh, K. S. (1994). Linear and non-linear strategies in computer-based writing. Computers and Composition, 11, 203–216. Simpson, S., & Torrance, M. (2007). EyeWrite (Version 5.1). Sternberg, S. (1969). The discovery of processing stages: Extensions of Donders’s method. Acta Psychologica, 30, 235–276. Sullivan, K. P. H., & Lindgren, E. (2006). Computer key-stroke logging and writing. Oxford: Elsevier Science. Van den Bergh, H., & Rijlaarsdam, G. (1996). The dynamics of composing: Modelling writing process data. In C. M. Levy & S. E. Ransdell (Eds.), The science of writing: Theories, methods, individual differences, and applications (pp. 207–232). Mahwah, NJ: Lawrence Erlbaum Associates. Van Waes, L., & Leijten, M. (2006). Logging writing processes with Inputlog. In L. Van Waes, M. Leijten, & C. Neuwirth (Eds.), Writing and digital media (Vol. 17, pp. 158–166). Oxford, UK: Elsevier. Van Waes, L., & Schellens, P. J. (2003). Writing profiles: The effect of the writing mode on pausing and revision patterns of experienced writers. Journal of Pragmatics, 35(6), 829–853. Voss, J. F., Vesonder, G. T., & Spilich, G. J. (1980). Text generation and recall by high-knowledge and low-knowledge individuals. Journal of Verbal Learning & Verbal Behavior, 19(6), 651–667. Wengelin, A., Torrance, M., Holmqvist, K., Simpson, S., Galbraith, D., Johansson, V., et al. (2009). Combined eye-tracking and keystroke-logging methods for studying cognitive processes in text production. Behavior Research Methods, 41, 337–351.

123

Copyright of Reading & Writing is the property of Springer Science & Business Media B.V. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.