American Educational Research Journal http://aerj.aera.net

Assessing Barriers to the Reform of U.S. Mathematics Instruction From an International Perspective Laura M. Desimone, Thomas Smith, David Baker and Koji Ueno Am Educ Res J 2005; 42; 501 DOI: 10.3102/00028312042003501 The online version of this article can be found at: http://aer.sagepub.com/cgi/content/abstract/42/3/501

Published on behalf of

http://www.aera.net

By http://www.sagepublications.com

Additional services and information for American Educational Research Journal can be found at: Email Alerts: http://aerj.aera.net/cgi/alerts Subscriptions: http://aerj.aera.net/subscriptions Reprints: http://www.aera.net/reprints Permissions: http://www.aera.net/permissions

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 501

American Educational Research Journal Fall 2005, Vol. 42, No. 3, pp. 501â€“535

Assessing Barriers to the Reform of U.S. Mathematics Instruction From an International Perspective Laura M. Desimone and Thomas Smith Vanderbilt University David Baker Pennsylvania State University Koji Ueno Florida State University

The authors assessed five commonly perceived barriers to increased use of conceptual teaching in mathematics in the United States related to teacher autonomy, trade-offs with computational strategies, student achievement, class size, and teacher qualifications. These barriers were examined through the use of data from nationally representative samples of eighth-grade mathematics classrooms across 38 nations that took part in the Third International Mathematics and Science Study in 1999 and follow-up analyses involving data from the high-achieving nations of Japan and Singapore. Findings suggest that most of these perceived barriers are not impediments to the use of conceptual teaching strategies in other countries, and the comparative findings hold promise for alternative paradigms for organizing better mathematics instruction in the United States. KEYWORDS: international comparison, mathematics instruction, school reform, teaching quality, TIMSS

W

hen the results of the 1994 Third International Mathematics and Science Study (TIMSS) were presented to the public by the U.S. government, what probably caught the attention of United States education policymakers and the public more than the mountain of statistics and charts was the videotape showing three mathematics teachersâ€”one from the United States, one from Japan, and one from Germanyâ€”all teaching the same topic. The images on the screen drove home the fact that unlike the U.S. teacher, who relied on lecture, drill, and practice, the Japanese and German teachers used extensive conceptual challenges, encouraging students to explore, investigate, and solve problems with greater insight into mathematics principles. Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 502

Desimone et al. As a result of this and more recent large-scale cross-national studies, such as the 1995 and 1999 versions of TIMSS (TIMSS-95 and TIMSS-99), international comparisons have figured prominently in debates about mathematics education in the United States (Linn, Lewis, Tsuchida, & Songer, 2000; Smith, 2004; Smith & Baker, 2002). Many United States mathematics reformers have imported “conceptual teaching” models from the high-achieving countries of Japan and Singapore, including “Singapore mathematics” (Federal Publications, 2003) and the “Japanese lesson-study” model (Fernandez & Yoshida, 2004). Even though there is debate about the exact characteristics of effective teaching and the appropriate balance between conceptual and computational teaching approaches (Carpenter & Fennema, 1991; Ernest, 1998, 1999; Loveless, 2001), a greater emphasis on conceptual instruction is now widely advised in the United States and is a core component of current state and national education policies (e.g., National Council of Teachers of Mathematics, 2000; No Child Left Behind Act of 2001). From both U.S. and international research, it is clear that quality of instruction affects student academic achievement in mathematics (Fuller, Hua, & Snyder, 1994; Rivkin, Hanushek, & Kain, 1998; Sanders & Horn, 1998; Schmidt, McKnight, & Raizen, 1997). Furthermore, there is general agreement on the benefits of the use of conceptual teaching (also called “higher-order instruction” or “teaching for understanding” in mathematics) (e.g., Carpenter, Fennema, Peterson, Chiang, & Loef, 1989). Conceptual teaching emphasizes real-world problem solving and elicits student reflection. It involves working with problems with no obvious solution, discussing alternative hypotheses, and using investigation to solve problems (Hiebert et al., 1996). Conceptual teaching can result in

LAURA M. DESIMONE is an assistant professor in the Department of Public Policy and Education, Vanderbilt University, #514 Peabody College, Nashville, TN 372035721; e-mail: l.desimone@vanderbilt.edu. Her research focuses on policy effects on instruction and student achievement. THOMAS SMITH is an assistant professor in the Department of Public Policy and Education, Vanderbilt University, #514 Peabody College, Nashville, TN 37203-5721; e-mail: thomas.smith@vanderbilt.edu. His areas of specialization are the organizational and policy contexts of teacher and teaching quality. DAVID BAKER is a professor in the Sociology Department, Pennsylvania State University, 300 Rackely Building, University Park, PA 16802; e-mail: dpb4@psu.edu. He is interested in schooling as an institution and its transformative powers in modern society. KOJI UENO is an assistant professor, Department of Sociology, Florida State University, 526 Bellamy Building, Tallahassee, FL 32306-2270; e-mail: kueno@fsu.edu. His research interests include research methods, adolescence, mental health, and social networks.

502

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 503

Barriers to the Reform of Mathematics Instruction enhanced problem-solving abilities among learners (Schoenfeld, 1985; Silver, 1985). By contrast, too much reliance on procedural or computational instructional practices does not allow students to develop a deeper understanding of the mathematical conceptions behind the procedures (Baroody & Benson, 2001; National Council of Teachers of Mathematics, 2000; Romberg, 2000). Although routine drill and practice may assist students in developing mathematical thinking in some circumstances (Li, 1999), such procedural learning is usually linked to short-term memorization of facts. We do not intend to add to the debate on the relative effectiveness of conceptual teaching. Rather, we challenge current assumptions about how the teaching of mathematics in the United States is organized and how it compares with mathematics teaching in other nations. We undertook two different types of analysis. First, starting from the debate in the U.S. mathematics reform community regarding the relative benefits of different types of instruction (e.g., Loveless, 2001), we sought to estimate the extent to which conceptual and computational teaching are occurring across U.S. eighthgrade mathematics classrooms. Second, we examined whether several oftenstated barriers to effective teaching really are impediments to the use of more conceptual approaches among teachers.

Is the Distribution of Teaching Quality Different in the United States Than in Other Countries? Cross-national comparisons involving purposefully selected samples of teachers—not nationally representative samples—have documented that conceptual instruction is used more in high-performing countries such as Japan and Singapore than in mid-scoring countries such as the United States (Hess & Azuma, 1991; Jiang, 1995; Stigler & Hiebert, 1997; Tsuchida & Lewis, 1996). Although it has been assumed that these studies summarize the behavior of the nation’s mathematics teachers, the exact amount and distribution of these strategies have never been examined nationally or compared cross nationally. In our study, we examined the distribution of teaching quality in the United States and other countries and tested several assumptions about why differences in this distribution may exist. We compared the amount and distribution of conceptual and computational mathematics teaching across multiple nations using nationally representative samples of eighth-grade mathematics classrooms. Specifically, we were interested in whether U.S. teachers as a whole use significantly less conceptual instruction—or significantly more computational approaches—than teachers in other nations, particularly those nations regarded as examples of high-achieving educational systems (e.g., Japan and Singapore). We undertook a detailed analysis of an extensive teacher questionnaire administered as part of TIMSS-99 to develop indicators of teaching strategies that measure conceptual and computational instruction. Using these Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

503

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 504

Desimone et al. indicators, we examined within- and between-country variations in mathematics teaching strategies across eighth-grade classrooms in 38 nations participating in TIMSS-99.

Perceived Barriers to Conceptual Teaching From the available literature on U.S. mathematics reform, some might infer that there are clear barriers to widespread conceptual teaching of mathematics that are unique to U.S. public schools. These barriers cluster around at least five assumptions, derived from the literature, about why conceptual teaching is not more widespread in the United States. The first assumption is that the individualism and autonomy that characterize teaching in the United States lead to wide variation among teachers in the use of conceptual and computational strategies, hence making it difficult to implement widespread change. This statement depicts the U.S. public school system as a very decentralized and technically decoupled system without strong administrative or professional instructional guidance; thus, for the most part, teachers must determine their own instructional approaches. Traditionally, teachers could simply â€œclose the doorâ€? and teach using any method or content they chose (Cuban, 1993; Jackson, 1986). This is changing somewhat in the current accountability environment, in which teachers are being held accountable for student achievement scores, but the tradition of local control of curriculum and instruction is still a central feature of U.S. school systems (Cross, 1999; Rowan, 1990). Accordingly, it is assumed that the qualities just described can lead to great variation in the use of instructional approaches. For example, studies have shown that teachers vary substantially in terms of the content they teach (Porter, 1989), that much of this variation in instruction is within rather than between schools (Newmann, King, & Youngs, 2000; Raudenbush, Rowan, & Cheong, 1993; Rowan, Chiang, & Miller, 1997), and that, in comparison with other countries, the United States may have less coherent and consistent curricula across all schools (Schmidt, Houang, & Cogan, 2002).1 In contrast, countries such as Japan and Singapore are thought to have a tightly controlled curriculum and clearer management of instructional approaches; teaching is more prescribed and thus believed to be more uniform. Furthermore, to the degree that U.S. teachersâ€™ early training does not include more conceptual approaches (Wilson, Floden, & Ferrini-Mundy, 2001), this autonomy can lessen the use of conceptual strategies in their instruction. Moreover, the unique governance structure of U.S. schooling is thought to lead to difficulties in changing the teaching behavior of large numbers of teachers. The second assumption is that if teachers use more conceptual strategies, they will have to reduce their use of computational strategies by an equal amount. Teachers may think that if they choose a more conceptual approach to their instruction, they will not provide instruction on how to 504

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 505

Barriers to the Reform of Mathematics Instruction actually solve standard problems. U.S. teachers may perceive the choice as dichotomous, which may lead them to believe that they cannot teach procedural and conceptual understanding simultaneously. Such beliefs among teachers about the inherent trade-offs of conceptual and computational approaches may be derived from their experiences in teacher preparation programs. Many teachers are not prepared to implement practices based on higher-order thinking strategies in mathematics (Cohen, 1990; Elmore, Peterson, & McCarthy, 1996; Grant, Peterson, & Shojgreen-Downer, 1996). A substantial number of teachers learned to teach using a model of instruction that focuses heavily on memorization of facts without also emphasizing deeper understanding of subject knowledge (Ball, 1996; Cohen, McLaughlin, & Talbert, 1993; Ma, 1999; Wilson et al., 2001; Wilson, Floden, & Ferrini-Mundy, 2002). Thus, if teachers are not taught ways of integrating procedural and conceptual instruction, they may be likely to think that an increase in one type requires a decrease in another. The third assumption is that U.S. teachers tend to think conceptual teaching is appropriate only for high-performing students, while computational instruction is appropriate for low-performing students. In the United States, lower-achieving students are more likely to experience computational teaching, and higher-achieving students are more likely to experience conceptual teaching (Barr, Wiratchai, & Dreeben, 1983; Clark & Peterson, 1986; Gamoran, 1986; Meier, 1995; Porter, Kirst, Osthoff, Smithson, & Schneider, 1993; Turnbull, Welsh, Heid, Davis, & Ratnofsky, 1999). Researchers have suggested that the emphasis on lower-level content is a central problem that explains the poor performance of disadvantaged students (Knapp, 1997; McKnight et al., 1987; Romberg, 1988). Studies have shown that teachers tend to match the level of their instruction to their perception of studentsâ€™ prior learning experiences and learning needs (Clark & Peterson, 1986; McAninch, 1993; Raudenbush et al., 1993; Shavelson & Stern, 1981), resulting in an emphasis on procedural learning for low achievers. Furthermore, several reform advocates have taken the position that a focus on computation and procedural learning is appropriate (Geary, 2001; Hirsch, 2001). There is debate regarding the appropriate mix of conceptual and computational strategies (see Loveless, 2001). For example, recent evidence suggests that low-achieving students can master more demanding intellectual content while simultaneously learning basic skills (Lo Cicero, De La Cruz, & Fuson, 1999; Mayer, 1999) and that demanding academic work is advantageous to underachievers (Knapp, 1995). Proponents of conceptual-oriented teaching suggest that students do not need to know computational procedures before understanding mathematics (Burrill, 2001), that it is difficult for students to understand mathematics once they have learned the rote procedures (Hiebert, 1999), and that there is better â€œtransferâ€? when students learn through conceptual understanding rather than memorization (Bransford, Brown, & Cocking, 1999). Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

505

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 506

Desimone et al. In contrast, another side of the debate contends that “reform” teaching is not consistent with how children learn and is doomed to fail, while direct and explicit instruction appears to be a more effective strategy (Geary, 1994, 2001; Slavin, Madden, Karweit, Livermon, & Dolan, 1990). Still others have reported mixed findings on the extent to which conceptual and computational practices differentially influence student achievement (Shouse, 2001). The fourth assumption is that classes with large numbers of students prevent U.S. teachers from using conceptual instruction. Some research has shown that class size has an impact on the type of instruction teachers use (King, 1999); for example, teachers of smaller classes use more individualized instruction (Shkolnik, 1999). Smaller class size has also been associated with increased student achievement (Finn & Achilles, 1999); one explanation for this relationship is that smaller classes are a key part of the capacity that teachers need to implement conceptual instruction in the classroom (V. E. Lee, Bryk, & Smith, 1993; Newmann et al., 2000). Research focusing on the extent to which class size mediates instruction has produced mixed results, however. For example, the Tennessee Class Size Study provided high-quality experimental evidence on the effects of class-size reductions in the early elementary grades (Finn & Achilles, 1999; Mosteller, 1995; Mosteller, Light, & Sachs, 1996; Nye, Hedges, & Konstantopoulos, 2002) but offered considerably less information regarding the actual mechanisms through which these reductions led to improved student outcomes. One study involving TIMSS-95 data revealed little relationship between use of instructional strategies and class size (Pong & Pallas, 2001), and a small amount of evidence supports the notion that teachers who use more individualized instruction in smaller classes change neither the topics on which they focus (e.g., fractions or decimals) nor the strategies they use to teach these topics (e.g., conceptual vs. computation) (Betts & Shkolnik, 1999). The fifth assumption is that, in the United States, only teachers with a strong knowledge of mathematics and how students learn mathematics have the necessary skills to use conceptual strategies, which require more in-depth knowledge than computational approaches. Teacher knowledge and skills have an important influence on teachers’ choice of instructional content (Ball, 1996; Cohen & Ball, 1990; Cohen et al., 1993; Gerges, 2001; Louis & Marks, 1998). Furthermore, a substantial number of reformers believe that the standards-based reform drive encouraging teachers to teach to higher standards requires much greater depth, sophistication, and grasp of the academic subject than most teachers in the United States possess (Cohen, 1990; Cohen & Spillane, 1992; Elmore & Burney, 1996; Elmore et al., 1996; Grant et al., 1996; Sizer, 1992). Thus, one common reason offered for the proclivity of U.S. teachers’ use of more procedural than conceptual teaching is that computational strategies require less in-depth knowledge of mathematics, and teachers in the United States generally do 506

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 507

Barriers to the Reform of Mathematics Instruction not have the knowledge and skills required for conceptual teaching in mathematics (Ma, 1999).

Studying Perceived Barriers The five assumptions just outlined about what inhibits the reform of U.S. mathematics teaching have never been put to a direct test. Do these qualities of classrooms and teachers really impede the use of conceptual strategies and increase computational ones? Furthermore, do these perceived barriers work uniquely to influence instructional strategies in U.S. classrooms as compared with classrooms in other nations? By examining how other countries organize teaching, we can challenge the conventional wisdom associated with how teaching practice could and should be organized in the United States.

Method Data We used data from the 38 countries that participated in TIMSS-99. This study, conducted by the International Association for the Evaluation of Educational Achievement, assessed the mathematics and science performance of U.S. students in comparison with their peers in other nations. TIMSS-99 also collected information on schools, curricula, instruction, lessons, and the lives of teachers and students in an attempt to provide an understanding of the educational context in which mathematics and science learning takes place (Gonzales et al., 2000). The target population for TIMSS-99 was students enrolled in the upper of the two grades containing the largest proportion of 13-year-olds, corresponding to eighth grade in most countries. Intact classrooms of eighth-grade mathematics students were sampled, resulting in about one randomly selected mathematics class from approximately 150 public and private schools in each country. Data were derived from student assessments, surveys, video studies, and case studies. In the present analysis, we used the surveys completed by studentsâ€™ mathematics teachers in 1999, as well as student mathematics assessment and background questionnaire data aggregated to the teacher level. Our operational sample included only the 6,171 teachers in the 38 countries who answered all of the teaching items. Measures To develop valid and reliable measures for conceptual and computational instruction, we first created a definitional framework for these approaches. We are interested in the quality of the cognitive strategies teachers use in instruction. In addition to the actual topic of study (i.e., algebra or geometry), the cognitive strategy that a teacher chooses for the topic is also part of the content of instruction. One of the most important choices a teacher makes in teaching mathematics is the relative emphasis placed on the Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

507

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 508

Desimone et al. cognitive strategies of calculating or computing mathematical problems and understanding at a conceptual level the mathematical principles underlying those problems (A. Thompson & Thompson, 1996). For example, if a teacher takes a computational approach to teaching students to find the slope of a line passing through two points in a plane, the lesson might involve simply memorizing the formula for slope: M = (y2 − y1)/(x2 − x1). With this approach, students are not required to demonstrate any understanding of the relationship in the ordered pairs that compose the line. Here the focus is on memorizing the rule and applying it blindly with faith that the result is actually the slope; the main challenge is for students to remember whether it is x or y that “goes on top.” A conceptual approach would have students engaged in a process of exploring the relationship between ordered pairs in, say, a T-chart or looking at the graph of the line and determining the change in ordered pairs on the line. In exploring the line and its points, the goal is for students to come to understand that the slope is actually a description of the “tilt” either increasing or decreasing. With this approach, it makes sense to take the relationship of the difference of the y coordinates over the difference of the x coordinates, which is the formula for slope. Students can still find the slope if they forget the formula, because they understand at a conceptual level what is meant by slope. We chose to focus on teachers’ cognitive strategies in mathematics on the basis of research showing that these strategies are predictive of student achievement (Fennema et al., 1996; Gamoran, Porter, Smithson, & White, 1997; Pellegrino, Baxter, & Glaser, 1999; Porter et al., 1993; Putnam & Borko, 1997). In addition, use of such strategies has been at the forefront of the U.S. debate regarding the extent to which mathematics teaching in this country is competitive with that in other countries (Loveless, 2001; Porter & Gamoran, 2002; Schmidt et al., 1997). We used the literature on computational and conceptual teaching in mathematics to develop measures of teachers’ cognitive strategies (see Carpenter, Fennema, & Franke, 1996; Hiebert et al., 1996; Ma, 1999; Romberg, 2000; Schoenfeld, 1985; Silver, 1985). A question included in TIMSS-99 asked about frequency of use of computation as a method of teaching mathematics: “How often do you ask students to practice computational skills?” Response options were never (1), some lessons (2), most lessons (3), and every lesson (4). Four items focused on conceptual teaching: (a) “How often do you ask students to explain reasoning behind an idea?” (b) “How often do you ask students to use tables, charts, or graphs?” (c) “How often do you ask students to work on problems with no obvious method of solution?” and (d) “How often do you ask students to write equations to represent relationships?” Response options were the same as those for computation. Exploratory factor analyses of the conceptual teaching items were conducted with the combined sample of 38 TIMSS countries. Results showed that there was only one factor with an eigenvalue greater than 1 (the eigenvalue for this factor was 1.70) and that all items had factor 508

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 509

Barriers to the Reform of Mathematics Instruction loadings exceeding .40 (factor loadings for the four items, in the order shown in the preceding paragraph, were .44, .63, .46, and .42), indicating a high degree of consistency among items (see B. Thompson, 2004). Conceptual instruction index reliabilities (Cronbach alpha coefficients) were .82 for the U.S.-only sample and .79 for the combined sample of 38 TIMSS99 countries. In the case of several other countries, including Japan, the alpha coefficient was lower, but not by more than .1. In addition, we used the mean conceptual score across all four items in the composite and the mean computation score to calculate a variable representing the ratio of conceptual to computational instruction. Because achievement and socioeconomic status (SES) are highly correlated (Bryk & Driscoll, 1988), we controlled for level and distribution of student SES and for class average achievement. We used number of books in students’ homes as a proxy for SES. Data on the TIMSS-99 question regarding parental education were not available for several countries (e.g., Japan), while information on the “books in the home” question was available for all participating countries and has been used in other studies as a proxy for SES (e.g., Baker, Goesling, & LeTendre, 2002). In the case of the classroom average mathematics achievement measure, we aggregated the first plausible values of students’ mathematics item response theory (IRT) scores from the student level to the teacher level.2 We used country-level aggregations for the descriptive correlation analyses. The mathematics assessment covered five content areas: (a) fractions and number sense, (b) measurement and data representation, (c) analysis and probability, (d) geometry, and (e) algebra. In addition to the classroom average mathematics achievement measure, we used the standard deviation of classroom achievement to capture within-class achievement diversity, often considered a barrier to use of conceptual teaching strategies. Class size was measured according to reports of the numbers of students enrolled in sampled classes. We used two proxy measures to measure teacher content knowledge. First, we determined whether the teacher had a bachelor’s or master’s degree in mathematics or mathematics education. Possession of a degree in the field being taught is often used as an approximation of teachers’ content knowledge (e.g., Darling-Hammond, 2000; Goldhaber & Brewer, 1997, 2000; Monk & King, 1994). We included degrees in both mathematics and mathematics education because previous research has not differentiated the effects of these two types of degrees (Desimone, Smith, & Ueno, in press) and because, across the TIMSS country data, there are no consistent distinctions between mathematics and mathematics education degrees. Our second measure of teacher knowledge was number of years of teaching experience. In addition to formal subject-specific training, as represented through possession of a degree in mathematics or mathematics education, teachers develop and extend their knowledge—of content and pedagogical content (Shulman, 1987)—via teaching in the classroom, talking with other teachers, developing lesson plans, observing their fellow teachers, Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

509

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 510

Desimone et al. and engaging in other curriculum- and instruction-related activities that are part of the daily life of teachers (Cohen et al., 1993). Accordingly, research on teacher qualifications and credentials has revealed a link between increased student achievement and more experienced teachers (Darling-Hammond, 2000; Murnane & Phillips, 1981; Rivkin et al., 1998). T capture teachersâ€™ content knowledge and pedagogical content knowledge above and beyond formal training, we included a measure of years of teaching experience. Analysis TIMSS-99 included national probability samples of students enrolled in eighth-grade mathematics classes but did not include national probability samples of eighth-grade mathematics teachers. Because the sampling procedures involved selection of students in intact classrooms, it was possible to calculate the probability that a mathematics teacher in a particular eighth-grade class would be sampled. We used the teacher weights3 (the numbers representing the inverse probabilities of teachers being selected into the sample) from the TIMSS-99 data file to adjust for differential selection probabilities of different classrooms within countries. In our descriptive analyses, we examined country-level means and standard deviations for the variables that we hypothesized to be associated with achievement in all 38 TIMSS-99 countries (as described earlier): cognitive teaching strategies, number of books in the home, standard deviation of books in the home, class average math achievement, class standard deviation of math achievement, and teacherâ€™s experience and degree. We calculated the within-nation coefficient of variation4 (standard deviation divided by mean, multiplied by 100) in use of computational and conceptual instruction. We use this coefficient to ensure that we could compare variations in computational and conceptual instruction; that is, standardizing by taking the standard deviation as a percentage of the mean allowed us to compare the two types of instruction, which are measured on different scales. We used the coefficient of variation and mean comparisons to address Assumption 1, namely, to examine whether there was more within-country variation in instruction in the United States than in other countries. To address Assumption 2, regarding trade-offs between computation and conceptual instruction, we examined correlations within and between countries. We used two types of analyses to address Assumptions 3, 4, and 5. First, we created a two-level model predicting teaching strategies on the basis of class-level student achievement, class size, and teacher qualifications. Second, we compared within-country correlations between teaching strategies and achievement, class size, and teacher qualifications for the United States, Japan, and Singapore. In addition, to further examine Assumption 3, we regressed teaching strategies on class average achievement and compared regression lines across the TIMSS countries. Specifically, we applied a multilevel analytic strategy in order to examine the relationships between class- and country-level variables and teachersâ€™ 510

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 511

Barriers to the Reform of Mathematics Instruction emphasis on computational and conceptual teaching strategies. We focused primarily on assessing the extent to which the class- and teacher-level variables described earlier explained within-country variation in teachers’ emphasis on conceptual teaching. A common technique for analyzing hierarchical data (in this case, teachers’ classrooms nested within countries) is hierarchical linear modeling (HLM), which allows simultaneous consideration of factors from two or three levels of analysis (Raudenbush & Bryk, 1986, 2002). We aggregated student-level information to the class level because our dependent variable was teachers’ instruction; we could not use the student as the level of analysis, given that there would be no within-class variation in the dependent variable for student-level data to explain (i.e., because the instructional variables would be the same for every student in the same class). Our analyses can be represented via two equations. The Level 1 equation is as follows: (Level of Conceptual Teaching)ij = β0j + β1j (Class Size)ij + β2j (Class-Level Standard Deviation of Number of Books in the Home)ij + β3j (Class-Level Average Number of Books in the Home)ij + β4j (Class-Level Average Math Achievement)ij + β5j (Class-Level Standard Deviation of Math Achievement)ij + β6j (Teacher’s Experience)ij + β7j (Teacher Has Mathematics Degree)ij + rij. (1) At Level 2, each effect k was modeled as random such that βkj = γk0 + ukj,

(2)

where ij indexes teacher i in country j and βkj represents the coefficient for the class- and teacher-level predictor k. The term rij is a measure of random error that includes unmeasured sources of variation in a particular teacher’s emphasis on conceptual teaching. Each teacher-level variable was centered around its grand mean for the sample, allowing the intercept, β0j, to be interpreted as the adjusted mean of the conceptual teaching emphasis for each country (Raudenbush & Bryk, 2002). We conducted our analyses using the random coefficients model; that is, we allowed each of the relationships (slopes) estimated in each of the six models (Models 1 and 2 for each of our three dependent variables, as reported in Table 3) to vary randomly across countries. Thus, the fixed effects reported here were for models estimated with randomly varying slopes (using the HLM software [Raudenbush, Bryk, & Congdon, 2004]). We chose a random coefficients model because we had no specific a priori assumptions that the relationships between the independent variables and type of instruction would be the same in all countries; we suspected that there might be cross-national patterns, however, and we wanted to assess these relationships empirically (rather than assuming them a priori). We did test a random intercepts model, which is a more parsimonious model, but we found that in no case did it provide a better fit than the random coefficients model (see Appendix). Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

511

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 512

Desimone et al. We added the Level 1 variables to Equation 1 in two substantive blocks, initially examining the relationships between conceptual teaching and class SES and class achievement (Model 1) and then adding teacher experience and whether or not teachers had a degree in mathematics (Model 2). Coefficients for the variables were allowed to vary randomly across countries. For comparison purposes, we conducted follow-up analyses of the United States and two high-achieving countries, Japan and Singapore, to examine within-country patterns in the relationships among school, teacher, and class characteristics and teaching strategies. We compared United States results with those of Japan and Singapore in an attempt to provide a more specific comparative context with high-achieving countries whose models of teaching and learning are driving many current U.S. mathematics reforms (e.g., Chabbott & Elliott, 2004; Lewis, 1995; Stigler & Hiebert, 1999; see www.singaporemath.com). We used this comparative context to assess whether the five common assumptions about U.S. teaching delineated earlier were in evidence in other countries, especially those whose students were successful on the TIMSS-99 mathematics assessment.

Results As some mathematics reformers claim, there may be too much use of computational strategies and too little use of conceptual strategies in U.S. mathematics classrooms (Carpenter et al., 1996; Hiebert, 1999; Romberg, 1988; J. Smith, Lee, & Newmann, 2001; Spillane & Zeuli, 1999). If so, however, the United States is not by any means unique. Overall, U.S. teachers do use computational strategies in most of their lessons, but not in exceptionally large amounts in comparative terms. The U.S. average score of 3.0 (i.e., use of computation in “most lessons”) was equal to the international mean among the 38 TIMSS-99 countries. Furthermore, computational teaching is still widely practiced in many other nations. As can be seen in Figure 1, in virtually every TIMSS-99 country, teachers use computation somewhere in the range spanning “some lessons” to “every lesson.” (Average values across TIMSS countries ranged from 1.9 to 3.6 on the 4-point scale). Figure 1 also shows plus/minus two standard deviations around mean values. Assumption 1 is that the individualism and autonomy of U.S. teachers leads to variation in the use of conceptual and computational strategies, which makes implementing change difficult. If this is true, we would expect that variations in the use of cognitive strategies in mathematics classrooms would be larger in the United States than in other countries. Although there were a few national outliers, within-country rates of variation in teaching computation ranged from 20% to 32%, with an international average of 28%. The U.S. coefficient of variation for computation was 31%, very close to the international average. Thus, variations in use of computational strategies were not significantly different in the United States than in other countries. 512

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 513

Lithuania Philippines Jordan Latvia Malaysia Hungary Romania Russian Fed. Thailand Moldova Indonesia Slovak Republic Czech Republic Turkey Finland Slovenia Macedonia Chile Hong Kong Bulgaria Morocco Iran, Islamic Rep. United States Italy Belgium (Fl) Japan Canada New Zealand Cyprus Israel Chinese Taipei Australia Singapore Tunisia Korea England Netherlands South Africa

1.5

2.0

2.5

3.0

3.5

4.0

Figure 1. Average computational teaching scores (and 95% confidence intervals), by country.

Results indicated that, cross nationally, teachers spent less time using conceptual teaching methods than computational methods. Figure 2 shows that teachers generally used conceptual strategies in the range of “some” to “most” lessons. The international average and the United States average were the same, 2.4, almost halfway between “some lessons” and “most lessons.” Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

513

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 514

Japan Turkey Italy Slovenia Romania Bulgaria Czech Republic Hungary Slovak Republic Korea Macedonia Moldova Philippines Malaysia United States Cyprus Russian Fed. Jordan Israel Canada Chinese Taipei South Africa Lithuania Netherlands Indonesia Tunisia Chile Latvia Iran, Islamic Rep. Thailand Finland England Australia Morocco Singapore Hong Kong New Zealand Belgium (Fl)

1.5

2.0

2.5

3.0

3.5

4.0

Figure 2. Average conceptual teaching scores (and 95% confidence intervals), by country.

As with computation, the amount of variation between teachers in terms of conceptual teaching was quite consistent across countries (results not shown). For example, coefficients of variation for “explain reasoning behind an idea” ranged from 17% to 34%; the range for “work on problems with no obvious method of solution” was 21% to 43%, with an international average of 34%. The average U.S. and international coefficients of variation in regard to conceptual teaching were the same: 19%. Because the United States did 514

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 515

Barriers to the Reform of Mathematics Instruction not have a notably higher coefficient of variation, the first assumption, that U.S. teachers vary more in their instruction than teachers in other countries, was not supported. To assess the second assumption, concerning whether there is a tradeoff between computational and conceptual strategies, we examined the extent to which the two strategies were correlated both within and between countries. If this assumption is true, we would expect to find negative associations between computational and conceptual strategies, particularly in the case of the United States. In only one country, Belgium, did teachers seem to exhibit a trade-off of strategies. In the other countries, there was either no significant correlation between use of conceptual and computational instruction (16 countries) or a positive significant correlation (21 countries). Correlations between computational and conceptual strategies were insignificant for Singapore (r = .126, p = .135) and positive for Japan (r = .257, p = .002) and the United States (r = .138, p = .013). These data provide support for the idea that an increase in one of the two types of instruction does not necessarily imply a decrease in the other. Assumption 3 suggests that U.S. teachers mostly favor high-performing students with conceptual strategies and relegate lower-performing students to computational strategies. To examine this assumption, we initially regressed teaching strategies on class average achievement and compared regression lines across countries. We then compared within-country correlations between teaching strategies and class average achievement for the United States, Japan, and Singapore. Finally, with the HLM framework, we predicted use of teaching strategies based on class-level average achievement and withinclass variation in achievement (i.e., the class-level standard deviation) while controlling for level and variation in class SES and teacher characteristics. We found that the regression lines were similar across countries. For example, Figure 3 shows that, with the exception of a few outliers, there were similar (and somewhat flat) relationships between use of conceptual teaching and class average mathematics score. These results support the idea that the relationship between conceptual teaching and class average achievement is similar across countries. Next, we examined whether there was a higher correlation between cognitive strategies and student achievement in the United States than in other countries, which we would expect if Assumption 3 were true. Here we treated class average achievement as an indicator of the mean level of student achievement over the school year. That is, we did not treat mean student achievement as an outcome. Rather, we used it to examine whether teachers used different cognitive strategies with students of different achievement levels, as suggested in Assumption 3.5 We examined, at the country level, correlations between class average mathematics score and (a) eighth-grade teachersâ€™ use of computation, (b) their use of conceptual teaching, and (c) the ratio of conceptual to computational teaching. We limited this analysis to the United States and the two highDownloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

515

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 516

Figure 3. Regressions (ordinary least squares) of teachers’ use of conceptual teaching strategies on class average math scores (regression lines for each of the 38 countries taking part in the Third International Mathematics and Science Study, 1999).

performing nations of Japan and Singapore. As shown in Table 1, Japan had no significant correlations of class average math score with our three measures of instruction, and Singapore had only one: Conceptual teaching was correlated with mathematics achievement (β = .274, p = .05). This indicates that a 67-point increase in class average mathematics score translates into a 0.11 increase in conceptual teaching (e.g., moving from a rating of 3 to a rating of 3.11 on the 4-point scale).6 In the United States, math achievement was positively correlated ((β = .360, p = .000) with conceptual teaching, negatively correlated with computation ((β = −.159, p = .10), and positively correlated with ratio of conceptual to computational teaching ((β = .384, p = .000). These coefficients indicate that a 67-point increase in mathematics average achievement is associated with a 0.17 increase in conceptual teaching, a 0.15 decrease in computation, and a 19.9 increase in ratio of conceptual teaching to computation.7 Thus, the United States distinguished itself by being the only one of the three countries to use significantly less computation and significantly more conceptual strategies for high achievers—although the strength of even 516

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

−.128 (p = .045) — —

.139 (p = .030) — —

.223 (p = .000) — —

United States Japan Singapore

United States Japan Singapore

−.159 (p = .105) — —

.129 (p = .034) — —

.360 (p = .000) .184 (p = .088) .274 (p = .050)

Conceptual teaching

— — —

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

.384 (p = .000) — —

Note. Only statistically significant coefficients (p < .15) are reported. SES = socioeconomic status.

.124 (p = .039) — —

— — —

.131 (p = .038) — —

— — —

Achievement SD

Ratio of conceptual teaching to computation

— — —

−.147 (p = .021) — —

United States Japan Singapore

Achievement average

Computation

Class size

.142 (p = .015) — —

— — —

−.204 (p =.001) — —

Teacher experience

— −.256 (p = .007) —

— — —

— .219 (p = .019) —

Mathematics degree

4:38 PM

— — —

SES SD

9/16/05

SES average

Country

Table 1 Relationships of SES, Class, and Teacher Variables to Teaching: Country-Level Regression Results for the United States, Japan, and Singapore

3185-04_DesimoneREV.qxd Page 517

517

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 518

Desimone et al. these relationships was weak. These results indicate that, in the high-achieving countries of Singapore and Japan, teachers do not use fewer conceptual techniques with their lower-achieving students, but that U.S. teachers tend to use conceptual teaching less often with weaker students. We now move to the multilevel analysis, which examined the relationship between teaching strategies and class student achievement while controlling for average SES and within-class diversity of SES (i.e., standard deviation of SES). Here we found modest support for the idea that U.S. teaching strategies are more closely tied to the achievement levels of students than is the case in other countries. Table 2 presents descriptive statistics for all of the variables included in the HLM analysis. Table 3 (Model 1) shows the HLM results, and Table 4 shows the corresponding variance components. The large sample size— 6,171 teachers in 38 countries—increased the chance that findings might be statistically significant but not substantially meaningful. In addition, the range of dependent variables was restricted (i.e., involving a 4-point scale). Thus, to help interpret the coefficients, we calculated the effect of a standard deviation change in independent variables on the dependent variables and examined the degree to which the strength of these “slopes” varied across countries. On average, across countries, the greater the class average achievement, the more use of conceptual teaching (β = .001, p = .000) and the greater the ratio of conceptual teaching to computation (β = .033, p = .10). To calculate the amount of change in the dependent variables associated with these independent variables, we multiplied the coefficient (see Table 3) by the standard Table 2 Descriptive Statistics for Variables Used in the Hierarchical Linear Modeling Analysis Variable Index of conceptual teaching (dependent variable) Computation (dependent variable) Ratio of conceptual to computational teaching (dependent variable) Number of books in home Class-level SD Class average Class average achievement Average math achievement (SD) Class size Teacher experience Teacher has math degree

Minimum

Maximum

M

SD

1

4

2.41

0.45

1 .25

4 1.00

2.99 .89

0.84 .40

5

170.31

89.91

25.10

5.00 181.73 18.30 1 0 0

300.00 794.50 151.05 95 52 1

107.19 490.36 65.23 28.80 16.46 0.81

56.72 88.81 17.21 11.46 10.04 0.39

Note. The sample was composed of 38 countries and 6,171 observations.

518

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

.018 .021 .670 .306 .469

.000 .016 .015 .255 .516 .461 .081

.001 −.001 −.001 .001 .000

2.9 .001 −.001 .001 .000 −.001 .059

.000 .024 .428 .247 .022 .360 .010

−.008 .001 .000 .002 .000 .053

.037 .459 .000 .079 .037

−.000 .000 .001 .001 .002

2.36

.000

p

2.41

Coefficient

−.090 .038 −.010 .087 −.009 .756

88.05

−.082 .034 .033 −.013 .086

87.31

Coefficient

.005 .081 .821 .066 .899 .055

.000

.008 .117 .109 .789 .062

.000

p

Ratio of Conceptual to Computation

4:38 PM

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

Note. SES = socioeconomic status; HLM = hierarchical linear model; TIMSS-99 = Third International Mathematics and Science Study, 1999.

.000

p

3.03

Coefficient

Conceptual

9/16/05

Model 1: class level SES, math achievement, class size Intercept Number of books in home Class-level SD Class average Average math achievement Average math achievement, SD Class size Model 2: class-level SES, class size, class-level math Intercept Number of books in home Class-level SD Class average Average math achievement, SD Class size Teacher years of experience Teacher mathematics degree

Model

Computation

Table 3 Predicting Computation and Conceptual Teaching Based on Class-Level SES, Class Size, Average Class-Level Mathematics Achievement, and Teacher Characteristics: 2-Level HLM for 38 TIMSS-99 Countries

3185-04_DesimoneREV.qxd Page 519

519

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 520

Desimone et al. deviation of its respective variable (see Table 2). These calculations showed that a one-standard-deviation increase in class average achievement was associated with a 0.08 increase on the 4-point conceptual teaching scale (i.e., .001 × 88.81 = 0.08) and a 2.93 increase in the ratio of conceptual to computational teaching. Using the between-country variance estimate for the relationship between class average achievement and ratio of conceptual to computational teaching, we were able to examine the variability in the strength of this relationship across countries. Thirty-six of the 38 countries had class average achievement/ratio of conceptual to computational teaching slopes in the −17 to 0.239 range (95% plausible value range).8 This translates to between −5% and 38% of a standard deviation change in the ratio of conceptual to computational teaching (standard deviations are shown in Table 2), indicating that the strength of the relationship within countries ranged from weakly negative to moderately positive. That is, in some countries teachers appear to trade off computation for conceptual teaching when class average achievement rises, while in other countries there is no apparent trade-off or the trade-off is in the opposite direction. Table 1, which presents country-specific regression results for the United States, Singapore, and Japan, shows that in each country teachers’ use of conceptual teaching strategies was positively associated with class average achievement, while only U.S. teachers appeared to be trading off computational strategies for conceptual ones as class average achievement increased. Thus, the findings relating to Assumption 3 were mixed. Cross nationally, teachers tend to use more conceptual teaching strategies as classlevel achievement increases. This is true in the United States as well as in Japan and Singapore. The degree to which teachers use fewer conceptual strategies and more computational ones as class-level achievement declines varies to a greater degree across countries. In the United States the ratio of use of conceptual strategies to computational ones declines as class average achievement declines, while in Singapore and Japan this trade-off is less apparent in the data. Next, as a test of Assumption 4, we examined whether conceptual instruction was less common in large classes. As can be seen in Table 3, there were no cross-national patterns in the relationship between class size and computation, but there was a weak positive relationship between increases in class size and increases in conceptual teaching (β = .002, p = .03). Thus, a one-standard-deviation increase in class size was associated with a 0.02 increase in conceptual teaching. To help judge the variation in the size of this effect, we examined the plausible value range around these slopes. The between-country variation in the class size/conceptual teaching slope was small, with 95% between −0.005 and 0.009, translating to about 5% of a standard deviation in conceptual teaching. Thus, the effect did not vary much across countries. Similar to the small cross-national relationship between increased class size and increased use of conceptual teaching, results of our three-country 520

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 521

Barriers to the Reform of Mathematics Instruction analysis (reported in Table 1) showed that in the United States the larger the class size, the more likely a teacher is to use conceptual instruction (β = .129, p = .03) and to use more conceptual teaching than computation (β = .124, p = .039). These effects were quite small, however, and such a relationship was not significant for Japan and Singapore. Thus, the idea that larger class sizes prevent use of conceptual teaching strategies is not supported by these data. The results of our cross-national analysis weakly support Assumption 5, which suggests that teachers with strong content and pedagogical content knowledge of mathematics are more likely to use conceptual than computational strategies. The strength of this relationship varied across countries, however. As can be seen in Table 3 (Model 2), we estimated the relationships between use of computational and conceptual instructional strategies and teacher years of experience and whether a teacher had a degree in mathematics or mathematics education. In our cross-national analysis, we found that having a degree in mathematics or mathematics education was associated with a 0.053 increase in the use of conceptual teaching strategies (p = .01), a very weak effect. The 95% confidence interval around the slope of the variance estimate for this relationship ranged from −0.07 to 0.18, translating to between 6% and 15% of a standard deviation in teachers’ use of conceptual instruction. Teachers with and without degrees in math were no more or less likely to use computation or to use more conceptual than computational strategies (see Table 3). In Japan, teachers with a degree in mathematics or mathematics education were less likely to use more conceptual than computational strategies and more likely to use computational strategies; there was no relationship between degree and teaching in the United States or Singapore (see Table 1). Cross nationally, there were no significant relationships between teaching experience and the cognitive strategies used in instruction (i.e., computation or conceptual). In our three-country comparison, shown in Table 1, we found that the more years of experience accrued by a teacher in the United States, the less likely she or he is to use computation and the more likely she or he is to use more conceptual than computational strategies. Years of experience were not related to type of teaching in either Singapore or Japan.9

Discussion Considerations in Interpretation While international comparisons can provide important insights, there are a few caveats that should be considered in interpreting our results. First, countries differ in many ways, so relationships among variables must be considered within national contexts (Porter & Gamoran, 2002; Raudenbush & Kim, 2002). Although there is increasingly a global culture of teaching (LeTendre, Baker, Akiba, Goesling, & Wiseman, 2001), important cultural, structural, and organizational differences between countries remain, and they may drive differences in teaching and achievement Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

521

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 522

Desimone et al. (Ashwill, 1999; Kinney, 1998; Lewis & Tsuchida, 1998; H. Stevenson & Stigler, 1992; Stigler & Hiebert, 1998, 1999). To address these issues, our models focused on predictors of within-country variations in teaching practice, and we have discussed how the strength of these relationships varies across countries. Second, we recognize that there are many challenges to transplanting effective teaching practices from one country to another (Bempechat, Jimenez, & Boulay, 2002; Bennett, 1987; Berliner & Biddle, 1995; Bracey, 1997; Schmidt, 1996); our results suggest alternative configurations of instruction that could be a useful model for the United States, but we acknowledge that country-specific structural, cultural, and social differences must be considered in any adaptation of cross-national practices (Baker & LeTendre, 2005; Porter & Gamoran, 2002). Third, an emphasis on cognitive strategies is a critical component of teaching, and this is the distinction on which our study focused; however, we acknowledge that there are other important dimensions of “quality” teaching (Stigler & Hiebert, 1997). Fourth, we used mathematics degree and years of teaching experience as proxies for teachers’ content and pedagogical content knowledge. Actual tests in which teachers are required to demonstrate their mathematical ability and their practical knowledge of how students learn mathematics might serve as stronger proxies; these types of tests are being developed (Ball, 2002; Rowan, Schilling, Ball, & Miller, 2001) but are not available for largescale studies. Although degree and experience might be only rough proxies for the underlying teacher knowledge construct we sought to measure, there is considerable evidence that both subject-specific degrees and teaching experience are related to teachers’ knowledge and skills, both nationally and internationally (Hopkins & Stern, 1996; Lockheed & Longford, 1991; Mullens, Murnane, & Willett, 1996). Fifth, our analyses relied on teachers’ self-reports of their own instruction. Most forms of data collection are subject to social desirability bias (Burstein et al., 1995), but responses on anonymous surveys such as TIMSS are less susceptible to such bias than more public data collection forums such as interviews or focus groups (Aquilino, 1994, 1998; Dillman & Tarnai, 1991; Fowler, 1995; see Desimone & LeFloch, 2004, for a discussion of the uses and quality of survey data). Furthermore, when survey questions ask teachers to account for their behaviors (as the TIMSS measures do) rather than evaluate or make quality judgments, the validity and reliability levels of teacher self-report data can be high (Mullens & Gayler, 1999; Mullens & Kasprzyk, 1996). Although teacher self-reports are not useful in measuring certain dimensions of teaching practice such as teacher-student interaction and teacher engagement, several studies have shown that teachers’ self-reports of their teaching in anonymous sample surveys are highly correlated with classroom observations and teacher logs (Mullens & Kasprzyk, 1996, 1999; Smithson & Porter, 1994). Furthermore, studies have shown that one-time 522

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 523

Barriers to the Reform of Mathematics Instruction surveys in which teachers are asked about the content and strategies they emphasize are reasonably valid and reliable in measuring teachersâ€™ instruction (Mullens, 1995; Mullens & Gayler, 1999; Schmidt et al., 1997; Shavelson, Webb, & Burstein, 1986; Smithson & Porter, 1994) and are effective in describing and distinguishing among different types of teaching practices (Mayer, 1999). Sixth, the TIMSS data we used were collected in 1999. Patterns of national and international teaching and learning may have changed in significant ways in the ensuing years. This should be kept in mind when interpreting our results and applying them to current conditions. Finally, cross-national comparisons face the challenge of shared meanings in regard to terms used to describe teaching (Stigler & Hiebert, 1997). While the reliability and validity of the TIMSS-99 measures used in this study are reasonably high, the potential inexactness of the survey measures, along with the other caveats mentioned here, should be considered in interpreting the present results. To What Extent Does Our Analysis Provide Support for Our Assumptions About Teaching in the United States? Our results do not support the commonly held assumption that the autonomous nature of teaching in the United States accounts for much of the variation in teaching (Assumption 1). In fact, we found that the rate of within-country variation in teaching in the United States is remarkably similar to that in other countries. Such similarities run counter to the argument that differences in culture, history, curriculum, and the way in which educational systems are organized in different countries would necessarily result in cross-national differences in regard to degree of homogeneity of instruction. Such consistency in variation in the use of teaching strategies across countries also challenges beliefs regarding the role of state control in education. State control of the curriculum is usually assumed to translate into teachers teaching the same content and using the same methods. This line of reasoning would suggest that countries with strong national control, such as Singapore, would exhibit low within-country variation and that countries with high local control, such as the United States, would exhibit more teacher-to-teacher variation (Stevenson & Baker, 1991, 1996). However, we found no such differences when we analyzed gross teaching patterns without controlling for any school, teacher, or student factors. Our findings are consistent with those of other studies that have revealed similar levels of variation in teaching in foreign countries (Anderson, Ryan, & Shapiro, 1989; Fuller et al., 1994; LeTendre et al., 2001) and that suggest there has been a global pattern of convergence on schooling norms (Benavot, Cha, Kamens, Meyer, & Wong, 1991; LeTendre et al., 2001) resulting in a trend toward cross-national consistency in instruction. Thus, teacher autonomy may not be as significant a barrier to increased use of conceptual Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

523

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 524

Desimone et al. teaching in the United States as conventional wisdom might suggest. Such a finding is good news for mathematics reform. We also did not find support for Assumption 2, that teachers’ increased use of conceptual instruction requires a reduction in the use of computation. Our findings reveal an alternative paradigm wherein conceptual and computational strategies are integrated, especially in high-achieving countries such as Japan and Singapore. This alternative paradigm holds promise as a model showing that instruction can be organized in a way in which tradeoffs are not drastic and conceptual and computational approaches can be used in tandem. We found support for Assumption 3, according to which U.S. teachers use conceptual instruction more with high- than with low-performing students, and such a pattern is a dominant one internationally. It is notable, however, that this pattern was not found in the two high-performing countries of Japan and Singapore. Furthermore, instead of using computational approaches with the same frequency for all students, U.S. teachers tend to use more computational instruction with low-achieving students. In contrast, other countries do not use computational strategies more with low- than high-achieving students. Our findings may at least partly explain why poor students in the United States do worse than poor students in other countries. U.S. teachers’ instructional differentiation is usually explained by within-school and classroom diversity in terms of achievement and income levels (Baker et al., 2002; Barr & Dreeben, 1988; Gamoran & Mare, 1989; J. Lee, 1999; U.S. Department of Education, 2003); one of the major responses to such challenges in the United States has been to target different types of instructional strategies to different types of students (Tomlinson, 1999). Our findings imply that there are other successful models for teaching low-achieving students. Specifically, our findings from Singapore and Japan suggest the possible benefits of using conceptual strategies as often with low- as with high-achieving students. The debate in the United States regarding the use of computational and conceptual approaches among low achievers can be informed by this “existence proof” from other countries. Although class size (see Assumption 4) is often cited as a barrier to conceptual teaching (Bourke, 1986; M. Smith & Glass, 1980), we found it to be positively associated with use of conceptual teaching strategies in the United States and across most other TIMSS-99 countries. Class size was also positively associated with ratio of conceptual to procedural teaching in most countries.10 Furthermore, our results show that large classes and use of conceptual teaching are common in a number of high-performing countries, raising questions about the validity of Assumption 4. It has been suggested that, in Japan, larger classes are more conducive to discussion and group investigation because students are generally well behaved, motivated, and prepared for class, whereas in the United States a teacher typically encounters students with differing behaviors, motivation levels, and preparedness levels (J. Lee, 524

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 525

Barriers to the Reform of Mathematics Instruction 1999). These social and behavioral differences may influence how U.S. teachers implement conceptual approaches, but we found that class size does not appear to be a structural barrier to increased use of conceptual teaching in the United States. Finally, we found partial support for Assumption 5, that qualified and experienced teachers use conceptual strategies more than they use computational strategies. In the United States, increased use of conceptual instruction is associated with more years of experience but not with possession of a mathematics degree. In Japan and Singapore, the opposite is true; increased use of conceptual instruction is associated with possession of a mathematics degree but not with teacher experience. Further examination of the interrelationship among teacher preparation, experience, and instruction is needed, with alternative measures of both instruction and teacher credentials. Although not conclusive, our findings suggest that U.S. teachers may learn more of their content knowledge through the teaching experience they accrue than through formal degree preparation. In contrast, in other countries, formal teacher preparation (i.e., obtaining a degree in mathematics or mathematics education) may play a stronger role in teachersâ€™ later use of conceptual teaching. These findings are consistent with research criticizing U.S. teacher preparation, especially in mathematics, for not requiring teachers to develop a conceptual understanding of mathematics and how students learn mathematics (Darling-Hammond, Chung, & Frelow, 2002; Darling-Hammond & Sykes, 1999; Ma, 1999).

Conclusion We found little evidence for the existence of commonly perceived barriers to the use of conceptual teaching in the United States, which is good news for the mathematics reform community. We also found that U.S. teachers use conceptual teaching strategies at about the international average. Furthermore, our results show little variation in use of conceptual teaching strategies, as measured in this analysis, between teachers with and without a degree in mathematics. Our findings could be interpreted to indicate that inservice professional development closes the content knowledge gap between teachers with and without such a degree. However, previous research indicates that most teachers do not participate in content-focused professional development (Garet, Porter, Desimone, Birman, & Yoon, 2001). A more likely interpretation is that teacher preparation programs could do more to prepare U.S. teachers to use conceptual strategies in the classroom. In our three-country analysis, we found that U.S. teachers distinguish themselves from teachers in other countries mainly by their tendency to use computational strategies with low-achieving students. In contrast, international trends, and trends in high-achieving countries such as Japan and Singapore, reflect a more consistent and equitable distribution of computational and conceptual teaching strategies across students. Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

525

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 526

Desimone et al. The scientific community has not yet articulated a sound and convincing logic regarding how to link findings from cross-national surveys to school improvement issues (Rowan, 2002). Thus, we are hesitant to draw conclusions in regard to such issues on the basis of the insights gained from our analyses. Instead, we intend our cross-national findings to suggest ideas and models for alternative paradigms for organizing instruction and to question certain assumptions about teaching in the United States. We hope our results will encourage continued examination of alternative models of teaching and learning. Notes This material is based on work supported by the National Science Foundation (Grants 0231884 and 9815112). The views expressed herein are those of the researchers. No official endorsement by the granting agency is intended or should be inferred. Thanks to Kay McClain for contributing her mathematics expertise. 1 For a counterargument, see Baker and LeTendre (2005). 2 Matrix sampling was used in TIMSS; thus, each student did not complete the same test form. We used IRT scores so that every student’s score could be reported in the same metric. 3 To calculate teacher weights (the inverse probability of a teacher being selected into the sample), we aggregated the mathematics teacher weights provided in the TIMSS data file by taking, in each case, the maximum within-teacher weight. We then aggregated the teacher-level file to the country level and calculated the sum of TIMSS-provided weights and number of teachers in the sample for each country. Next, we disaggregated the countrylevel variables to the teacher questionnaire file. From this, we calculated a new mathematics teacher weight by dividing the teacher weight by the sum of the country’s teacher weights and multiplying by the number of teachers in the country. 4Actual coefficients of variation are not reported here but are available from the authors. 5 Since there is no fall achievement test, we were unable to measure change in instruction, which would have been an appropriate dependent variable. Although student achievement levels may change during the year, class average achievement is a reasonable proxy for the level of student skills and knowledge encountered by the teacher in the classroom. 6 To calculate the effect size, we associated a one-standard-deviation increase in class average mathematics achievement in Singapore (67.9) with a 0.274 increase in conceptual teaching: 0.274 × 0.45 (the standard deviation of conceptual teaching) = 0.11. 7 The calculations were as follows: .360 × .47 (standard deviation of conceptual teaching) = .17; .159 × .94 (standard deviation of computation) = .15; and .384 × 52 (standard deviation of ratio of conceptual teaching to computation) = 19.9. 8 Confidence intervals are calculated via the following formula: coefficient ± 1.96 × standard deviation of the slope—in this case, .033 ± (1.96 × .10545). The standard deviations of the slopes are not shown in the tables. 9 A maturation-instrument interaction (Rossi, Freeman, & Lipsey, 1999) is possible in that respondents may, over time, become more cognizant of their policy environment and respond to survey questions in a different way than newer teachers. However, since the TIMSS survey was cross sectional, and thus teachers completed the questionnaire only once, we do not believe that a maturation-instrument interaction explains the findings revealed for experienced teachers. 10 Predictions based on U.S. class size research would postulate that smaller classes allow teachers to move away from procedural/lecture techniques and move toward more effective strategies that involve discussion and problem solving (Bourke, 1986; Nye et al., 2002; M. Smith & Glass, 1980). From this perspective, our results seem counterintuitive. We should note, though, that most of the positive effects of class size documented in U.S. schools have been observed in the lower grades (i.e., K–2).

526

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 527

Barriers to the Reform of Mathematics Instruction Furthermore, the class size research indicates that there is a threshold number of students at which teachers are able to use different strategies, so an analysis of this threshold may produce different results than an analysis of incremental increases in the number of students per class.

Appendix Comparison of Random Coefficients Model, Model With Variance Components p > .10, and Random Intercept Model

Model number

Model with variance components p > .10 fixed

Random coefficients model

Random intercept model (no random slopes)

Computation Model 1 Deviance statistic 13,272.795351 Number of parameters estimated 22 Chi-square (df) for comparison Fits as well as more complex model? Model 2 Deviance statistic 13,261.300041 Number of parameters estimated 37 Chi-square (df) for comparison Fits as well as more complex model?

13,285.542639 13,323.707779 7 5 12.74729 (15) 38.16514 (5) Yes (p > .500) No (p = .000) 13,307.938024 13,337.737752 11 2 46.63798 (26) 29.79973 (9) No (p = .008) No (p = .001)

Conceptual instruction Model 1 Deviance statistic Number of parameters estimated Chi-square (df) for comparison Fits as well as more complex model? Model 2 Deviance statistic Number of parameters estimated Chi-square (df) for comparison Fits as well as more complex model?

6,520.273988 22

6,525.535515 6,588.140058 11 2 5.26153 (11) 62.60454 (9) Yes (p > .500) No (p = .000)

6,489.666724 11

6,525.535515 6,588.140058 26 2 35.86879 (26) 62.60454 (9) Yes (p = .094) No (p = .000)

Ratio of conceptual to computational instruction Model 1 Deviance statistic 57,266.466404 57,297.546061 57,454.14753 Number of parameters estimated 22 7 2 Chi-square (df) for comparison 31.07966 (15) 156.60147 (5) Fits as well as more complex model? No (p = .009) No (p = .009) Model 2 Deviance statistic 57,240.048064 57,275.470256 57,455.377430 Number of parameters estimated 37 16 2 Chi-square (df) for comparison 35.42219 (21) 179.90717 (17) Fits as well as more complex model? No (p = .025) No (p = .025)

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

527

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 528

References Anderson, L., Ryan, D., & Shapiro, B. (1989). The IEA Classroom Environment Study. New York: Pergamon Press. Aquilino, W. S. (1994). Interview mode effects in drug surveys. Public Opinion Quarterly, 58, 210–240. Aquilino, W. S. (1998). Effects of interview mode on measuring depression in younger adults. Journal of Official Statistics, 14, 15–30. Ashwill, M. (Ed.). (1999). The educational system in Germany: Case study findings. Washington, DC: U.S. Department of Education. Baker, D., Goesling, B., & LeTendre, G. (2002). Socio-economic status, school quality, and national economic development: A cross-national analysis of the “Heyneman-Loxley effect” on mathematics and science achievement. Comparative Education Review, 46, 291–312. Baker, D., & LeTendre, G. (2005). National differences, global similarities: World culture and the future of schooling. Palo Alto, CA: Stanford University Press. Ball, D. L. (1996). Teacher learning and the mathematics reforms: What we think we know and what we need to learn. Phi Delta Kappan, 77, 500–508. Ball, D. L. (2002). Knowing mathematics for teaching: Relations between research and practice. Mathematics and Education Reform Newsletter, 14(3), 1–5. Baroody, A. J., & Benson, A. (2001). Early number instruction. Teaching Children Mathematics, 8, 154–158. Barr, R., & Dreeben, R. (1988). The formation and instruction of ability groups. American Journal of Education, 97, 34–64. Barr, R., Wiratchai, N., & Dreeben, R. (1983). How schools work. Chicago: University of Chicago Press. Bempechat, J., Jimenez, N., & Boulay, B. (2002). Cultural-cognitive issues in academic achievement: New directions for cross-national research. In A. C. Porter & A. Gamoran (Eds.), Methodological advances in cross-national surveys of educational achievement (pp. 117–149). Washington, DC: National Academy Press. Benavot, A., Cha, Y., Kamens, D., Meyer, J., & Wong, S. (1991). Knowledge for the masses: World models and national curricula, 1920–1986. American Sociological Review, 56, 85–100. Bennett, W. (1987). Implications for American education. NASSP Bulletin, 71, 102–108. Berliner, D., & Biddle, B. (1995). The manufactured crisis: Myths, fraud and the attack on America’s public schools. Reading, MA: Addison-Wesley. Betts, J., & Shkolnik, J. L. (1999). The behavioral effects of variation in class size: The case of math teachers. Educational Evaluation and Policy Analysis, 21, 193–213. Bourke, A. (1986). How smaller is better: Some relationships between class size, teaching practices, and student achievement. American Educational Research Journal, 23, 558–571. Bracey, G. (1997). On comparing the incomparable: A response to Baker and Stedman. Educational Researcher, 26(3), 19–26. Bransford, J., Brown, A., & Cocking, R. (Eds.). (1999). How people learn: Brain, mind, experience, and school. Washington, DC: National Academy Press. Bryk, A. S., & Driscoll, M. E. (1988). The high school as community: Contextual influences and consequences for students and teachers. Madison, WI: National Center on Effective Secondary Schools. Burrill, G. (2001). Mathematics education: The future and the past create a context for today’s issues. In T. Loveless (Ed.), The great curriculum debate: How should we teach reading and math? (pp. 25–41). Washington, DC: Brookings Institution.

528

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 529

Barriers to the Reform of Mathematics Instruction Burstein, L., McDonnell, L. M., Van Winkle, J., Ormseth, T., Mirocha, J., & Guitton, G. (1995). Validating national curriculum indicators. Santa Monica, CA: RAND. Carpenter, T., & Fennema, E. (1991). Research and cognitively guided instruction. In E. Fennema, T. Carpenter, & S. Lamon (Eds.), Integrating research on teaching and learning mathematics (pp. 1–17). Albany: State University of New York Press. Carpenter, T. P., Fennema, E., & Franke, M. L. (1996). Cognitively guided instruction: A knowledge base for reform in primary mathematics instruction. Elementary School Journal, 97, 3–20. Carpenter, T., Fennema, E., Peterson, P., Chiang, C., & Loef, M. (1989). Using knowledge of children’s mathematics thinking in classroom teaching: An experimental study. American Educational Research Journal, 26, 499–531. Chabbott, C., & Elliott, E. (Eds.). (2004). Understanding others, educating ourselves: Getting more from international comparative studies in education. Washington, DC: National Academy Press. Clark, C., & Peterson, P. (1986). Teachers’ thought processes. In M. C. Wittrock (Ed.), Handbook of research on teaching (3rd ed., pp. 255–296). New York: Macmillan. Cohen, D. (1990). A revolution in one classroom: The case of Mrs. Oublier. Educational Evaluation and Policy Analysis, 12, 311–329. Cohen, D., & Ball, D. (1990). Policy and practice: An overview. Educational Evaluation and Policy Analysis, 12, 347–353. Cohen, D., McLaughlin, M., & Talbert, J. (Eds.). (1993). Teaching for understanding: Challenges for policy and practice. San Francisco: Jossey-Bass. Cohen, D., & Spillane, J. (Eds.). (1992). Policy and practice: The relations between governance and instruction. Washington, DC: American Educational Research Association. Cross, C. (1999). Standards and local control. American School Board Journal, 186(4), 54–55. Cuban, L. (1993). How teachers taught: Constancy and change in American classrooms, 1890–1990 (2nd ed.). New York: Teachers College Press. Darling-Hammond, L. (2000). Teacher quality and student achievement: A review of state policy evidence. Education Policy Analysis Archives, 8(1). Retrieved February 17, 2004, from http://olam.ed.asu.edu/epaa/v8n1/ Darling-Hammond, L., Chung, R., & Frelow, F. (2002). Variation in teacher preparation: How well do different pathways prepare teachers to teach? Journal of Teacher Education, 53, 286–302. Darling-Hammond, L., & Sykes, G. (Eds.). (1999). Teaching as the learning profession: Handbook of policy and practice. San Francisco: Jossey-Bass. Desimone, L., & LeFloch, K. (2004). Are we asking the right questions? Using cognitive interviews to improve surveys in education research. Educational Evaluation and Policy Analysis, 26, 1–22. Desimone, L., Smith, T., & Ueno, K. (in press). Who gets sustained, content-focused professional development? An administrator’s dilemma. Educational Administration Quarterly. Dillman, D. A., & Tarnai, J. (1991). Mode effects of cognitively designed recall questions: A comparison of answers to telephone and mail surveys. In P. N. Beimer, R. M. Groves, L. E. Lyberg, N. A. Mathiowetz, & S. Sudman (Eds.), Measurement errors in surveys (pp. 367–393). New York: Wiley. Elmore, R., & Burney, D. (1996). Staff development and instructional improvement: Community District 2, New York City. Philadelphia: Consortium for Policy Research in Education. Elmore, R., Peterson, P., & McCarthy, S. (1996). Restructuring in the classroom: Teaching, learning, and school organization. San Francisco: Jossey-Bass.

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

529

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 530

Desimone et al. Ernest, P. (1998). The history of mathematics in the classroom. Mathematics in School, 27(4), 25–31. Ernest, P. (1999). Forms of knowledge in mathematics and mathematics education: Philosophical and rhetorical perspectives. Educational Studies in Mathematics, 38, 67–83. Federal Publications. (2003). Primary mathematics, United States edition. Singapore: Author. Fennema, E., Carpenter, T. P., Franke, M. L., Levi, L., Jacobs, B., & Empson, V. (1996). A longitudinal study of learning to use children’s thinking in mathematics instruction. Journal for Research in Mathematics Education, 27, 403–434. Fernandez, C., & Yoshida, M. (2004). Lesson study: A Japanese approach to improving mathematics teaching and learning. Mahwah, NJ: Erlbaum. Finn, J., & Achilles, C. (1999). Tennessee’s class size study: Findings, implications, misconceptions. Educational Evaluation and Policy Analysis, 21, 97–109. Fowler, F. J., Jr. (1995). Improving survey questions: Design and evaluation. Thousand Oaks, CA: Sage. Fuller, B., Hua, H., & Snyder, W., Jr. (1994). When girls learn more than boys: The influence of time in school and pedagogy in Botswana. Comparative Education Review, 38, 347–376. Fuller, B., & Rubinson, R. (Eds.). (1992). The political construction of education: The state, school expansion, and economic change. New York: Praeger. Gamoran, A. (1986). Instructional and institutional effects of ability grouping. Sociology of Education, 59, 185–198. Gamoran, A., & Mare, R. (1989). Secondary school tracking and educational inequality: Compensation, reinforcement or neutrality? American Journal of Sociology, 94, 1146–1183. Gamoran, A., Porter, A., Smithson, J., & White, P. (1997). Upgrading high school mathematics instruction: Improving learning opportunities for low-achieving low-income youth. Educational Evaluation and Policy Analysis, 19, 325–338. Garet, M., Porter, A., Desimone, L., Birman, B., & Yoon, K. (2001). What makes professional development effective? Analysis of a national sample of teachers. American Educational Research Journal, 38, 915–945. Geary, D. (1994). Children’s mathematical development: Research and practical applications. Washington, DC: American Psychological Association. Geary, D. C. (2001). A Darwinian perspective on mathematics and instruction. In T. Loveless (Ed.), The great curriculum debate: How should we teach reading and math? (pp. 85–107). Washington, DC: Brookings Institution. Gerges, G. (2001). Factors influencing preservice teachers’ variation in use of instructional methods: Why is teacher efficacy not a significant contributor? Teacher Education Quarterly, 28(4), 71–88. Goldhaber, D., & Brewer, D. J. (1997). Evaluating the effect of teacher degree level on educational performance. In W. Fowler (Ed.), Developments in school finance, 1996 (pp. 197–210). Washington, DC: National Center for Education Statistics. Goldhaber, D. D., & Brewer, D. J. (2000). Does teacher certification matter? High school teacher certification status and student achievement. Educational Evaluation and Policy Analysis, 22, 129–145. Gonzales, P., Calsyn, C., Jocelyn, L., Mak, K., Kastberg, D., Arafeh, S., et al. (2000). Pursuing excellence: Comparisons of international eighth-grade mathematics and science achievement from a United States perspective, 1995 and 1999. Washington, DC: National Center for Education Statistics. Grant, S., Peterson, P., & Shojgreen-Downer, A. (1996). Learning to teach mathematics in the context of systemic reform. American Educational Research Journal, 33, 509–541.

530

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 531

Barriers to the Reform of Mathematics Instruction Hess, R. D., & Azuma, H. (1991). Cultural support for schooling: Contrasts between Japan and the United States. Educational Researcher, 20(9), 2–8. Hiebert, J. (1999). Relationships between research and the NCTM standards. Journal for Research in Mathematics Education, 30, 3–19. Hiebert, J., Carpenter, T. P., Fennema, E., Fuson, K., Human, P., Murray, H., et al. (1996). Problem solving as a basis for reform in curriculum and instruction: The case of mathematics. Educational Researcher, 25(4), 12–21. Hirsch, E. D., Jr. (2001). The roots of the education wars. In T. Loveless (Ed.), The great curriculum debate: How should we teach reading and math? (pp. 13–24). Washington, DC: Brookings Institution. Hopkins, D., & Stern, D. (1996). Quality teachers, quality schools: International perspectives and policy implications. Teaching and Teacher Education, 12, 501–517. Jackson, P. W. (1986). The practice of teaching. New York: Teachers College Press. Jiang, Z. (1995). A brief comparison of the United States and Chinese middle school mathematics programs. School Science and Mathematics, 95, 187–194. King, J. (1999). The impact of class size on instructional strategies and the use of time in high school mathematics and science courses. Educational Evaluation and Policy Analysis, 21, 215–229. Kinney, C. (1998). Building an excellent teacher corps: How Japan does it. American Educator, 21(4), 16–23. Knapp, M. (1995). Teaching for meaning in high-poverty classrooms. New York: Teachers College Press. Knapp, M. (1997). Between systemic reforms and the mathematics and science classroom: The dynamics of innovation, implementation, and professional learning. Review of Educational Research, 67, 227–266. Lee, J. (1999). Missing links in international education studies: Can we compare the U.S. with East Asian countries in the TIMSS? International Electronic Journal, 3(18). Retrieved February 17, 2004, from http://www.ucalgary.ca/~iejll Lee, V. E., Bryk, A., & Smith, J. B. (1993). The organization of effective effective secondary schools. In L. Darling-Hammond (Ed.), Review of research in education (pp. 171–267). Washington, DC: American Educational Research Association. LeTendre, G., Baker, D., Akiba, A., Goesling, B., & Wiseman, A. (2001). Teacher’s work: Institutional isomorphism and cultural variation in the U.S., Germany, and Japan. Educational Researcher, 30(6), 3–15. Lewis, C. (1995). Educating hearts and minds: Reflections on Japanese preschool and elementary education. New York: Cambridge University Press. Lewis, C., & Tsuchida, I. (1997). Planned educational change in Japan: The case of elementary science instruction. Journal of Educational Policy, 12, 313–331. Lewis, C., & Tsuchida, I. (1998, Winter). A lesson is like a swiftly flowing river. American Educator, 12–17. Li, S. (1999). Does practice make perfect? For the Learning of Mathematics, 19(3), 33–35. Linn, M. C., Lewis, C., Tsuchida, I., & Songer, N. (2000). Beyond fourth-grade science: Why do United States and Japanese students diverge? Educational Researcher, 29(3), 4–14. Lo Cicero, A., De La Cruz, Y., & Fuson, K. (1999). Teaching and learning creatively: Using children’s narratives. Teaching Children Mathematics, 5, 544–547. Lockheed, M., & Longford, N. (1991). School effects on mathematics achievement gain in Thailand. In S. Raudenbush & D. Willms (Eds.), Schools, classrooms and pupils: International studies of schooling from a multilevel perspective (pp. 131–148). San Diego, CA: Academic Press.

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

531

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 532

Desimone et al. Louis, K., & Marks, H. (1998). Does professional community affect the classroom? Teachers’ work and student experiences in restructuring schools. American Journal of Education, 106, 532–575. Loveless, T. (Ed.). (2001). The great curriculum debate: How should we teach reading and math? Washington, DC: Brookings Institution. Ma, L. (1999). Knowing and teaching elementary mathematics: Teachers’ understanding of fundamental mathematics in China and the United States. Mahwah, NJ: Erlbaum. Mayer, D. (1999). Measuring instructional practice: Can policymakers trust survey data? Educational Evaluation and Policy Analysis, 21, 29–46. McAninch, A. R. (1993). Teacher thinking and the case method: Theory and future directions. New York: Teachers College Press. McKnight, C. C., Crosswhite, F. J., Dossey, J. A., Kifer, E., Swafford, J. O., Travers, K. J., et al. (1987). The underachieving curriculum: Assessing U.S. school mathematics from an international perspective. Champaign, IL: Stipes. Meier, D. (1995). The power of their ideas: Lessons for America from a small school in Harlem. Boston: Beacon Press. Monk, D. H., & King, J. R. (1994). Multi-level teacher resource effects on pupil performance in secondary mathematics and science: The role of teacher subject matter preparation. In R. Ehrenberg (Ed.), Contemporary policy issues: Choices and consequences in education (pp. 29–58). Ithaca, NY: ILR Press. Mosteller, F. (1995). The Tennessee study of class size in the early school grades. Future of Children, 5, 113–127. Mosteller, F., Light, R., & Sachs, J. (1996). Sustained inquiry in education: Lessons from skill grouping and class size. Harvard Educational Review, 66, 797–842. Mullens, J. (1995). Classroom instructional processes: A review of existing measurement approaches and their applicability for the teacher follow-up survey. Washington, DC: National Center for Education Statistics. Mullens, J. E., & Gayler, K. (1999). Measuring classroom instructional processes: Using survey and case study field test results to improve item construction. Washington, DC: National Center for Education Statistics. Mullens, J., & Kasprzyk, D. (1996). Using qualitative methods to validate quantitative survey instruments. In Proceedings of the Section on Survey Research Methods (pp. 638–643). Alexandria, VA: American Statistical Association. Mullens, J., & Kasprzyk, D. (1999). Validating item responses on self-report teacher surveys. Washington, DC: U.S. Department of Education. Mullens, J., Murnane, J., & Willett, J. (1996). The contribution of training and subject matter knowledge to teaching effectiveness: A multilevel analysis of longitudinal evidence from Belize. Comparative Education Review, 40, 139–152. Murnane, R., & Phillips, B. (1981). Learning by doing, vintage, and selection: Three pieces of the puzzle relating teaching experience and teaching performance. Economics of Education Review, 1, 453–465. National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics. Reston, VA: Author. Newmann, F., King, M., & Youngs, P. (2000). Professional development that addresses school capacity: Lessons from urban elementary schools. American Journal of Education, 108, 259–299. No Child Left Behind Act, Pub. L. No. 107-110, 115 Stat. 1425 (2002). Nye, B., Hedges, L., & Konstantopoulos, S. (2002). Do low-achieving students benefit more from small classes? Evidence from the Tennessee class size experiment. Educational Evaluation and Policy Analysis, 24, 201–217. Pellegrino, J. W., Baxter, G. P., & Glaser, R. (1999). Addressing the “two disciplines” problem: Linking theories of cognition and learning with assessment and instruc-

532

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 533

Barriers to the Reform of Mathematics Instruction tional practices. In A. Iran-Nejad & P. D. Pearson (Eds.), Review of research in education (Vol. 24, pp. 307–353). Washington, DC: American Educational Research Association. Pong, S., & Pallas, A. (2001). Class size and eighth-grade math achievement in the United States and abroad. Educational Evaluation and Policy Analysis, 23, 251–273. Porter, A. C. (1989). External standards and good teaching: The pros and cons of telling teachers what to do. Educational Evaluation and Policy Analysis, 11, 343–356. Porter, A., & Gamoran, A. (2002). Methodological advances in cross-national surveys of educational achievement. Washington, DC: National Academy Press. Porter, A. C., Kirst, M. W., Osthoff, E. J., Smithson, J. L., & Schneider, S. A. (1993). Reform up close: An analysis of high school mathematics and science classrooms. Madison: University of Wisconsin. Putnam, R. T., & Borko, H. (1997). Teacher learning: Implications of new views of cognition. In B. J. Biddle, T. L. Good, & I. F. Goodson (Eds.), International handbook of teachers and teaching (Vol. 2, pp. 1223–1296). Dordrecht, the Netherlands: Kluwer. Raudenbush, S. W., & Bryk, A. S. (1986). A hierarchical model for studying school effects. Sociology of Education, 59, 1–17. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage. Raudenbush, S. W., Bryk, A., & Congdon, R. (2004). Hierarchical linear and nonlinear modeling software. Lincolnwood, IL: Scientific Software International. Raudenbush, S. W., & Kim, J. (2002). Statistical issues in analysis of international comparisons of educational achievement. In A. C. Porter & A. Gamoran (Eds.), Methodological advances in cross-national surveys of educational achievement (pp. 267–294). Washington, DC: National Academy Press. Raudenbush, S., Rowan, B., & Cheong, Y. (1993). The pursuit of high order instructional goals in secondary schools: Class, teacher, and school influences. American Educational Research Journal, 30, 523–553. Rivkin, S. G., Hanushek, E., & Kain, J. (1998). Teachers, schools and academic achievement (Working Paper No. 6691). Cambridge, MA: National Bureau of Economic Research. Romberg, T. (1988). One point of view: NCTM’s curriculum and evaluation standards: What they are and why they are needed. Arithmetic Teacher; 35(9), 2–3. Romberg, T. A. (2000). Changing the teaching and learning of mathematics. Australian Mathematics Teacher, 56(4), 6–9. Rossi, P., Freeman, H., & Lipsey, M. (1999). Evaluation: A systematic approach (6th ed). Beverly Hills, CA: Sage. Rowan, B. (1990). Commitment and control: Alternative strategies for the organizational design of schools. In C. Cazden (Ed.), Review of research in education (Vol. 16, pp. 353–389). Washington, DC: American Educational Research Association. Rowan, B. (2002). Large-scale, cross-national surveys of educational achievement: Promises, pitfalls, and possibilities. In A. C. Porter & A. Gamoran (Eds.), Methodological advances in cross-national surveys of educational achievement (pp. 321–349). Washington, DC: National Academy Press. Rowan, B., Chiang, F., & Miller, R. (1997). Using research on employees’ performance to study the effects of teachers on students’ achievement. Sociology of Education, 70, 256–284. Rowan, B., Schilling, S. G., Ball, D. L., & Miller, R. (2001). Measuring teachers’ pedagogical content knowledge in surveys: An exploratory study. Ann Arbor: University of Michigan.

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

533

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 534

Desimone et al. Sanders, W. L., & Horn, S. P. (1998). Research findings from the Tennessee ValueAdded Assessment System (TVAAS) database: Implications for educational evaluation and research. Journal of Personnel Evaluation in Education, 12, 247–256. Schmidt, W. (Ed.). (1996). Characterizing pedagogical flow: An investigation of mathematics and science teaching in six countries. Boston: Kluwer Academic. Schmidt, W., Houang, R., & Cogan, L. (2002). A coherent curriculum: The case of mathematics. American Educator, 26(2), 10–26, 47. Schmidt, W., McKnight, C., & Raizen, S. (1997). A splintered vision: An investigation of United States science and mathematics education. Boston: Kluwer Academic. Schoenfeld, A. (1985). Mathematical problem solving. New York: Academic Press. Shavelson, R., & Stern, P. (1981). Research on teachers’ pedagogical thoughts, judgments, decisions and behavior. Review of Educational Research, 51, 455–498. Shavelson, R. J., Webb, N. M., & Burstein, L. (1986). Measurement of teaching. In M. Wittrock (Ed.), Handbook of research on teaching (3rd ed., pp. 1–36). Washington, DC: American Educational Research Association. Shkolnik, J. (1999). The behavioral effects of variations in class size: The case of math teachers. Educational Evaluation and Policy Analysis, 21, 193–213. Shouse, R. (2001). The impact of traditional and reform-style practices on student mathematics achievement. In T. Loveless (Ed.), The great curriculum debate: How should we teach reading and math? (pp. 108–133). Washington, DC: Brookings Institution. Shulman, L. S. (1987). Knowledge and teaching: Foundations of the new reform. Harvard Educational Review, 57, 1–22. Silver, E. (Ed.). (1985). Teaching and learning mathematics problem solving: Multiple research perspectives. Hillsdale, NJ: Erlbaum. Sizer, T. (1992). Horace’s school: Redesigning the American high school. Boston: Houghton Mifflin. Slavin, R., Madden, N. A., Karweit, N., Livermon, B. J., & Dolan, L. (1990). Success for all: First-year outcomes of a comprehensive plan for reforming urban education. American Educational Research Journal, 27, 255–278. Smith, J., Lee, V., & Newmann, F. (2001). Instruction and achievement in Chicago elementary schools. Chicago: Consortium on Chicago School Research. Smith, M., & Glass, G. V. (1980). Meta-analysis of research on class size and its relationship to attitudes and instruction. American Educational Research Journal, 17, 419–433. Smith, T. M. (2004). Curriculum reform in mathematics and science since “A Nation at Risk.” Peabody Journal of Education, 79, 105–129. Smith, T. M., & Baker, D. P. (2002). World-wide growth and institutionalization of statistical indicators for education policy-making. Peabody Journal of Education, 76, 141–152. Smithson, J. L., & Porter, A. C. (1994). Measuring classroom practice: Lessons learned from efforts to describe the enacted curriculum—The Reform Up Close study (CPRE Research Report 31). Madison: University of Wisconsin, Consortium for Policy Research in Education. Spillane, J., & Zeuli, J. (1999). Reform and teaching: Exploring patterns of practice in the context of national and state mathematics reforms. Educational Evaluation and Policy Analysis, 21, 1–27. Stevenson, D., & Baker, D. (1991). State control of the curriculum and classroom instruction. Sociology of Education, 64, 1–10. Stevenson, D., & Baker, D. (1996). Does state control of the curriculum matter? A response to Westbury and Hsu. Educational Evaluation and Policy Analysis, 18, 339–343.

534

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

3185-04_DesimoneREV.qxd

9/16/05

4:38 PM

Page 535

Barriers to the Reform of Mathematics Instruction Stevenson, H., & Stigler, J. (1992). The learning gap: Why our schools are failing and what we can learn from Japanese and Chinese education. New York: Summit Books. Stigler, J., & Hiebert, J. (1997). Understanding and improving classroom mathematics instruction. Phi Delta Kappan, 79, 14–21. Stigler, J., & Hiebert, J. (1998). Teaching is a cultural activity. American Educator, 22(4), 4–11. Stigler, J., & Hiebert, J. (1999). The teaching gap: Best ideas from the world’s teachers for improving education in the classroom. New York: Free Press. Thompson, A., & Thompson, P. (1996). Talking about rates conceptually, Part II: Mathematical knowledge for teaching. Journal for Research in Mathematics Education, 27, 2–24. Thompson, B. (2004). Exploratory and confirmatory factor analysis: Understanding concepts and applications. Washington, DC: American Psychological Association. Tomlinson, C. (1999). The differentiated classroom: Responding to the needs of all learners. Alexandria, VA: Association for Supervision and Curriculum Development. Tsuchida, I., & Lewis, C. (1996). Responsibility and learning: Some preliminary hypotheses about Japanese elementary classrooms. In T. P. Rohlen & G. K. LeTendre (Eds.), Teaching and learning in Japan (pp. 190–212). New York: Cambridge University Press. Turnbull, B., Welsh, M., Heid, C., Davis, W., & Ratnofsky, A. C. (1999). The Longitudinal Evaluation of School Change and Performance (LESCP) in Title I schools: Interim report to Congress. Washington, DC: Policy Studies Associates/Westat. U.S. Department of Education. (2003). Digest of education statistics, 2002. Washington, DC: Author. Wilson, S., Floden, R., & Ferrini-Mundy, J. (2001). Teacher preparation research: Current knowledge, gaps, and recommendations. Seattle: University of Washington. Wilson, S., Floden, R., & Ferrini-Mundy, J. (2002). Teacher preparation research: An insider’s view from the outside. Journal of Teacher Education, 53, 190–204. Manuscript received March 31, 2004 Revision received April 8, 2004 Accepted May 16, 2005

Downloaded from http://aerj.aera.net by Armando Loera on October 31, 2009

535