guidelines for impact evaluation in education using experimental design

Page 47

ucativos - EXALE), and Peru (Evaluaci贸n Censal de Estudiantes - ECE), as well as the Trends in International Mathematics and Science Study (TIMSS) and the Progress in International Reading Literacy Study (PIRLS) administered internationally by the International Association for the Evaluation of Education Achievement, and the Segundo Estudio Regional Comparativo y Explicativo (SERCE) administered internationally by the United Nations Educational, Scientific and Cultural Organization (UNESCO). Some publishers offer standardized tests such as the Test of Early Mathematics Ability (TEMA)10 . Second, developing a test and collecting data are costly and therefore it is worth considering the use of local or national tests when they are available. Many countries now have standardized test scores that can be used for evaluation. Third, written tests are usually applicable to children in a small grade range and with specific characteristics. In some contexts where the educational level is very low, children may not be able to read instructions, and verbal application of tests may be necessary. Fourth, care needs to be taken with the use of teacher-specific test scores and failure rates as the only instrument to measure education quality. This measure is positively correlated with test scores, but it is unlikely that all students are compared against a common scale. Teachers may tend to normalize test scores. For example, a grade of 7 out of 10 in a school where students tend to perform very well is unlikely to be equivalent to 7 out of 10 in a school where most students tend to perform poorly. Fifth, potential sources of measurement error need to be taken into account and avoided. Muralidharan and Sundararaman (2011) test students twice in each round to include more material, reduce the impact of measurement errors specific to the day of the test, and reduce sample attrition due to student absence on the day of the test. The authors report on the use of repeated questions across rounds, as improvements may be a result of students remembering questions. Finally, testing should be externally monitored. Cheating has been well documented both in the United States and other countries. This is a particular concern in programs with performance pay schemes. To summarize, it is necessary to determine whether developing a test is necessary, choose a test that measures the outcome you intend to capture, ensure that the test is applicable to the specific student group, ensure that all students are measured by a common metric, and consider monitoring when collecting data. 10

more information at http://www.psych-edpublications.com/arithmetic.htm#tema

39


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.