Page 1






1. Front Page 2. Index 3. Introduction 4. Assessment 5. Evaluation 6. Blue Print 7. Type of Tests 8. Final test 9. Conclusion


Teachers need to know different kinds of strategies and must to have account aspects that influence in class in the teachinglearning process. Not only student´s behavior but also teacher´s leadership in the class. On the following pages there are some assessment, evaluation and different type of test that will help us to deal with our daily teaching routine. And too, we will find a BLUE PRINT format that will help us to structure correctly an exam. If we take count the information we can better our classes and make it effective.

CLASS LOG Date: 07/23/2011 Topic: Assessment and Evaluation ASSESSMENT Assessment is an activity that we can apply in our classes without previous plan to check our students comprehension of a specific topic. The assessment activities can be used on the beginning or end of the class. If we want to know if we should pass to another topic of our plan we first need to check how was the topic understood. And according to my results I can teach a new topic. Assessment can be divided by two groups: · Formal assessment · Informal assessment EVALUATION Evaluation is the action of interpreting information on an specific period of time. It will help me to check my student’s knowledge of the topic that I am teaching. This evaluation has an specific time to make and it will have score to summarize the knowledge. It is always required on courses that our students must to pass for grading on their careers. Evaluation can be divided by two groups: · Diagnostic · Formative · Summative


Chapter 9. Assessment Vocabulary The definitions in this list were derived from several sources, including: • • • • • •

Glossary of Useful Terms Related to Authentic and Performance Assessments. Grant Wiggins SCASS Arts Assessment Project Glossary of Assessment Terms The ERIC Review: Performance-Based Assessment. Vol. 3 Issue 1, Winter, 1994. Assessment: How Do We Know What They Know? ASCD. 1992. Dissolving the Boundaries: Assessment that Enhances Learning. Dee Dickinson

Accountability – The demand by a community (public officials, employers, and taxpayers) for school officials to prove that money invested in education has led to measurable learning. "Accountability testing" is an attempt to sample what students have learned, or how well teachers have taught, and/or the effectiveness of a school's principal's performance as an instructional leader. School budgets and personnel promotions, compensation, and awards may be affected. Most school districts make this kind of assessment public; it can affect policy and public perception of the effectiveness of taxpayer-supported schools and be the basis for comparison among schools. It has been suggested that test scores analyzed in a disaggregated format can help identify instructional problems and point to potential solutions. Action Plans – The statement that indicates the specific changes that a given area plans to implement in the next cycle based on assessment results. "The biology faculty will introduce one special project in the introductory class that will expose the students to the scientific method." "Career Services is implementing a software program called ‘1st Place’. This software will allow better tracking of job openings."

Action Research – Classroom-based research involving the systematic collection of data in order to address certain questions and issue so as to improve classroom instruction and educational effectiveness. Affective Outcomes – Outcomes of education that reflect feelings more than understanding; likes, pleasures, ideals, dislikes, annoyances, values. Annual Report: A report from each academic program based on its assessment plan that is submitted annually, which outlines how evidence was used to improve student learning outcomes through curricular and/or other changes or to document that no changes were needed. Assessment – The systematic collection, review, and use of information about educational programs undertaken for the purpose of improving student learning and development. In general

terms, assessment is the determination of a value, or measurement, based on a "standard." We often refer to this standard as a "target." Standard-based measurement, or assessment, is useful in education for both the placement of students in initial course work and ascertaining the extent of students' acquisition of skills/knowledge. Assessment Cycle – The assessment cycle in higher education is generally annual and fits within the academic year. Outcomes, targets and assessment tools are established early in the fall semester; data is collected by the end of spring semester; results are analyzed during the summer and early fall.

Assessment Tool – An instrument that has been designed to collect objective data about students' knowledge and skill acquisition. An appropriate outcomes assessment test measures students' ability to integrate a set of individual skills into a meaningful, collective demonstration. Some examples of assessment tools include standardized tests, end-of-program skills tests, student inquiries, common final exams, and comprehensive embedded test items. Assessment Literacy – The possession of knowledge about the basic principals of sound assessment practice, including terminology, the development and use of assessment methodologies and techniques, familiarity with standards of quality in assessment. Increasingly, familiarity with alternatives to traditional measurements of learning.

Authentic Assessment – A circumstance in which the behavior that the learning is intended to produce is evaluated and discussed in order to improve learning. The concept of model, practice, feedback in which students know what excellent performance is and are guided to practice an entire concept rather than bits and pieces in preparation for eventual understanding. A variety of techniques can be employed in authentic assessment. Benchmark – Student performance standards (the level(s) of student competence in a content area).

Cohort – A group whose progress is followed by means of measurements at different points in time. Course-embedded assessment – A method in which evidence of student learning outcomes for the program is obtained from assignments in particular courses in the curriculum. Course-level assessment – Assessment to determine the extent to which a specific course is achieving its learning goals. Course mapping – A matrix showing the coverage of each program learning outcome in each course. It may also indicate the level of emphasis of each outcome in each course. Criterion Referenced Tests – A test in which the results can be used to determine a student's progress toward mastery of a content area. Performance is compared to an expected level of

mastery in a content area rather than to other students' scores. Such tests usually include questions based on what the student was taught and are designed to measure the student's mastery of designated objectives of an instructional program. The "criterion" is the standard of performance established as the passing score for the test. Scores have meaning in terms of what the student knows or can do, rather than how the test-taker compares to a reference or norm group. Curriculum Map – A matrix showing where each goal and/or learning outcome are covered in each program course. Direct Assessment – Assessment to gauge student achievement of learning outcomes directly from their work. Educational Goals – The knowledge, skills, abilities, capacities, attitudes or dispositions students are expected to acquire as a result of completing your academic program. Goals are sometimes treated as synonymous with outcomes, though outcomes are the behavioral results of the goals, and are stated in precise operational terms. Formative assessment – The assessment of student achievement at different stages of a course or at different stages of a student’s academic career. The focus of formative assessment is on the documentation of student development over time. It can also be used to engage students in a process of reflection on their education. General Education Assessment – Assessment that measures the campus-wide, general education competencies agreed upon by the faculty. General education assessment is more holistic in nature than program outcomes assessment because competencies are measured across disciplines, rather than just within a single discipline. Holistic Scoring – In assessment, assigning a single score based on an overall assessment of performance rather than by scoring or analyzing dimensions or traits individually. The product is considered to be more than the sum of its parts and so the quality of a final product or performance is evaluated rather than the process or dimension of performance. A holistic scoring rubric might combine a number of elements on a single scale. Focused holistic scoring may be used to evaluate a limited portion of a learner's performance. Indirect Assessment – Assessment that deduces student achievement of learning outcomes through the reported perception of learning by students and other agents. Institutional assessment – Assessment to determine the extent to which a college or university is achieving its mission. Learning outcomes – Operational statements describing specific student behaviors that evidence the acquisition of desired goals in knowledge, skills, abilities, capacities, attitudes or dispositions. Learning outcomes can be usefully thought of as behavioral criteria for determining whether students are achieving the educational goals of a program, and, ultimately, whether overall program goals are being successfully met. Outcomes are sometimes treated as synonymous with objectives, though objectives are usually more general statements of what

students are expected to achieve in an academic program. Measurable Criteria – An intended student outcome, or administrative objective, restated in a quantifiable, or measurable, statement. "60% of biology students will complete an experiment/project using scientific methods in fall 2003;" "75% of responding MU students will indicate on a survey in fall 2003 that they have read materials about career opportunities on campus." Metacognition – The knowledge of one's own thinking processes and strategies, and the ability to consciously reflect and act on the knowledge of cognition to modify those processes and strategies. Norm – A distribution of scores obtained from a norm group. The norm is the midpoint (or median) of scores or performance of the students in that group. Fifty percent will score above and fifty percent below the norm. Performance-Based Assessment – Direct, systematic observation and rating of student performance of an educational objective, often an ongoing observation over a period of time, and typically involving the creation of products. The assessment may be a continuing interaction between teacher and student and should ideally be part of the learning process. The assessment should be a real-world performance with relevance to the student and learning community. Assessment of the performance is done using a rubric, or analytic scoring guide to aid in objectivity. Performance-based assessment is a test of the ability to apply knowledge in a reallife setting or performance of exemplary tasks in the demonstration of intellectual ability. Portfolio – A systematic and organized collection of a student's work that exhibits to others the direct evidence of a student's efforts, achievements, and progress over a period of time. The collection should involve the student in selection of its contents, and should include information about the performance criteria, the rubric or criteria for judging merit, and evidence of student self-refection or evaluation. Portfolio Assessment – Portfolios may be assessed in a variety of ways. Each piece may be individually scored, or the portfolio might be assessed merely for the presence of required pieces, or a holistic scoring process might be used and an evaluation made on the basis of an overall impression of the student's collected work. It is common that assessors work together to establish consensus of standards or to ensure greater reliability in evaluation of student work. Established criteria are often used by reviewers and students involved in the process of evaluating progress and achievement of objectives. Primary Trait Method – A type of rubric scoring constructed to assess a specific trait, skill, behavior, or format, or the evaluation of the primary impact of a learning process on a designated audience. Process – A generalizable method of doing something, generally involving steps or operations which are usually ordered and/or interdependent. Process can be evaluated as part of an assessment, as in the example of evaluating a student's performance during prewriting exercises

leading up to the final production of an essay or paper. Program assessment – Assessment to determine the extent to which students in a departmental program can demonstrate the learning outcomes for the program. Reliability – An assessment tool’s consistency of results over time and with different samples of students. Rubric – A set of criteria specifying the characteristics of a learning outcome and the levels of achievement in each characteristic. Self-efficacy – Students’ judgment of their own capabilities for a specific learning outcome. Senior Project – Extensive projects planned and carried out during the senior year as the culmination of the undergraduate experience. Senior projects require higher-level thinking skills, problem-solving, and creative thinking. They are often interdisciplinary, and may require extensive research. Projects culminate in a presentation of the project to a panel of people, usually faculty and community mentors, sometimes students, who evaluate the student's work at the end of the year. Summative assessment – The assessment of student achievement at the end point of their education or at the end of a course. The focus of summative assessment is on the documentation of student achievement by the end of a course or program. It does not reveal the pathway of development to achieve that endpoint. Triangulation – The collection of data via multiple methods in order to determine if the results show a consistent outcome Validity – The degree to which an assessment measures (a) what is intended, as opposed to (b) what is not intended, or (c) what is unsystematic or unstable

CLASS LOG Date: 07/30/2011 Topic: General review Assessment and Evaluation This class was about a general review of the topic already taught. We review what is an assessment and evaluation. Assessment as we remember is an activity that we can use any time that we want to check our student`s knowledge in our class. Evaluation is a timed and planned activity to check our studentツエs advances on the class. Teacher too, gives us some examples of informal and formal assessment. She show us how we can create projects using our imagination but first at all the studentツエs imagination. We can assign them some projects and let them work alone. But first, we have to put our rules and our parameters because we must to be careful with our objectives of each class work. Is necessary too, to tell them that we donツエt want that they bring to our class dangerous material and we must to avoid that they hurt whit our project. If we will go out of the school, we have too planned every detail of the project and we will have good results.





The Teaching Assessment and Evaluation Guide provides instructors with starting-points for reflecting on their teaching, and with advice on how to gather feedback on their teaching practices and effectiveness as part of a systematic program of teaching development. As well, the Guide provides guidance on how teaching might be fairly and effectively evaluated, which characteristics of teaching might be considered, and which evaluation techniques are best suited for different purposes. The Teaching Assessment and Evaluation Guide is a companion to the Teaching Documentation Guide (1993), also prepared by the Senate Committee on Teaching and Learning (SCOTL). The Documentation Guide (available at the Centre for the Support of Teaching and on the SCOTL website) aims to provide instructors with advice and concrete suggestions on how to document the variety and complexity of their teaching contributions.

Teaching is a complex and personal activity that is best assessed and evaluated using multiple techniques and broadly-based criteria. Assessment for formative purposes is designed to stimulate growth, change and improvement in teaching through reflective practice. Evaluation, in contrast, is used for summative purposes to give an overview of a particular instructor’s teaching in a particular course and setting. Informed judgements on teaching effectiveness can best be made when both assessment and evaluation are conducted, using several techniques to elicit information from various perspectives on different characteristics of teaching. There is no one complete source for information on one’s teaching, and no single technique for gathering it. Moreover, the techniques need to be sensitive to the particular teaching assignment of the instructor being assessed or evaluated, as well as the context in which the teaching takes place. If multiple perspectives are represented and different techniques used, the process will be more valued, the conclusions reached will be more credible, and consequently more valuable to the individual being assessed or evaluated.

CONTENTS • Introduction ....................................... 1 • Need for the Guide ............................ 1 • What is Quality Teaching? ................. 2 • Formative Assessment ...................... 2 • Summative Evaluation ....................... 2 • Overview of Assessment and Evaluation Strategies: 1. 2. 3. 4. 5. 6.

Teaching dossiers ........................ 3 Student ratings ............................ 4 Peer observations ........................ 5 Letters & individual interviews ...... 6 Course portfolios ......................... 6 Classroom assessment ............... 7

• Classroom Assessment Techniques .. 8

Current practices at York University are varied. In most departments and units, teaching is systematically evaluated, primarily for summative purposes. Individual instructors are free, if they wish, to use the data so gathered for formative purposes, or they may contact the Centre for the Support of Teaching which provides feedback and teaching analysis aimed at growth, development and improvement. Without denying the value of summative teaching evaluation, the main purpose of this Guide is to encourage committees and individuals to engage in reflective practice through the ongoing assessment of teaching for formative purposes and for professional development. Research indicates that such practice leads to heightened enthusiasm for teaching, and improvement in teaching and learning, both of which are linked to faculty vitality.

The Teaching Assessment and Evaluation Guide© is published by the Senate Committee on Teaching and Learning (SCOTL),York University (revised January 2002)

Teaching Assessment and Evaluation Guide


consideration the level of the course, the instructor’s objectives and style, and the teaching methodology employed. Nonetheless, the primary criterion must be improved student learning. Research indicates that students, faculty and administrators alike agree that quality teaching:

All assessment and evaluation techniques contain implicit assumptions about the characteristics that constitute quality teaching. These assumptions should be made explicit and indeed should become part of the evaluation process itself in a manner which recognizes instructors’ rights to be evaluated within the context of their own teaching philosophies and goals. First and foremost then, “teaching is not right or wrong, good or bad, effective or ineffective in any absolute, fixed or determined sense.”¹ Instructors emphasize different domains of learning (affective, cognitive, psychomotor, etc.) and employ different theories of education and teaching methodologies (anti-racist, constructivist, critical, feminist, humanistic, etc.)². They encourage learning in different sites (classrooms, field locations, laboratories, seminar rooms, studios, virtual classrooms, etc.). They use different instructional strategies and formats (using case studies, coaching, demonstrating, facilitating discussions, lecturing, problemQUALITY TEACHING based learning, Put succinctly, quality teaching is online delivery, etc.), that activity which brings about the and they do this most productive and beneficial while recognizing learning experience for students and that students have promotes their development as diverse backgrounds learners. This experience may and levels of include such aspects as: preparedness. In one situation, instructors • improved comprehension of may see their role as and ability to use the ideas transmitting factual introduced in the course; information, and in • change in outlook, attitude and another as facilitating enthusiasm towards the discussion and discipline and its place in the promoting critical academic endeavour; thinking. • intellectual growth; and • improvement in specific skills As variable and such as critical reading and diverse as quality writing, oral communication, teaching might be, analysis, synthesis, abstraction, generalizations may and generalization. nevertheless be made about its basic characteristics as described in the accompanying text box.

• • • • •

establishes a positive learning environment; motivates student engagement; provides appropriate challenges; is responsive to students’ learning needs; and is fair in evaluating their learning.

Concretely, indicators of quality teaching can include: effective choice of materials; organization of subject matter and course; effective communication skills; knowledge of and enthusiasm for the subject matter and teaching; • availability to students; and • responsiveness to student concerns and opinions. • • • •

Some characteristics are more easily measured than others. Furthermore, since instructors are individuals and teaching styles are personal, it is all the more important to recognize that not everyone will display the same patterns and strengths.

ASSESSMENT OF TEACHING FOR FORMATIVE PURPOSES Formative assessment of teaching can be carried out at many points during an instructional period, in the classroom or virtual environment, to compare the perceptions of the instructor with those of the students, and to identify gaps between what has been taught and what students have learned. The purpose of assessment is for instructors to find out what changes they might make in teaching methods or style, course organization or content, evaluation and grading procedures, etc., in order to improve student learning. Assessment is initiated by the instructor and information and feedback can be solicited from many sources (for example, self, students, colleagues, consultants) using a variety of instruments (surveys, on-line forms, etc. - see classroom assessment below). The data gathered are seen only by the instructor and, if desired, a consultant, and form the basis for ongoing improvement and development.

The criteria for evaluating teaching vary between disciplines and within disciplines, and should take into

SUMMATIVE EVALUATION Summative evaluation, by contrast, is usually conducted at the end of a particular course or at specific points in an instructor’s career. The purpose is to form a judgment about the effectiveness of a course and/or an instructor. The judgment may be used for tenure and promotion decisions, to reward success in the form of teaching awards or merit pay, or to enable departments to make

______ 1. Mary Ellen Weimer (1990). Improving College Teaching (CA: Jossey Bass Publishers), 202. 2. Adapted from George L. Geis (1977), “Evaluation: definitions, problems and strategies,” in Chris Knapper et al Eds., Teaching is Important (Toronto: Clarke Irwin in association with CAUT).


Teaching Assessment and Evaluation Guide

informed decisions about changes to individual courses, the curriculum or teaching assignments.

• evidence of exceptional achievements and contributions to teaching in the form of awards, and committee work.

At most universities, summative evaluation includes the results of teaching evaluations regularly scheduled at the end of academic terms. However, to ensure that summative evaluation is both comprehensive and representative, it should include a variety of evaluation strategies, among them:

One’s teaching dossier (see below) is an ideal format for presenting these types of evaluation as a cumulative and longitudinal record of one’s teaching. Important note: It is crucial that the two processes – summative evaluation and formative assessment – be kept strictly apart if the formative assessment of teaching is to be effective and achieve its purpose. This means that the information gathered in a program of formative assessment should not be used in summative evaluation unless volunteered by instructors themselves. It also means that persons who are or have been involved in assisting instructors to improve their teaching should not be asked to provide information for summative evaluation purposes.

• letters from individual students commenting on the effectiveness of the instructor’s teaching, the quality of the learning experience, and the impact of both on their academic progress; • assessments by peers based on classroom visits; • samples and critical reviews of contributions to course and curriculum development, as well as of contributions to scholarship on teaching; and

OVERVIEW OF STRATEGIES FOR ASSESSING AND EVALUATING QUALITY TEACHING AND STUDENT LEARNING This section describes six strategies that teachers may use to assess and evaluate the quality of their teaching and its impact on student learning: 1) teaching dossiers; 2) student ratings; 3) peer observations; 4) letters and individual interviews; 5) course portfolios; and 6) classroom assessment. These descriptions draw on current research in the field (available at the Centre for the Suppport of Teaching, 111 Central Square, and practices and procedures at other universities in Canada and abroad. All evaluation and assessment efforts should use a combination of strategies to take advantage of their inherent strengths as well as their individual limitations.


Benefits: Dossiers provide an opportunity for instructors to articulate their teaching philosophy, review their teaching goals and objectives, assess the effectiveness of their classroom practice and the strategies they use to animate their pedagogical values, and identify areas of strength and opportunities for improvement. They also highlight an instructor’s range of responsibilities, accomplishments, and contributions to teaching and learning more generally within the department, university and/or scholarly community.

A teaching dossier or To focus on: portfolio is a factual description of an § Appraisal of instructor’s instructor’s teaching teaching and learning context achievements and contains documentation § Soundness of instructor’s approach to teaching and that collectively learning suggests the scope and quality of his or her § Coherence of teaching teaching. Dossiers can objectives and strategies be used to present § Vigour of professional evidence about teaching development, contributions quality for evaluative and accomplishments in the purposes such as T&P area of teaching. submissions, teaching award nominations, etc., as they can provide a useful context for analyzing other forms of teaching evaluation. Alternatively, dossiers can provide the framework for a systematic program of reflective analysis and peer collaboration leading to improvement of teaching and student learning. For further information on how to prepare a teaching dossier, please consult SCOTL’s Teaching Documentation Guide (available at the Centre for the Support of Teaching and from the SCOLT website).

Limitations: It is important to note that dossiers are not meant to be an exhaustive compilation of all the documents and materials that bear on an instructor’s teaching performance; rather they should present a selection of information organized in a way that gives a comprehensive and accurate summary of teaching activities and effectiveness. _______ For further information on teaching dossiers see: Teaching Documentation Guide (1993, Senate Committee on Teaching and Learning). Peter Seldin “Self-Evaluation: What Works? What Doesn’t?” and John Zubizarreta “Evaluating Teaching through Portfolios” in Seldin and Associates (1999). Changing Practices in Evaluating Teaching: A Practical Guide to Improved Faculty Performance and Promotion/ Tenure Decisions (MA: Anker Press).


Teaching Assessment and Evaluation Guide

2. STUDENT RATINGS OF TEACHING Student ratings of To focus on: teaching or student evaluations are the most § Effectiveness of instructor commonly used source § Impact of instruction on of data for both student learning summative and § Perceived value of the course formative information. to the student In many academic units they are mandatory, and § Preparation and organization in several units, they are § Knowledge of subject matter also standardized. For and ability to stimulate purposes such as tenure interest in the course and promotion, data should be obtained over § Clarity and understandability time and across courses § Ability to establish rapport using a limited number and encourage discussion of global or summary within the classroom type questions. Such § Sensitivity to and concern data will provide a with students’ level of undercumulative record and standing and progress enable the detection of patterns of teaching development1. Information obtained by means of student ratings can also be used by individual instructors to improve the course in future years, and to identify areas of strength and weakness in their teaching by comparison with those teaching similar courses. Longer and more focussed questionnaires are also useful in a program of formative evaluation when designed and administered by an instructor during a course. Benefits: The use of a mandatory, standardized questionnaire puts all teaching evaluations on a common footing, and facilitates comparisons between teachers, courses and academic units. The data gathered also serve the purpose of assessing whether the educational goals of the unit are being met. Structured questionnaires are particularly appropriate where there are relatively large numbers of students involved, and where there are either several sections of a single course, or several courses with similar teaching objectives using similar teaching approaches.

Limitations: While students’ perceptions provide valuable feedback to instructors, recent research has identified specific areas of teaching quality on which students are not able to make informed judgments. These include the appropriateness of course goals, content, 3 design, materials, and evaluation of student work. Thus, the use of a variety of techniques as described elsewhere in this document can help to address the gaps and shortcomings in the student rating data. Further, recent research indicates that care should be taken to control for possible biases based on gender, race, discipline, and teaching approach, particularly for those using non-traditional teaching methods and curriculum. Likewise, ratings can be affected by factors for which it is difficult to control, such as student motivation, complexity of material, level of course, and class size. Care should be taken, therefore, to create an appropriate context for interpreting the data in light of other sources of data and in comparison with other courses. One way to ensure fairness and equity is to ask students to identify the strengths of the instructor’s approach as well as weaknesses, and to ask for specific suggestions for improvement. Teachers have such different perspectives, approaches, and objectives that a standardized questionnaire may not adequately or fairly compare their performance. For example, the implicit assumption behind the design of many evaluation forms is that the primary mode of instruction is the lecture method. Such a form will be inadequate in evaluating the performance of instructors who uses different teaching methods, for example collaborative learning. One way to overcome this limitation and to tailor the questionnaire to the objectives and approaches of a specific course or instructor is to design an evaluation form with a mandatory core set of questions and additional space for inserting questions chosen by the instructor. Note: The Centre for the Support of Teaching has sample teaching evaluation forms from numerous Faculties and departments, as well as books and articles which are helpful resources for individuals and committees interested in developing questionnaires. In addition, web resources are posted on the SCOTL website. _____

Questionnaires are relatively economical to administer, summarize and interpret. Provided that students are asked to comment only on items with which they have direct experience, student responses to questionnaires have been found to be valid. While questionnaire forms with open-ended questions are more expensive to administer, they often provide more reliable and useful sources of information in small classes and for the tenure and promotion process. Also, open-ended questions provide insight into the numerical ratings, and provide pertinent information for course revision.

For further information on student ratings of teaching see: 1. Cashin, William (1995), “Student ratings of teaching: The research revisited.” Idea Paper, Number 32 (Kansas State University, Centre for Faculty Development) 2. See, for example, The Teaching Professor, Vol. 8, No. 4, 3-4 3. See also Theall, Michael and Franklin, Jennifer, Eds.(1990). Student Ratings of Instruction: Issues for Improving Practice, New Directions in Teaching and Learning, No. 43 (CA: Jossey-Bass Inc.).


Teaching Assessment and Evaluation Guide


Peer observation is especially useful for formative evaluation. In this case, it is important that the results of the observations remain confidential and not be used for summative evaluation. The process of observation in this case should take place over time, allowing the instructor to implement changes, practice improvements and obtain feedback on whether progress has been made. It may also include video-taping the instructor’s class. This process is particularly helpful to faculty who are experimenting with new teaching methods.

Peer observations offer To focus on: critical insights into an instructor’s § Quality of the learning performance, environment (labs, lecture complementing student halls, online discussion ratings and other forms groups, seminars, studios, of evaluation to etc.) contribute to a fuller § Level of student engagement and more accurate representation of § Clarity of presentation, and overall teaching quality. ability to convey course Research indicates that content in a variety of ways colleagues are in the § Range of instructional best position to judge methods and how they specific dimensions of support student teaching quality, understanding including the goals, § Student-instructor rapport content, design and organization of the § Overall effectiveness course, the methods and materials used in delivery, and evaluation of student work.

A particularly valuable form of observation for formative purposes is peer-pairing. With this technique, two instructors provide each other with feedback on their teaching on a rotating basis, each evaluating the other for a period of time (anywhere between 2 weeks and a full year). Each learns from the other and may learn as much in the observing role as when being observed. Full guidelines for using this technique, as well as advice and assistance in establishing a peer-pairing relationship, are available from the Centre for the Support of Teaching. Benefits: Peer observations can complete the picture of an instructor’s teaching obtained through other methods of evaluation. As well, observations are an important supplement to contextualize variations in student ratings in situations, for example, where an instructor’s teaching is controversial because experimental or non-traditional teaching methods are being used, or where other unique situations exist within the learning environment. Colleagues are better able than students to comment upon the level of difficulty of the material, knowledge of subject matter and integration of topics, and they can place the teaching within a wider context and suggest alternative teaching formats and ways of communicating the material.

Peer observation may be carried out for both summative and formative purposes. For summative evaluation, it is recommended that prior consensus be reached about what constitutes quality teaching within the discipline, what the observers will be looking for, and the process for carrying out and recording the observations. To ensure that a full picture of an instructor’s strengths and weaknesses is obtained, some observers find checklists useful and some departments may choose to designate the responsibility of making classroom observations to a committee. Given the range of activities in a class, some observers find it helpful to focus on specific aspects of the teaching and learning that takes place. It is also advisable that more than one colleague be involved, and that more than one observation take place by each colleague. This will counteract observer bias towards a particular teaching approach and the possibility that an observation takes place on an unusually bad day. These precautions also provide for greater objectivity and reliability of the results.

Limitations: There are several limitations to using peer observations for summative purposes. First, unless safeguards are put in place to control for sources of bias, conflicting definitions of teaching quality, and idiosyncrasies in practice, inequities can result in how classroom observations are done1. For example, instructors tend to find observations threatening and they and their students may behave differently when there is an observer present. Also, there is evidence to suggest that peers may be relatively generous evaluators in some instances. A second limitation is that it is costly in terms of faculty time since a number of observations are necessary to ensure the reliability and validity of findings. Since observers vary in their definitions of quality teaching and some tact is required in providing feedback on observations, it is desirable that observers receive training before becoming involved in providing formative evaluation. The approaches described above can help to minimize these inequities and improve the effectiveness of peer observation. Finally, to protect the integrity of this

Before an observation, it is important that the observer and instructor meet to discuss the instructor’s teaching philosophy, the specific objectives and the strategies that will be employed during the session to be observed, and the materials relevant to the course: syllabus, assignments, online course components, etc. Likewise, discussions of the criteria for evaluation and how the observations will take place can help to clarify expectations and procedures. A post-observation meeting allows an opportunity for constructive feedback and assistance in the development of a plan for improvement.


Teaching Assessment and Evaluation Guide


technique for both formative and summative purposes, it is critical that observations for personnel decisions be kept strictly separate from evaluations for teaching improvement. ______ For further information on colleague evaluation of teaching see: 1. DeZure, Deborah. “Evaluating teaching through peer classroom observation,” in Peter Seldin and Associates (1999). Changing Practices in Evaluating Teaching: A Practical Guide to Improved Faculty Performance and Promotion/Tenure Decisions (MA: Anker Press).

4. LETTERS AND INDIVIDUAL INTERVIEWS Letters and/or individual interviews may be used in teaching award nominations, tenure and promotion files, etc. to obtain greater depth of information for the purpose of improving teaching, or for providing details and examples of an instructor’s impact on students.

To focus on: § Effectiveness of instructor through detailed reflection § Impact of instruction on student learning and motivation over the longer term § Preparation and organization § Clarity and understandability § Ability to establish rapport and encourage discussion

§ Sensitivity to and concern with students’ level of Benefits: Interviews understanding and progress and letters elicit information not readily available through student ratings or other forms of evaluation. Insights, success stories, and thoughtful analyses are often the outcomes of an interview or request for a written impressions of an instructor’s teaching. Students who are reluctant to give information on a rating scale or in written form, often respond well to a skilled, probing interviewer.

A course portfolio is a To focus on: variant on the teaching dossier and is the § Appropriateness of course product of focussed goals and objectives inquiry into the learning § Quality of instructional by students in a materials and assignments particular course. It § Coherence of course represents the specific organization, teaching aims and work of the strategies and modes of instructor and is delivery structured to explain what, how and why § Comprehensiveness of students learn in a class. methods for appraising It generally comprises student achievement four main components: § Level of student learning and 1) a statement of the contribution of teaching to aims and pedagogical students’ progress strategies of the course and the relationship § Innovations in teaching and between the method and learning outcomes; 2) an analysis of student learning based on key assignments and learning activities to advance course goals; 3) an analysis of student feedback based on classroom assessment techniques; and 4) a summary of the strengths of the course in terms of students’ learning, and critical reflection on how the course goals were realised, changed or unmet. The final analysis leads to ideas about what to change in order to enhance student learning, thinking and development the next time the course is taught.1 Course portfolios have been described as being closely analogous to a scholarly project, in that: “a course, like a project, begins with significant goals and intentions, which are enacted in appropriate ways and lead to relevant results in the form of student learning. Teaching, like a research project, is expected to shed light on the question at hand and the issues that shape it; the methods used to complete the project should be congruent with the outcomes sought. The course portfolio has the distinct advantage of representing – by encompassing and connecting planning, implementation and results – the intellectual integrity of teaching as reflected in a single course.” 2

Limitations: The disadvantage of letters is that the response rate can be low. The major disadvantage of interviews is time. Interviews can take approximately one hour to conduct, about 30 minutes to arrange, and another block of time for coding and interpretation. A structured interview schedule should be used to eliminate the bias that may result when an untrained interviewer asks questions randomly of different students.

Benefits: The focus on a specific course allows the portfolio to demonstrate student understanding as an index of successful teaching. For instructors, course portfolios provide a framework for critical reflection and continuous improvement of teaching, and deep insight into how their teaching contributes to students’ knowledge and skills.


Teaching Assessment and Evaluation Guide

For departments, they can highlight cohesion and gaps within the curriculum and enable continuity within the course over time and as different instructional technologies are incorporated. As well, course portfolios can collectively promote course articulation and provide means of assessing the quality of a curriculum and pedagogical approaches in relation to the overall goals and outcomes of a program of study.

between what they teach and what students learn and enable them to adjust their teaching to make learning more efficient and effective. The information should always be shared with students to help them improve their own learning strategies and become more successful selfdirected learners. There are a variety of instruments for classroom assessment, either in class or electronically, such as oneminute papers, one-sentence summaries, critical incident questionnaires, focus groups, and mid-year mini surveys (see page 8). Generally, the instruments are created, administered, and results analysed by the instructor to focus on specific aspects of teaching and student learning. Although the instructor is not obligated to share the results of classroom assessment beyond the course, the results may usefully inform other strategies for evaluating teaching quality.

Limitations: Because course portfolios focus on one course, they do not reflect the full range of an instructor’s accomplishments, responsibilities, and contributions (such as curriculum development and work with graduate students) that would be documented in a teaching dossier. Also, course portfolios take time to prepare and evaluate, and instructors should not be expected to build a portfolio for every course taught; rather they should concentrate on those courses for which they have the strongest interest or in which they invest the majority of their energy, imagination and time.3 ______ For further information on course portfolios see:

Classroom assessment can be integrated into an instructor’s teaching in a graduated way, starting out with a simple assessment technique in one class involving five to ten minutes of class time, less than an hour for analysis of the results, and a few minutes during a subsequent class to let students know what was learned from the assessment and how the instructor and students can use that information to improve learning. After conducting one or two quick assessments, the instructor can decide whether this approach is worth further investment of time and energy.

1. Cerbin, William (1994), “The course portfolio as a tool for continuous improvement of teaching and learning.” Journal on Excellence in College Teaching, 5(1), 95-105. 2. Cambridge, Barbara. “The Teaching Initiative: The course portfolio and the teaching portfolio.” American Association for Higher Education. 3. Cutler, William (1997). The history course portfolio. Perspectives 35 (8): 17-20.

Benefits: Classroom assessment encourages instructors to become monitors of their own performance and promotes reflective practice. In addition, its use can prompt discussion among colleagues about their effectiveness, and lead to new and better techniques for eliciting constructive feedback from students on teaching and learning.

6. CLASSROOM ASSESSMENT* Classroom assessment To focus on: is method of inquiry into the effects of § Effectiveness of teaching on teaching on learning. It learning involves the use of § Constructive feedback on techniques and teaching strategies and instruments designed to classroom/online practices give instructors ongoing feedback about § Information on what students the effect their teaching are learning and level of is having on the level understanding of material and quality of student § Quality of student learning learning; this feedback and engagement then informs their § Feedback on course design subsequent instructional decisions. Unlike tests and quizzes, classroom assessment can be used in a timely way to help instructors identify gaps

Limitations: As with student ratings, the act of soliciting frank, in-the-moment feedback may elicit critical comments on the instructor and his/her approach to teaching. However, it is important to balance the positive and negative comments and try to link negative commentary to issues of student learning. New users of classroom assessment techniques might find it helpful to discuss the critical comments with an experienced colleague. ______ Adapted from Core: York’s newsletter on university teaching (2000) Vol 9, No. 3.

* “Classroom Assessment” is a term used widely by scholars in higher education; it is meant to include all learning environments. For examples, see references on page 8.


Teaching Assessment and Evaluation Guide


The One Sentence Summary technique involves asking students to consider the topic you are discussing in terms of Who Does/Did What to Whom, How, When, Where and Why, and then to synthesize those answers into a single informative, grammatical sentence. These sentences can then be analyzed to determine strengths and weaknesses in the students’ understanding of the topic, or to pinpoint specific elements of the topic that require further elaboration. Before using this strategy it is important to make sure the topic can be summarized coherently. It is best to impose the technique on oneself first to determine its appropriateness or feasibility for given material.

The One-Minute Paper, or a brief reflection, is a technique that is used to provide instructors with feedback on what students are learning in a particular class. It may be introduced in small seminars or in large lectures, in first year courses or upper year courses, or electronically using software that ensures student anonymity. The OneMinute Paper asks students to respond anonymously to the following questions: One-Minute Paper 1. What is the most important thing you learned today?

For further information on these and other classroom assessment strategies see:

2. What question remains uppermost in your mind?

Cross, K. P. and Angelo, T. A, Eds. (1988) Classroom Assessment Techniques: A Handbook for Faculty (MI: National Center for Research to Improve Post-Secondary Teaching and Learning).

Depending upon the structure and format of the learning environment, the One-Minute Paper may be used in a variety of ways: •

During a lecture, to break up the period into smaller segments enabling students to reflect on the material just covered.

At the end of a class, to inform your planning for the next session.

CRITICAL INCIDENT QUESTIONNAIRES The Critical Incident Questionnaire is a simple assessment technique that can be used to find out what and how students are learning, and to identify areas where adjustments are necessary (e.g., the pace of the course, confusion with respect to assignments or expectations). On a single sheet of paper, students are asked five questions which focus on critical moments for learning in a course. The questionnaire is handed out about ten minutes before the final session of the week.

In a course comprising lectures and tutorials, the information gleaned can be passed along to tutorial leaders giving them advance notice of issues that they may wish to explore with students.

Critical Incident Questionnaire


1. At what moment this week were you most engaged as a learner?

An adaptation of the One-Minute Paper, the Muddiest Point is particularly useful in gauging how well students understand the course material. The Muddiest Point asks students:

2. At what moment this week were you most distanced as a learner? 3. What action or contribution taken this week by anyone in the course did you find most affirming or helpful?

What was the ‘muddiest point’ for you today? Like the One-Minute Paper, use of the Muddiest Point can helpfully inform your planning for the next session, and signal issues that it may be useful to explore.

4. What action or contribution taken this week by anyone in the course did you find most puzzling or confusing?


5. What surprised you most about the course this week?

One Sentence Summaries can be used to find out how concisely, completely and creatively students can summarize a given topic within the grammatical constraints of a single sentence. It is also effective for helping students break down material into smaller units that are more easily recalled. This strategy is most effective for any material that can be represented in declarative form – historical events, story lines, chemical reactions and mechanical processes.

Critical Incident Questionnaires provide substantive feedback on student engagement and may also reveal power dynamics in the classroom that may not initially be evident to the instructor. For further information on Critical Incident Questionnaires see Brookfield, S. J. and Preskill, S. (1999) Discussion as a Way of Teaching: Tools and Techniques for a Democratic Classroom. (CA: Jossey Bass), page 49.


UNIVERSIDAD MARIANO GALVEZ DE GUATEMALA FACULTAD DE HUMANIDADES ESCUELA DE IDIOMAS ESCUELA DE IDIOMAS LICDA. EVELYN R. QUIROA ASSESSMENT AND EVALUATION VOCABULARY (DIAGNOSTIC) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.

Action Research Affective Outcomes Annual Report Assessment Assessment Cycle Assessment Tool Assessment Literacy Authentic Assessment Benchmark Cohort Course!embedded assessment Course!level assessment Course mapping Criterion Referenced Tests Curriculum Map Diagnostic Evaluation Direct Assessment Educational Goals Formative assessment General Education Assessment Holistic Scoring Learning outcomes Measurable Criteria Metacognition Norm Portfolio Primary Trait Method Process Program assessment Reliability Rubric Self!efficacy Senior Project Summative assessment Validity



COURSE DESCRIPTION This course is dedicated in the study of the principle theories that inbound evaluation and assessment in the classroom. A critical analysis will be held in order to critique and put into practice the different perspectives, techniques and styles related to performance!based assessment, summative and formative feedback methods to assess and evaluate student learning in the classroom. COURSE GOAL By the end of the course, students will be able to plan and create assessments and evaluations that provide their students with activities closely related to learning objectives and/or competences. LEARNING OUTCOMES Upon completion of the course, students will be able to: 1. Demonstrate development and use of academic standards across the curriculum and application of standards and objectives in classroom assessment and evaluation. 2. Match assessment to learning outcomes, develop rubric criteria and select appropriate assessment and evaluation choices using the tools proportioned by the course. 3. Apply current research tools to create authentic assessment, discourse analysis, self and peer evaluation, rubrics, surveys, tests and mini!quizzes for self!paced tutorials. 4. Evaluate and utilize appropriate tools such as grade books, calendars, spreadsheets and portfolios. GENERAL AND SPECIFIC EXPECTATIONS OF THE COURSE Student Assessment and Evaluation General Expectation 1: to communicate an overview of evaluation frameworks and processes. Specific Expectations: 1. Identify the following: a) the purposes of evaluation, b) key terms relative to evaluation, c) types of evaluation, d) links between planning and evaluation 2. Develop student assessment and practice within a philosophical framework 3. Understand equity issues in evaluation and assessment. General Expectation 2: to understand the purposes of various types of evaluation strategies. Specific Expectations: 1. Differentiate between diagnostic, formative, and summative evaluation 2. Compare the purpose and function of different information sources for evaluation 3. Identify a variety of evaluation and assessment procedures, their purposes, strengths, and weaknesses 4. Discriminate between traditional and authentic assessment and appropriate application in teaching/learning 5. Incorporate appropriate assessment and evaluation strategies into your teaching practice.

General Expectation 3: to place evaluation strategies in the context of a unit of study. Specific Expectations: 1. Design student assessment instruments (including rubrics) for a unit of study 2. Accommodate the needs of exceptional students within the unit and its evaluation component. 3. Enhance research in teaching to improve their own practice. 4. Be capable of doing self assessment. 5. Share the knowledge acquired to benefit the school community to which they belong. EXPECTATIONS: Students are expected to attend all classes. Class attendance will be a part of the final evaluation. Students are expected to arrive for class on time. Any student who arrives late will not be given additional time to complete quizzes, exams, or in!class assignments. Students are expected to submit all assignments on time. Late submissions will be penalized or not be accepted depending on the particular case. Students are expected to come to class having read and completed all assignments. Students are expected to participate in class discussions. Students are expected to complete all quizzes and examinations in class on the date specified by the teacher. Students are expected to word process assignments as required, handwritten work will not be excepted unless it is a test blueprint. CONTENTS: EXAM DATE




CONTENT The difference between evaluation and assessment Types of evaluation (Diagnostic, Formative & Summative) Establishing High!Quality (Validity, Reliability etc. ) Becoming aware of content, context and learners Curriculum and Evaluation Visualizing your actions: planning and testing Objectives vs. Competences Blooms Taxonomy Designing a blueprint Test type items Test item type instructions Organizing test type items according to competencies and domain levels Analyzing test Creating different core content tests Assessment strategies Self Improvement through self assessment Self assessment tools: rubrics, checklists, portfolios etc. Differentiated learning Declarative and procedural knowledge based assessment Reflective Teaching and Learning Administering and interpreting standardized tests

NOTE: Additional content may be added to list.

MEANS TO ACHIEVE OUR GOALS: 1. 2. 3. 4. 5. 6. 7. 8.

Summary on subject matter must be turned in weekly. (Except when having test) Teacher and student exchange of knowledge and experiences. Group discussions. Students must read the material in advance. Individual research and enrichment. Multimedia presentations. Teaching Project Portfolio Exams

EVALUATION: Attendance 80% to apply for final term TOTAL ZONE…………………….……………………………………………10 PTS QUIZZES CLASS ACTIVITIES PRESENTATIONS TWO MIDTERMS…….…………………………………………………….40PTS PORTFOLIO …….……………………………………………………………. 20 PTS FINAL EXAM ….…………..…………………………………………………. 30 PTS TOTAL …………………………………………………………………………..100PTS REFERENCES:

1. LANGUAGE PROGRAM EVALUATION, Brian K. Lynch Cambridge University applied linguistics 2. REFLECTIVE PLANNING, TEACHING AND EVALUATION. Judy W. Eby, Adrienne L. Herrell & Jim Hicks 3rd. Edition Merill!Prentice Hall. London 2002 3. PLANNING LESSONS AND COURSES. Tessa Woodward. Cambridge University Press. Cambridge 2001 4. CLASSROOM ASSESSMENT, PRINCIPLES AND PRACTICES FOR EFFECTIVE INSTRUCTION, James H. McMillan. McMillan Press. Virginia 2001

CLASS REQUIREMENTS AND GUIDELINES Submitting Assignments: All assignments either have or will have an identified “due date”. Extensions beyond the designated due date are not granted except in the most extenuating of circumstances. With the exception of an immediate and pressing “emergency”, all requests for an extension will be written, signed, dated, and delivered in person to me, as your Professor, before the specified due date and in time for me to respond to your request in writing. All assignments are to include a title page that clearly identifies the assignment topic/title, course name and number, the date submitted, the teacher’s name, and the student’s name and I.D. number. All assignments are to be given, in person, directly to the teacher. I will take no responsibility for assignments that are given to other students or given to the personnel in the “Escuela de Idiomas” office. While I have not yet lost any student assignment; there is always the first time! Therefore, you would be well advised to back up your assignment electronically and if feasible, in hard copy. An assignment will be considered late if it is not directly handed to me, as your Professor, by the end of class on the specified “due date”. Late assignments will be penalized 5% for each day or part thereof following the specified “due date” [including Saturday(s) and Sunday(s)]. Attendance and Participation:

Attendance will be taken at the beginning of each class period. Attendance in each class is mandatory; however, there is a proviso in the University regulations that students are permitted to miss the equivalent of 3 classroom contact hours. Beyond this limit, the student will be issued a warning that any more absences may result in being excluded from writing the final examination. Regular attendance, being prepared, and constructively participating in classroom activities, are all seen as integral components in the growth and development of becoming a professional teacher and in the establishment of a meaningful community of learnership in our class. Tardiness This can be extremely disruptive and disrespectful to members who strive to be on time. Naturally, we all encounter circumstances that occasionally cause us to be late – but habituated tardiness is not acceptable. If you are late for class, no material will be repeated. Therefore, you need to contact your classmates to be filled in on the material covered. If you arrive after attendance has been taken and you have no excuse, you will be marked as absent. Class Policy on Cell Phones Cell phones must be turned off at all times. If you are expecting an emergency call make sure to talk to me before class. Class Policy on Laptop Computers You may bring your laptop to class, but all work done on laptop computers must be related to the class work of that day. Academic Dishonesty Academic honesty is fundamental to the activities and principles of the University, and more broadly to society at large. All members of the academic community must be confident that each person’s work has been responsibly and honorably acquired, developed, and presented. References Use the A.P.A format 5th Edition.


What is assessment and evaluation? Assessment is defined as data!gathering strategies, analyses, and reporting processes that provide information that can be used to determine whether or not intended outcomes are being achieved: Evaluation uses assessment information to support decisions on maintaining, changing, or discarding instructional or programmatic practices. These strategies can inform: The nature and extent of learning, Facilitate curricular decision making, Correspondence between learning and the aims and objectives of teaching, and The relationship between learning and the environments in which learning takes place. Evaluation is the culminating act of interpreting the information gathered for the purpose of making decisions or judgments about students' learning and needs, often at reporting time. Assessment and evaluation are integral components of the teaching!learning cycle. The main purposes are to guide and improve learning and instruction. Effectively planned assessment and evaluation can promote learning, build confidence, and develop students' understanding of themselves as learners. Assessment data assists the teacher in planning and adapting for further instruction. As well, teachers can enhance students' understanding of their own progress by involving them in gathering their own data, and by sharing teacher!gathered data with them. Such participation makes it possible for students to identify personal learning goals. Types of Assessment and Evaluation There are three types of assessment and evaluation that occur regularly throughout the school year: diagnostic, formative, and summative. Diagnostic assessment and evaluation usually occur at the beginning of the school year and before each unit of study. The purposes are to determine students' knowledge and skills, their learning needs, and their motivational and interest levels. By examining the results of diagnostic assessment, teachers can determine where to begin instruction and what concepts or skills to emphasize. Diagnostic assessment provides information essential to teachers in selecting relevant learning objectives and in designing appropriate learning experiences for all students, individually and as group members. Keeping diagnostic instruments for comparison and further reference enables teachers and students to determine progress and future direction. Diagnostic assessment tools such as the Writing Strategies Questionnaire and the Reading Interest/Attitude Inventory in this guide can provide support for instructional decisions.

Formative assessment and evaluation focus on the processes and products of learning. Formative assessment is continuous and is meant to inform the student, the parent/guardian, and the teacher of the student's progress toward the curriculum objectives. This type of assessment and evaluation provides information upon which instructional decisions and adaptations can be made and provides students with directions for future learning. Involvement in constructing their own assessment instruments or in adapting ones the teacher has made allows students to focus on what they are trying to achieve, develops their thinking skills, and helps them to become reflective learners. As well, peer assessment is a useful formative evaluation technique. For peer assessment to be successful, students must be provided with assistance and the opportunity to observe a model peer assessment session. Through peer assessment students have the opportunity to become critical and creative thinkers who can clearly communicate ideas and thoughts to others. Instruments such as checklists or learning logs, and interviews or conferences provide useful data. Summative assessment and evaluation occur most often at the end of a unit of instruction and at term or year end when students are ready to demonstrate achievement of curriculum objectives. The main purposes are to determine knowledge, skills, abilities, and attitudes that have developed over a given period of time; to summarize student progress; and to report this progress to students, parents/guardians, and teachers. Summative judgements are based upon criteria derived from curriculum objectives. By sharing these objectives with the students and involving them in designing the evaluation instruments, teachers enable students to understand and internalize the criteria by which their progress will be determined. Often assessment and evaluation results provide both formative and summative information. For example, summative evaluation can be used formatively to make decisions about changes to instructional strategies, curriculum topics, or learning environment. Similarly, formative evaluation assists teachers in making summative judgements about student progress and determining where further instruction is necessary for individuals or groups. The suggested assessment techniques included in various sections of this guide may be used for each type of evaluation.

TEST TYPES True/False Good for: · · ·

Knowledge level content Evaluating student understanding of popular misconceptions Concepts with two logical responses

Advantages: · ·

Can test large amounts of content Students can answer 3-4 questions per minute

Disadvantages: · · · ·

They are easy It is difficult to discriminate between students that know the material and students who don't Students have a 50-50 chance of getting the right answer by guessing Need a large number of items for high reliability

Tips for Writing Good True/False items: · · · · · · · ·

Avoid double negatives. Avoid long/complex sentences. Use specific determinants with caution: never, only, all, none, always, could, might, can, may, sometimes, generally, some, few. Use only one central idea in each item. Don't emphasize the trivial. Use exact quantitative language Don't lift items straight from the book. Make more false than true (60/40). (Students are more likely to answer true.)

Matching Good for: · ·

Knowledge level Some comprehension level, if appropriately constructed

Types: · · · · ·

Terms with definitions Phrases with other phrases Causes with effects Parts with larger units Problems with solutions

Advantages: · ·

Maximum coverage at knowledge level in a minimum amount of space/preptime Valuable in content areas that have a lot of facts

Disadvantages: · ·

Time consuming for students Not good for higher levels of learning

Tips for Writing Good Matching items: · · · · · · · ·

Need 15 items or less. Give good directions on basis for matching. Use items in response column more than once (reduces the effects of guessing). Use homogenous material in each exercise. Make all responses plausible. Put all items on a single page. Put response in some logical order (chronological, alphabetical, etc.). Responses should be short.

Multiple Choice Good for: ·

Application, synthesis, analysis, and evaluation levels

Types: · · ·

Question/Right answer Incomplete statement Best answer

Advantages: · · · · ·

Very effective Versatile at all levels Minimum of writing for student Guessing reduced Can cover broad range of content

Disadvantages: · ·

Difficult to construct good test items. Difficult to come up with plausible distractors/alternative responses.

Tips for Writing Good Multiple Choice items: · · · · · · · · · · · · ·

Stem should present single, clearly formulated problem. Stem should be in simple, understood language; delete extraneous words. Avoid "all of the above"--can answer based on partial knowledge (if one is incorrect or two are correct, but unsure of the third...). Avoid "none of the above." Make all distractors plausible/homoegenous. Don't overlap response alternatives (decreases discrimination between students who know the material and those who don't). Don't use double negatives. Present alternatives in logical or numerical order. Place correct answer at random (A answer is most often). Make each item independent of others on test. Way to judge a good stem: student's who know the content should be able to answer before reading the alternatives List alternatives on separate lines, indent, separate by blank line, use letters vs. numbers for alternative answers. Need more than 3 alternatives, 4 is best.

Short Answer Good for: ·

Application, synthesis, analysis, and evaluation levels

Advantages: · · · ·

Easy to construct Good for "who," what," where," "when" content Minimizes guessing Encourages more intensive study-student must know the answer vs. recognizing the answer.

Disadvantages: · · ·

May overemphasize memorization of facts Take care - questions may have more than one correct answer Scoring is laborious

Tips for Writing Good Short Answer Items: · · · · · ·

When using with definitions: supply term, not the definition-for a better judge of student knowledge. For numbers, indicate the degree of precision/units expected. Use direct questions, not an incomplete statement. If you do use incomplete statements, don't use more than 2 blanks within an item. Arrange blanks to make scoring easy. Try to phrase question so there is only one answer possible.

Essay Good for: ·

Application, synthesis and evaluation levels

Types: · ·

Extended response: synthesis and evaluation levels; a lot of freedom in answers Restricted response: more consistent scoring, outlines parameters of responses

Advantages: · · · ·

Students less likely to guess Easy to construct Stimulates more study Allows students to demonstrate ability to organize knowledge, express opinions, show originality.

Disadvantages: · · ·

Can limit amount of material tested, therefore has decreased validity. Subjective, potentially unreliable scoring. Time consuming to score.

Tips for Writing Good Essay Items: · · · · ·

Provide reasonable time limits for thinking and writing. Avoid letting them to answer a choice of questions (You won't get a good idea of the broadness of student achievement when they only answer a set of questions.) Give definitive task to student-compare, analyze, evaluate, etc. Use checklist point system to score with a model answer: write outline, determine how many points to assign to each part Score one question at a time-all at the same time.

Oral Exams Good for: ·

Knowledge, synthesis, evaluation levels

Advantages: · · ·

Useful as an instructional tool-allows students to learn at the same time as testing. Allows teacher to give clues to facilitate learning. Useful to test speech and foreign language competencies.

Disadvantages: · · ·

Time consuming to give and take. Could have poor student performance because they haven't had much practice with it. Provides no written record without checklists.

Student Portfolios Good for: ·

Knowledge, application, synthesis, evaluation levels

Advantages: · · ·

Can assess compatible skills: writing, documentation, critical thinking, problem solving Can allow student to present totality of learning. Students become active participants in the evaluation process.

Disadvantages: ·

Can be difficult and time consuming to grade.

Performance Good for: ·

Application of knowledge, skills, abilities

Advantages: ·

Measures some skills and abilities not possible to measure in other ways

Disadvantages: · · · ·

Can not be used in some fields of study Difficult to construct Difficult to grade Time-consuming to give and take










7. 8. 9.




CLASS LOG Date: 11/03/2011 Topic: Type of tests ASSESSMENT PERFORMANCE The evaluation will have a score while the assessment is an activity that we can do and our students wouldnツエt realize that they are been evaluated. Those can be rubrics. TRUE AND FALSE If any one part of the sentence is false, the whole sentence is false despite many other true statements. MATCHING TEST You have to try to make two columns and write on the first only numbers and the other letters. ESSAY QUESTIONS Writing an effective essay examination requires two important abilities: recalling information and organizing the information in order to draw relevant conclusions from it. While this process sounds simple, writing an effective essay examination under pressure in limited time can be a daunting task. COMPLETATION TEST This kind of thest help to know if the students know the correct answer about we want to know of a topic. MULTIPLE CHOICE TEST This kind of test make our students analyze and choose the correct answer in the test, proving if they understand the concept of each item.


CLASS LOG Date: 09/10/2011 Topic: Blue Print

Blue Print It is important to know that our evaluations have to be prepared on a detailed way. You donツエt have to just open your books and star to make your tests. First, you must to make a blue print to organize the score, and the series of the test. On this way your exams will be effective because you work on a rubric test using the score that you find on the blue print. It will be useful to help our students knowing which of the skill is available to test in your exams.


Colegio Integral “San Pabloâ€? BĂĄsicos- Diversificado

Name: __________________________________ Date: __________________ Score: __________ I SERIE (5pts. 1pt each) Directions: Write T if is true or F for false items. Remember to justify false items. 1. Invertebrates are animals without backbone.

______ __________________________________________________________________

2. The fish and the birds are invertebrates.

______ __________________________________________________________________

3. Echinoderms and worms are vertebrates.

______ _________________________________________________________________

4. Invertebrates are multicellular organisms and mostly form a colony of individual cell that function as one. ______ __________________________________________________________________

5. The vertebrates form the most advanced organisms on the planet. ________ ________________________________________________________

II SERIE (5pts. 1pt each) Directions: Match each number the correct letter. 1. can be contrasted to weather, which is the present condition of these elements and their variations over shorter periods. 2. High-level clouds form above 20,000 feet (6,000 meters) and since the temperatures are so cold at such high elevations, these clouds are primarily composed of ice crystals. 3. The bases of mid-level clouds typically appear between 6,500 to 20,000 feet (2,000 to 6,000 meters). Because of their lower altitudes, they are composed primarily of water droplets, however, they can also be composed of ice crystals when temperatures are cold enough. 4. Low clouds are of mostly composed of water droplets since their bases generally lie below 6,500 feet (2,000 meters). However, when temperatures are cold enough, these clouds may also contain ice particles and snow. 5. Probably the most familiar of the classified clouds is the cumulus cloud. Generated most commonly through either thermal convection or frontal lifting, these clouds can grow to heights in excess of 39,000 feet (12,000 meters), releasing incredible amounts of energy through the condensation of water vapor within the cloud itself.

a. Vertically developed clouds

b. Mid level clouds

c. Weather

d. Low level clouds

e. High level clouds

III SERIE (5pts. 1pt each) Directions: Write a paragraph on five lines the difference between vertebrates and invertebrates on the following lines.

IV SERIE (5pts. 1pt each) Directions: Complete writing the correct animal classification on each sentence. 1. If an animal drinks milk when it is a baby and has hair on its body, it belongs to _________.

2. ________ are animals that have feathers and that are born out of hard-shelled eggs.

3. ____________are vertebrates that live in water and have gills, scales and fins on their body.

4. ____________________are a class of animal with scaly skin. 5. ______________________ are born in the water.

V SERIE (5pts. 1pt each) Directions: Choose the correct part of the plants of the following options. 1. Take in water and food (mineral salts) from soil. Anchors plants. roots stem leave flowers 2. Transports water through plant. roots stem



3. Almost always green but sometimes covered with another colour such as red. roots stem leave flowers 4. Produce seeds which form new plants. roots stem leave


5. Make food for the plant. roots stem




• • • • • •

Multiple choice True or False Completion/Short Answers Matching Essay Questions Performance Assessment


For example (1) when there are only two possible alternatives, a shift can be made to a true-false item; and (2) (2) when there are a number of similar factors to be related, a shift can be made to a matching item.

• Multiple-choice items can be used to measure knowledge outcomes and various types of learning outcomes. • They are most widely used for measuring knowledge, comprehension, and application outcomes. • The multiple-choice item provides the most useful format for measuring achievement at various levels of learning. • When selection-type items are to be used (multiple-choice, true-false, matching, check all that apply) an effective procedure is to start each item as a multiple-choice item and switch to another item type only when the learning outcome and content make it desirable to do so.


• •

• • • • • • • •

Learning outcomes from simple to complex can be measured. Highly structured and clear tasks are provided. A broad sample of achievement can be measured. Incorrect alternatives provide diagnostic information. Scores are less influenced by guessing than true-false items. Scores are more reliable than subjectively scored items (e.g., essays). Scoring is easy, objective, and reliable. Item analysis can reveal how difficult each item was and how well it discriminated between the strong and weaker students in the class Performance can be compared from class to class and year to year Can cover a lot of material very efficiently (about one item per minute of testing time). Items can be written so that students must discriminate among options that vary in degree of correctness. Avoids the absolute judgments found in True-False tests.


• •

• •

• •

• •

• • •

Constructing good items is time consuming. It is frequently difficult to find plausible distracters. This item is ineffective for measuring some types of problem solving and the ability to organize and express ideas. Real-world problem solving differs – a different process is involved in proposing a solution versus selecting a solution from a set of alternatives. Scores can be influenced by reading ability. There is a lack of feedback on individual thought processes – it is difficult to determine why individual students selected incorrect responses. Students can sometimes read more into the question than was intended. Often focus on testing factual information and fails to test higher levels of cognitive thinking. Sometimes there is more than one defensible “correct” answer. They place a high degree of dependence on the student’s reading ability and the instructor’s writing ability. Does not provide a measure of writing ability. May encourage guessing.


• • •

• • • • • • • •

Base each item on an educational or instructional objective of the course, not trivial information. Try to write items in which there is one and only one correct or clearly best answer. The phrase that introduces the item (stem) should clearly state the problem. Test only a single idea in each item. Be sure wrong answer choices (distracters) are at least plausible. Incorporate common errors of students in distracters. The position of the correct answer should vary randomly from item to item. Include from three to five options for each item. Avoid overlapping alternatives (see Example 3 following). The length of the response options should be about the same within each item (preferably short). There should be no grammatical clues to the correct answer. Format the items vertically, not horizontally (i.e., list the choices vertically) The response options should be indented and in column form.

Helpful Hints

• Word the stem positively; avoid negative phrasing such as “not” or “except.” If this cannot be avoided, the negative words should always be highlighted by underlining or capitalization: Which of the following is NOT an example …… • Avoid excessive use of negatives and/or double negatives. • Avoid the excessive use of “All of the above” and “None of the above” in the response alternatives. • In the case of “All of the above”, students only need to have partial information in order to answer the question. Students need to know that only two of the options are correct (in a four or more option question) to determine that “All of the above” is the correct answer choice. Conversely, students only need to eliminate one answer choice as implausible in order to eliminate “All of the above” as an answer choice. • Similarly, with “None of the above”, when used as the correct answer choice, information is gained about students’ ability to detect incorrect answers. However, the item does not reveal if students know the correct answer to the question.

Multiple-choice questions typically have 3 parts: STEM, KEY & DISTRACTERS

Multiple-Choice Item Writing Guidelines

• Use either the best answer or the correct answer format. • Best answer format refers to a list of options that can all be correct in the sense that each has an advantage, but one of them is the best. • Correct answer format refers to one and only one right answer. • Format the items vertically, not horizontally (i.e., list the choices vertically) • Allow time for editing and other types of item revisions. • Use good grammar, punctuation, and spelling consistently. • Minimize the time required to read each item. • Avoid trick items. • Use the active voice. • The ideal question will be answered by 60-65% of the tested population. • Have your questions peer-reviewed. • Avoid giving unintended cues – such as making the correct answer longer in length than the distracters.

Procedural Rules:

• Base each item on an educational or instructional objective of the course, not trivial information. • Test for important or significant information. • Focus on a single problem or idea for each test item. • Keep the vocabulary consistent with the examinees’ level of understanding. • Avoid cueing one item with another; keep items independent of one another. • Use the author’s examples as a basis for developing your items. • Avoid overly specific knowledge when developing items. • Avoid textbook, verbatim phrasing when developing the items. • Avoid items based on opinions. • Use multiple-choice to measure higher level thinking. • Be sensitive to cultural and gender issues. • Use case-based questions that use a common text to which a set of questions refers.

Content-related Rules:

• State the stem in either question form or completion form. • When using a completion form, don’t leave a blank for completion in the beginning or middle of the stem. • Ensure that the directions in the stem are clear, and that wording lets the examinee know exactly what is being asked. • Avoid window dressing (excessive verbiage) in the stem. • Word the stem positively; avoid negative phrasing such as “not” or “except.” If this cannot be avoided, the negative words should always be highlighted by underlining or capitalization: Which of the following is NOT an example …… • Include the central idea and most of the phrasing in the stem. • Avoid giving clues such as linking the stem to the answer (…. Is an example of an: test-wise students will know the correct answer should start with a vowel)

Stem Construction Rules:

• • •

• • • • • • • •

• •

Place options in logical or numerical order. Use letters in front of options rather than numbers; numerical answers in numbered items may be confusing to students. Keep options independent; options should not be overlapping. Keep all options homogeneous in content. Keep the length of options fairly consistent. Avoid, or use sparingly, the phrase all of the above. Avoid, or use sparingly, the phrase none of the above. Avoid the use of the phrase I don’t know. Phrase options positively, not negatively. Avoid distracters that can clue test-wise examinees; for example, absurd options, formal prompts, or semantic (overly specific or overly general) clues. Avoid giving clues through the use of faulty grammatical construction. Avoid specific determinates, such as never and always. Position the correct option so that it appears about the same number of times in each possible position for a set of items. Make sure that there is one and only one correct option.

General Option Development Rules:

• • • • • • • • • •

Use plausible distracters. Incorporate common errors of students in distracters. Avoid technically phrased distracters. Use familiar yet incorrect phrases as distracters. Use true statements that do not correctly answer the item. Avoid the use of humor when developing options. Distracters that are not chosen by any examinees should be replaced. Suggestions for Writing Good Multiple Choice Items: Present practical or real-world situations to the students. Present the student with a diagram of equipment and ask for application, analysis or evaluation. • Present actual quotations taken from newspapers or other published sources and ask for the interpretation or evaluation of these quotations. • Use pictorial materials that require students to apply principles and concepts. • Use charts, tables or figures that require interpretation.

Distracter (incorrect options) Development Rules:

• • •

• •

• • •

• •

Begin writing items well ahead of the time when they will be used; allow time for revision. Match items to intended outcomes at the proper difficulty level to provide a valid measure of the instructional objectives. Be sure each item deals with an important aspect of the content area and not with trivia. Be sure that the problem posed is clear and unambiguous. Be sure that each item is independent of all other items (i.e., a hint to an answer should not be unintentionally embedded in another item). Be sure the item has one correct or best answer on which experts would agree. Prevent unintended clues to the answer in the statement or question (e.g., grammatical inconsistencies such as ‘a’ or ‘an’ give clues). Avoid duplication of the textbook in writing test items; don’t lift quotes directly from any textual materials. Avoid trick or catch questions in an achievement test. (Don’t waste time testing how well the student can interpret your intentions). On a test with different question formats (e.g., multiple choice and True-False), one should group all items of similar format together. Questions should follow an easy to difficult progression. Space the items to eliminate overcrowding. Have diagrams and tables above the item using the information, not below.

General Guidelines to Writing Test Items

Any additional information that is irrelevant to the question, such as the phrase "completed in 1803‌," can distract or confuse the student, thus providing an alternative explanation for why the item was missed. Keep it simple.

*an asterisk indicates the correct answer.

a. the port of New Orleans* b. helping Haitians against Napoleon c. the friendship of Great Britain d. control over the Indians

To [Stem]: The purchase of the Louisiana Territory primarily grew out of our need for

a. the port of New Orleans* b. helping Haitians against Napoleon c. the friendship of Great Britain d. control over the Indians

Example: Change [Stem]: The purchase of the Louisiana Territory, completed in 1803 and considered one of Thomas Jefferson's greatest accomplishments as president, primarily grew out of our need for

1. Keep the stem simple, only including relevant information.

Below are some strategies to reduce the cognitive load of your test items.

Examples & Tips

The more mental effort (or cognitive load) that students have to use to make sense of an item the more likely a comprehension error can occur that would provide another rival explanation. By placing the alternatives in a logical order the reader can focus on the content of the question rather than having to reorder the items mentally. Although such reordering might require a limited amount of cognitive load, such load is finite, and it does not take much additional processing to reach the point where concentration is negatively impacted. Thus, this guideline is consistently recommended (Haladyna, Downing, & Rodriguez, 2002).

a. 2% b. 9%* c. 25% d. 39%

According to the 1991 census, approximately what percent of the United States population is of Spanish or Hispanic descent? a. 25% b. 39% c. 2% d. 9%* To



3. Put alternatives in a logical order.

Instead of repeating the phrase "you should" at the beginning each alternative add that phrase to the end of the stem. The less reading the student has to do the less chance there is for confusion.

To When your body adapts to your exercise load, you should a. decrease the load slightly. b. increase the load slightly.* c. change the kind of exercise you are doing. d. stop exercising.

When your body adapts to your exercise load, a. you should decrease the load slightly. b. you should increase the load slightly.* c. you should change the kind of exercise you are doing. d. you should stop exercising.



2. Keep the alternatives simple by adding any common words to the stem rather than including them in each alternative.

The more consistent and predictable a test is the less cognitive load that is required by the student to process it. Consequently, the student can focus on the questions themselves without distractions. Additionally, if students must transpose their answers onto a score sheet of some kind, there is less likelihood of error in the transposition if the number of alternatives for each item is always the same.

5. Include the same number of alternatives for each item.

If you are going to use NOT or EXCEPT, the word should be highlighted in some manner so that students recognize a negative is being used.

Once again, trying to determine which answer is NOT consistent with the stem requires more cognitive load from the students and promotes the likelihood of more confusion. If that additional load or confusion is unnecessary it should be avoided (Haladyna, Downing, & Rodriguez, 2002).

Which of the following is true of the Constitution? a. The Constitution has not been amended in 50 years b. The Constitution sets limits on how a government can operate* c. The Constitution permits only one possible interpretation


Which of the following is NOT true of the Constitution? a. The Constitution sets limits on how a government can operate b. The Constitution is open to different interpretations c. The Constitution has not been amended in 50 years*



4. Limit the use of negatives (e.g., NOT, EXCEPT).

• It is easy to inadvertently include clues in your test items that point to the correct answer, help rule out incorrect alternatives or narrow the choices. • Any such clue would decrease your ability to distinguish students who know the material from those who do not, thus, providing rival explanations.

Reducing the Chance of Guessing Correctly

Obviously, "proteins" is inconsistent with the stem since it is singular and the others are plural. However, it can be easy for the test writer to miss such inconsistencies. As a result, students may more easily guess the correct answer without understanding the concept - a rival explanation.

a. glucose b. cholesterol* c. beta carotene d. protein

Example: Change What is the dietary substance that is often associated with heart disease when found in high levels in the blood? a. glucose b. cholesterol* c. beta carotene d. proteins To

Keep the grammar consistent between stem and alternatives.

a. It is required by most teachers b. It is unfair and illegal to use someone's ideas without giving proper credit* c. To get a better grade on the project d. So the reader knows from where you got your information

Students often recognize that a significantly longer, more complex alternative is commonly the correct answer. Even if the longer alternative is not the correct answer, some students who might otherwise answer the question correctly could be misled by this common clue and select the wrong answer. So, to be safe and avoid a rival explanation, keep the alternatives similar in length.


a. It is required b. It is unfair and illegal to use someone's ideas without giving proper credit* c. To get a better grade d. To make it longer

Example: Change What is the best reason for listing information sources in your research assignment?

Avoid including an alternative that is significantly longer than the rest.

If students can easily discount one or more distractors (obviously Ozzie Osbourne does not belong) then the chance of guessing is increased, reducing the discriminability of that item. There is some limited evidence that including humor on a test can have certain benefits such as reducing the anxiety of the test-takers (Berk, 2000; McMorris, Boothroyd, & Pietrangelo, 1997). But humor can be included in a manner that does not reduce the discriminability of the item. For example, the nature of the question in the stem may be humorous but still addresses the material in a meaningful way.

Lincoln was assassinated by a. Lee Harvey Oswald b. John Wilkes Booth* c. Oswald Garrison Villard d. Louis Guiteau


Lincoln was assassinated by a. Lee Harvey Oswald b. John Wilkes Booth* c. Oswald Garrison Villard d. Ozzie Osbourne

Example: Change

Make all distracters plausible.


Since both of the publishers in choice "b" are associated with yellow journalism and none of the other people mentioned is, the student only has to know of one such publisher to identify that "b" is the correct answer. That makes the item easier than if just one name is listed for each alternative. To make the question more challenging, at least some of the distracters could mention one of the correct publishers but not the other as in the second example (e.g., in distracter "c" Pulitzer is correct but Ochs is not). As a result, the student must recognize both publishers associated with yellow journalism to be certain of the correct answer.

a. Adolph Ochs and Martha Graham b. William Randolph Hearst and Joseph Pulitzer* c. Joseph Pulitzer and Adolph Ochs d. Martha Graham and William Randolph Hearst

Example: Change "Yellow Journalism" is associated with what two publishers? a. Adolph Ochs and Martha Graham b. William Randolph Hearst and Joseph Pulitzer* c. Col. Robert McCormick and Marshall Field III d. Michael Royko and Walter Cronkite

Avoid giving too many clues in your alternatives.


Evaluating our students is a complete action when you do conciently. All the students are different and we have to deal with that and facility them the study of a new language. When we find a correct structure to evaluate our students will enjoy our test and will not feel uncomfortable and nervous. Teacher needs to be organized and have the correct strategies to score correctly each series of test and variety the type of test.


Evaluation portafolio with general information