Page 1



Julio David Samayoa Cรกrcamo Student ID: 076-08-14353 Guatemala, November 24th 2012

Introduction: During the last semester of this year I took the Evaluation and Assessment course. This course is intended for teachers and provides with the necessary theory and tools to be able to make tests and the test items adequately. Our teacher mentioned how tests are usually a “collage of exercises” that we randomly take from books or websites. The truth is that tests are very important because they are tools to measure students’ knowledge. A group of randomly chosen exercises won’t do what we need. The objective of the course is that by the end of it we will be capable of making tests appropriately, tests based on reliability and validity concepts so that our tests can provide the desired, and more importantly, accurate information from our students.

Formative Evaluation: Formative Evaluation is a bit more complex than summative evaluation. It is done with a small group of people to "test run" various aspects of instructional materials. For example, you might ask a friend to look over your web page s to see if they are graphically pleasing, if there are errors you've missed, if it has navigational problems. It's like having someone look over your shoulder during the development phase to help you catch things that you miss, but a fresh set of eye might not. At times, you might need to have this help from a target audience. For example, if you're designing learning materials for third graders, you should have a third grader as part o f your Formative Evaluation. Here are some different author's definitions of Formative Evaluation that will help you understand the difference. Scriven, (1991) Formative evaluation is typically conducted during the development or improvement of a program or product (or person, and so on) and it is conducted, often more than once, for in-house staff of the program with the intent to improve. The reports normally remain in-house; but serious formative evaluation may be done by an internal or an external evaluator or preferably, a combination; of course, many program staff are, in an informal sense, constantly doing formative evaluation. Weston, Mc Alpine, and Bordonaro, (1995) The purpose of formative evaluation is to validate or ensure that the goals of the instruction are be ing achieved and to improve the instruction, if necessary, by means of identification and subsequent remediation of problematic aspects. Worthen, Sanders, and Fitzpatrick, (1997) Formative evaluation is conducted to provide program staf f evaluative information useful in improving the program. Robert Stakes "When the cook tastes the soup, that ’s formative; when the guests taste the soup, that’s summative." Scriven, (1996) • "is research-oriented vs. action-oriented"

• "evaluations are intended - by th e evaluator - as a basis for improvement" • "the summative vs. formative distinction is context dependent"

Summative Evaluation: Summative evaluation provides information on the product's efficacy (its ability to do what it was designed to do). For example, did the learners lear n what they we re supposed to learn after using the instructional module? In a sense, it lets the learner know "how they did," but more importantly, by looking at how the learner's did, it help s you know whether the product teaches what it is supposed to teach. Summative evaluation is typicall y quantitative, using numeric scores or letter grades to assess learner achievement. So what is the difference between a Summative Evaluation and Learner Assessment? Although both might look at the same data, a Learner Assessment generally looks at how an individual learner performed on a learning task. It assesses a student's learning --hence the name Learner Assessm ent. For example, you might assess an entire class of students, but you ar e assess them individually to s ee how each did. A Summative Evaluation, on the other hand, looks at more than one learner's performance to see how well a group did on a learning task that utilized specific learning materials and methods. By looking at the group, the instructional designer can evaluate the learning materials and learning process -- hence the name Summative Evaluation. For example, here you may find that, as a group, all of the students did well on Section A o f some instructional materials, but didn't do so well on Section B. That would indicate that the designer should go back and look at the design or delivery of Section B.

Summary of How Evaluation, Assessment, Measurement and Testing Terms Are Related Commonly used assessment and measurement terms are related and understanding how they connect with one another can help you better integrate your testing and teaching. Evaluation Examining information about many components of the thing being evaluated (e.g., student work, schools, or a specific educational program) and comparing or judging its quality, worth or effectiveness in order to make decisions based on Assessment The process of gathering, describing, or quantifying information about performance. includes Measurement Process of assigning numbers to qualities or characteristics of an object or person according to some rule or scale and analyzing that data based on psychometric and statistical theory specific way to measure performance is Testing A method used to measure the level of achievement or performance

DOCUMENT 5 Abilities and Behaviors Related to Bloom’s Taxonomy of Educational Objectives Knowledge – Recognizes students’ ability to use rote memorization and recall certain facts. • Test questions focus on identification and recall of information Comprehension – Involves students’ ability to read course content, extrapolate and interpret important information and put other’s ideas into their own words. • Test questions focus on use of facts, rules and principles Application – Students take new concepts and apply them to another situation. • Test questions focus on applying facts or principles Analysis – Students have the ability to take new information and break it down into parts to differentiate between them. • Test questions focus on separation of a whole into component parts Synthesis – Students are able to take various pieces of information and form a whole creating a pattern where one did not previously exist. • Test questions focus on combining ideas to form a new whole Evaluation – Involves students’ ability to look at someone else’s ideas or principles and see the worth of the work and the value of the conclusions

DOCUMENT 5 • Test questions focus on developing opinions, judgments or decisions Examples of Instructional Objectives for the Cognitive Domain 1. The student will recall the four major food groups without error. (Knowledge) 2. By the end of the semester, the student will summarize the main events of a story in grammatically correct English. (Comprehension) 3. Given a presidential speech, the student will be able to point out the positions that attack a political opponent personally rather than the opponent’s political programs. (Analysis) 4. Given a short story, the student will write a different but plausible ending. (Synthesis) 5. Given fractions not covered in class, the student will multiply them on paper with 85 percent accuracy. (Application) 6. Given a description of a country’s economic system, the student will defend it by basing arguments on principles of socialism. (Evaluation) 7. From memory, with 80 percent accuracy the student will match each United States General with his most famous battle. (Knowledge)

8. The student will describe the interrelationships among acts in a play. (Analysis) Test Blueprint Once you know the learning objectives and item types you want to include in your test you should create a test blueprint. A test blueprint, also known as test specifications, consists of a matrix, or chart, representing the number of questions you want in your test within each topic and level of objective.

DOCUMENT 5 blueprint identifies the objectives and skills that are to be tested and the relative weight on the test given to each. The blueprint can help you ensure that you are obtaining the desired coverage of topics and level of objective. Once you create your test blueprint you can begin writing your items! Once you create your blueprint you should write your items to match the level of objective within each topic area.

Description of Multiple‐ Choice Items Multiple‐Choice Items: Multiple‐choice items can be used to measure knowledge outcomes and various types of learning outcomes. They are most widely used for measuring knowledge, comprehension, and application outcomes. The multiple‐choice item provides the most useful format for measuring achievement at various levels of learning. When selection ‐type items are to be used (multiple ‐choice, true ‐false, matching, check all that apply) an effective procedure is to start each item as a multiple‐choice item and

switch to another item type only when the learning outcome and content make it desirable to do so. For example, (1) when there are only two possible alternatives, a shift can be made to a true ‐false item; and (2) when there are a number of similar factors to be related, a shift can be made to a matching item.


DOCUMENT 5 1. Learning outcomes from simple to complex can be measured. 2. Highly structured and clear tasks are provided. 3. A broad sample of achievement can be measured. 4. Incorrect alternatives provide diagnostic information. 5. Scores are less influenced by guessing than true ‐false items. 6. Scores are more reliable than subjectively scored items (e.g., essays). 7. Scoring is easy, objective, and reliable. 8. Item analysis can reveal how difficult each item was and how well it discriminated between the strong and weaker students in the class 9. Performance can be compared from class to class and year to year 10. Can cover a lot of material very efficiently (about one item per minute of testing time). 11. Items can be written so that students must discriminate among options that vary in degree of correctness. 12. Avoids the absolute judgments found in True ‐False tests.


1. Constructing good items is time consuming. 2. It is frequently difficult to find plausible distracters. 3. This item is ineffective for measuring some types of problem solving and the ability to organize and express ideas. 4. Real ‐world problem solving differs – a different process is involved in proposing a solution versus selecting a solution from a set of alternatives. 5. Scores can be influenced by reading ability.

DOCUMENT 5 6. There is a lack of feedback on individual thought processes – it is difficult to determine why individual students selected incorrect responses. 7. Students can sometimes read more into the question than was intended. 8. Often focus on testing factual information and fails to test higher levels of cognitive thinking. 9. Sometimes there is more than one defensible “correct” answer. 10. They place a high degree of dependence on the student’s reading ability and the instructor’s writing ability. 11. Does not provide a measure of writing ability. 12. May encourage guessing. Helpful Hints:

• Base each item on an educational or instructional objective of the course, not trivial information. • Try to write items in which there is one and only one correct or clearly

best answer. • The phrase that introduces the item (stem) should clearly state the problem. • Test only a single idea in each item. • Be sure wrong answer choices (distracters) are at least plausible. • Incorporate common errors of students in distracters. • The position of the correct answer should vary randomly from item to item. • Include from three to five options for each item. • Avoid overlapping alternatives (see Example 3 following). • The length of the response options should be about the same within each item (preferably short). DOCUMENT 5 • There should be no grammatical clues to the correct answer. • Format the items vertically, not horizontally (i.e., list the choices vertically) • The response options should be indented and in column form. • Word the stem positively; avoid negative phrasing such as “not” or “except.” If this cannot be avoided, the negative words should always be highlighted by underlining or capitalization: Which of the following is NOT an example …… • Avoid excessive use of negatives and/or double negatives. • Avoid the excessive use of “All of the above” and “None of the above” in the response alternatives. In the case of “All of the above”, students only need to have partial information in order to answer the question. Students need to know that only two of the options are correct (in a four or more option question) to determine that “All of the above” is the correct answer choice. Conversely, students only need to eliminate one answer choice as implausible in order to eliminate “All of the above” as an answer choice. Similarly, with “None of the above”, when used as the correct answer choice, information is gained about students’ ability to detect incorrect answers. However, the item does not reveal if students know the

correct answer to the question. Example 1 The stem of the original item below fails to present the problem adequately or to set a frame of reference for responding. Original 1. World War II was: A. The result of the failure of the League of Nations. B. Horrible. C. Fought in Europe, Asia, and Africa. D. Fought during the period of 1939‐ 1945.




“A data collection method or instrument is considered reliable if the same result is obtained from using the method on repeated occasions.”

“A measurement method or instrument is considered valid if it measures what it intends to measure.”

The meaning and relationships among reliability and validity can be clarified through the metaphor of the target

. . .. . . . . .. . . .

. .. . . . . .



Neither Reliable

Both Reliable

Not Valid

Not Reliable

Nor Valid

And Valid

The target is hit consistently and systematically measuring the wrong value for all cases: it is consistent but wrong.

Hits are randomly spread across the target. On average you get a valid group estimate, but you are

Hits are spread across the target and consistently missing the centre.

You consistently hit the centre of the target.




Adapted from: Trochim 2001.

RELIABILITY “A data collection method or instrument is considered reliable if the same result is obtained from using the method on repeated occasions.”

Reliability can depend on various factors (the observers/raters, the tools, the methods, the context, the sample…) and can be estimated in a variety of ways, including:


How to test reliability

Inter-observer reliability

To what degree are measures taken by different raters/observers consistent?

Consider pre-testing if different raters/observers are giving consistent results on the same phenomenon.


Is a measure consistent from one time to another?

Consider administering the same test to the same (or similar) sample in different occasions. But be aware of the effects of the time gap.

Parallel forms reliability

Are previous tests and tools constructed in the same way from the same content domain giving similar results?

Consider splitting a large set of questions into parallel forms and measure the correlation of the results.

Internal consistency reliability

Do different measures on a similar issue yield results that are consistent?

Consider testing a sampling of all records for inconsistent measures.


Adapted from: Trochim 2001.

How to improve reliability? When constructing reliable data collection instruments:  

Ensure that questions and the methodology are clear Use explicit definitions of terms

Use already tested and proven questioning methods.

WHAT IS A SCAVENGER HUNT? A scavenger hunt is an assessment idea in which students try to find all the items or complete all the activities on a list. It can be completed in teams or by individuals, and is often timed.


Analyze the difference between the types of assessment. Evaluate the importance of assessment in the teaching and learning process. Synthesize ways in which teacher can assess his or her own teaching.


Use the link above to complete the following list of questions and activities. The completed assignment must be sent via e-mail before Wednesday August 15th. DO NOT COPY PASTE

PART I: ASSESSMENT BASICS RESOURCE LINK: INSTRUCTIONS: Complete and or answer the following questions or tasks after each instruction.

Why should assessments, learning objectives, and instructional strategies be aligned?

WHAT IF THE COMPONENTS OF A COURSE ARE MISALIGNED? What is the difference between formative and summative assessment? Provide examples. What is the difference between assessment and grading?

PART II: HOW TO ASSESS STUDENTS’ PRIOR KNOWLEDGE RESOURCE LINK: What are Performance-Based Prior Knowledge Assessments? Provide examples. Give examples of your own of appropriately written questions for self assessment. What are CATs? List and provide a brief description of the CATs suggested on the sight. Give 3 examples of CATs you’ve used in class.

PART III: HOW TO ASSESS STUDENTS’ LEARNING AND PERFORMANCE RESOURCE LINK: List a few tips on how to create assignments. List a few tips on how to create exams. Compare and contrast the two previous methods. What is the difference between concept maps and concept tests? Provide an example of each. Provide tips on how to assess student group work? Explain what has worked for you. What benefits are there in using rubrics? Provide an example of a rubric you have used.

PART IV: HOW TO ASSESS YOUR TEACHING RESOURCE LINK: What are early course evaluations? What do they consist in? Explain the importance o classroom observations. What are a few suggestions on how to go about with a classroom observation? What is a student focus group?

Examples of reliable measures

A reliable reading comprehension test to measure children's level of competency in English would be one that has the same results from one week to the next, so long as there has been no instruction in the intervening period. A ruler is a reliable measure of length.

Examples of measures that may be unreliable

  

A questionnaire to measure self-esteem may not be reliable if it is administered to people who have just experienced either success or failure. Asking the question "Have you been tested for HIV?" may not yield reliable data because some people may answer truthfully on this sensitive topic and some may not. The method of dietary recall to measure food consumption is only as reliable as each respondent’s memory.

VALIDITY “Validity is the best available approximation to the truth of a given proposition, inference of conclusion. A measurement method or instrument is considered valid if it measures what it intends to measure.”

There are different types of validity, and we will focus on internal and external validity.

Internal validity

What is it?


Internal validity is relevant in studies attempting to establish a causal relationship, and it is only relevant for the specific study in question.

Did the programme really caused the outcome?

“Can change be attributed to a program or intervention and not to other possible causes?”

To base the result on a single group (i.e., not using a control group) means that it is more difficult to assess that the change is due to the programme and not to other factors (e.g., other external influences, changes in the social contexts).

Even when you have a control group (that is often not the case in development programmes, also for ethical reasons!), there are challenges in ensuring that the two groups are really comparable and that no social interaction between them leads to muddling results.

External validity

It is related to generalising. It is the degree to which the conclusion of your study will hold for other persons in other places and at other times.

Can you really make a generalisation based on results?

Consider if your study can be biased by your choice of:  People: did you focus on “special” people?  Places: did you undertake the study in “special” places?  Times: did the study happen in a particular time or following an event that could bias the results?

Adapted from: Trochim 2001.

Examples of challenges to validity

 

An evaluation designed to assess the impact of nutrition education on weaning practices is valid if actual weaning practices are observed. An evaluation that relies only on mothers’ reports may find that it is measuring what mothers know rather than what they do. An instrument to measure self-esteem in one country may not be valid in another culture. Our understandings of validity change over time. It is still debated whether I.Q. tests are a valid measure of intelligence. Long ago, measurements of the skull were thought to be a valid measure of intelligence.

What is the Reflective Teaching Model? Reflective Teaching is an inquiry approach that emphasizes an ethic of care, a constructivist approach to teaching, and creative problem solving (Henderson, 2001). An ethic of care respects the wonderful range of multiple talents and capacities of all individuals regardless of cultural, intellectual, or gender differences. A premium is placed on the dignity of all persons. Teachers using a constructivist approach place emphasis on big concepts, student questions, active learning, and cooperative learning, and they interweave assessment with teaching. A constructivist approach seeks to connect theory to practice and views the student as “thinker, creator, and constructor.” Integral to a constructivist theory of learning is creative

problem solving. Teachers take responsibility for assessing and solving problems not with mechanistic “cook book” recipes, but by asking “What decisions should I be making?”, “On what basis do I make these decisions?”, and “What can I do to enhance learning?” How does the Reflective Teaching Model Integrate Theory with Practice? Teacher Education at EMU strives to help you make meaningful connections between theory and practice. You are taught to ask significant questions in the context of classroom and field experiences. The Education Department incorporates reflective thinking and teaching into a sequential curriculum pattern with initiatory, developmental, and culminating phases. Courses are arranged within the professional education sequence around five questions: 1. 2. 3.

Exploring Teaching— “Shall I Teach?” Academic Preparation— “What Shall I Teach?” Understanding Learners— “How Do Students Learn?”

4. 5.

Organizing for Teaching —“How Shall I Teach?” Schooling and Cultural Context—“Why Do We Teach?”

Classes participate in carefully arranged and fully integrated field based experiences beginning in the first year and culminating in the senior year with Student Teaching. The professional education curriculum emphasizes caring relationships, assertive but cooperative classroom management practices, peace and justice issues, and the integration of ethics with professional competency. The ultimate goal of teacher education at EMU is to empower you to develop a spirit of inquiry leading to informed decision making while applying values to action. Members of the education faculty are committed to demonstrating the reflective model in their own teaching. Education classes utilize instructional activities such as cooperative learning strategies, class interaction and role playing, microteaching lessons, and case studies. Instructors give special attention to the application of theory and practice by helping you make connections between relevant concepts through higher order questioning strategies. Reflective thinking skills – the ability to evaluate and interpret evidence, modify views, and make objective judgments – are stressed in all courses.

The Advantages of Rubrics Part one in a five-part series What is a rubric?  A rubric is a scoring guide that seeks to evaluate a student's performance based on the

sum of a full range of criteria rather than a single numerical score.  A rubric is an authentic assessment tool used to measure students' work. o Authentic assessment is used to evaluate students' work by measuring the product according to real-life criteria. The same criteria used to judge a published author would be used to evaluate students' writing. o Although the same criteria are considered, expectations vary according to one's level of expertise. The performance level of a novice is expected be lower than that of an expert and would be reflected in different standards. For example, in evaluating a story, a first-grade author may not be expected to write a coherent paragraph to earn a high evaluation. A tenth grader would need to write coherent paragraphs in order to earn high marks.  A rubric is a working guide for students and teachers, usually handed out before the assignment begins in order to get students to think about the criteria on which their work will be judged.  A rubric enhances the quality of direct instruction. Rubrics can be created for any content area including math, science, history, writing, foreign languages, drama, art, music, and even cooking! Once developed, they can be modified easily for various grade levels. The following rubric was created by a group of postgraduate education students at the University of San Francisco, but could be developed easily by a group of elementary students. Chocolate chip cookie rubric The cookie elements the students chose to judge were:     

Number of chocolate chips Texture Color Taste Richness (flavor)

4 - Delicious: Chocolate chip in every bite Chewy Golden brown Home-baked taste Rich, creamy, high-fat flavor 3 - Good:

Chocolate chips in about 75 percent of the bites taken Chewy in the middle, but crispy on the edges Either brown from overcooking, or light from being 25 percent raw Quality store-bought taste Medium fat content 2 - Needs Improvement: Chocolate chips in 50 percent of the bites taken Texture is either crispy/crunchy from overcooking or doesn't hold together because it is at least 50 percent uncooked Either dark brown from overcooking or light from undercooking Tasteless Low-fat content 1 - Poor: Too few or too many chocolate chips Texture resembles a dog biscuit Burned Store-bought flavor with a preservative aftertaste – stale, hard, chalky Non-fat contents Here's how the table looks: Delicious


Needs Improvement


Chips in about 75% of bites

Chocolate in 50% of bites

Too few or too many chips


Chewy in middle, crisp on edges

Texture either crispy/crunchy or 50% uncooked

Texture resembles a dog biscuit


Golden brown

Either light from overcooking or light from being 25% raw

Either dark brown from overcooking or light from undercooking



Home-baked taste

Quality storebought taste


Store-bought flavor, preservative aftertaste – stale, hard, chalky


Rich, creamy, high-fat flavor

Medium fat contents

Low-fat contents

Nonfat contents

Number of Chocolate chip Chips in every bite Texture

Why use rubrics?

Many experts believe that rubrics improve students' end products and therefore increase learning. When teachers evaluate papers or projects, they know implicitly what makes a good final product and why. When students receive rubrics beforehand, they understand how they will be evaluated and can prepare accordingly. Developing a grid and making it available as a tool for students' use will provide the scaffolding necessary to improve the quality of their work and increase their knowledge. In brief:  Prepare rubrics as guides students can use to build on current knowledge.  Consider rubrics as part of your planning time, not as an additional time commitment to

your preparation. Once a rubric is created, it can be used for a variety of activities. Reviewing, reconceptualizing, and revisiting the same concepts from different angles improves understanding of the lesson for students. An established rubric can be used or slightly modified and applied to many activities. For example, the standards for excellence in a writing rubric remain constant throughout the school year; what does change is students' competence and your teaching strategy. Because the essentials remain constant, it is not necessary to create a completely new rubric for every activity. There are many advantages to using rubrics:  Teachers can increase the quality of their direct instruction by providing focus, emphasis,

and attention to particular details as a model for students.  Students have explicit guidelines regarding teacher expectations.  Students can use rubrics as a tool to develop their abilities.  Teachers can reuse rubrics for various activities.

Create an Original Rubric Part two in a five-part series Learning to create rubrics is like learning anything valuable. It takes an initial time investment. Once the task becomes second nature, it actually saves time while creating a higher quality student product. The following template will help you get started:     

Determine the concepts to be taught. What are the essential learning objectives? Choose the criteria to be evaluated. Name the evidence to be produced. Develop a grid. Plug in the concepts and criteria. Share the rubric with students before they begin writing. Evaluate the end product. Compare individual students' work with the rubric to determine whether they have mastered the content.

Fiction-writing content rubric Criteria



PLOT: "What" and "Why"

One of the plot parts is Both plot parts are fully developed and the fully developed. less developed part is at least addressed.

SETTING: "When" and "Where"

Both setting parts are fully developed.

CHARACTERS "Who" Described by behavior, appearance, personality, and character traits.

The main characters are fully developed with much descriptive detail. The reader has a vivid image of the characters.



Both plot parts are addressed but not fully developed.

Neither plot parts are fully developed.

One of the setting parts is fully developed and the less developed part is at least addressed.

Both setting parts of the story are addressed but not fully developed.

Neither setting parts are developed.

The main characters are developed with some descriptive detail. The reader has a vague idea of the characters.

The main characters are identified by name only.

None of the characters are developed or named.

In the above example, the concepts include the plot, setting, and characters. The criteria are the who, what, where, when, and why parts of the story. The grid is the physical layout of the rubric. Sharing the rubric and going over it step-by-step is necessary so that students will understand the standards by which their work will be judged. The evaluation is the objective grade determined by the teacher. The teacher determines the passing grade. For instance, if all three concepts were emphasized, a passing grade of 3 in all three concepts might be required. If any part of the story fell below a score of 3, then that particular concept would need to be re-taught and rewritten with specific teacher feedback. In another example, suppose a teacher emphasized only one concept, such as character development. A passing grade of "3" in character development may constitute a passing grade for the whole project. The purpose in writing all three parts of the story would be to gain writing experience and get feedback for future work. Share the rubric with students prior to starting the project. It should be visible at all times on a bulletin board or distributed in a handout. Rubrics help focus teaching and learning time by directing attention to the key concepts and standards that students must meet. GRADING Grading is not simply a matter of assigning number or letter grades. It is a process that may involve some or all of these activities:  

Creating effective assignments Establishing standards and criteria

    

Setting curves Making decisions about effort and improvement Deciding which comments would be the most useful in guiding each student's learning Designing assignments and exams that promote the course objectives Assessing student learning and teaching effectiveness

Effective grading requires an understanding of how grading may function as a tool for learning, an acceptance that some grades will be based on subjective criteria, and a willingness to listen to and communicate with students. It is important to help students to focus on the learning process rather than on "getting the grade," while at the same time acknowledging the importance that grades hold for students. And, since GSIs are students themselves, to balance the requirements of effective grading with other workload and professional commitments. This section contains general tips on how to make your grading both more effective and more efficient. You will also find specific suggestions here about designing assignments, setting standards and policies, using grading rubrics, and writing comments on student work. You might also find relevant information in other sections of this online guide, for example, Working with Student Writing, Academic Misconduct, and Improving Your Teaching. Before You Grade Designing Assignments As a GSI, you may or may not have input into the course assignments you will grade. Some faculty prefer to design the course assignments themselves; others ask for substantial input from GSIs. Course assignments can be very particular: they depend on the content and objectives of the course, the teaching methods and style of the instructor, the level and background of the students, and the given discipline. However, there are questions to take into account if you are designing assignments: What do you want the students to learn? What are the goals and objectives of the course? How does the assignment contribute to those goals and objectives? What skills to you want students to employ: to solve, to argue, to create, to analyze, to explain, to demonstrate, to apply, etc.? How well focused is the assignment? Are the instructions clear and concise? Does the assignment give the students a clearly defined, unambiguous task? How long is the assignment going to be? Do you want students to engage in research that goes beyond the course content, or do you want them to stick to the course materials? What should the assignment format be? When will the assignment be due, and how much time will you need to grade it? When will the assignment be returned to students? Will you allow students to rewrite the assignment if necessary? Can this assignment be realistically completed given the knowledge, ability, and time constraints of the students? Is it clearly related to the course content? Are the research materials needed to complete the assignment available in sufficient quantity? Is it possible for you to grade this assignment effectively, given your workload and other commitments? How is this assignment going to contribute to the student’s final course grade?

The Grading Process Steps in the Process The process of assigning grades can be broken down into stages. Establish a set of criteria (or understand the criteria given to you by your department or the professor of the course) by thinking about what students must do to complete the assignment successfully, and weighting each criterion accordingly with respect to the final grade. If there is more than one GSI for the course, try to establish the criteria jointly to ensure consistency. To make criteria easier to keep in mind when grading, divide them into areas such as clarity of expression, understanding of material, and quality of argumentation (or whatever areas are relevant to your field and the assignment). Read through all the papers quickly. Try to pick out model papers, for example a model 'A' paper, a model 'B' paper, and so forth. This will give you a good overall sense of the quality of the assignments. Read through the papers more carefully, writing comments and assigning preliminary grades. Write the grades in pencil in case you want to change them later. Sort the papers by grade range and compare them to make sure that you have assigned grades consistently; that is, that all of the B papers are of the same quality, that all of the A papers are of the same quality, etc. Write the final grades in pen, or record it in whatever way you are instructed. If there is more than one GSI or grader for your course, you might want to exchange a couple of papers from each grade range to ensure that you are assigning grades in the same way. Communicating with Students Writing Comments on Student Work Your written comments on students’ work should be used to help them understand the strengths and weaknesses of their work, and to make clear how their work has or has not achieved the goals and standards set in the class. Here are some suggestions on how to make your comments meaningful to students. For more detailed advice about writing comments on papers, see Comments on Student Writing. Think about the sorts of comments that you find helpful and unhelpful. For example, avoid one word comments such as “good” or “unclear.” If you think that something is good or unclear, explain in concrete terms why you think so. Think about the extent to which you want to comment on each aspect of the assignment. For example, how important are punctuation and spelling? Is it enough to have one or two comments on grammar or syntax, or would more extensive comments be appropriate? Don’t overwhelm the student with a lot of different comments. Approximately one or two comments per page is enough. Focus on a couple of major points rather than comment on everything. Write specific comments in the margin and more general comments at the end of the assignment.

General comments give the students an overall sense of what went right or wrong and how they might improve their work in the future. Specific comments identify particular parts of the assignment that are right or wrong and explain why. What has been omitted from the paper or exam response is as important as what has been included. Ask questions to point out something that’s missing or to suggest improvements. Try to give the students a good overall sense of how they might improve their work. Don’t comment exclusively on weaknesses. Identify strengths and explain them. This helps students know their progress, and helps them build their skills. Write as many comments on good work as on bad work. In addition to commenting on things the student does well, think about how the student might work to improve his or her work even further.     

Write legibly or type your comments. Don’t be sarcastic or make jokes. What seems funny to you may be hurtful to students and not provide the guidance they need for improvement. Discuss difficult cases with other GSIs or the instructor in charge. Keep a record of common problems and interesting ideas, and discuss them in class. Make sure you have adequately explained the reason for the grade.

Questions to Ask Yourself When Writing Comments What were the strengths in this piece of work? What were the weaknesses? What stands out as memorable or interesting? Does the work have a clear thesis or main point, either explicit or implicit? Is it clear what point the author is trying to make and why? Are the main points and ideas clear? Are they specific enough? Are they clearly related to the assignment? Does the author provide sufficient evidence or argumentative support? Is the writing clear, concise, coherent, and easy and interesting to read? Are the grammar and syntax acceptable? Is the writing style appropriate? Does the author understand all of the words and phrases that he or she is using? Does the work have a clear, logical structure? Are the transitions clear? Is there one main point per paragraph? Are the factual claims correct? Does the author provide the appropriate citations and bibliographical references? Steps to Creating a Rubric Creating a rubric takes time and requires thought and experimentation. Here, you can see the steps we used to create a rubric for an essay assignment in a large-enrollment, intro-level sociology course. See also Tips on Using Rubrics Effectively and a Rubric Worksheet you can use to make rubrics of your own. Define the traits or learning outcomes you want to assess (usually in nouns or noun phrases). Choose what kind of scale you want to use: analytic or holistic? Five-point scale, three-point scale, letter grades, or a scale of your own devising?

Draw a table in which you describe the characteristics of student work at each point on your scale. Test it out!

Example: Sociology 3AC Essay Assignment Write a 7-8 page essay in which you make an argument about the relationship between social factors and educational opportunity. To complete the assignment, you will use electronic databases to gather data on three different high schools (including your own). You will use this data to locate each school within the larger social structure and to support your argument about the relationship between social status and public school quality. In your paper you should also reflect on how your own personal educational opportunities have been influenced by the social factors you identify. Course readings and materials should be used as background, to define sociological concepts and to place your argument within a broader discussion of the relationship between social status and individual opportunity. Your paper should be clearly organized, proofread for grammar and spelling, and all scholarly ideas must be cited using the ASA style manual. Using the four-step process for this assignment:       

Define the traits or learning outcomes you want to assess (usually in nouns or noun phrases). Argument Use and interpretation of data Reflection on personal experiences Application of course readings and materials Organization, writing, and mechanics Choose the kind of scale you want to use: analytic or holistic, five-point scale, three-point scale, letter grades, or a scale of your own devising.

For this assignment, we decided to grade each trait individually because there seemed to be too many independent variables to grade holistically. We decided to use a five-point scale for each trait, but we could have used a three-point scale, or a descriptive scale, as follows. This choice, again, depends on the complexity of the assignment and the kind of information you want to convey to students. Draw a table in which you describe the characteristics of student work at each point on your scale. RUBRIC SCALES Five-Point Scale Grade/ Point



Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible.


Argument pertains to relationship between social factors and educational opportunity and is defensible, but it is not clearly stated.


Argument pertains to relationship between social factors and educational opportunity but is not defensible using the evidence available.


Argument is presented, but it does not pertain to relationship between social factors and educationa opportunity.


Social factors and educational opportunity are discussed, but no argument is presented.

Three-Point Scale Grade/ Point



Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible.


Argument pertains to relationship between social factors and educational opportunity but may not b clear or sufficiently narrow in scope.


Social factors and educational opportunity are discussed, but no argument is presented.

Simplified Three-Point Scale Ideal Outcome Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible




Simplified Three-Point Scale, numbers replaced with descriptive terms Ideal Outcome Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible




Final Analytic Rubric Argument 5

Argument pertains to relationship between social factors and educational opportunity and is clearly stated and defensible.


Argument pertains to relationship between social factors and educational opportunity and is defensible, but it is not clearly stated.


Argument pertains to relationship between social factors and educational opportunity but is not defensible using the evidence available.


Argument is presented, but it does not pertain to relationship between social factors and educational opportunity.


Social factors and educational opportunity are discussed, but no argument is presented.

Interpretation and Use of Data 5

The data is accurately interpreted to identify each school’s position within a larger social structure, and sufficient data is used to defend the main argument.


The data is accurately interpreted to identify each school’s position within a larger social structure, and data is used to defend the main argument, but it might not be sufficient.


Data is used to defend the main argument, but it is not accurately interpreted to identify each school’s position within a larger social structure, and it might not be sufficient.


Data is used to defend the main argument, but it is insufficient, and no effort is made to identify the school’s position within a larger social structure.


Data is provided, but it is not used to defend the main argument.

Reflection on Personal Experiences 5

Personal educational experiences are examined thoughtfully and critically to identify significance of external social factors and support the main argument.


Personal educational experiences are examined thoughtfully and critically to identify significance of external social factors, but relation to the main argument may not be clear.


Personal educational experiences are examined, but not in a way that reflects understanding of the external factors shaping individual opportunity. Relation to the main argument also may not be clear.


Personal educational experiences are discussed, but not in a way that reflects understanding of the external factors shaping individual opportunity. No effort is made to relate experiences back to the main argument.


Personal educational experiences are mentioned, but in a perfunctory way.

Application of Course Readings and Materials


Demonstrates solid understanding of the major themes of the course, using course readings to accurately define sociological concepts and to place the argument within a broader discussion of the relationship between social status and individual opportunity.


Uses course readings to define sociological concepts and place the argument within a broader framework, but does not always demonstrate solid understanding of the major themes.


Uses course readings to place the argument within a broader framework, but sociological concepts are poorly defined or not defined at all. The data is not all accurately interpreted to identify each school's position within a larger social structure, and it might not be sufficient.


Course readings are used, but paper does not place the argument within a broader framework or define sociological concepts.


Course readings are only mentioned, with no clear understanding of the relationship between the paper and course themes.

Organization, Writing, and Mechanics


Clear organization and natural “flow” (with an introduction, transition sentences to connect major ideas, and conclusion) with few or no grammar or spelling errors. Scholarly ideas are cited correctly using the ASA style guide.


Clear organization (introduction, transition sentences to connect major ideas, and conclusion), but writing might not always be fluid, and might contain some grammar or spelling errors. Scholarly ideas are cited correctly using the ASA style guide.


Organization unclear or the paper is marred by significant grammar or spelling errors (but not both). Scholarly ideas are cited correctly using the ASA style guide.


Organization unclear and the paper is marred by significant grammar and spelling errors. Scholarly ideas are cited correctly using the ASA style guide.


Effort to cite is made, but the scholarly ideas are not cited correctly. (Automatic “F” if ideas are not cited at all.)

Holistic Rubric For some assignments, you may choose to use a holistic rubric, or one scale for the whole assignment. This type of rubric is particularly useful when the variables you want to assess just cannot be usefully separated. We chose not to use a holistic rubric for this assignment because we wanted to be able to grade each trait separately, but we’ve completed a holistic version here for comparative purposes.

Grade/ Point



The paper is driven by a clearly stated, defensible argument about the relationship between social factors and educational opportunity. Sufficient data is used to defend the argument, and the data is accurately interpreted to identify each school’s position within a larger social structure. Personal educational experiences are examined thoughtfully and critically to identify significance of external social factors and support the main argument. Paper reflects solid understanding of the major themes of the course, using course readings to accurately define sociological concepts and to place the argument within a broader discussion of the relationship between social status and individual opportunity. Paper is clearly organized (with an introduction, transition sentences to connect major ideas, and conclusion) and has few or no grammar or spelling errors. Scholarly ideas are cited correctly using the ASA style guide.


The paper is driven by a defensible argument about the relationship between social factors and public school quality, but it may not be stated as clearly and consistently throughout the essay as in an “A” paper. The argument is defended using sufficient data, reflection on personal experiences, and course readings, but the use of this evidence does not always demonstrate a clear understanding of how to locate the school or community within a larger class structure, how social factors influence personal experience, or the broader significance of course concepts. Essay is clearly organized, but might benefit from more careful attention to transitional sentences. Scholarly ideas are cited accurately, using the ASA style sheet, and the writing is polished, with few grammar or spelling errors.


The paper contains an argument about the relationship between social factors and public school quality, but the argument may not be defensible using the evidence available. Data, course readings, and personal experiences are used to defend the argument, but in a perfunctory way, without demonstrating an understanding of how social factors are identified or how they shape personal experience. Scholarly ideas are cited accurately, using the ASA style sheet. Essay may have either significant organizational or proofreading errors, but not both.


The paper does not have an argument, or is missing a major component of the evidence requested (data, course readings, or personal experiences). Alternatively, or in addition, the paper suffers from significant organizational and proofreading errors. Scholarly ideas are cited, but without following ASA guidelines.


The paper does not provide an argument and contains only one component of the evidence requested, if any. The paper suffers from significant organizational and proofreading errors. If scholarly ideas are not cited, paper receives an automatic “F.”

Tips on Using Rubrics Think through your learning objectives. Put some thought into the various traits, or learning outcomes, you want the assignment to assess. The process of creating a rubric can often help clarify the assignment itself. If the assignment has been well articulated, with clear and specific learning goals in mind, the language for your rubric can come straight from the assignment as

written. Otherwise, try to unpack the assignment, identifying areas that are not articulated clearly. If the learning objectives are too vague, your rubric will be less useful (and your students will have a difficult time understanding your expectations). If, on the other hand, your stated objectives are too mechanistic or specific, your rubric will not accurately reflect your grading expectations. For help in articulating learning objectives, see Bloom’s Taxonomy. Decide what kind of scale you will use. Decide whether the traits you have identified should be assessed separately or holistically. If the assignment is complex, with many variables in play, you might need a scale for each trait (“Analytic Rubric”). If the assignment is not as complex, or the variables seem too interdependent to be separated, you might choose to create one scale for the entire assignment (“Holistic Rubric”). Do you want to use a letter-grade scale, a point scale (which can be translated into a grade at the end), or some other scale of your own devising (e.g., “Proficient,” “Fair,” “Inadequate,” etc.)? This decision will depend, again, on how complex the assignment is, how it will be weighed in the students’ final grade, and what information you want to convey to students about their grade. Describe the characteristics of student work at each point on your scale. Once you have defined the learning outcomes being assessed and the scale you want to employ, create a table to think through the characteristics of student work at every point or grade on your scale. You might find it helpful to use the rubric worksheet that follows. Instructors are used to articulating the ideal outcome of a given assignment. It can be more challenging (but often far more helpful to the students) to articulate the differences, for example, between “C” and “B” work. If you have samples of student work from past years, look them over to identify the various levels of accomplishment. Start by describing the “ideal” outcome, then the “acceptable” outcome, then the “unacceptable” outcome, and fill in the blanks in between. If you don’t have student work, try to imagine the steps students will take to complete the assignment, the difficulties they might encounter, and the lower-level achievements we might take for granted. Test your rubric on student work. It is essential to try your rubric out and make sure it accurately reflects your grading expectations (as well as those of the instructor and other GSIs). If available, use sample work from previous semesters. Otherwise, test your rubric on a sampling of student papers and then revise the rubric before you grade the rest. Make sure, however, that you are not altering substantially the grading criteria you laid out for your students. Use your rubric to give constructive feedback to students. Consider handing the rubric out with students’ returned work. You can use the rubric to facilitate the process of justifying grades and to provide students with clear instructions about how they can do better next time. Some instructors prefer not to hand out the rubric, at least in the form that they use it in grading. An abbreviated form of the rubric can be developed for student communication both before the paper is handed in and when it's handed back after grading. Use your rubric to clarify your assignments and to improve your teaching. The process of creating a rubric can help you create assignments tailored to clear and specific learning objectives. Next time you teach the assignment, use your rubric to fine-tune the assignment description, and consider handing out the rubric with the assignment itself. Rubrics can also provide you, as the teacher, with important feedback on how well your students are meeting the learning outcomes you’ve laid out for them. If most of your students are scoring a “2” on “Clarity and Strength of

Argument,� then you know that next time you teach the course you need to devote more classroom time to this learning goal. WORKSHEET List the traits you want the assignment to measure (usually in nouns or noun phrases): ________________________________________________________________________________ __________________ ________________________________________________________________________________ __________________ ________________________________________________________________________________ _________________ ________________________________________________________________________________ _________________ ________________________________________________________________________________ ________________ Use the following chart to create a rubric. Either fill out one chart for the entire assignment (holistic rubric) or fill out one chart for each trait or learning objective (analytic rubric). Trait / Assignment being Assessed:______________________________________________________________________ Grade / Points:


What is assessment and evaluation? Assessment is defined as data-gathering strategies, analyses, and reporting processes that provide information that can be used to determine whether or not intended outcomes are being achieved: Evaluation uses assessment information to support decisions on maintaining, changing, or discarding instructional or programmatic practices. These strategies can inform: • The nature and extent of learning, • Facilitate curricular decision making, • Correspondence between learning and the aims and objectives of teaching, and • The relationship between learning and the environments in which learning takes place. Evaluation is the culminating act of interpreting the information gathered for the purpose of making decisions or judgments about students' learning and needs, often at reporting time. Assessment and evaluation are integral components of the teaching-learning cycle. The main purposes are to guide and improve learning and instruction. Effectively planned assessment and evaluation can promote learning, build confidence, and develop students' understanding of themselves as learners. Assessment data assists the teacher in planning and adapting for further instruction. As well, teachers can enhance students' understanding of their own progress by involving them in gathering their own data, and by sharing teacher-gathered data with them. Such participation makes it possible for students to identify persona l learning goals.

Types of Assessment and Evaluation There are three types of assessment and evaluation that occur regularly throughout the school year: diagnostic, formative, and summative. Diagnostic assessment and evaluation usually occur at the beginning of the school year and before each unit of study. The purposes are to determine students' knowledge and skills, their learning needs, and their motivational and interest levels. By examining the results of diagnostic assessment, teacher s can determine where to begin instruction and what concepts or skills to emphasize. Diagnostic assessment provides information essential to teachers in selecting relevant learning objectives and in designing appropriate learning experiences for all students, individually and as group members. Keeping diagnostic instruments for comparison and further reference enables teachers and students to determine progress and future direction. Diagnostic assessment tools such as the Writing Strategies Questionnaire and the Reading Interest/Attitude Inventory in this guide can provide support for instructional decisions. Formative assessment and evaluation focus on the processes and products of learning. Formative assessment is continuous and is meant to inform the student, the parent/guardian, and the teacher of the student's progress toward the curriculum objectives. This type of assessment and evaluation provides information upon which instructional decisions and adaptations can be made and provides students with directions for future learning. Involvement in constructing their own assessment instruments or in adapting ones the teacher has made allows students to focus on what they are trying to achieve, develops their thinking skills, and helps them to become reflective learners. As well, peer assessment is a useful formative evaluation technique. For peer assessment to be successful, students must be provided with assistance and the opportunity to observe a model peer assessment session. Through peer assessment students have the opportunity to become critical and creative thinkers who can clearly communicate ideas and thoughts to others. Instruments such as checklists or learning logs, and interviews or conferences provide useful data. Summative assessment and evaluation occur most often at the end of a unit of instruction and at term or year end when students are ready to demonstrate achievement of curriculum objectives. The main purposes are to determine knowledge, skills, abilities, and attitudes that have developed over a given period of time; to summarize student progress; and to report this progress to students, parents/guardians, and teachers. Summative judgments are based upon criteria derived from curriculum objectives. By sharing these objectives with the students and involving them in designing the evaluation instruments, teachers enable students to understand and internalize the criteria by which their progress will be determined. Often assessment and evaluation results provide both formative and summative information. For example, summative evaluation can be used formatively to make decisions about changes to instructional strategies, curriculum topics, or learning environment. Similarly, formative evaluation assists teachers in making summative judgments about student progress and determining where further instruction is necessary for individuals or groups. The suggested assessment techniques included in various sections of this guide may be used for each type of evaluation.

RUBRICS From an Assessment Workshop presented at Honolulu Community College on August 31, 2004 by Dr. Mary Allen, The California State University System

In general a rubric is a scoring guide used in subjective assessments. A rubric implies that a rule defining the criteria of an assessment system is followed in evaluation. A rubric can be an explicit description of performance characteristics corresponding to a point on a rating scale. A scoring rubric makes explicit expected qualities of performance on a rating scale or the definition of a single scoring point on a scale Rubrics are explicit schemes for classifying products or behaviors into categories that vary along a continuum. They can be used to classify virtually any product or behavior, such as essays, research reports, portfolios, works of art, recitals, oral presentations, performances, and group activities. Judgments can be self-assessments by students; or judgments can be made by others, such as faculty, other students, or field-work supervisors. Rubrics can be used to provide formative feedback to students, to grade students, and/or to assess programs. Rubrics have many strengths:     

Complex products or behaviors can be examined efficiently. Developing a rubric helps to precisely define faculty expectations. Well-trained reviewers apply the same criteria and standards, so rubrics are useful for assessments involving multiple reviewers. Summaries of results can reveal patterns of student strengths and areas of concern. Rubrics are criterion-referenced, rather than norm-referenced. Raters ask, "Did the student meet the criteria for level 5 of the rubric?" rather than "How well did this student do compared to other students?" This is more compatible with cooperative and collaborative learning environments than competitive grading schemes and is essential when using rubrics for program assessment because you want to learn how well students have met your standards. Ratings can be done by students to assess their own work, or they can be done by others, such as peers, fieldwork supervisions, or faculty.

Developing a Rubric It is often easier to adapt a rubric that someone else has created, but if you are starting from scratch, here are some steps that might make the task easier:   

Identify what you are assessing (e.g., critical thinking). Identify the characteristics of what you are assessing (e.g., appropriate use of evidence, recognition of logical fallacies). Describe the best work you could expect using these characteristics. This

  

describes the top category. Describe the worst acceptable product using these characteristics. This describes the lowest acceptable category. Describe an unacceptable product. This describes the lowest category. Develop descriptions of intermediate-level products and assign them to intermediate categories. You might develop a scale that runs from 1 to 5 (unacceptable, marginal, acceptable, good, outstanding), 1 to 3 (novice, competent, exemplary), or any other set that is meaningful. Ask colleagues who were not involved in the rubric's development to apply it to some products or behaviors and revise as needed to eliminate ambiguities.

Suggestions for Using Scoring Rubrics for Grading and Program Assessment

1. Hand out the grading rubric with an assignment so students will know your expectations and how they'll be graded. This should help students master your learning objectives by guiding their work in appropriate directions. 2. Use a rubric for grading student work, including essay questions on exams, and return the rubric with the grading on it. Faculty save time writing extensive comments; they just circle or highlight relevant segments of the rubric. Each row in the rubric could have a different array of possible points, reflecting its relative importance for determining the overall grade. Points (or point ranges) possible for each cell in the rubric could be printed on the rubric, and a column for points for each row and comments section(s) could be added. 3. Develop a rubric with your students for an assignment or group project. Students can then monitor themselves and their peers using agreed-upon criteria that they helped develop. (Many faculty find that students will create higher standards for themselves than faculty would impose on them.) 4. Have students apply your rubric to some sample products (e.g., lab reports) before they create their own. Faculty report that students are quite accurate when doing this, and this process should help them evaluate their own products as they develop them. 5. Have students exchange paper drafts and give peer feedback using the rubric, then give students a few days before the final drafts are turned in to you. (You might also require that they turn in the draft and scored rubric with their final paper.) 6. Have students self-assess their products using the grading rubric and hand in the self-assessment with the product; then faculty and students can compare self- and faculty-generated evaluations. 7. Use the rubric for program assessment. Faculty can use it in classes and aggregate the data across sections, faculty can independently assess student products (e.g., portfolios) and then aggregate the data, or faculty can participate in group readings in which they review student products together and discuss what they found. Field-work supervisors or community professionals also may be invited to assess student work using rubrics. A well-designed rubric should allow evaluators to efficiently focus on specific learning objectives while

reviewing complex student products, such as theses, without getting bogged down in the details. Rubrics should be pilot tested, and evaluators should be "normed" or "calibrated" before they apply the rubrics (i.e., they should agree on appropriate classifications for a set of student products that vary in quality). If two evaluators apply the rubric to each product, inter-rater reliability can be examined. Once the data are collected, faculty discuss results to identify program strengths and areas of concern, "closing the loop" by using the assessment data to make changes to improve student learning. 8. Faculty can get "double duty" out of their grading by using a common rubric that is used for grading and program assessment. Individual faculty may elect to use the common rubric in different ways, combining it with other grading components as they see fit.

Analytic vs. Holistic Rubrics Part three in a five-part series What's the difference between analytic and holistic rubrics?  Analytic rubrics identify and assess components of a finished product.  Holistic rubrics assess student work as a whole.

Which one is better? Neither rubric is better than the other. Both have a place in authentic assessment, depending on the following:  Who is being taught? Because there is less detail to analyze in the holistic rubric, younger

students may be able to integrate it into their schema better than the analytic rubric.  How many teachers are scoring the product? How many teachers are scoring the product? Different teachers have different ideas about what constitutes acceptable criteria. The extra detail in the analytic rubric will help multiple grades emphasize the same criteria. Recall the analytic rubric from part two and compare it with the holistic rubric below: Fiction Writing Content Rubric – HOLISTIC  5 – The plot, setting, and characters are developed fully and organized well. The who,

   

what, where, when, and why are explained using interesting language and sufficient detail. 4 – Most parts of the story mentioned in a score of 5 above are developed and organized well. A couple of aspects may need to be more fully or more interestingly developed. 3 – Some aspects of the story are developed and organized well, but not as much detail or organization is expressed as in a score of 4. 2 – A few parts of the story are developed somewhat. Organization and language usage need improvement. 1 – Parts of the story are addressed without attention to detail or organization.

Rubric Reminders: 1. Neither the analytic nor the holistic rubric is better than the other one. 2. Consider your students and grader(s) when deciding which type to use. 3. For modeling, present to your students anchor products or exemplars of products at various levels of development.

How to Weight Rubrics Part four in a five-part series What is a weighted rubric? 


A weighted rubric is an analytic rubric in which certain concepts are judged more heavily than others. If, in a creative writing assignment, a teacher stresses character development, he or she might consider weighing the characters part of the rubric more heavily than the plot or setting. Remember that the purpose of creative writing is to evoke emotion from the reader. The writing needs to be interesting, sad, exciting, mysterious, or whatever the author decides. One way to develop the intended emotion is to focus on each concept separately within the context of creative writing.

Advantages A weighted rubric clearly communicates to the students and their parents which parts of the project are more important to learn for a particular activity. Weights can be changed to stress different aspects of a project. One week a teacher may focus on character development. In the next week or two, plot may take precedence. A weighted rubric focuses attention on specific aspects of a project. When learning something new, it is difficult to assimilate all of the necessary details into a coherent final product. Likewise, it is difficult to learn new things in isolation or out of context. A weighted rubric devised from quality projects will allow new learners to focus on what is being taught, while providing meaningful context to support the entire experience. Different ways to weight rubrics 1. Refer to the analytic rubric in part two of this series. If you have just focused on character development, simply require students to achieve a passing score of 3.00 in characters, realizing that the other parts are also necessary for quality fiction writing. 2. Assign numeric weights to different concepts. Characters might be worth 50 percent, and the setting and plot might be worth 25 percent each. When grading a story, the teacher would put twice as much weight on characters as either setting or plot. A passing score of at least 2.00 points with 1.50 coming from characters would be required. After a lesson on how to develop the plot, that concept might be worth 50 percent while the setting and characters would be worth 25 percent each. 3. To achieve a cumulative effect after the second lesson, the plot and characters might be worth 40 percent each, and the setting might be worth 20 percent. Summary Weighted rubrics are useful for explicitly describing to students and parents what concepts take priority over others for certain activities. In designing weighted rubrics, it is important not to lose sight of the purpose of an activity by getting bogged down in meaningless details, such as the number of adjectives and verbs used or the number of pages written. The purpose of creative writing is to evoke a response from the reader. Using written words to

elicit emotion effectively requires skill and understanding of the language. The concepts are the form by which good writing is judged. The important criteria become how the author uses language to achieve his or her goals. Weighted fiction-writing content rubric Criteria


Both plot parts PLOT: "What" and are fully "Why" developed. 25%

.25 x 4 = 1.00 point

3 One of the plot parts is fully developed and the less developed part is at least addressed.



Both plot parts Neither plot parts are addressed but are fully not fully developed. developed. .25 x 2 = .50 point

.25 x 1 = .25 point

.25 x 3 = .75 point

SETTING: "When" and "Where" 25%

One of the setting parts is fully Both setting parts developed and are fully the less developed. developed part is at least .25 x 4 = 1.00 addressed. point

Both setting parts Neither setting of the story are parts are addressed but not developed. fully developed. .25 x 2 = .50 point

.25 x 1 = .25 point

.25 x 3 = .75 point

CHARACTERS: "Who" described by appearance, personality, character traits, and behavior. 50%

The main characters are fully developed with much descriptive detail. The reader has a vivid image of the characters. .50 x 4 =2.00 points

The main characters are developed with some descriptive detail. The reader has a vague idea of the characters. .50 x 3 = 1.50 points

The main characters are identified by name only. .50 x 2 = 1.00 point

None of the characters are developed or named. .50 x 1 = .50 point

Student-Generated Rubrics Part five in a five-part series Why should students create their own rubrics? Reading or listening to a teacher's expectations is very different for a student than creating and accomplishing his or her own goals. The purpose of inviting students to develop their own evaluation structure is to improve their motivation, interest, and performance in the project. As students' overall participation in school increases, they are likely to excel in it. How can students create their own rubrics? Students are motivated intrinsically to design their own assessment tool after experiencing project-based learning. Once students have invested a significant amount of time, effort, and energy into a project, they naturally want to participate in deciding how it will be evaluated. The knowledge gained through experience in a particular field of study provides the foundation for creating a useful rubric. Background I decided to try out the possibility of student-created rubrics with my class when we did a project on bridges. The purpose of the project was for students to:   

learn basic physics concepts. apply fundamental mathematics principles. develop technical reading and writing skills.

My third-grade class began the Bridge Project by poring through books, handouts, magazine articles, Internet sites, and pictures of bridges. The class was divided into four work groups of five students each. Each group decided on their own "Company Name" as well as who would fill the following department head positions: project director, architect, carpenter, transportation chief, and accountant. All students were required to help out in every department. Each group received $1.5 million (hypothetically) to purchase land and supplies. Rubric development I created the preliminary outline by listing the learning outcomes that were to be emphasized in the project. The outcomes were then divided into suitable categories, and sample products were displayed and discussed. I proceeded to introduce the idea of the rubric to the students, who then generated many ideas for the rubric criteria. Students were asked to think about what parts of the design, construction, budget, and building journal were the most significant to the overall bridge quality. Together, the class came up with four different rubrics.

The budget rubric is provided as an example: Budget Criteria

4 Excellent

3 Good


The budget shows two or Completely three marks legible. or stains, but is legible.

2 Fair

1 Unacceptable

The budget is barely The budget is legible, with messy and numerous illegible. marks or stains.

Five-sixths of Supplies & Completely the materials Materials accounted and labor are Accountability for. accounted for.

Two-thirds of the materials and labor are accounted for.

Materials and labor are not accounted for.

Five-sixths of the daily balance of funds is indicated.

Two-thirds of the daily balance of funds is indicated.

The daily balance of funds is nonexistent.

Ledger Activity

All daily activities are recorded.

Ledger Balance

The daily The daily Balance is fund record The daily fund balance has completely has more balance is two or three accurate. than three inaccurate. inaccuracies. inaccuracies.

Summary The experience students gain through an authentic project enables them to understand the various aspects necessary for creating a valuable piece of work. Knowledge that has deep meaning provides the basis for students to judge objectively their own work as well as that of others. Developing a rubric is a reflective process that extends the experience and the knowledge gained beyond simply turning in a project for a teacher-initiated grade.

RUBRIC EXAMPLES TO ANALYZE Appendix A: Sample Holistic Rubric • Always prepared and attends class • Participates constructively in class • Exhibits preparedness and punctuality in class/class work


Proficient (3)

Apprentice (2)

Novice (1)

SUITABILITY / match of material and presentation style to audience

SUITABILITY / match of material and presentation style to audience

SUITABILITY / match of material and presentation style to audience

SUITABILITY / match of material and presentation style to audience

Information (Including Explanation and Instruction):

Information (Including Explanation and Instruction):

Information (Including Explanation and Instruction):

Information (Including Explanation and Instruction):

significantly increases audience understanding and knowledge of topic

raises audience understanding and awareness of most points

raises audience understanding and knowledge of some points

fails to increase audience understanding or knowledge of topic



Persuasion: Persuasion:

effectively convinces an audience to recognize the validity of a point of view

point of view is clear, but development or support is inconclusive and incomplete



uses humor appropriately to make significant points about the topic consistent with the interest of audience

achieves moderate success in using humor

point of view may be clear, but lacks development or support

fails to effectively convince the audience

Entertainment: Entertainment: humor attempted but inconsistent or weak

no use of humor or humor used inappropriately


Proficient (3)

Apprentice (2)

Novice (1)









Purpose and subject are defined clearly; information and logic are self-consistent;

has some success defining purpose and subject; information and logic generally self-

attemps to define purpose and subject; has contradictory

subject and purpose are not clearly defined; muddled

(persuasive speech anticipates opposition and provides counter example[s])


information and/or logic

Quality of material:

Quality of material:

of material:

Quality of material:

pertinent examples, facts, and/or statistics

some examples, facts, and/or statistics that supports the subject

weak examples, facts, and/or statistics, which do not adequately support the subject

very weak or no support of subject through use of examples, facts, and/or statistics


Sufficiency: Sufficiency:


includes very thin data or evidence in support of ideas or conclusions

totally insufficient support for ideas or conclusions


conclusions or ideas are supported by data or evidence

includes some data or evidence which supports conclusions or ideas



Introduction: Introduction has strong purpose statement which captivates audience and narrows topic

ORGANIZATION Introduction:


Introductory statement informs audience of general purpose of presentation

Introduction of subject fails to make audience aware of the purpose of presentation



topic needs to be narrowed, researched extended and/or tightened

topics too broad, insufficiently researched, and/or haphazardly delivered


Core: topic is narrowed, researched, and organized

no introductory statement or introductory statement which confuses audience Core:



topic is general, vague, and/or disorganized

Closing: Closing:

audience informed, major ideas summarized, audience left with a full understanding of presenter=s position

may need to refine summary or final idea

major ideas may need to be summarized or audience is left with vague idea to remember

major ideas left unclear, audience left with no new ideas

Proficient (3)

Apprentice (2)

Novice (1)

DELIVERY Distinguished (4)





relaxed, self-confident and appropriately dressed for purpose or audience

quick recovery from minor mistakes; appropriately dressed

some tension or indifference apparent and possible inappropriate dress for purpose or audience

nervous, tension obvious and/or inappropriately dressed for purpose or audience BODY




movements and gestures generally enhance delivery

insufficient movement and/or awkward gestures

no movement or descriptive gestures



occasional but unsustained eye contact with audience

no effort to make eye contact with audience

BODY LANGUAGE: natural movement and descriptive gestures which display energy, create mood, and help audience visualize EYE CONTACT: builds trust and holds attention by direct eye contact with all parts of audience

EYE CONTACT: fairly consistent use of direct eye contact with audience


VOICE: fluctuation in volume and inflection help to maintain audience interest and emphasize key points PACING: good use of pause, giving sentence drama, length matches allocated time

satisfactory variation of volume and inflection

uneven volume with little or no inflection

low volume and/or monotonous tone causes audience to disengage

PACING: PACING: PACING: pattern of delivery generally successful; slight mismatch between length and allotted time

uneven or inappropriate patterns of delivery and/or length does not match allotted time

delivery is either too rushed or too slow and/or length does not match allotted time PRESENTATION AIDS:


PRESENTATION AIDS: are clear, appropriate, not over-used and beneficial to the speech

are used and add some clarity and dimension to speech

none used or attempted attempted, but unclear; inappropriate or overused

Rubric for Paper based on an Interview Rating

Criteria _ engaging, creative, and thoughtful _ precise, vivid, and sophisticated vocabulary; varied patterns and lengths of sentences _ coherent and organized structure

A _ chosen form effectively and innovatively conveys content _ relevant and intriguing use of details to convey personality and experience of person interviewed _ few surface feature errors; only noticeable if looking for them _ clear and thoughtful _ complex, precise vocabulary and varied sentences _ logical organization B _ chosen form effectively conveys content _ relevant and careful use of details to convey personality and experience of person interviewed _ few surface feature errors; occasional spelling or punctuation errors _ quite well developed and detailed _ generally precise vocabulary and complex sentence structures containing minimal errors _ obvious organization C _ chosen form appropriate for content _ relevant use of details to convey personality and experience of person interviewed _ generally few surface feature errors; some punctuation, spelling, or pronoun reference errors _ direct and usually clear _ straightforward vocabulary and effective sentences that are rarely complex or varied D

_ organization evident _ chosen form generally appropriate for content _ competent use of details to convey personality and experience of person interviewed

_ surface feature errors such as comma splice, spelling, or pronoun reference errors _ limited clarity and thought _ unsophisticated and, at times, inappropriate vocabulary with simple sentences _ evidence of some organization REWRITE

_ chosen form rarely conveys content effectively _ inconsistent use of details to convey personality and experience of person interviewed _ surface feature errors may at times distract reader _ message is clear, understandable, and thought-provoking

Collaboration Rubric

Beginning 1

Developing 2

Accomplished 3

Exemplary 4

Contribute Research & Gather Information

Does not collect any information that relates to the topic.

Collects very little information--some relates to the topic.

Collects some basic information--most relates to the topic.

Collects a great deal of information--all relates to the topic.

Share Information Does not relay any information to teammates.

Relays very little information--some relates to the topic.

Relays some basic information--most relates to the topic.

Relays a great deal of information--all relates to the topic.

Hands in most assignments on time.

Hands in all assignments on time.

Performs nearly all duties.

Performs all duties of assigned team role.

Be Punctual

Does not hand in Hands in most any assignments. assignments late.

Take Responsibility Fulfill Team Role's Does not Duties perform any duties of assigned team role.

Performs very little duties.

Participate in Science Conference

Does not speak during the science conference.

Either gives too little Offers some information or information--most is information which is relevant. irrelevant to topic.

Offers a fair amount of important information-all is relevant.

Share Equally

Always relys on

Rarely does the

Always does the

Usually does the


others to do the work.

assigned work--often assigned work--rarely needs reminding. needs reminding.

assigned work without having to be reminded.

Listen to Other Teammates

Is always talking-never allows anyone else to speak.

Usually doing most Listens, but of the talking--rarely sometimes talks too allows others to much. speak.

Listens and speaks a fair amount.

Cooperate with Teammates

Usually argues Sometimes argues. with teammates.

Rarely argues.

Never argues with teammates.

Make Fair Decisions

Usually wants to Often sides with have things their friends instead of way. considering all views.

Usually considers all views.

Always helps team to reach a fair decision.

Value Others' Viewpoints


THE USE OF PORTFOLIO ASSESSMENT IN EVALUATION Meg Sewell, Mary Marczak, & Melanie Horn WHAT IS PORTFOLIO ASSESSMENT? In program evaluation as in other areas, a picture can be worth a thousand words. As an evaluation tool for community-based programs, we can think of a portfolio as a kind of scrapbook or photo album that records the progress and activities of the program and its participants, and showcases them to interested parties both within and outside of the program. While portfolio assessment has been predominantly used in educational settings to document the progress and achievements of individual children and adolescents, it has the potential to be a valuable tool for program assessment as well. Many programs do keep such albums, or scrapbooks, and use them informally as a means of conveying their pride in the program, but most do not consider using them in a systematic way as part of their formal program evaluation. However, the concepts and philosophy behind portfolios can apply to community evaluation, where portfolios can provide windows into community practices, procedures, and outcomes, perhaps better than more traditional measures. ortfolio assessment has become widely used in educational settings as a way to examine and measure progress, by documenting the process of learning or change as it occurs. Portfolios extend beyond test scores to include substantive descriptions or examples of what the student is doing and experiencing. Fundamental to "authentic assessment" or "performance assessment" in educational theory is the principle that children and adolescents should demonstrate, rather than tell about, what they know and can do (Cole, Ryan, & Kick, 1995). Documenting progress toward higher order goals such as application of skills and synthesis of experience requires obtaining information beyond what can be provided by standardized or norm-based tests. In "authentic assessment", information or data is collected from various sources, through multiple methods, and over multiple points in time (Shaklee, Barbour, Ambrose, & Hansford, 1997). Contents of portfolios (sometimes called "artifacts" or "evidence") can include drawings, photos, video or audio tapes, writing or other work samples, computer disks, and copies of standardized or program-specific tests. Data sources can include parents, staff, and other community members who know the participants or program, as well as the self-reflections of participants themselves. Portfolio assessment provides a practical strategy for systematically collecting and organizing such data.

PORTFOLIO ASSESSMENT IS MOST USEFUL FOR: *Evaluating programs that have flexible or individualized goals or outcomes. For example, within a program with the general purpose of enhancing children's social skills, some individual children may need to become less aggressive while other shy children may need to become more assertive. Each child's portfolio asseessment would be geared to his or her individual needs and goals. *Allowing individuals and programs in the community (those being evaluated) to be involved in their own change and decisions to change. *Providing information that gives meaningful insight into behavior and related change. Because portfolio assessment emphasizes the process of change or growth, at multiple points in time, it may be easier to see patterns. *Providing a tool that can ensure communication and accountability to a range of audiences. Participants, their families, funders, and members of the community at large who may not have much sophistication in interpreting statistical data can often appreciate more visual or experiential "evidence" of success. *Allowing for the possibility of assessing some of the more complex and important aspects of many constructs (rather than just the ones that are easiest to measure).

PORTFOLIO ASSESSMENT IS NOT AS USEFUL FOR: *Evaluating programs that have very concrete, uniform goals or purposes. For example, it would be unnecessary to compile a portfolio of individualized "evidence" in a program whose sole purpose is full immunization of all children in a community by the age of five years. The required immunizations are the same, and the evidence is generally clear and straightforward. *Allowing you to rank participants or programs in a quantitative or standardized way (although evaluators or program staff may be able to make subjective judgments of relative merit). *Comparing participants or programs to standardized norms. While portfolios can (and often do) include some standardized test scores along with other kinds of "evidence", this is not the main purpose of the portfolio.

USING PORTFOLIO ASSESSMENT WITH THE STATE STRENGTHENING EVALUATION GUIDE Tier 1 - Program Definition Using portfolios can help you to document the needs and assets of the community of interest. Portfolios can also help you to clarify the identity of your program and allow you to document the "thinking" behind the development of and throughout the program. Ideally, the process of deciding on criteria for the portfolio will flow directly from the program objectives that have been established in designing the program. However, in a new or existing program where the original objectives are not as clearly defined as they need to be, program developers and staff may be able to clarify their own thinking by visualizing what successful outcomes would look like, and what they would accept as "evidence". Thus, thinking about portfolio criteria may contribute to clearer thinking and better definition of program objectives. Tier 2 - Accountability Critical to any form of assessment is accountability. In the educational arena for example, teachers are accountable to themselves, their students, and the families, the schools and society. The portfolio is an assessment practice that can inform all of these constituents. The process of selecting "evidence" for inclusion in portfolios involves ongoing dialogue and feedback between participants and service providers. Tier 3 - Understanding and Refining Portfolio assessment of the program or participants provides a means of conducting assessments throughout the life of the program, as the program addresses the evolving needs and assets of participants and of the community involved. This helps to maintain focus on the outcomes of the program and the steps necessary to meet them, while ensuring that the implementation is in line with the vision established in Tier 1.

Tier 4 - Progress Toward Outcomes Items are selected for inclusion in the portfolio because they provide "evidence" of progress toward selected outcomes. Whether the outcomes selected are specific to individual participants or apply to entire communities, the portfolio documents steps toward achievement. Usually it is most helpful for this selection to take place at regular intervals, in the context of conferences or discussions among participants and staff. Tier 5 - Program Impact One of the greatest strengths of portfolio assessment in program evaluation may be its power as a tool to communicate program impact to those outside of the program. While this kind of data may not take the place of statistics about numbers served, costs, or test

scores, many policy makers, funders, and community members find visual or descriptive evidence of successes of individuals or programs to be very persuasive.

ADVANTAGES OF USING PORTFOLIO ASSESSMENT *Allows the evaluators to see the student, group, or community as individual, each unique with its own characteristics, needs, and strengths. *Serves as a cross-section lens, providing a basis for future analysis and planning. By viewing the total pattern of the community or of individual participants, one can identify areas of strengths and weaknesses, and barriers to success. *Serves as a concrete vehicle for communication, providing ongoing communication or exchanges of information among those involved. *Promotes a shift in ownership; communities and participants can take an active role in examining where they have been and where they want to go. *Portfolio assessment offers the possibility of addressing shortcomings of traditional assessment. It offers the possibility of assessing the more complex and important aspects of an area or topic. *Covers a broad scope of knowledge and information, from many different people who know the program or person in different contexts ( eg., participants, parents, teachers or staff, peers, or community leaders).

DISADVANTAGES OF USING PORTFOLIO ASSESSMENT *May be seen as less reliable or fair than more quantitative evaluations such as test scores. *Can be very time consuming for teachers or program staff to organize and evaluate the contents, especially if portfolios have to be done in addition to traditional testing and grading. *Having to develop your own individualized criteria can be difficult or unfamiliar at first. *If goals and criteria are not clear, the portfolio can be just a miscellaneous collection of artifacts that don't show patterns of growth or achievement. *Like any other form of qualitative data, data from portfolio assessments can be difficult to analyze or aggregate to show change.

HOW TO USE PORTFOLIO ASSESSMENT Design and Development Three main factors guide the design and development of a portfolio: 1) purpose, 2) assessment criteria, and 3) evidence (Barton & Collins, 1997). 1) Purpose The primary concern in getting started is knowing the purpose that the portfolio will serve. This decision defines the operational guidelines for collecting materials. For example, is the goal to use the portfolio as data to inform program development? To report progress? To identify special needs? For program accountability? For all of these? 2) Assessment Criteria Once the purpose or goal of the portfolio is clear, decisions are made about what will be considered sucess (criteria or standards), and what strategies are necessary to meet the goals. Items are then selected to include in the portfolio because they provide evidence of meeting criteria, or making progress toward goals.

3) Evidence In collecting data, many things need to be considered. What sources of evidence should be used? How much evidence do we need to make good decisions and determinations? How often should we collect evidence? How congruent should the sources of evidence be? How can we make sense of the evidence that is collected? How should evidence be used to modify program and evaluation? According to Barton and Collins (1997), evidence can include artifacts (items produced in the normal course of classroom or program activities), reproductions (documentation of interviews or projects done outside of the classroom or program), attestations (statements and observations by staff or others about the participant), and productions (items prepared especially for the portfolio, such as participant reflections on their learning or choices) . Each item is selected because it adds some new information related to attainment of the goals.

Steps of Portfolio Assessment Although many variations of portfolio assessment are in use, most fall into two basic types: process portfolios and product portfolios (Cole, Ryan, & Kick, 1995). These are not the only kinds of portfolios in use, nor are they pure types clearly distinct from each other. It may be more helpful to think of these as two steps in the portfolio assessment process,

as the participant(s) and staff reflectively select items from their process portfolios for inclusion in the product portfolio. Step 1: The first step is to develop a process portfolio, which documents growth over time toward a goal. Documentation includes statements of the end goals, criteria, and plans for the future. This should include baseline information, or items describing the participant's performance or mastery level at the beginning of the program. Other items are "works in progress", selected at many interim points to demonstrate steps toward mastery. At this stage, the portfolio is a formative evaluation tool, probably most useful for the internal information of the participant(s) and staff as they plan for the future. Step 2: The next step is to develop a product portfolio (also known as a "best pieces portfolio"), which includes examples of the best efforts of a participant, community, or program. These also include "final evidence", or items which demonstrate attainment of the end goals. Product or "best pieces" portfolios encourage reflection about change or learning. The program participants, either individually or in groups, are involved in selecting the content, the criteria for selection, and the criteria for judging merits, and "evidence" that the criteria have been met (Winograd & Jones, 1992). For individuals and communities alike, this provides opportunities for a sense of ownership and strength. It helps to show-case or communicate the accomplishments of the person or program. At this stage, the portfolio is an example of summative evaluation, and may be particularly useful as a public relations tool. Distinguishing Characteristics Certain characteristics are essential to the development of any type of portfolio used for assessment. According to Barton and Collins (1997), portfolios should be: 1) Multisourced (allowing for the opportunity to evaluate a variety of specific evidence) Multiple data sources include both people (statements and observations of participants, teachers or program staff, parents, and community members), and artifacts (anything from test scores to photos, drawings, journals, & audio or videotapes of performances). 2) Authentic (context and evidence are directly linked) The items selected or produced for evidence should be related to program activities, as well as the goals and criteria. If the portfolio is assessing the effect of a program on participants or communities, then the "evidence" should reflect the activities of the program rather than skills that were gained elsewhere. For example, if a child's musical performance skills were gained through private piano lessons, not through 4-H activities, an audio tape would be irrelevant in his 4-H portfolio. If a 4-H activity involved the same child in teaching other children to play, a tape might be relevant. 3) Dynamic (capturing growth and change)

An important feature of portfolio assessment is that data or evidence is added at many points in time, not just as "before and after" measures. Rather than including only the best work, the portfolio should include examples of different stages of mastery. At least some of the items are self-selected. This allows a much richer understanding of the process of change.

4) Explicit (purpose and goals are clearly defined) The students or program participants should know in advance what is expected of them, so that they can take responsibility for developing their evidence. 5) Integrated (evidence should establish a correspondence between program activities and life experiences) Participants should be asked to demonstrate how they can apply their skills or knowledge to real-life situations. 6) Based on ownership (the participant helps determine evidence to include and goals to be met) The portfolio assessment process should require that the participants engage in some reflection and self-evaluation as they select the evidence to include and set or modify their goals. They are not simply being evaluated or graded by others. 7) Multipurposed (allowing assessment of the effectiveness of the program while assessing performance of the participant). A well-designed portfolio assessment process evaluates the effectiveness of your intervention at the same time that it evaluates the growth of individuals or communities. It also serves as a communication tool when shared with family, other staff, or community members. In school settings, it can be passed on to other teachers or staff as a child moves from one grade level to another. Analyzing and Reporting Data As with any qualitative assessment method, analysis of portfolio data can pose challenges. Methods of analysis will vary depending on the purpose of the portfolio, and the types of data collected (Patton, 1990). However, if goals and criteria have been clearly defined, the "evidence" in the portfolio makes it relatively easy to demonstrate that the individual or population has moved from a baseline level of performance to achievement of particular goals. It should also be possible to report some aggregated or comparative results, even if participants have individualized goals within a program. For example, in a teen peer

tutoring program, you might report that "X% of participants met or exceeded two or more of their personal goals within this time frame", even if one teen's primary goal was to gain public speaking skills and another's main goal was to raise his grade point average by mastering study skills. Comparing across programs, you might be able to say that the participants in Town X on average mastered 4 new skills in the course of six months, while those in Town Y only mastered 2, and speculate that lower attendance rates in Town Y could account for the difference. Subjectivity of judgements is often cited as a concern in this type of assessment (Bateson, 1994). However, in educational settings, teachers or staff using portfolio assessment often choose to periodically compare notes by independently rating the same portfolio to see if they are in agreement on scoring (Barton & Collins, 1997). This provides a simple check on reliability, and can be very simply reported. For example, a local programmer could say "To ensure some consistency in assessment standards, every 5th portfolio (or 20%) was assessed by more than one staff member. Agreement between raters, or inter-rater reliability, was 88%".

There are many books and articles that address the problems of analyzing and reporting on qualitative data in more depth than can be covered here. The basic issues of reliability, validity and generalizability are relevant even when using qualitative methods, and various strategies have been developed to address them. Those who are considering using portfolio assessment in evaluation are encouraged to refer to some of the sources listed below for more in-depth information.

ANNOTATED BIBLIOGRAPHY Barton, J., & Collins, A. (Eds.) (1997). Portfolio assessment: A handbook for educators. Menlo Park, CA: Addison-Wesley Publishing Co. A book about portfolio assessment written by and for teachers. The main goal is to give practical suggestions for creating portfolios so as to meet the unique needs and purposes of any classroom. The book includes information about designing portfolios, essential steps to make portfolios work, actual cases of portfolios in action, a compendium of portfolio implementation tips that save time and trouble, how to use portfolios to assess both teacher and student performance, and a summary of practical issues of portfolio development and implementation. This book is very clear, easy to follow, and can easily serve as a bridge between the use of portfolios in the classroom and the application of portfolios in community evaluations. Bateson, D. (1994). Psychometric and philosophic problems in "authentic" assessment: Performance tasks and portfolios. Alberta Journal of Educational Research, 40 (2), p. 233245.

Considers issues of reliability and validity in assessment which are as important in "authentic assessment" methods as in more traditional methods. Care needs to be exercised so that these increasingly popular new methods are not perceived as unfair or invalid. Cole, D. J., Ryan, C. W., & Kick, F. (1995). Portfolios across the curriculum and beyond. Thousand Oaks, CA: Corwin Press. Authors discuss the development of authentic assessment and how it has led to portfolio usage. Guidelines are given for planning portfolios, how to use them, selection of portfolio contents, reporting strategies, and use of portfolios in the classroom. In addition, a chapter focuses on the development of a professional portfolio. Courts, P. L., & McInerny, K. H. (1993). Assessment in higher education: Politics, pedagogy, and portfolios. London: Praeger. The authors describe a project using portfolios to train teachers to assess exceptional potential in underserved populations. The portfolio includes observations of the children's behavior in the school, home, and community. The underlying assumption of the project is that teachers learn to recognize exceptional potential if they are provided with authentic examples of such behavior. Results indicated that participating teachers experienced a sense of empowerment as a consequence of the project and became both involved in and committed to the project. Glasgow, N. A. (1997). New curriculum for new times: A guide to student-centered, problem-based learning. Thousand Oaks, CA: Corwin Press. This book is an attempt to identify and define current practices and present alternatives that can better meet the needs of a wider range of students in facilitating literacy and readiness for life outside the classroom. Discussion centers on current curriculum and the need for instruction that meets the changing educational context. Included is information about portfolio assessment, design and implementation, as as examples of a new curricular style that promotes flexible and individualistic instruction. Maurer, R. E. (1996). Designing alternative assessments for interdisciplinary curriculum in middle and secondary schools. Boston: Allyn and Bacon. This book explains how to design an assessment system that can authentically evaluate students' progress in an interdisciplinary curriculum. It offers step-by-step procedures, checklists, tables, charts, graphs, guides, worksheets, and examples of successful assessment methods. Specific to portfolio assessment, this book shows how portfolios can be used to measure learning. Provides some information on types and development of portfolios.

Patton, M. Q. (1990). Qualitative evaluation and research methods, 2nd ed. Newbury Park, CA: Sage. A good general reference on issues of qualitative methods, and strategies for analysis and interpretation of qualitative data. Shaklee, B. D., Barbour, N. E., Ambrose, R., & Hansford, S. J. (1997). Designing and using portfolios. Boston: Allyn and Bacon. Discusses the history of portfolio assessment, decisions that need to be made before beginning the portfolio assessment process (eg., what it will look like, who should be involved, what should be assessed, how the assessment will be accomplished), designing a portfolio system (eg., criteria and standards), using portfolio results in planning, and issues related to assessment practices (eg., accountability). Shaklee, B. D., & Viechnicki, K. J. (1995). A qualitative approach to portfolios: The Early Assessment for Exceptional Potential Model. Journal for the Education of the Gifted, 18 (2), 156-170. The creation of a portfolio assessment model based on qualitative research principles is examined by the authors. Portfolio framework assumptions for classrooms are: designing authentic learning opportunities, interaction of assessment, curriculum and instructions, multiple criteria derived from multiple sources, and systematic teacher preparations. Additionally, the authors examine the qualitative research procedures embedded in the development of the Early Assessment for Exceptional Potential model. Provided are preliminary results for credibility, transferability, dependability, and confirmability of the design. Winograd, P., & Jones, D. L. (1992). The use of portfolios in performance assessment. New Directions for Educational Reform, 1 (2), 37-50. Authors examine the use of portfolios in performance assessment. Suggestions are offered to educators interested in using portfolios in aiding students to become better readers and writers. Addresses concerns related to portfolios' usefulness. Educators need support in learning how to use portfolios, including their design, management, and interpretation.


Conclusion: Making tests is a lot more time consuming than expected. It is not easy either. Assessing and testing are tools to know how are students are doing at a given moment, whether it is during the course or at the end of a long year of teaching, tests are important. I believe that properly making tests can help relieve students from the usual fear they have to tests. Students often think that teachers evaluate things that are not related or they didn’t teach; sadly this is because it has happened before. As teachers our responsibility is to make sure tests reflect our teaching and content given. As teachers we also have to encourage students to be confident during tests and to always expect the best. I too have had teachers that “took revenge� during finals but these are things that must end. As teachers we must professionally deliver our knowledge and make sure it is correctly and fairly tested.

Evaluation Portfolio  

Portfolio homework for the Evaluation and Assessment course at Mariano Galvez

Evaluation Portfolio  

Portfolio homework for the Evaluation and Assessment course at Mariano Galvez