Issuu on Google+

‘Tes%ng
a
test’
–
Evalua%ng
our
 Assessment
Tools
 Eddy
White,
Ph.D.


Assessment
 Coordinator
 Center
for
English
as
a
 Second
Language
 University
of
Arizona




Targets


1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions



2



Targets


1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions



3



(1994‐2009)



Tokyo
Woman’s
ChrisAan
University



Classroom-based Assessment •  Assessment
 of
Learning
 •  Assessment
 for
Learning



V a n c o u v e
 r



Targets


1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions



8



The
goal
of
assessment
is
to
.
.
.



9



The goal of assessment has to be, above all, to support the improvement of learning and teaching.
 (Fredrickson
&
Collins,
1989)





10



definiAon:
Classroom
Assessment


Planning


ReporAng


Assessment


Analyzing


CollecAng



ESL
Assessment‐Purposes
 •  idenAfy
strengths
and
weaknesses
of
individual
 students,
 •  adjust
instrucAon
to
build
on
students’
strengths
 and
alleviate
weaknesses,
 •  monitor
the
effecAveness
of
instrucAon,
 •  provide
feedback
to
students
(sponsors,
 parents,etc.),
and

 •  make
decisions
about
the
advancement
of
 students
to
the
next
level
of
the
program.

 (Source:
ESL
Senior
High
Guide
to
ImplementaAon,
2002)


12



Consider


•  Research
suggests
that
 teachers
spend
from
 one‐quarter
to
one‐third
 of
their
professional
Ame
 on
assessment‐related
 acAviAes.


•  Almost
all
do
so
 without
the
benefit
 of
having
learned
the
 principles
of
sound
 assessment.
 (S%ggins,
2007)



Teachers
learn
how
to
teach
without
learning
 much
about
how
to
assess.
(Heritage,
2007)


14



Assessment
literacy

 •  the
kinds
of
assessment
know‐how
 and
understanding
that
teachers
 need
to
assess
their
students
 effecAvely
 •  Assessment
literate
educators
 should
have
knowledge
and
skills
 related
to
the
basic
principles
of
 quality
assessment
pracAces

 (SERVE
Center,
University
of
North
Carolina,
2004)



Assessment
Literacy
 Know‐how
and
 understanding
 teachers
need
to
 assess
students
 effec%vely
and
 maximize
 learning



Importance
of
classroom
 assessment
 •  We
may
not
like
it,
but
 students
can
and
do
 ignore
our
teaching;

 •  however
if
they
want
to
 get
a
qualificaAon,
they
 have
to
parAcipate
in
the
 assessment
processes
we
 design
and
implement.

 (Brown,
S.
2004.
Assessment
for
learning.
Learning
and
Teaching
in
Higher
 Educa0on,
1,
81‐89)



43
/
W



Who
are
the
assessment
 ‘deciders’
at
your
 insAtuAon?



Classroom-Based Assessment: Challenges, Choices, and Consequences


Assessment
Frameworks



Assessment
framework
 •  ‐
the
series
of
assessment
 tools
(exams,
tasks,
projects,
 etc.)
that
are
scored
and
used
 to
arrive
at
a
summa%ve
grade
 for
a
course
 •  ‐it
should
be
skills‐based
and
 knowledge‐
based
(i.e.
Ss
 demonstrate
what
they
know
 about
and
can
do
with
English)
 •  based
on
learning
outcomes



•  The
spirit
and
style
 of
student
 assessment
defines
 the
de
facto
 curriculum.
 
(Rowntree,
1987)


de
facto=
exisAng
in
fact,
actual,

 whether
intended
or
not



Targets


1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions



25



Quiz
Ame!


26



Assessing
an
English
arAcles
 quiz
 Context
 • ConversaAon
class
 (listening
&
speaking)
 • high‐beginner
level
 27



What
is
a
fundamental
 problem
with
this
quiz?


28



Answer


29



Targets


1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions



30



What
is
 a
test?


31



A
test
.
.
.

 •  is
a
method
of
measuring
a
person’s
 ability,
knowledge,
or
performance
in
 a
given
domain.
 •  is
an
instrument
–
a
set
of
 techniques,
procedures,
or
items
–
 that
requires
performance
on
the
 part
of
the
test‐taker.
 32



Tests
–
measuring
func%on


33



A
test
must
measure
 •  Some
tests
measure
general
ability,
while
 others
focus
on
very
specific
competencies
or
 objecAves.
 •  Examples

 •  A
mulA‐skill
proficiency
test
measures
general
 ability;

 •  a
quiz
on
recognizing
correct
use
of
definite
 arAcles
measures
very
specific
knowledge.
 34



•  A
test
measures
 performance,
.
.
.



•  but,
the
results
 imply
the
test‐ takers
ability,
or
 competence.


35



•  Performance‐ based
tests
 sample
the
test‐ takers
actual
use
 of
language,

 •  but
from
those
 samples
the
test
 administrator
 infers
general
 competence.

 36



•  A
well‐constructed
 test
is
an
 instrument
that
 provides
an
 accurate
measure
 of
a
test‐taker’s
 ability
within
a
 parAcular
domain.


•  Construc%ng
a
 good
test
is
a
 complex
task.
 37



Your
 assessment

 prac%ces?
 38



Think
about
 what
is
 happening
in
 your
context
 and
your
 assessment
 pracAces



Your
assessment
pracAces?
 •  •  •  •  •  •  •  •  •  • 

True–False
Item
 MulAple
Choice
 CompleAon
 Short
Answer
 Essay
 PracAcal
Exam
 Papers/Reports
 Projects
 QuesAonnaires
 PresentaAons


•  Inventories
 •  •  •  •  •  •  •  • 

Checklists
 Peer
RaAng
 Self
RaAng
 Journals
 Porkolios
 ObservaAons
 Discussions
 Interviews



For
you,
 which
of
the
 four
skills
are
 more/less
 challenging
 to
test?
 41



Targets


1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions



42



Quiz
Ame!


43



2010



•  Exploring
how
 principles
of
 language
assessment
 can
and
should
be
 applied
to
formal
 tests.
 •  These
principles
 apply
to
assessment
 of
all
kinds.
 •  How
to
use
these
 principles
to
design
a
 good
test.
 45



•  What
are
the
 ‘five
cardinal
 criteria’
that
 can
be
used
to
 design
and
 evaluate
all
 types
of
 assessment?
 46



Q.
How
do
you
know
if
a


test
is
effecAve,
appropriate,
 useful,
or,
in
down‐to‐earth
 terms,
a
“good”
test?


 47



Five
key
assessment
principles?


• Discuss
 • 3
minutes
 • Hint
(five
nouns)
 48



Five
key
 assessment
 principles


• PracAcality
 • Reliability
 • Validity
 • AuthenAcity
 • Washback
 49



50



Key

Assessment
Principles



•  These
quesAons
 provide
an
 excellent
 criterion
to
 evaluate
the
 tests
we
design
 and
use.
 52



53



1.
PracAcality
 • Is
the
procedure
relaAvely
easy
 to
administer?


54



Prac%cality
considera%ons
 •  the
logisAcal
and
administraAve
issues
 involved
in
making,
giving
and
scoring
an
 assessment
instrument
 •  the
amount
of
Ame
it
takes
to
construct
 and
administer
 •  the
ease
of
scoring
 •  ease
of
interpreAng/reporAng
the
results
 55



An
effecAve
test
is
prac%cal.

 This
means
that
it:
 •  is
not
excessively
expensive
 •  stays
within
appropriate
Ame
 constraints
 •  is
relaAvely
easy
to
administer,
and
 •  has
a
scoring/evaluaAon
procedure
 that
is
specific
and
Ame
efficient


56



The
value
and
quality
of
a
 test
someAmes

hinge
on
 such
ni`y‐gri`y
prac%cal
 considera%ons.
 57



•  In
classroom
 based
tesAng,
 _________
is
 almost
always
 a
crucial
 pracAcal
factor
 for
busy
 teachers.

 58



59



2.
Reliability
 • Is all work being consistently marked to the same standard? 60



•  A
reliable
test
is
consistent
and
 dependable.
 •  If
you
give
the
same
test
to
the
same
 student
or
matched
students
on
two
 different
occasions,
the
test
should
 yield
similar
results.
 61



What
 factors
 contribute
 to
the
 unreliability
 of
a
test?
 62



Test
Unreliability‐ contribuAng
factors
 • Student
related
reliability
 • Rater
reliability
(inter,
intra)
 • Test
administra%on
reliability
 • Test
reliability
 63



Q.
What
is
one
key
way
to
 increase

reliability?


A.
Use
rubrics
 64



• Rubrics
are
scoring
guidelines.
 •  They
provide
a
way
to
make
 judgments
fair
and
sound
when
 assessing
performance.
 •  A
uniform
set
of
precisely
defined
 criteria
or
guidelines
are
set
forth
to
 judge
student
work.

 65



66



3.
Validity
 •  Does the assessment measure what we really want to measure?

‐ most
complex
 criteria
 ‐ 
most
important
 principle



Validity
‐
definiAon
 •  ‘The
extend
to
which
inferences
 made
from
assessment
results
 are
appropriate,
meaningful,
and
 useful
in
terms
of
the
purpose
of
 the
assessment.’

 (Gronlund,
1998,
p.
226)
 68



•  A
valid
test
of
reading
 ability
.
.
.

 •  actually
measures
reading
 ability
–
 •  not
math
skills
 •  or
previous
knowledge

in
 a
subject
 •  nor
wriAng
skills
 •  nor
some
other
variable
of
 quesAonable
relevance
 69



How
is
the
validity
of
a
test
 established?
 1. Content
 validity
 2. Face
 validity
 70



Content
validity
 •  If
a
test
requires
the
test‐taker
to
perform
the
 behavior
that
is
being
measured.
.
.
 •  it
can
claim
content‐related
evidence
of
validity
 (content
validity)
 •  e.g.
A
test
of
a
person’s
ability
to
speak
an
L2
 requires
the
student
to
actually
speak
within
 some
sort
of
authenAc
context.

 •  A
test

with
paper
and
pencil
mulAple
choice
 quesAons
requiring
grammaAcal
judgments
does
 not
achieve
content
validity.
 71



•  direct
tes%ng
–
 involves
the
test‐taker
 in
actually
performing
 the
target
task
 •  indirect
tes%ng‐ students
not
 performing
the
task
 itself,
but
a
related
 task.


Another
way
of
 understanding
 content
validity
 is
to
consider
 the
difference
 between
direct
 •  e.g.
tes%ng
oral
 and
indirect
 produc%on
of
 syllable
stress
 tesAng.


72



To
achieve
 content
validity
in
 classroom
 assessment,
try
to
 test
performance
 directly.
 73



74



How
is
the
validity
of
a
test
 established?
 1. Content
 validity
 2. Face
 validity
 75



Face
validity
 •  The
extent
to
which
students
view
the
 assessment
as:
 1.  fair
 2.  relevant
 3.  useful
for
improving
learning
 •  Face
validity
refers
to
the
degree
to
which
a
 test
looks
right,
and
appears
to
measure
the
 knowledge
or
abiliAes
it
claims
to
measure.
 76



High
face
validity:
the
test
.
.
.
 •  is
well‐constructed,
expected
format
with
 familiar
tasks
 •  is
clearly
doable
within
alloued
Ame
 •  has
items
that
are
clear
and
uncomplicated
 •  direcAons
that
are
crystal
clear
 •  has
tasks
related
to
course
work
(content
 validity)
 •  has
a
difficulty
level
that
presents
a
 reasonable
challenge
 77



•  Most
significant
cardinal
principle
of
 assessment
evalua%on.
 •  If
validity
is
not
established,
all
other
 consideraAons
may
be
rendered
useless.

 78



79



4.
AuthenAcity
 • Are
students
asked
to
perform
 real‐world
tasks?


80



Test
task
authen%city
 • tasks
represent,
or
closely
 approximate,
real‐world
tasks
 • the
task
is
likely
to
be
enacted
 in
the
“real
world”
 • not
contrived
or
arAficial
 81



AuthenAcity
checklist
 •  Is
the
language
in
the
test
as
natural
as
 possible?
 •  Are
topics
as
contextualized
as
possible
rather
 than
isolated?
 •  Are
topics
and
situaAons
interesAng
 enjoyable,
and/or
humorous?
 •  Is
some
themaAc
organizaAon
provided,
such
 as
through
a
story
line
or
episode?
 •  Do
tasks
represent,
or
closely
approximate,
 real‐world
tasks?
 82



83



5.Washback
 • Does the assessment have positive effects on learning and teaching? 84



Washback
=
the
effect
 of
tesAng
on
teaching
 and
learning

 ‐ posi%ve
washback
 ‐ nega%ve
washback
 85



Washback
 •  Classroom
assessment:
the
affects
of
an

 assessment
on
teaching
and
learning
prior
to
 the
assessment
itself
(preparaAon)
 •  Another
form
of
washback=the
informaAon
 that
‘washes
back’
to
students
in
the
form
of
 useful
diagnoses
of
strengths
and
weaknesses.
 •  Formal
tests
provide
no
washback
if
students
 receive
a
simple
leuer
grade
or
single
overall
 numerical
score.

 86



A
test
that
provides
beneficial
 washback
.
.
.

 •  posiAvely
influences
what
and
how
teachers
 teach
 •  posiAvely
influences
what
and
how
students
 learn
 •  offers
learners
a
chance
to
adequately
prepare
 •  gives
learners
feedback
that
enhances
their
 language
development
 •  provides
condiAons
for
peak
performance
by
 the
learner
 87



Teachers’
challenge
 • to���create
classroom
tests
 that
serve
as
learning
tools
 through
which
washback
is
 achieved
 88



89



Targets


1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions




Q.
How
do
you
know
if
a
test
is


effecAve,
appropriate,
useful,
or,
in
 down‐to‐earth
terms,
a
“good”
 test?




91



Answer.

A
‘good’
test:


 •  can
be
given
within
appropriate
administraAve
 constraints,
 •  is
dependable,
 •  
accurately
measures
what
you
want
it
to
 measure,
 •  the
language
in
the
test
is
representaAve
of
 real‐world
language
use,
and
 •  the
test
provides
informaAon
that
is
useful
for
 the
learner.
 92



•  These
principles
will
help
 you
make
accurate
 judgments
about
the
 English
competence
of
 your
students.
 •  They
provide
useful
 guidelines
for
evaluaAng
 exisAng
tests,
and
 designing
our
own.

 93



Assessment
Literacy
 Know‐how
and
 understanding
 teachers
need
to
 assess
students
 effec%vely
and
 maximize
 learning



•  There
is
no
gewng
 away
from
the
fact
 that
most
of
the
 things
that
go
wrong
 with
assessment
are
 our
fault,
 •  the
result
of
poor
 assessment
design‐
 and
not
the
fault
of
 our
students.
 (Race
et
al.,
2005)



•  Improving
student
 learning
implies
 improving
the
 assessment
system.
 •  Teachers
oxen
assume
 that
it
is
their
teaching
 that
directs
student
 learning.

 •  In
pracAce,
assessment
 directs
student
learning,
 because
it
is
the
 assessment
system
that
 defines
what
is
worth
 learning.







 (Havnes,
2004,
p.1)



(Boud
&
Falchikov,
2007)
 •  There
is
substanAal
evidence
that
 assessment,
rather
than
teaching,
has
the
 major
influence
on
students’
learning.
 •  It
directs
auenAon
to
what
is
important,
 acts
as
an
incenAve
for
study,
and
has
a
 powerful
effect
on
student’s
approaches
 to
their
work.
 Rethinking
Assessment
in
Higher
Educa0on



“We owe it to ourselves and our students to devote at least as much energy to ensuring that our assessment practices are worthwhile as we do to ensuring that we teach well”. 
 Dr.
David
Boud,



University
of
Technology,

 Sydney,
Australia


98



Thank
you
for
 your
Ame
and
 parAcipaAon.
 Best
wishes
 with
your
 assessment
 pracAces.



The
 End
 101



Testing a Test-Evaluating Our Asssessment Tools