PREFACE
This manual is designed as a companion to the textbook, Psychological Testing: History, Principles, and Applications, Sixth Edition,by Robert J. Gregory. The 24 topics in this manual correspond to each of the 24 topics in the textbook. For each topic, the instructor will find the following:
Classroom Discussion Questions
Extramural Assignments
Classroom Demonstrations
Essay Questions
Test Item File
This manual serves two purposes. First, the instructor will find numerous and diverse approaches to improving the quality of a course on psychological testing. For example, the classroom discussion questions will encourage students to think about difficult and controversial issues in psychological testing. The classroom demonstrations are designed to liven up the class periods. Also, the extramural assignments will help broaden the students’ grasp of psychological and psychometric concepts.
The second purpose of the manual is to provide the instructor with ready-made multiple choice and essay questions. In all, the manual incorporates more than 1,000 multiple choice questions, plus dozens of suggested essay questions. Every single question was written and/or reviewed by the author, so the textbook intentions are well represented in the pedagogy of this manual.
Teaching is a complex and demanding task. I hope that the modest resources provided here will help the instructor with this difficult but rewarding endeavor.
Robert J. GregoryTopic 1A
The Nature and Uses of Psychological Tests
The Consequences of Testing
Definition of a Test
Case Exhibit 1.1: True-Life Vignettes of Testing
Further Distinctions in Testing
Types of Tests
Uses of Testing
Factors Influencing the Soundness of Testing
Standardized Procedures in Test Administration
Desirable Procedures of Test Administration
Influence of the Examiner
Background and Motivation of the Examinee
Summary
Key Terms and Concepts
Classroom Discussion Questions
1. An interesting way to generate classroom discussion on the nature and definition of a test is to bring in one or more quasi-tests that can be found in any bookstore. For example, the Luscher Color Test or variations thereof can be found in most bookstores. After describing and demonstrating these tests, ask students to discuss whether they meet the criteria of a psychological test.
2. It is usually possible to create a lively debate by asking students who should have access to psychological tests. For example, should anyone be able to purchase a copy of the Wechsler Adult Intelligence Scale-III? Should a high school teacher who has taken a course on individual intelligence tests be allowed to administer the WAIS-III?
3. An interesting discussion question is whether school-based testing (e.g., in high school) should be norm-referenced (e.g., who is at the 99th percentile?) or criterionreferenced (e.g., can each student reach a specific skill level in each subject matter?). The purposes of testing and the nature of a just society usually emerge from this kind of discussion.
4. A useful way to begin Topic 1A is by asking students to catalogue the numerous ways in which test results can be swayed by extraneous factors. That is, other than the variable being measured, what other factors can cause test scores to be artificially high or low? It is especially helpful to have students provide specific examples.
5. Sensitivity to disabilities is another useful discussion topic. What kinds of disabilities might examinees possess? How might examiners recognize these disabilities? What adjustments are appropriate in response to a disability?
6. A good broad-based question for Topic 1A is to ask the class to brainstorm as many different applications of psychological testing as possible. After the initial round of discussion, it may be helpful to list the main types of psychological tests (i.e. intelligence, creativity, personality, neuropsychological, etc.) to generate further responses. This exercise should help provide a relevant introduction and appreciation for the nature and uses of psychological testing.
Extramural Assignments
1. A challenging assignment is to ask students to invent a test. Once they have selected a construct for measurement, they can be challenged to devise items and develop the test throughout the semester, paying special attention to the concepts introduced in the first few chapters of the text.
2. Students may gain insight into the ethics of testing if they are encouraged to poll others about the extent of cheating on standardized tests. It would be interesting to ask other students (anonymously, of course) to recount instances in which they or others
cheated on any kind of standardized test, whether group or individual. Students could produce a brief catalog of these instances, discussing the likely effect on test validity, etc.
3. Tests come in an amazing variety of types and purposes. Ask students to review the latest editions of the journals listed at the end of Topic 1A to write brief descriptions of new tests. Offer to give a small prize (e.g., extra credit) for the student who finds the most unique or peculiar test.
4. Ask students to track down biographies, autobiographies, and journal articles about persons who were misdiagnosed by psychological tests because of unrecognized handicaps. The students could write a short synopsis or present a brief oral report to the class.
5. Divide the class into two to four groups and have each group collect data on a digit span task under different conditions. For example, the rate of presentation might be the independent variable, with different groups presenting at .5, 1.0, 1.5, and 2.0 seconds between digits. The groups could bring their data back to class and compare the effects of varying the presentation methods.
Classroom Demonstrations
1. The importance of standardized procedure is a topic worthy of demonstration. An easy way to approach this issue is to describe or demonstrate an existing test or subtest, and then ask students to describe the probable effects of variations from standardized procedure. Digit span tests are especially useful in this regard. In addition to discussing the effects of nonstandard procedure, the instructor can demonstrate the effects. For example, students can be asked to write down orally presented digit sequences under various conditions: rapid reading (more than one digit per second), background noise (e.g., have a student cough several times during the presentation), meaningful sequences (e.g., 1-800-325-3535-1492-1776). By tallying class averages for these various conditions, the students can see the value of standardized procedures.
2. The textbook outlines eight different kinds of tests. For some of the tests in each category, it would be possible to demonstrate sample items. Instructors need to be sensitive to their own responsibilities, but it is usually possible to demonstrate tests without breaching test security. For example, college students can be shown sample items from earlier editions of intelligence tests with no harm; MMPI items can be read to show students the range of item types; the structure of interest inventories can be discussed without invaliding them; and so on.
3. This would be a good time to bring out the Mental Measurements Yearbooks and the Test Critiques volumes and circulate them in class. Also, The Journal of Psychoeducational Assessment and The Journal of Clinical Psychology are useful journals for demonstrating the kinds of research that new tests engender.
4. Subjective judgment in scoring can be demonstrated in class by reading students the criteria for a vocabulary item on an outdated test (e.g., the WISC) and then asking students to rate various responses as 0, 1, or 2. Although there will usually be a high level of agreement, certain responses will prove difficult to score, with the result that ratings vary widely.
5. The importance of rapport can be demonstrated through role playing. The instructor can “test” several students with a hypothetical examination. By alternating demeanor between friendly and harsh, the effect of rapport can be demonstrated quite effectively.
6. Divide the class in half. Have both groups develop a simple and benign test that they can administer in class (i.e. how many times they can flip a coin in 30 seconds, how many fairy tales they can name in a minute, etc.). Once they have created a test, instruct them to test each member in their group and record scores to create a standardized sample. Then have each group administer their test to members of the opposite group and record where each individual falls in relation the standardized sample. They can chart the results on the blackboard. This is an interactive way to familiarize students with the basic features of norm-referenced tests.
Essay Questions
1. Outline the characteristics of a test. Using a hypothetical test of your own invention, provide evidence that it is truly a test as defined in the textbook.
2. Discuss the potential uses of psychological tests.
3. Define and differentiate norm-referenced testing and criterion-referenced testing.
4. What is behavioral assessment? Cite a new example of a behavioral assessment procedure.
5. Define test anxiety and summarize the research findings with respect to its correlates.
6. Describe desirable procedures for the administration of group tests.
7. Describe how a correction for guessing can be used in test scoring. For example, with a 50-item multiple choice test that has four options per question, what should be the corrected score for an individual who answered 35 items correctly, answered 9 items incorrectly, and left 6 items blank? Clarify your answer.
8. Name two mild disabilities that are frequently overlooked in testing. Identify some signs that will help the examiner detect these impairments and describe adjustments they should make when testing these individuals.
9. Why is comprehensive training in test administration critical to proper utilization of tests, and how has this been shown to be insufficient in past studies?
Topic 1A: The Nature and Uses of Psychological Tests
1. The test is a multivariate assessment of heart rate, respiration, muscle tone, reflex irritability, and color in newborns.
a. Reflex
*c. Apgar
b. Newborn
d. Alpha
2. A specialist in psychology or education who develops and evaluates psychological tests:
a. clinician
c. psychometrist
*b. psychometrician
d. counselor
3. Which of the following could be a test, according to the definition offered in the textbook?
a. a checklist for rating the social skills of an intellectually disabled youth
b. a non-timed measure of mastery in adding pairs of three-digit numbers
c. a microcomputer appraisal of reaction time
*d. all of the above
4. Which of the following is NOT a typical characteristic of psychological tests?
*a. standardization to a mean of 100
b. sampling of behavior
c. description of behavior with categories or scores
d. use of norms to predict other behaviors
5. Tests that use a well-defined population of persons for their interpretive framework are referred to as:
a. criterion-referenced
c. standard-referenced
6. Criterion-referenced tests
*a. measure what a person can do
b. population-referenced
*d. norm-referenced
b. compare results to the performance levels of others
c. are passed by everyone
d. all of the above
7. Uniformity of administration procedures is the definition of
a. criterion-referencing
c. norm-referencing
*b. standardization
d. reliability
8. Which of the following is an essential step in the standardization of a test?
a. use of identical stimuli with all examinees
b. precise specification of oral instructions for subtests
c. advice to the examiner as to how to handle queries from the examinee
*d. all of the above
9. Where is the most reliable source to get directions and instructions for administering specific psychological tests?
a. the American Psychological Association’s volume on Testing and Assessment
b. continuing education seminars in psychological testing
*c. the instruction manual that typically accompanies a test
d. a credentialed psychologist
10. Why are tests merely a sample of behavior?
*a. so that the time required for testing is not excessive
b. a sample is as good as the totality of behaviors
c. so that the examiner’s influence is minimized
d. because the examiner has a special interest in that sample of behavior
11. Suppose that answering “true” to the question “I drink a lot of water” happens to help predict depression. Would it be wise to include this item on a test used to identify depression?
*a. yes, because the essential characteristic of a good test is that it predicts relevant behaviors
b. no, because there is no theoretical link between drinking water and being depressed
c. yes, because there is a theoretical link between drinking water and being depressed
d. maybe, depending upon the theoretical orientation of the test developer
12. Which of the following is NOT true in relation to psychological tests:
a. they typically portray an abstraction that is shown useful in prediction
*b. results represent a thing with physical reality
c. every test score will reflect some degree of measure error
d. they sum up performance in numbers of classifications
13. In the equation X = T + e, what is the best that a test developer can do?
a. make T very large
c. make e very large
14. The norm group is referred to as the
a. criterion sample
c. reference group
b. make T very small
*d. make e very small
*b. standardization sample
d. all of the above
15. The purpose of norms is to
a. establish an average performance
b. indicate the prevalence of high and low scores
c. determine deviations from expectation
*d. all of the above
16. In the selection and testing of a standardized sample, it is crucial that
*a. the sample is representative of the population for whom the test is intended
b. the sample is diverse in composition
c. the sample is uniform in composition
d. all members of the sample are literate
17. The ability of a test to predict non-test behavior is determined by
*a. an extensive body of postpublication validational research
b. the scores of the standardization sample
c. the reliability of the test
d. the prepublication validational research
18. In a(n) test, the objective is to determine where the examinee stands with respect to very tightly defined educational objectives.
a. norm-referenced
*c. criterion-referenced
19. Which is the most comprehensive term?
a. testing
c. norming
b. ability
d. aptitude
b. scoring
*d. assessing
20. Psychological assessment is characterized by all of the following EXCEPT:
a. comparing and combining data from different sources
b. utilizing and understanding a variety of different testing and observational measures
c. an inherently subjective process that makes predictions on a complex gestalt of data
*d. an objective process based on a single source of information
21. The term was invented during World War II to describe a program to select men for secret service assignment in the Office of Strategic Services.
*a. assessment
c. classification
b. evaluation
d. estimation
22. Which of the following was used as a situational test by the Office of Strategic Services during WWII?
a. transporting equipment across a raging brook
b. scaling a ten foot high wall
c. surviving a realistic interrogation
*d. all of the above
23. An important advantage of tests is that the examiner can gauge the level of motivation of the examinee.
a. group
*c. individual
b. personality
d. intelligence
24. Most intelligence tests use a assortment of test items.
a. homogeneous
c. random
*b. heterogeneous
d. culture-free
25. tests are often used to predict success in an occupation, training course, or educational endeavor.
a. Intelligence
*c. Aptitude
b. Personality
d. Achievement
26. tests are often used to measure a person’s degree of learning, success or accomplishment in a subject matter.
a. Intelligence
c. Aptitude
b. Personality
*d. Achievement
27. Measures of emphasize novelty and originality in the solution of fuzzy problems or the production of artistic works.
a. personality
*c. creativity
b. achievement
d. femininity
28. Putting forth a variety of answers to a complex or fuzzy problem is an example of thinking.
*a. divergent
c. undisciplined
b. convergent
d. intelligent
29. Checklists, inventories, and projective techniques are all examples of tests.
a. creativity
*c. personality
b. intelligence
d. vocational
30. share a common assumption that behavior is best understood in terms of clearly defined characteristics such as frequency, duration, antecedents, and consequences.
a. Intelligence tests
c. Creativity tests
b. Personality inventories
*d. Behavioral procedures
31. What subspecialty of psychology uses specialized tests on people to make inferences about the locus, extent, and consequences of brain damage?
a. Neurology
b. Cognitive Psychology
c. Physiological Psychology
*d. Neuropsychology
32. By far the most common use of psychological tests is to
*a. make decisions about persons
b. diagnose mental and emotional disorders
c. determine personality functioning
d. evaluate learning disabilities
33. Placement, screening, certification, and selection are all examples of
a. diagnosis
*c. classification
b. program evaluation
d. research-based testing
34. A neuropsychologist investigating the hypothesis that low-level lead absorption causes behavior deficits in children would be an example of using psychological testing for
*a. research
b. self-knowledge
c. program evaluation
d. diagnosis and treatment
35. In general, Head Start children show immediate gains in
a. IQ
c. academic achievement
b. school readiness
*d. all of the above
36. It is important that the standardization sample be representative of the population for whom the test is intended because
*a. this allows for the examinee’s relative standing to be determined
b. minority groups must be represented in all samples
c. the high generalizability is no longer a confounding variable
d. test standards require a standardization sample
37. In a(n) test, the performance of each examinee is interpreted in reference to a relevant standardization sample.
a. individually-referenced
b. group-referenced
*c. norm-referenced
d. criterion-referenced
38. A psychometrician is best understood as
a. an expert administrator of personality tests
b. a psychologist who has been trained from the scientist-practitioner model
*c. a developer and evaluator of psychological tests
d. any authorized user of assessment instruments
39. Appraising or estimating the magnitude of one or more attributes in a person is referred to as
a. testing
c. attribution
b. evaluation
*d. assessment
40. The distinction between aptitude tests and achievement tests is based largely upon
*a. usage
c. format
b. content
d. difficulty
41. Suppose a tester asks “What is a sofa?” and the child looks puzzled. In general, is it acceptable for the tester to rephrase the question, asking “What is a couch?”
a. Yes, because valid testing requires the development of rapport.
b. Yes, because the two questions are equivalent.
c. No, because the tester should never deviate from standardized procedure.
*d. No, because the rephrased question is easier and therefore not comparable.
42. In determining the boundaries of flexible testing procedures, the examiner should consider
*a. how the test was likely administered to the norm sample
b. the potential consequences of altering the test items
c. the general dictum that testing procedures should be interpreted literally and strictly
d. all of the above
43. In most cases, if a test question asks “What shape is a ball?” a correct answer would be recorded if
a. the subject responds verbally “round”
b. the subject responds verbally “spherical”
c. the subject gestures with his index finger in a circular pattern
*d. all of the above
44. The necessary prerequisite(s) to administering a new test are:
a. reading the manual
b. memorizing key elements of instructions
c. rehearsing the test
*d. all of the above
45. Which age group is most prone to periodic accumulation of fluid in the middle ear during intervals of mild illness?
*a. young subjects
c. young adults
b. adolescents
d. old adults
46. Which of the following is a possible sign of hearing loss?
a. inattentiveness
b. poor articulation
c. difficulty in following oral directions
*d. all of the above
47. Owing to the special nature of this kind of impairment, subjects may receive less credit on a test item than is due.
a. hearing-impaired *b. speech-impaired
c. motor-impaired d. vision-impaired
48. When testing a person with a mild motor handicap, examiners may wish to omit
a. multiple choice spatial items b. untimed spatial items
*c. timed performance subtests d. all of the above
49. According to the text, which kind of test generally requires the greatest vigilance from the examiner?
a. group test
b. individual test
*c. group and individual tests require equal vigilance
d. unknown
50. All of the following are common sources of error in group testing EXCEPT:
a. lack of clarity in delivering the directions
*b. failure to provide allotted break time
c. noise distractions
d. failure to explain when and if examinees should guess
51. Undoubtedly the single greatest source of error in group test administration is:
a. reading the wrong instructions
b. giving the wrong form of the test
c. giving a test to the wrong age group
*d. incorrect timing of tests
52. In general, how do test manuals for group standardized tests handle the issue of guessing?
*a. they provide explicit instructions to examinees as to the advantages and potential pitfalls of guessing
b. they warn examinees that guessing is usually counterproductive
c. most commonly, the test manual does not provide any guidance on the pros and cons of guessing
d. they explain that guessing seldom improves the score
53. Suppose a young girl answers correctly on 37 questions from a 50-item test but answers erroneously on 9 questions, leaving 2 questions blank. Suppose their are four alternatives per question. Using established principles of probability, what would be her corrected score?
a. 32 c. 36
*b. 34
d. 37
54. When testing children, testing should begin
a. not longer than 5 to 10 minutes after the child arrives
b. when the test manual says it should begin
*c. when he/she seems relaxed enough to give maximum effort
d. almost immediately so as to prevent the child from developing fear of the tester
55. Which of the examiner characteristics listed below has been found to make a consistent and significant difference in the outcome of individual test results?
a. sex
c. race
b. experience
*d. none of the above
56. In one study reported in the text (Terrell, et al. 1981), mistrustful blacks performed relatively poorly when tested by examiners.
a. black
c. black or white
*b. white
d. female
57. What is the relationship between test anxiety and school achievement?
*a. high anxiety correlates with low achievement
b. high anxiety correlates with high achievement
c. test anxiety and school achievement are unrelated
d. the relationship between test anxiety and school achievement is unknown
58. Test-anxious students have study habits that are those of other students.
a. far superior to
b. slightly superior to
c. about equally effective as
*d. worse than
59. When instructions for a task are neutral or nonthreatening, test-anxious subjects
*a. perform just as well as low-anxious subjects
b. show a decrement in performance
c. still perceive the situation to be stressful
d. all of the above
60. Suppose subjects are matched on overall IQ. On timed subtests from an intelligence scale such as the WAIS, the performance of low-anxious subjects that of high-anxious subjects.
a. drops below
*c. surpasses
b. equals
d. is twice as fast as
61. Conscious faking on psychological tests is thought to be
*a. rare
c. evidence of psychopathology
b. common place
d. blatant and obvious
62. In a 50-item multiple choice test with four choices per item, what would be the corrected score for an examinee who answered 32 items correctly, answered 9 items incorrectly, and left 9 items blank?
a. 35
b. 32
*c. 29 d. 26
63. Vernon and Brown (1964) relate the tragic case of a young girl who was put in an institution for the intellectually disabled because of a test IQ of 29, when, in fact, it was later shown her real IQ was 113. The original low score was a result of
a. undiagnosed autism in the girl
b. gross scoring errors by the examiner
*c. unrecognized deafness in the girl
d. misreading the original score (of 129)
64. The test item writer’s aim is to make all or nearly all considered guesses guesses.
a. correct
c. random
*b. wrong
d. educated
65. A common form of error made by graduate students in studies of practice administrations of IQ and achievement tests would be:
a. failure to have required materials on hand
b. incorrect readings of test instructions
*c. incorrect calculations of test ceilings
d. excessive queries of responses
Topic 1B
Ethical and Social Implications of Testing
The Rationale for Professional Testing Standards
Responsibilities of Test Publishers
Case Exhibit 1.3: Ethical and Professional Quandaries in Testing
Responsibilities of Test Users
Case Exhibit 1.4: Overzealous Interpretation of the MMPI
Testing of Cultural and Linguistic Minorities
Unintended Effects of High-Stakes Testing
Reprise: Responsible Test Use
Summary
Key Terms and Concepts
Classroom Discussion Questions
1. Discuss each of the broad ethical principles that apply to testing, asking students to cite hypothetical examples where these principles might be violated. The principles are: assessment should be in the best interests of the client; practitioners have a primary obligation to protect the confidentiality of test results; the psychologist must possess the expertise needed to evaluate the tests that are chosen for an assessment, the test user must obtain informed consent from the test taker or a legal representative; the examiner must be knowledgeable about individual differences; and, the psychologist must respect the current standards of care.
2. How does culture affect the validity of standard tests? The instructor might ask persons from any nonmajority culture to discuss how certain standard tests (e.g., individual IQ tests) might be misleading when used with persons from their cultural and linguistic background.
3. Are students aware of cheating on group tests? Although it may be difficult to get students to open up on this topic, most students have second-hand knowledge of cheating. The nature and prevalence of cheating would be an interesting discussion topic. How do students feel about this?
4. Ask students if they think that tests can truly be administered and scored in an ethical and unbiased manner. Why or why not?
Extramural Assignments
1. Have students read the latest version of the Ethical Principles of Psychologists and summarize the main points.
2. Have students find journal articles pertaining to the assessment of cultural and linguistic minorities and summarize the conclusions.
3. Ask students to look up recent findings on the duty to warn principle. In particular, what is the relevance of this principle to a therapist whose client is HIV-positive and also sexually active? Is the therapist obligated, if necessary, to break confidentiality and inform the client’s lover?
4. Students might design a simple, anonymous questionnaire on the nature and prevalence of cheating on tests and administer it to classmates.
5. There are numerous free web-based tests that claim to test various individual characteristics. Ask class members to pick one of these brief free online tests and give