Standard Five Compendium 4 - RA5.2 Reliability and Validity of EPP Assessments by Michele Brewer

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

Alignment to National Standard: CAEP A5.2 Data Quality The provider’s quality assurance system from R5.1 relies on relevant, verifiable, representative, cumulative, and actionable measures to ensure interpretations of data are valid and consistent.

How Alignment is assured: The Assessment Coordinator in consultation with Program/Discipline Chairs, aligns the evaluation measures and assessment tasks with CAEP, InTASC, and appropriate Technology Standards. The Assessment Coordinator maintains alignments and adherence to multiple Louisiana state laws and policy regulations. All Standards have been maintained utilizing Watermark – Taskstream. This standards database is maintained by the Assessment Coordinator so that alignments can accommodate updates to standards, program competencies, courses, or assessments.

Evidence Overview

Evidence for this compendium will be presented in the following manner (1) Process for conducting validity and reliability study, (2) Presentation of Reliability Evidence, and (3) Presentation of Validity Evidence. Evidence will document that the EPP-created assessment has met the minimal 80% or .80 or above to establish content validity, and 75% or .75 or above to establish inter-rater reliability or agreement.

Evidence and Analysis

Process for Designing and Developing Assessments: Once an evaluation measure has been established, program leads work with a team of subject matter experts to create individual activities, assessment prompts, and the associated rubric that are all aligned with the SPA standards. Because they offer subject matter expertise to guarantee that content validity is embedded into the final design, the team of SMEs is a crucial component of this process.

The team concentrates on the rubric after completing the work instructions. GSU has basic rules for each of the rubric levels and how they should be constructed, even though the content specifies the precise criteria for each level under a particular rubric element. These definitions are listed in Table 1.

Table 1: Performance Indicators- Descriptions

Performance Indicators - Descriptions

Novice

This rating is equivalent to having emerging performance skills/content knowledge that can be enriched with additional coursework.

Effective: Emerging This rating is equivalent to having the performance skills/content knowledge needed to move forward into student teaching; however, additional remediation might be needed to hone the candidate's performance.

Effective: Proficient (Target)

This rating is equivalent to having the performance skills/content knowledge needed to be an effective student teacher where additional skills will be practiced.

Highly Effective

This rating is equivalent to having the performance skills/content knowledge needed as a highly effective first year teacher.

Template for the Presentation of Evidence by Dr. Michele Brewer and Dr. Amber Vraim is licensed under Attribution 4.0

International

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

There was a small section labeled "Assurance of Reliability and Validity" in each of the evidence items tagged to Standards 1 and 4 that includes information from assessments made by GSU. Also, the Constant Improvement/Actionability of Outcomes section of Standard Five Compendium 3 underlines how data insights are also actionable at GSU. We have a systematic approach to data quality in our EPP. We typically set validity reviews to every three years unless substantive changes are made to the instrument. As well, reliability will be explored every 3 years unless there are changes to the evaluators/instructors in the courses (Data Quality Review Table).

Program leads and faculty (SMEs) participate in training and calibration exercises to make sure that evaluators are using and interpreting rubrics in a consistent manner, which is necessary to ensure interrater reliability regarding the consistency of evaluating candidate performance on assessments (IRR and Norming Training). All evaluators in that area utilize the scoring rubric to evaluate a particular candidate submission that is selected during calibration. In order to ensure consistency among raters, evaluators receive personalized feedback to help them understand where they converge with and diverge from the broader team.

Faculty members are also occasionally chosen t o take part in a formal inter-rater reliability study. The same pre-selected work sample from a course that the faculty members actively teach is scored individually for this study by other members of the faculty. Internal and external subject-matter and content experts are invited t o participate in content validity studies of common, EPP-created key assessments on a 3-year cycle or following instrument or description revisions

Formal content validity and reliability studies are conducted electronically via Google Forms surveys using the format presented by Drs. Monaco and Horne at CAEPCon Spring 2022 (Monaco & Horne, 2022). Reliability study forms (Sample- ED 545: Action Research Implementation Assessment Reliability Study Form) provide student work samples to teams of faculty members along with assessment rubrics and assignment directions Percent Agreement is calculated using the scores of the faculty members to evaluate the amount of inter-rater reliability. GSU seeks 75% or higher agreement (Sample- ED 545: Action Research Implementation Assessment Percentage of Agreement Worksheet). Newly revised assessments are piloted upon completion of the reliability and validity studies and are adopted following review by the QAS Review Panel.

References

Monaco, M., Ph. D., & Horne, E. T., Ph. D. (2022, March 9). Data Quality: Deconstructing CAEP R5.2 [Conference Presentation]. CAEP. https://caepnet.org/

Template for the Presentation of Evidence by Dr. Michele Brewer and Dr. Amber Vraim is licensed under Attribution 4.0

International

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

Reliability of Assessments (Initial Programs)

Compendium CAEP Standard Assessment Inter-Rater Reliability

Standard One

Compendium 2 –Applications of Data Literacy

Standard One

Compendium 2

–Applications of Data Literacy

RA1.1

ED 505: Analysis of Reading Difficulties- Word Study

Reliability study results for this assessment show that the percent of agreement, across all programs, was as follows:

AY 2022-23: 85%

Standard One

Compendium 3 –Collaborative Activities

RA1.1

ED 545: Evaluation and Assessment of P-12 Students in Educational Settings- Action Research Implementation Assessment

Reliability study results for this assessment show that the percent of agreement, across all programs, was as follows:

AY 2022-23: 86%

Standard One

Compendium 4 –Use of Research

RA1.1

SPED 542: Methods & Materials for Teach Children with Exceptional Learning Needs- Inclusive Lesson Planning

Reliability study results for this assessment show that the percent of agreement, across all programs, was as follows:

AY 2022-23: 100 %

Standard R1

Compendium 2

Application of Content

Standard R1

Compendium 3

Instructional Practices

RA1.1

ED 549 Introduction to Techniques of ResearchAction Research Proposal Assessment

Reliability study results for this assessment show that the percent of agreement, across all programs, was as follows:

AY 2022-23: 85%

R1.2, R1.3

Praxis II SPED Praxis 5543 (Proprietary)

Proprietary assessment scored outside of GSU by ETS.

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

Reliability of Assessments (Initial Programs)

Compendium CAEP Standard Assessment Inter-Rater Reliability

Standard One Compendium 5 –Provider Responsibilities

RA1.2 SPED 543: Humanistic Approaches- Behavioral Intervention

Reliability study results for this assessment show that the percent of agreement, across all programs, was as follows:

AY 2022-23: 100%

RA1.2 ED 505 Analysis of Reading Difficulties- Informal Reading Inventory

Reliability study results for this assessment show that the percent of agreement, across all programs, was as follows:

AY 2022-23: 90%

The EPP uses the proprietary Educator Dispositions Assessment developed by Almerico, Johnston & Wilson (2015). The ratings were completed for each candidate by two reviewers who knew or had taught the candidate. Those ratings were then compared for interrater agreement using an internet site: https://calculator.academy/interrater-reliability-calculator/ . Over the past three (3) years, we obtained the following data:

Standard R1 Compendium 4Professional Responsibility

R1.4 Educator Disposition Assessment

Fall 2020 N = 1 IRR = 78% Reading

Fall 2021- Spring 2022 N = 1 IRR = 78% Reading

Fall 2022-Spring 2023 N = 4 IRR range 67-78%*

Reading

N = 1 IRR range = 88% Special Education

* even though the ratings for 2 candidates in 2022 was 67%, the difference was generally between ranking the candidate as “Meets Expectations”(3) vs. “Exceeds Expectations”(4). Only one score where there was a disagreement between scorers was between “Developing”(2) vs. “Meets Expectations”(4).

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

Validity Evidence: CAEP recommends establishing content validity using Lawshe’s approach. To determine the content validity of EPP created assessments, GSU uses a panel of subject matter experts (SMEs) to determine how well the elements included within the assessment align with the intended outcomes. Using the Lawshe Method (recommended by CAEP), SMEs are provided with a copy of the assessment’s directions and rubric. They are then asked to determine if each element is essential, useful but not essential, or not necessary (Sample- ED 545: Action Research Implementation Assessment Content Validity Study Form. The content validity ratio (CVR) is calculated for each element and the content validity index (CVI) is calculated for the instrument using a an Excel worksheet (Sample- ED 545- Action Research Implementation Assessment CVR and CVI Outcomes) formatted with the following formulas:

CVR = (ne – n/2)/(n/2)

S-CVI (R)

Feedback from the experts is reviewed and discussed by program leads and faculty members to discuss what modifications and updates might be necessary, particularly for those items or instruments that fail to meet the acceptable CAEP Sufficiency of Evidence Standards.

Validity of GSU Assessments Initial Programs (ITP)

Validity study results for this assessment are as follows:

Standard One Compendium 2 –Applications of Data Literacy

ED 505: Analysis of Reading DifficultiesWord Study

Items 4, 11, and 12 do not meet content validity with a CVR of .60

Template for the Presentation of Evidence by Dr. Michele Brewer and Dr. Amber Vraim is licensed under Attribution 4.0 International

Compendium CAEP Standard Assessment Content Validity

RA1.1

Assessment Element CVR ILA 3.1 .67 ILA 3.2 1.00 ILA 4.1 1.00 ILA 5.1 .67 ILA 5.3 1.00 ILA 7.1 1.00 ILA 7.4 1.00 ILA 3.4 1.00 Content of Presentation 1.00 Interpretation of Classroom Composite and Feature Guide 1.00 Summaries of Assessment Results .33

of Visual Aids... .67

and Conclusions 1.00

Use

Citation

CVI= .86

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

Validity of GSU Assessments Initial Programs (ITP)

Template for the Presentation of Evidence by Dr. Michele Brewer and Dr. Amber Vraim is licensed under Attribution 4.0

International "College of Education

of Technology, Assessment, and Compliance:

Office

Compendium CAEP Standard Assessment Content Validity Standard

Compendium

Applications of Data Literacy RA1.1 ED

Implementation Assessment

One

2 –

545: Evaluation and Assessment of P12 Students in Educational SettingsAction Research

Assessment Element CVR Academic Honesty 1.00 APA Format 1.00 Methods to Remain Current... 1.00 Make Responsive Adjustments... 1.00 Methods Identified and Implemented 1.00 Appropriateness of Research... 1.00 Act ethically...Complete Citiprogram Training 1.00 Act Ethically... Informed Consents 1.00 Select and Use Technically Sound Instruments...Pretest Information 1.00 Select and Use Technically Sound Instruments...Posttest Information 1.00 Interpret Information... 1.00 CVI= 1.00 Standard

Compendium

Collaborative Activities Standard One Compendium

–Provider Responsibilities RA1.1, RA1.2 SPED

Lesson Planning

Validity study results for this assessment are as follows:

One

3 –

542: Methods & Materials for Teach Children with Exceptional Learning Needs- Inclusive

Assessment Element CVR Description of Student... 1.00 Planning for Instruction...Develop long and Short Range Plans... 1.00 Planning for Language Needs 1.00 Planning- Provides Foundation... 1.00 Planning- Select, Adapt, and Use [Materials]... 1.00 Knowledge of the Discipline 1.00 Management- Use of EvidenceBased Strategies... 1.00 Management- Use of Direct Motivational and Instructional interventions... 1.00 Instruction- Knowledge of Structure and Content of Discipline 1.00

Validity study results for this assessment are as follows:

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

Template for the Presentation of Evidence by Dr. Michele Brewer

Dr.

Vraim is licensed under Attribution 4.0 International "College of Education Office of Technology, Assessment, and Compliance: Template for the Presentation of Evidence." Copyright 2020 by Wilmington University. Validity of GSU Assessments Initial Programs (ITP) Compendium CAEP Standard Assessment Content Validity Instruction- Planning to Foster Active Engagement 1.00 Instruction- Enhance Language Development... 1.00 Instruction- Modify Based on Ongoing Analysis... 1.00 Instruction- Enhance the Learning of Critical Thinking... 1.00 Instruction- Individualizing Instruction... 1.00 Instruction- Multiple Types of Assessment... 1.00 Professional Development- Ethical Considerations... 1.00 Professional Development- Lifelong Learners... 1.00 Professional Development- Lifelong Learning .33 Professional Development- Current with Evidenced-Based Practices 1.00 Written and/or Oral Communication Skills 1.00 CVI= 1.00 Item 17 does not meet content validity with CVR of .60

and

Amber

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

Template for the Presentation of Evidence by Dr. Michele Brewer and Dr. Amber Vraim is licensed under Attribution 4.0 International "College of Education Office of Technology, Assessment, and Compliance: Template for the Presentation of Evidence." Copyright 2020 by Wilmington University. Validity of GSU Assessments Initial Programs (ITP) Compendium CAEP Standard Assessment Content Validity Standard One Compendium 4 –Use of Research RA1.1 ED 549 Introduction to Techniques of Research- Action Research Proposal Assessment Validity study results for this assessment are as follows: Assessment Element CVR Academic Honesty 1.00 Appropriateness of Interest and Treatment 1.00 Bibliographic Citations 1.00 Bibliographic Research Choices... 1.00 Candidates Use Foundational Knowledge 1.00 Candidates Use the Evidence-Based Research... 1.00 Candidate Includes Feasible Plans for Collecting ...Data 1.00 Style and Format...APA Guidelines 1.00 Human Subjects Training...Citiprograms 1.00 Written and/or Oral Communication Competence 1.00 CVI= 1.00 Standard One Compendium 5 –Provider Responsibilities RA1.2 Praxis II SPED Praxis 5543 (Proprietary) Proprietary Assessment: Link to ETS validity study Standard One Compendium 5 –Provider Responsibilities RA1.2 SPED 543: Humanistic ApproachesBehavioral Intervention Validity study results for this assessment are as follows: Assessment Element CVR Use Knowledge of Development... 1.00 Description of Current Learning EnvironmentAcademically 1.00 Description of Current Learning EnvironmentSocially 1.00 Pre-Intervention: Multiple Informal 1.00

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

Validity study results for this assessment are as follows:

of GSU Assessments Initial Programs (ITP) Compendium CAEP Standard Assessment Content Validity Observational Assessments... Pre-Intervention: Candidate Uses Technically Sound Instruments...Pretest Information 1.00 Use of Knowledge of Measurement and Practice to Interpret Assessment... 1.00 Post-Intervention: Used the Same Formal Assessments.... 1.00 Reflection and Analysis of Outcome Data 1.00 Incorporation of PeerReviewed Research 1.00 Communicate the Needs of Students... 1.00 Explanation band Knowledge of Signs od=f ...Crisis... 1.00 Presentation of Procedures and Services in School... 1.00 Documentation of Resources Available... 1.00 Use of Written Language Effectively 1.00 CVI= 1 Standard One Compendium 5 –Provider Responsibilities RA1.2 ED 505 Analysis of Reading DifficultiesInformal Reading Inventory

Validity

Assessment Element CVR Student Information 1.00 Summary of Results 1.00 Analysis of Results 1.00 ILA 3.1 1.00 ILA 3.4 1.00 ILA 4.1 1.00 ILA 5.1 1.00

Standard

Grambling State University Standard Five Compendium 4

RA5.2 Reliability and Validity of EPP Assessments

Standard Three

Continuous Improvement

Questions or topics are explicitly aligned with aspects of the EPP’s mission and also CAEP, InTASC, national/professional, and state standards. Individual items have a single subject; language is unambiguous. Leading questions are avoided. Items are stated in terms of behaviors or practices instead of opinions, whenever possible. Surveys of dispositions make clear to candidates how the survey is related to effective teaching.

Educator Disposition Assessment was presented at the CAEP Conference (during September 17-19, 2015 in Washington, D. C.). The session was entitled, Educator Disposition Assessment: A Research-Based Measure of Teacher Dispositional Behaviors by the University of Tampa. They indicated that the instrument has already gone through the validity processes.

Focus Area(s): GSU will continue to follow its schedule for reviewing EPP-created assessments to establish content validity and inter-rater reliability percentages that meet and/or exceed the requirements to meet CAEP sufficiency of evidence.

Template for the Presentation of Evidence by Dr. Michele Brewer and Dr. Amber Vraim is licensed under Attribution 4.0

International "College of Education Office of Technology, Assessment, and Compliance:

2020

Validity of GSU Assessments Initial Programs (ITP) Compendium CAEP Standard Assessment Content Validity ILA 6.1 1.00 Recommendations: Relevant Strategies... 1.00 Citation and Conclusion .33 Classroom Presentation 1.00 Assessment and Evaluation .33 Diversity .33 CVI= .89

Template for the Presentation of Evidence." Copyright

by Wilmington University.

Competency at Completion RA3.4 Disposition Survey

Three Compendium 3 –

Compendium

–Competency at Completion RA3.4 Educator Disposition Assessment