Overview

The Individuals with Disabilities Education Act (IDEA) requires that each state have a State Performance Plan/Annual Performance Report (SPP/APR) that evaluates the state’s efforts to implement the requirements and purposes of IDEA. States must report annually to the Secretary of Education on its performance. In June 2014, the U.S. Department of Education introduced a new framework known as Results-Driven Accountability (RDA), which also included educational results and outcomes for students with disabilities in making each state’s annual determination under the IDEA. Beginning in 2015, each state submitted an SPP/APR that covered the six-year period for Federal Fiscal Years (FFYs) 2013 through 2018. It included a new performance indicator (Indicator 17), the State Systemic Improvement Plan (SSIP). (There are now 17 RDA performance indicators, some of which are Indicator 2 [Dropout Rates], Indicator 4 [Suspensions and Expulsions], Indicator 5 [Participation/Time in General Education Settings—Least Restrictive Environment], and Indicator 8 [Parental Involvement].)

The SSIP is a comprehensive, multiyear plan that outlines a state’s strategy for improving results for children with disabilities. The U.S. Department of Education required that each state’s SSIP focus on a State Identified Measurable Result (SIMR) that identified outcome targets for children with disabilities. The SIMRs must address one of the indicators which address child or student outcomes. According to the U.S. Department of Education (2021), “the most common SIMRs address child-specific results such as performance on assessments (B-3).”

Across states, SIMR targets that are assessment related included performance data for state summative assessments administered for Elementary and Secondary Education Act (ESEA) accountability as well as other assessments (e.g., interim assessments, etc.). In this report, interim assessments refer to commercially-produced assessments (for example, DIBELS, AIMSweb, etc.) that are administered several times during a school year to measure student progress. Other terms that are sometimes used to describe these assessments are local assessments and formative assessments.

Some states include interim assessments as part of the SSIP’s evaluation plan as a measure of progress toward their SIMR and may include data from these assessments in their SSIP annual reports. The instructions in the Part B Measurement Table for Indicator 17 (the SSIP), which provides information regarding monitoring priorities, data sources and measurement instructions for the submission of the SSIP to the Office of Special Education Programs (OSEP), indicates that: “The State must report on whether the State met its target. In addition, the State may report on any additional data (e.g., progress monitoring data) that were collected and analyzed that would suggest progress toward the SIMR” (U.S. Department of Education, 2019, p. 23).

This report presents the findings of an analysis of states’ FFY 2018 SSIPs. This analysis examined how assessments are included in states’ SIMRs. For states with assessment-related SIMRs, the SSIP evaluation plans were also analyzed to see how assessments were being used for evaluation and reporting.

Table of Contents

Method

In fall 2020, the National Center on Educational Outcomes (NCEO) conducted an analysis of states’ FFY 2018 SSIP documents (submitted to OSEP in April 2020) to learn more about how assessments were included in the SSIPs of states with assessment-related SIMRs. The SSIPs of both regular states (e.g., Alabama, Wyoming, etc.) and unique states (e.g., American Samoa, Federated States of Micronesia, etc.) were analyzed. In this report, the term states refers to both regular and unique states.

For each state that had an assessment-related SIMR, the full-text of the SIMR was identified, copied into a data file, and analyzed by an NCEO staff member. The following information was then coded from each state’s SIMR: (a) content area of the SIMR; (b) grade levels covered by the SIMR; (c) population included in the SIMR (children with disabilities statewide or in a subset of schools, districts, or subgroups of students—e.g., specific disability categories, specific racial/ethnic groups); and (d) assessments used for outcome measures, including state general and alternate assessments used for ESEA accountability and other assessments. States could choose to identify a SIMR based on a specific target grade, multiple grades, or decide not to address (limit) the grade levels included. For this analysis, states that did not specify limitations on grades were presumed to address all grades and, therefore, were coded as grades K,1,2,3,4,5,6,7,8, HS. NCEO staff also analyzed the text in the SSIPs to identify which states included assessments in their evaluation plans. For states that included assessments in their evaluation plan, the names of the interim assessments used were also compiled, as well as any assessment-related data limitations identified by states in their SSIP.

To help ensure data accuracy, a second NCEO staff member coded a random selection of 25 percent of SSIPs included in this analysis. The two researchers then met and reconciled any coding differences. After the data were compiled and coded, researchers analyzed and summarized the findings.

Table of Contents

Results

This section presents the results of the analysis of state SIMRs that were assessment-related. The SIMR for each of these states is shown in Appendix A, Table A1. SIMRs are typically short statements that describe the desired outcome of the state’s SSIP, as well as the measure that will be used to determine whether the outcome was achieved. For example, the Oregon SIMR stated:

To increase the percentage of third grade students with disabilities reading at grade level, as measured by state assessment.

Other examples include:

Iowa: Increase the percentage of learners with IEPs who are proficient readers by the end of third grade, as measured by the Formative Assessment System for Teachers (FAST).

Puerto Rico: To increase the percentage (%) of special education students in the 5th grade who score proficient or advanced on the math regular assessment in the participating schools (all elementary schools from the former Yabucoa School District).

Rhode Island: To improve mathematics achievement (on the statewide assessment) by 4% for students with specific learning disabilities (SLDs) who are Black or Hispanic/Latino in Grades 3-5 by 2018-19.

Texas: Increase the reading proficiency for all children with disabilities in grades 3-8 against grade level and alternate achievement standards, with or without accommodations.

Results presented in this report are organized into five sections: (1) Content Areas, (2) Grade Levels, (3) Target Student Population, (4) Assessment Type, and (5) Data Limitations.

Content Areas

As shown in Figure 1, forty-three states out of 60 regular and unique states had SIMRs related to improving academic achievement in reading/English language arts (ELA) or mathematics. Of the forty-three states with assessment-related SIMRs, the majority targeted ELA (36 states), with approximately one-fifth of the states (eight states) having SIMRs that addressed mathematics. One state included both ELA and math in its SIMR. That state (California) was counted as having both an ELA and a math SIMR. For additional details see Appendix A, Table A2.

Figure 1. Number of States with Specified Content Area in SIMR

N=43
Note: One state (California) included both ELA and Math in SIMR. California was counted both for ELA and Math.

Grade Levels

As indicated in Figure 2, for states with a SIMR focusing on ELA, grade 3 was the grade most commonly included in the SIMR. Thirty-four states had a grade 3 ELA SIMR. Grade 4 and grade 5 were the second and third most commonly included grades, with 14 and 12 states, respectively. For additional details see Appendix A, Table A2.

Figure 2. Number of States with Specified Grades in ELA SIMR

N=43

Figure 3 shows that, for states focusing on math in their SIMR, six states included grade 5. It was the most frequently selected grade. Grade 3 and grade 4 were the second and third most commonly included grades with five states each. Most states included multiple grades in their math SIMRs.

Figure 3. Number of States with Specified Grades in Math SIMR

N =43

Across both ELA and mathematics, the vast majority of states identified grades between 3 and 8 for the focus of their SIMRs with many fewer including either early elementary school or high school. For additional details see Appendix A, Table A2.

Target Student Population

States had the option of developing a statewide SIMR for all students, for all students with disabilities, or a smaller group specifying a smaller group of districts, schools, or subgroups of students.

Figure 4 indicates that 35 states included students with disabilities in all disability categories in their SIMR, while eight states SIMRs focused on selected disabilities categories (e.g., emotional disability, selected learning disability, speech language impairment). Three states indicated that their SIMR included all students. For example, Indiana’s SIMR included “all third grade students, including those with disabilities.”

One state (Rhode Island) had a SIMR focused on specific racial and ethnic groups. Its SIMR was focused on “students with specific learning disabilities (SLDs) who are Black or Hispanic/Latino.”

Figure 4. Number of States that Included Specified Study Groups in SIMR

N=43

Note: The three states in the “all students” category included two states whose SIMRs included “all students” (Federated States of Micronesia, Indiana) and one state that included “students with and without disabilities” (Palau). These three states are also included in the count for the “students with disabilities (all categories)” group.

Some states indicated in the SIMR language whether the population of interest was statewide or a subset of schools or districts, while other states indicated the targeted population elsewhere in the SSIP. Figure 5 shows that 10 states with assessment-related SIMRs had SIMRs that included only statewide data. Twenty-six states limited their SIMR targets to only a subset of schools or districts, and seven states included statewide data and a targeted set of data. For those targeting a subset, the sample ranged from including a single target school (Palau) to several that included multiple school districts (e.g., Arizona, New Mexico). Some states (seven states) included with statewide SIMRs also targeted a sample. For additional details see Appendix A, Table A2.

Figure 5. Number of States that Included Specified Study Groups in SIMR

Assessment Type

Figure 6 shows that 35 states used their state summative ELA or math assessment as an outcome measure. Nine states listed using assessments other than the state summative assessment as an outcome measure for their SIMR. Eight of those states listed a different assessment measure instead of the state summative assessment, and one state listed a different assessment measure in addition to using the state summative assessment. Non-state assessments listed in SIMRs included DIBELS Next and curriculum-based measures. Additional details are in Appendix A, Table A3.

Figure 6. Number of States that Included Specified Assessments in SIMR

N=43
Note: One state (Ohio) included both the state general assessment used for ESEA accountability and another N=43 assessment in its SIMR. It was counted as having both a state assessment and another assessment.

As shown in Figure 7, among the states using the statewide assessment, all 35 states identified using the state general assessment as a measure for the outcome identified by their SIMR. Seven of these states explicitly stated that their SIMR included data from both the state general and the state alternate assessment. No states identified using only the alternate assessment as their outcome measure.

Figure 7. Number of States Whose SIMR Measure Was the State Test that Included the State General Assessment and State Alternate Assessment in the SIMR

N=35
Note: All states that included the alternate assessment in its SIMR also used the general assessment as well.

About four-fifths of states (35 states) with assessment-related SIMRs included one or more interim assessments in their evaluation plans (see Figure 8). Many of these states indicated that the interim assessment was included as a measure of progress toward the SIMR target. Most of these states did not have a SIMR that used an interim assessment as an outcome measure.

Figure 8. Number of State SSIPs that Included an Interim Assessment as a Measure of Progress Toward the SIMR

As shown in Figure 9, STAR (Reading, Early Literacy, or unspecified) (seven states), Acadience, DIBELS, or DIBELS Next (seven states), AIMSweb or AIMSweb Plus (seven states), and NWEA (seven states), and I-Ready (three states) were the interim assessments most frequently included in states’ SSIP evaluation plans. Additionally, 14 states included other specified interim assessments in their evaluation plans, and 12 states included other unspecified assessments in the plans. For additional details see Appendix A, Table A3.

Figure 9. Assessments Included in SSIP Evaluation Plans

Data Limitations

Many states identified assessment-related data challenges and limitations in their SSIPs. Some of the states identified issues related to the state tests used to measure outcomes in their SIMR; others identified data limitations that were related to the interim assessment data included in the SSIP evaluation plans.

State tests. Some states that used the state test used for ESEA accountability as the SIMR measure noted that the state test had changed across the years, which created challenges for determining whether the state was meeting targets. For example, Wyoming’s SSIP said:

One complicating factor to examining state test data over time is that during the 2017-2018 school year, the WDE adopted the Wyoming Test of Proficiency and Progress (WYTOPP) as the new state assessment. This means that any increase or decrease in reading proficiency rates from 2016-17 to 2017-18 could be a function of the new test and not a function of any real increase or decrease in actual reading achievement (p. 13).

A few states noted that some parents opted their child with disabilities out of participation in the state test used for ESEA accountability. Since these assessment data were also used as the SIMR outcome measure in these states, the states asserted that parent opt out could have potentially had an effect on the validity of the reported data. For example, according to the Oregon SSIP:

Oregon law permits students to opt out of participation in summative assessments, contributing to varying rates of district participation. The SEA did not examine summative assessment participation rates when examining statewide assessment data. It may not be accurate to draw conclusions about the performance of all grade three students with disabilities in Oregon when assessment participation rates varied among districts (pp. 30-31).

The 2015 reauthorization of ESSA, known as the Every Student Succeeds Act (ESSA), requires that no more than 1% of students participate in the alternate assessment based on alternate academic achievement standards (AA-AAAS). States have been working with their schools and districts to more appropriately identify students for participation in the AA-AAAS, so a state that excluded students participating in the AA-AAAS from its SIMR (Idaho) asserted that the population shift as a result of students shifting from the AA-AAAS to the general assessment might affect outcomes. Idaho noted:

This large decrease in Idaho Alternate Assessment (IDAA) ELA participation and subsequent increase in regular assessment participation by lower performing students significantly impacted the ELA proficiency rate of Cohort 1 (p. 36).

Interim Assessments. Many states identified challenges related to the use of interim assessment data. States indicated that data challenges occurred because different districts or schools used different interim assessments. This made it challenging to aggregate data across districts. For example, according to Maryland’s SSIP:

Universal screening and progress monitoring data used by districts vary from one local jurisdiction to another; and sometimes across years within one district or across grades within a year. This makes it impossible to aggregate those data for any analyses or to examine trends over time (p. 41).

Some states indicated that not all districts participating in SSIP activities actually used interim assessments, which created an additional challenge for aggregating data. Similarly, some states indicated that some districts used interim assessments, but did not report the data to the state. For example, the Arizona SSIP stated:

Arizona does not mandate administration of PEA [Public Education Agency] benchmarks to assess student progress towards the Arizona English Language Arts Standards 2016. The SEA requests this data from the SSIP PEAs to assist in driving decisions, however the statewide assessment is the only mandated, consistent data source for the SEA to use for the collection of literacy data. As such, some inconsistency is evident in reported benchmark data, including missing benchmark data from PEAs that have either opted out of the benchmarking process, or opted out of the reporting of benchmarks (p. 39).

Still other states indicated that their data systems lacked the capacity to handle interim assessment data. For example, South Carolina (which uses the term “formative assessments,” instead of interim assessments), said in its SSIP:

South Carolina continues to struggle with the lack of an easily accessible, student-level data system to collect and report formative assessment data at the state level (p. 21).

For additional details about the data limitations identified by states with assessment-related SIMRs, see Appendix A, Table A4.

Table of Contents

Discussion

A key reason the U.S. Department of Education shifted to results driven accountability (RDA) in 2014 was to improve outcomes for students with disabilities. In 2020 the SPP/APR was extended for another year to include the FFY 2019 SPP/APR, due February 2021. Additionally, in 2020, OSEP released a new SPP/APR measurement package for FFY 2020–2025. States will be refining their SSIPs as they move into the new five-year cycle. The first submission in this new cycle will be for FFY 2020 which is due February, 2022. States may continue to use their current SIMR or identify a new one:

If a State is continuing to implement its current SSIP and has not identified a new State Identified Measurable Results (SIMR), then information previously reported does not need to be reported in the FFY 2020 submission.
States may continue the SIMR that was identified in the previous SPP/APR. In the FFY 2020 report due February 1, 2022, all States must set targets for FFY 2020–FFY 2025. Alternatively, States may choose to change their SIMR. States that change their SIMR for the FFY 2020–FFY 2025 SPP/APR must provide baseline data (in addition to FFY 2020–FFY 2025 targets). Although States are encouraged to discuss changes to their SSIP with their OSEP State Lead, and work with their TA providers, States will not be required to obtain pre-approval from OSEP in order to change their SIMR. Questions or concerns about a State’s new or revised SIMR will be documented in OSEP Response and addressed with the State during the SPP/APR clarification period (U.S. Department of Education, 2021).

States that choose to change the SIMR focus should provide details in the FFY 2020 report regarding the system analysis, data analysis, and stakeholder engagement activities that were conducted to reach the decision to change. Additionally, States should report on the infrastructure improvement activities/coherent improvement activities from previous SSIP activities that it will leverage to improve the new outcome or result area as well as any newly identified system components and evidence-based practices.

This analysis provided insights into how states are currently including assessments in their SSIPs. It is important to understand the myriad of ways assessment data, including interim assessment data, are contributing to the tracking and evaluation of states’ SSIP efforts to help inform data interpretation, possible revisions to SSIPs and SIMRs, policy decisions—and ultimately improve outcomes for students with disabilities.

Table of Contents

References

U.S. Department of Education. (2019). Part B state performance plan (SPP) and annual performance report (APR) part B indicator measurement table. Office of Special Education Programs. https://osep.grads360.org/services/PDCService.svc/GetPDCDocumentFile?fileId=39973

U.S. Department of Education (2021). Part B SPP/APR instructions. Office of Special Education and Rehabilitative Services. https://sites.ed.gov/idea/grantees/#SPP-APR

Table of Contents

Appendix A

Analysis of States’ FFY 2018 SSIP (Indicator 17) Submission

This appendix contains findings of the analysis of states’ 2020 SSIP submission. The SSIPs were submitted to OSEP in April, 2020 and are part of states’ State Performance Plan/Annual Performance Report (APR) (SPP/APR). The SSIPs are Indicator 17 in these reports.

States’ SSIPs are available at https://sites.ed.gov/idea/spp-apr-letters. To locate the 2020 SSIP for a specific state:

Select: Part B, 2020, [state]
Then click on MS Word. Next scroll to end of the document to Indicator 17
Click on the pdf for the SSIP (Indicator 17)

Table A1. SIMRs of States with Assessment-related SIMRs, FY2018 SSIPs

State	SIMR
American Samoa	Increase the percentage (%) of students with disabilities who will be proficient in reading as measured by Standard Base Assessment (SBA) in the third grade (3rd grade) on the three pilot schools that are implementing the Dual Language Program for students with disability. (p. 2)
Arizona	Targeted PEAs (Public Education Agencies) will increase the performance of students with disabilities in grades 3–5 on the English/Language Arts (ELA) state assessment from 6.4% to 12.99% by FFY 2019 to meet the State proficiency average for students with disabilities in grades 3–5. (p. 4)
Arkansas	[Increase] percent of students with disabilities in grades 3-5 whose value-added score in reading is moderate or high for the same subject and grade level in the state. (p. 5)
California	California’s SIMR for the SSIP is the performance of all SWD who took the California Assessment of Student Performance and Progress in both English Language Arts and Mathematics during the FFY 2018 school year. (p. 27); FFY 2019 target: 15.6%. (CA clarification submitted to OSEP, p. 1)
Colorado	Students^* in kindergarten, first, second and third grades who are identified at the beginning of the school year as Well Below Benchmark according to the DIBELS Next Assessment, will significantly improve their reading proficiency as indicated by a decrease in the percentage of students who are identified at the end of the school year as Well Below Benchmark. (p. 8) ^*Who attend one of the 17 SSIP project schools
Commonwealth of the Northern Mariana Islands	By June 30, 2020, at least 55% of 3rd grade students with IEPs in three target schools will perform at or above reading proficiency against grade level and alternate academic achievement standards as measured by the state assessment. (p. 2)
Connecticut	To increase the reading performance of all third-grade students with disabilities (SWD) statewide, as measured by Connecticut’s English Language Arts (ELA) Performance Index. (p. 20)
Delaware	To increase the literacy proficiency of students with disabilities in K-3rd grade, as measured by a decrease in the percentage of 3rd grade students with disabilities scoring below proficiency on Delaware’s statewide assessments. (p. 1)
Federated States of Micronesia	Increase English literacy skills of all students in ECE through Grade 5 in the FSM, with a particular focus on students identified as having a disability. (p. 1)
Guam	There will be an increased percent of students with disabilities in the 3rd grade that will be proficient in reading in the four participating schools as measured by the district-wide assessment. (p. 1)
Hawaii	1. [Increase] the percentage of 3rd and 4th grade students, combined, with eligibility categories of OHD, SLD, and SoL who are proficient on the Smarter Balanced Assessment (SBA) for English Language Arts (ELA)/Literacy; and 2. [Increase] the Median Growth Percentile (MGP) of 4th grade students with eligibility categories of OHD, SLD, and SoL on the SBA for ELA/Literacy. (p. 6)
Idaho	Increase the percent of fourth-grade students with disabilities in Idaho who will be proficient in literacy as measured on the state summative assessment, currently ISAT by Smarter Balanced. (p. 35)
Illinois	The percentage of 3rd grade students with disabilities who are proficient or above the grade level standard on the state English-language arts assessment will increase. (p. 5)
Indiana	Indiana will increase reading proficiency achievement on the Indiana Reading Evaluation and Determination (IREAD-3) assessment by at least .5% each year for all third grade students, including those with disabilities attending elementary schools participating in the Indiana SSIP Initiatives. (p. 3)
Iowa	Increase the percentage of learners with IEPs who are proficient readers by the end of third grade, as measured by the Formative Assessment System for Teachers (FAST). (p.1)
Kansas	Increased percentage of students with disabilities in grades K-5 score at grade level in reading as measured by Curriculum-Based Measure General Outcome Measure (CBM-GOM). (p.25)
Kentucky	To increase the percentage of students with disabilities performing at or above proficient in middle school math, specifically at the 8th grade level, with emphasis on reducing novice performance, by providing professional learning, technical assistance and support to elementary and middle school teachers around implementing, scaling and sustaining evidence-based practices in math. (p.1)
Louisiana	Louisiana’s SiMR is to increase ELA proficiency rates on statewide assessments for students with disabilities in third through fifth grades, in eight school systems (SSIP cohort¹) across the state. (p.4)
Maine	Students in grades 3–8 with Individualized Education Programs (IEPs) will demonstrate improved math proficiency as measured by math scores on the statewide Maine Educational Assessment (MEA) in the schools in which teachers receive evidence-based professional development in the teaching of math. To express proficiency as a percent, Maine reports proficiency as follows: Percent = number of grade 3–8 students with IEPs in the identified schools who demonstrate proficiency in math divided by the number of grades 3–8 students with IEPs in the identified schools who are evaluated on the math assessment. (p.1)
Maryland	State-identified Measurable Result (SiMR) or target of our SSIP: Students in grades 3, 4, and 5 will demonstrate progress and narrowing of the gap in mathematics performance. (p. 4)
Michigan²	Current SIMR: The percent of K-3 students with an Individualized Education Program (IEP) in participating schools who achieve benchmark status in reading as defined by a curriculum-based measurement. Data are inclusive of all participating districts in the transformation zone. (2019, p.5) The future SiMR will represent the percentage of target students who score at or above benchmark on the spring Acadience Reading K-6 Composite Score. In addition, MDE will describe progress monitoring results for target students, schoolwide reading performance for all students with and without disabilities, and MTSS implementation data (Reading Tiered Fidelity Inventory, intervention fidelity, DBI fidelity). (2020, p.29)
Mississippi	Increase the percentage of third grade students with Specific Learning Disability and Language/Speech rulings in targeted districts who score proficient or higher on the regular statewide reading assessment to 24 percent by FFY 2018. (p.32)
Missouri	To increase the percent of students with disabilities in grades three to eight and in their tested grade in high school who perform at proficiency levels in English/language arts (ELA) in the Collaborative Work (CW) schools by 6.5 percentage points by FFY 2018. (p. 3)
Nebraska	Increase reading proficiency for students with disabilities at the 3rd grade level as measured by the statewide reading assessment. (p. 6)
Nevada	The Nevada Department of Education will improve the performance of third-grade students with disabilities in Clark County School District on statewide assessments of reading/language arts through building the school district’s capacity to strengthen the skills of special education teachers in assessment, instructional planning, and teaching. (p. 3)
New Mexico	By federal fiscal year (FFY) 2018, 42.5% of students with disabilities in 3rd Grade of Cohort 1 in the RAMS [Reading, Achievement, Math, and School-Culture] schools will score benchmark on the End of Year reading accountability assessment. (p. 2)
New York	For students classified as students with learning disabilities in SSIP schools (grades 3-5), increase the percent of students scoring at proficiency levels 2 and above on the grades 3-5 English Language Arts State Assessments. (p. 3)
Ohio	1. The percentage of students with disabilities scoring proficient or higher on Ohio’s third grade English language arts achievement test; and 2. State-identified measurable result 2 (SIMR 2): The percentage of all kindergarten through third grade students who are on track for reading proficiency, as measured by state-approved diagnostic reading assessments. (p. 1)
Oklahoma³	By FFY 2018, Oklahoma will see improved early literacy performance in specific districts in Tulsa County among students with disabilities taking the 3rd grade annual reading assessment. The passing rate (proficiency or above) in Tulsa County will increase from 14.9 percent in FFY 2016 to at least 15.5 percent in FFY 2018. Participating districts will also realize statistically significant improvement in the rate of growth toward proficiency among these students. (p. 2)
Oregon	To increase the percentage of third grade students with disabilities reading at grade level, as measured by state assessment. (p. 1)
Palau	1. Increase percentage of students with and without disabilities in grades 1-3 in the target school performing at the proficient level in the Post-PERA for reading comprehension. 2. Increase proficiency percentage from Pre to Post PERA in reading comprehension for grades 1-3 for students with and without disabilities in the target school. (p. 2)
Puerto Rico	To increase the percentage (%) of special education students in the 5th grade who score proficient or advanced on the math regular assessment in the participating schools (all elementary schools from the former Yabucoa School District). (p. 6)
Rhode Island	To improve mathematics achievement (on the statewide assessment) by 4% for students with specific learning disabilities (SLDs) who are Black or Hispanic/Latino in Grades 3-5 by 2018-19. (p. 1)
South Carolina	To increase the percent of students with disabilities at the end of third grade scoring proficient and above on the statewide assessment in reading. (p. 3)
South Dakota	Students with specific learning disabilities (SLD) will increase reading proficiency prior to 4th grade from 4.84% in spring 2015 to 44.49% by spring 2020 as measured by the statewide assessment. (p. 1)
Tennessee	Increasing by three percent annually the percent of students with a specific learning disability (SLD) in grades 3-8 scoring at or above basic on the statewide English/language arts (ELA) assessment. (p. 6)
Texas	Increase the reading proficiency for all children with disabilities in grades 3-8 against grade level and alternate achievement standards, with or without accommodations. (p. 6)
Utah	To increase the number of students with SLI or SLD in grades 6–8 who are proficient on the Readiness Improvement Success Empowerment (RISE) statewide end of level (mathematics) assessment by 0.25 standard deviations over ten years (or a target proficiency rate of 10.95% in five years [by 2022-2023]). (p. 5)
Vermont	To improve proficiency of math performance for students identified as having an emotional disturbance in grades 3, 4, and 5. (p. 1)
Virgin Islands	To increase the percentage of third grade students with disabilities who score proficient or above on state-wide reading and language assessments. (p. 1)
Washington	To reduce the early literacy achievement gap between kindergartners with disabilities and typically developing peers. (p. 5)
Wisconsin	Increasing literacy achievement for students with individualized education plans (IEPs) in grades three through eight. (p. 3)
Wyoming	The percentage of third grade students with disabilities will increase their state test reading proficiency from 23.63% in 2017-2018* to 29.63% in 2019-2020. (Appendix A)

Note: Page numbers refer to the page where the information is found in each state’s SSIP.

¹Louisiana’s SSIP cohort measures students with disabilities in grades three through five. Each year, new students will enter the cohort (typically in third grade) and will exit the cohort when they move from fifth to sixth grade. Since the SSIP supports educator effectiveness, it tracks the outcomes of the students they directly educate.

²Michigan recently changed its SIMR. It reported data in its FFY 2018 for the previous SIMR, which was listed in its 2019 SSIP. The 2020 SSIP listed the new SIMR.

³Oklahoma’s SIMR included in this report went from FFY 2013-2018; beginning in FY2019 the state shifted to a new SIMR unrelated to assessment in 2019, so the SIMR included in this report is from 2013-2018.

Table A2. Content Area, Student Groups Included, and Grades Included Targeted Sample, 2020 SSIPs

State	Content Area	Student Groups and Grades Included	Targeted Sample & Description
American Samoa	ELA	SWD, Grade 3	3 target schools implementing dual language program for SWD
Arizona	ELA	SWD, Grades 3-5	Targeted PEAs in 3 cohorts
Arkansas	ELA	SWD, Grades 3-5	Targeted schools
California	ELA Math	SWD, Grades 3-8, HS	Statewide
Colorado	ELA	SWD, Grades K-3	Students in 17 SSIP project schools in k-3 who score Well Below Benchmarks on DIBELS Next assessment
Commonwealth of the Northern Mariana Islands	ELA	SWD, Grade 3	3 target schools
Connecticut	ELA	SWD, Grade 3	Statewide with comparison to three cohorts engaged in SSIP-related activities
Delaware	ELA	SWD, Grades K-3 (Assessment in Grade 3)	Statewide, with three cohorts implementing DELI program highlighted
Federated States of Micronesia	ELA	All Students (with focus on SWD) Grade ECE-5	Statewide
Guam	ELA	SWD, Grade 3	4 schools
Hawaii	ELA	SWD who qualified as SLD, OHD, and/or SoL, Grades 3- 4	Statewide
Idaho	ELA	SWD, Grade 4	4 cohorts
Illinois	ELA	SWD, Grade 3	11 transformation zone districts
Indiana	ELA	All students including those with disabilities Grade 3	Statewide including schools participating in the Indiana SSIP Initiatives
Iowa	ELA	SWD Grade 3	Statewide
Kansas	ELA	SWD, Grades K-5	Targeted buildings
Kentucky	Math	SWD, Grade 8	Statewide
Louisiana	ELA	SWD, Grades 3-5	8 School systems (SSIP cohort)
Maine	Math	SWD, Grades 3-8	Subset of schools receiving professional development
Maryland	Math	SWD, Grades 3-5	Statewide
Michigan	ELA	SWD, Grades K-5	Subset of schools participating in professional development
Mississippi	ELA	Students with SLI, Grade 3	Selected districts
Missouri	ELA	SWD Grades 3-8, HS	“Collaborative Work” schools
Nebraska	ELA	SWD, Grade 3	Statewide
Nevada	ELA	SWD, Grade 3	Clark County Schools
New Mexico	ELA	SWD, Grade 3	NM RAMS Schools Cohort 1
New York	ELA	SWD classified SLD, Grades 3-5	SSIP learning sites
Ohio	ELA	SWD, K-3	Statewide data reported, as well as data for 2 cohorts
Oklahoma	ELA	SWD, Grade 3	Participating districts in Tulsa County
Oregon	ELA	SWD, Grade 3	Statewide (but also highlights four cohorts) ,
Palau	ELA	SWD and Students Without Disabilities, Grades 1-3	1 Target School (Koror Elementary School)
Puerto Rico	Math	SWD, Grade 5	Selected schools in the former Yabucoa School District
Rhode Island	Math	SWD with SLD who are Hispanic or Black, Grades 3-5	19 schools across 9 districts in 3 cohorts
South Carolina	ELA	SWD, Grade 3	Selected schools
South Dakota	ELA	SWD with SLD. K-3	15 implementation sites
Tennessee	ELA	SWD with SLD, Grades 3-8	28 participating districts in old cohort (from 2017-18), 20 districts in new cohort (added in 2018-2019)
Texas	ELA	SWD, Grades 3-8	Statewide
Utah	Math	SWD with SLD or SLI, Grades 6,8	Statewide
Vermont	Math	SWD with ED, Grades 3-5	Statewide
Virgin Islands	ELA	SWD, Grade 3	4 pilot schools in 2 districts
Washington	ELA	SWD, Kindergarten	Statewide, three transformational zones initially with plans to expand to additional geographic zones
Wisconsin	ELA	SWD, Grades 3-8	Statewide, with identified transformational zones
Wyoming	ELA	SWD, Grade 3	4 cohorts
Totals
States with assessment–related SIMR = 43	ELA = 36 Math = 8	Student Groups Students with disabilities (all categories) = 35 Selected disability categories = 8 All students = 3¹ Selected groups (e.g., race/ethnicity)=1 Grades (ELA) K = 8 1 = 8 2 = 8 3 = 34 4 = 14 5 = 12 6 = 5 7 = 5 8 = 5 HS =2 Grades (math) K = 0 1 = 0 2 = 0 3 = 5 4 = 5 5 = 6 6 = 3 7 = 2 8 = 4 HS =1	Statewide data = 10 Targeted set of districts or schools = 26 Both statewide data and targeted set of schools or districts =7

Notes: SWD=Students with disabilities; PEA = Public Education Agency; General Assessment = Statewide General Assessment used for ESEA accountability; Alternate Assessment = Alternate Assessment based on Alternate Academic Achievement Standards (AA-AAAS) used for ESEA accountability

Disability Categories: ED = emotional disability, SLD = specific learning disability; SLI = speech language impairment

¹States included in “all students” group were also included in the “students with disabilities (all categories)” group (i.e., Federated States of Micronesia, Indiana, Palau).

Table A3. Assessments Used to Measure SIMR Outcomes and Progress by State, FFY 2018 SSIPs

State	Assessment Used to Measure SIMR Outcomes		Assessments Used to Measure Progress Toward SIMR Outcomes in SSIP Evaluation Plans
State	State Assessment Used for ESEA Accountability	Other Assessment
American Samoa	General Assessment (Standard Based Assessment - SBA)		pre- and post- vocabulary tests in English and Samoan Language (Samoan English Picture Vocabulary Test – SEPVT, Samoan Picture Vocabulary Test – SPVT) Standard Based Assessment (SBA) - DL¹ (p. 28)
Arizona	General Assessment (AzMERIT)		DIBELS (p.36) Benchmark tools determined by each individual PEA (p. 16)
Arkansas	Value-added score (VAS) calculated using General Assessment²
California	General Assessment (CASPP)
Colorado		DIBELS Next	DIBELS Next (p. 8)
Commonwealth of the Northern Mariana Islands	General Assessment (ACT Aspire) Alternate Assessment (Multi-State Alternate Assessment)		General Assessment (ACT Aspire) Star Early Literacy Star Reading (p. 8)
Connecticut	General Assessment (Smarter Balanced) Assessment Alternate Assessment (Connecticut Alternate Assessment -CTAA) (p. 20)		Aimsweb Tests of Early Literacy or Reading DIBELS DIBELS Next and mCLASS NWEA measures of academic progress (MAP) STAR reading assessment I-Reading diagnostic assessment³ (p. 16)
Delaware	General Assessment (Smarter Balanced) Alternate Assessment (Delaware System of Student Assessment (DeSSA) Alternate Assessment) (p. 6)		School screening/benchmark data (p. 29)
Federated States of Micronesia		DIBELS (p. 7)	DIBELS (p. 7)
Guam	General Assessment (ACT Aspire) Alternate Assessment (MSAA) (p. 2)		aimswebPlus⁴ (p. 6)
Hawaii	General Assessment (Smarter Balanced)		i-Ready (p. 47) Planning Reading Tiered Fidelity Inventory (T-TFI) p. 47) STAR (p.30) DIBELS⁵ (p. 28)
Idaho	General Assessment (ISAT by Smarter Balanced)		General Assessment (ISAT) Idaho Reading Indicator (IRI)⁶ Planning and Evaluation Tool for Schoolwide Reading Programs (PET-R) Recognizing Effective Special Education Teachers rubric (RESET) (p. 19)
Illinois	General Assessment (Illinois Assessment of Readiness - IAR)		AIMSweb STAR NWEA MAP (p.14)
Indiana		Indiana Reading Evaluation and Determination (IREAD-3)	General Assessment (IREAD) ISTAR-KR ISPROUT (p.19)
Iowa		Formative Assessment System for Teachers (FAST)	FAST (p.33)
Kansas		A curriculum based measurement- general outcome measurement (CBM-GOM)	A curriculum based measurement- general outcome measurement (CBM-GOM) (p. 36)
Kentucky	General Assessment (Kentucky Performance Rating for Educational Progress - K-PREP)
Louisiana	General Assessment (LEAP 360)		LEAP 360 diagnostic and interim assessments (p. 15)
Maine	General Assessment (Maine Educational Assessment)		Unspecified assessments⁷ (p. 4)
Maryland	General Assessment (Maryland Comprehensive Assessment Program - MCAP)		NWEA MAP⁸ (p. 9)
Michigan		Acadience (formerly DIBELS Next)	NWEA (p. 6)
Mississippi	General Assessment (Mississippi Academic Assessment Program)		STAR NWEA MAP I-Ready (p. 30-31)
Missouri	General Assessment (Missouri Assessment Program)		Common formative assessments⁹ (p. 4)
Nebraska	General Assessment (Nebraska Student Centered Assessment System -NSCAS)		NWEA MAP (p. 8) Teaching Strategies GOLD (TS GOLD) (p. 36)
Nevada	General Assessment (Smarter Balanced)		AIMSweb (p. 37)
New Mexico		Istation (p. 9)	General Assessments (NMSTAMELA) (p. 27) Istation (p. 27, 40)
New York	General Assessment (New York State English Language Arts Assessment)		STAR (p. 26) DIBELS (p. 29) AimsWeb (p. 26) Fountas and Pinnell (p. 26) NWEA (p. 26)
Ohio	General Assessment	State approved diagnostic reading assessment (K-2)	Acadience (formerly DIBELS Next) (p. 21) AIMSweb (p. 21) Kindergarten Readiness Assessment (p. 10)
Oregon	General Assessment (Smarter Balanced)⁹		Screening data (specific assessments not identified) (p. 21)
Oklahoma¹⁰	General Assessment (Oklahoma School Testing Program – OSTP)
Palau	General Assessment (Palau English Reading Assessment – PERA)		Reading Success Network (RSN) English reading screening tool (p. 16) easyCBM (p. 16)
Puerto Rico	General Assessment (Measurement and Evaluation for Academic Transformation of Puerto Rico – META-PR)
Rhode Island	General Assessment (Rhode Island Comprehensive Assessment System – RICAS)		STAR AIMSweb Monitoring Basic Skills Progress (MBSP)¹¹ (p. 23)
South Carolina	General Assessment (SC Ready)		Universal screening tools¹² (p.17)
South Dakota	General Assessment (Smarted Balanced)		Formative school-based assessments (p. 3)
Tennessee	General Assessment (Tennessee Comprehensive Assessment Program – TCAP)		Universal screening tools (p. 20)
Texas	General Assessment (STAAR) Alternate Assessment (STAAR Alternate 2)
Utah	General Assessment (Readiness Improvement Success Empowerment – RISE) Alternate Assessment		Unspecified assessments¹³ (p. 28)
Vermont	General Assessment (Smarter Balanced)		Unspecified formative/interim assessments (optional) (p. 34)
Virgin Islands	General Assessment (Smarter Balanced)		i-Ready (p. 16)
Washington		Washington Kindergarten Inventory of Developing Skills (WaKIDS) (p. 10)
Wisconsin	General Assessments¹⁴ Alternate Assessment (Dynamic Learning Maps) (p. 49)
Wyoming	General Assessment (PAWS/WYTOPP)		Unspecified evaluation measures (p. 13)
Totals
States with assessment – related SIMR = 43	General Assessment = 35 Alternate Assessment= 7	Other Assessments= 9	States including assessments as measures of progress toward SIMR outcomes = 35 STAR = 7 Acadience/ DIBELS/DIBELS Next= 7 AIMSweb= 7 NWEA= 7 I-Ready= 3 Other specified assessments¹⁵ = 14 Unspecified assessments16 = 12

Note: Most of the data in this table was compiled from analysis of state’s SIMRs (see Table A1). For data elements not in the SIMR, but that were found elsewhere in the SSIP, page numbers are listed after the data element.

¹American Samoa: The SBA – DL is different from the SBA used in 3rd grade (which is the SIMR measure). The SBA-DL is a pre- and post-test, which addresses the standards and benchmarks taught in each level (K - 3).

²Arkansas: See https://myschoolinfo.arkansas.gov/Content/ESSA/2019/16_School_Growth_Explanation.pdf for information about how the value added score is calculated using the general assessment.

³Connecticut: Districts can select which of these approved assessments they use.

⁴Guam: The Department’s Curriculum & Instructional Improvement (CII) procured the use of AIMSwebPlus system-wide as the reading screening and progress monitoring tool. In addition, the following additional reading assessments were identified and procured to support the AIMS: (1) DRA-2 Developmental Reading Assessment, 2nd Edition (DRA-2), (2) Fountas & Pinnell Benchmark Assessment System, 3rd Edition (BAS-3), and Qualitative Reading Inventory, 6th Edition (QRI-6) for grades 3-5. (p. 100)

⁵Hawaii: Measures of progress selected by Complex Area.

⁶Idaho: Measures of progress selected by districts. Some used Idaho’s Reading Indicator (IRI) as a measure of progress. The IRI is an early reading screener and diagnostic assessment administered to all K-3 public school students. For additional details about the IRI see https://www.sde.idaho.gov/assessment/iri/

⁷Maine: SSIP discusses training on formative assessments but does not list specific assessments.

⁸Maryland: NWEA MAP used as a universal screener in some schools.

⁹Missouri: Educators develop and administer common formative assessments (p. 4).

¹⁰Oklahoma: Oklahoma shifted to a new SIMR unrelated to assessment in 2019, so the SIMR included in this report is from 2013-2018. Data reported is from Oklahoma’s old SIMR (2013-2018).

¹¹Rhode Island: Different sites use different assessments and tools (p. 49)

¹²South Carolina: Different schools use different universal screening tools.

¹³Utah: Local Education Agencies (LEAs) develop or select their own benchmarks for formative assessment (assessments not specified). (p. 28)

¹⁴Wisconsin: Wisconsin has had several general assessments during its SSIP cycle (Wisconsin Knowledge and Concepts Exam - until 2014-15; Badger Exam - 2014-15; and Forward Exam – 2015-16 - present)

¹⁵Other specified assessments include: Samoan English Picture Vocabulary Test (SEPVT); Samoan Picture Vocabulary Test (SPVT); Standard Based Assessment (SBA) – DL; I-Reading diagnostic assessment; Planning Reading Tiered Fidelity Inventory (T-TFI); Idaho Reading Indicator (IRI); Planning and Evaluation Tool for Schoolwide Reading Programs (PET-R); Recognizing Effective Special Education Teachers rubric (RESET); ISTAR-KR; ISPROUT; FAST; LEAP 360 diagnostic and interim assessments; Teaching Strategies GOLD (TS GOLD); Istation; Fountas and Pinnell; Kindergarten Readiness Assessment; Reading Success Network (RSN) English reading screening tool; easyCBM; and Monitoring Basic Skills Progress (MBSP)

¹⁶Unspecified assessments refer to assessments that are not mentioned by name in the SSIP (e.g., screening data, universal screening tools, common formative assessments).

Table A4. Details and Specifications, Including Any Data Limitations Identified by State in FFY 2018 SSIP

State	Details
American Samoa	Data Limitations A data analysis issue (being discussed as a quality of analysis issue) is the small number of students with disabilities in the pilot schools. It is not whether the data is correct or not, but how small numbers of students lead to data fluctuation from year to year due to individual student characteristics or other reasons such as school staff changes, and, as a consequence, data on small numbers of students may limit the analysis. (p. 34)
Arizona	Data Limitations Benchmark tools determined by each individual PEA. (p. 16) Arizona does not mandate administration of PEA benchmarks to assess student progress towards the Arizona English Language Arts Standards 2016. The SEA requests this data from the SSIP PEAs to assist in driving decisions, however the statewide assessment is the only mandated, consistent data source for the SEA to use for the collection of literacy data. As such, some inconsistency is evident in reported benchmark data, including missing benchmark data from PEAs that have either opted out of the benchmarking process, or opted out of the reporting of benchmarks. While the benchmark tools reported are aligned to grade-level ELA standards, SSIP PEAs administer a variety of assessments, making data-based decisions related to benchmarks impossible to do at the SEA level. (p 39)
Arkansas	Data Limitations The SiMR uses a value-added growth model that does not set projection scores, but rather prediction scores for each student. This difference between the actual score and the prediction score results in a residual or the value-added score (VAS). By using the same model approved in the Arkansas ESSA Plan, there are less data quality concerns. However, a student must have two or more years of state assessment data to be included in the growth model. (p. 74)
California	Data Limitations The SED continues to build an integrated approach to monitoring. As highlighted in last year’s phase III, year three SSIP report, the SED incorporated changes to selecting LEAs for SED monitoring processes using the same data and accountability indicators that are used in the Dashboard when possible and as appropriate. (p. 13) Additionally, the nature of California’s SSIP lends itself to qualitative evaluation measures, which produces information that is more representative of SSIP implementation progress. However, connecting this qualitative information to a single quantitative measure, the SIMR, presents unique challenges. California continues to work toward effectively demonstrating how SSIP implementation and the creation of the SOS will impact outcomes for SWD and the SIMR, specifically. (p. 24)
Colorado	Details/ Specifications We plan to establish a new baseline in FFY 2019 based on the Project Schools rather than the statewide baseline that is in place. (p. 10)
Commonwealth of the Northern Mariana Islands	Data Limitations The school data reflects that students who were not able to be screened in reading with Renaissance STAR Early Literacy or STAR Reading were screened with an alternative screening tool or that significant efforts were made to screen students on alternate dates if they were absent during the screening period. No data was provided indicating the level of proficiency for these students. (p. 51) [Need to] assist the schools in determining the proficiency levels of students screened with an alternative screening tool and incorporating into classroom, grade level, and school-wide screening data. (p. 51) In Phase III, the SSIP Core Team, with input from teachers, was unsure about the validity or reliability of the screening data prior to the Renaissance training and prior to the finalization of the Standard Operating Procedures (SOP). There were no procedures in place to observe the implementation of the screening to ensure the procedures were carried out with fidelity. There were a number of students who were not screened due to absenteeism or other reasons during the screening window and there were no procedures in place to reopen the window for the students who were missed. There were no procedures in place to use an alternative measure to screen students who could not perform on the STAR EL or STAR Reading. (pp. 55-56)
Connecticut	Data Limitations For FFY 2018, the area that continued to be the biggest challenge is the analysis of district universal screening data. The State has developed a menu of approved assessments from which districts may select. There are currently six assessments on the list: 1. AIMSweb Tests of Early Literacy or Reading 2. Dynamic Indicators of Basic Early Literacy Skills (DIBELS) 3. DIBELS Next and mCLASS 4. NWEA Measures of Academic Progress (MAP) 5. STAR Reading Assessment 6. i-Ready Diagnostic Reading Assessment As there is not one uniform assessment used by districts throughout the state, it is difficult to incorporate data from these assessments in the district selection process and follow-up progress monitoring. Additionally, some districts do not have the capacity to easily disaggregate the data by subgroup, and different subtests may be administered in the different grade levels (K-3), which hinders cross-grade comparison. In fact, some districts use different assessments at different grade levels. As part of the Tier 2 technical assistance session, the CSDE asks districts to provide the previous year’s universal screening data from the fall, winter, and spring administrations for SWD is grades K-3. These data are also requested as part of our progress monitoring of districts; however, the follow-up monitoring cycle is affected by the time it takes to provide technical assistance and for improvement activities to be implemented. As a result, the subsequent data reviewed for progress monitoring represents different points in time across two school years. (p. 16)
Delaware	Data Limitations We are still working with participating schools to gather accurate and reliable student screening/benchmarking data, as well as the percentage of students receiving tiered instruction to complement the statewide SBAC assessment is our only measure of student performance…These data limitations should not significantly impact the ability to assess progress. (p. 27)
Federated States of Micronesia
Guam	Data Limitations Due to the discontinuation of the aimsweb2.0 version, the GDOE schools had to transition from aimsweb2.0 to aimswebPlus version as the universal screener. As a result of the transition, a comparison of results for SY2019-2020 cannot be made with previous school years as there are differences in reading measures as well as administration. (p. 61) Standards-based grading is referenced in meetings but performance on proficiency is not reported. The baseline data referenced is aimswebPlus with measurement for goals using teacher standards. (p. 77) AimswebPlus has been in use by the district for one school year. Therefore, as a result of this relatively short time period, teachers are still not proficient in the administration and scoring of the measures. (p. 81)
Hawaii	Data Limitations As with any large, dynamic system, assessing the quality of implementation of stated strategies and the impact on student achievement within an entire state is challenging. Each complex area and school utilized a variety of methodologies and measurement instruments, including walk-throughs and progress monitoring assessment tools (e.g., timely execution of compliance requirements, IEP case sampling), to ensure high quality data. HIDOE continues to evaluate alternative methods of data collection, such as the collection and use of existing planning and implementation artifacts. Implementation data needs to be assessed on a continual basis. Although implementation tools are being discussed as a larger component, subsections within those implementation tools will need further assessment. (p. 44) The implementation data set that continues to be a challenge is the variation of progress monitoring tools measuring the effect professional development, training and technical assistance has on student achievement and where these resources best fit in a tri-level system. As HIDOE continues to gather, evaluate, and aggregate school-level progress monitoring tools, a constant shift continues to make it difficult to look at possible correlations between statewide assessments and use of these tools. Although it is for the betterment of improving education, aggregating the data to focus on academic change and determining the need to change or adopt a new strategy at the system or local level has proven to be an ongoing challenge for both state and complex area staff. (p.44)
Idaho	Details/ Specifications Schools in Cohort 1 reduced the percentage of 4th grade students with disabilities taking the Idaho Alternate Assessment (IDAA) in ELA from 20% to 3.66% during the same time period. This large decrease in IDAA ELA participation and subsequent increase in regular assessment participation by lower performing students significantly impacted the ELA proficiency rate of Cohort 1. (p 36) To control for the relatively large decrease in percentage of students moving off the IDAA from 2018 to 2019, the ISDE evaluation team explored ISAT data by removing students from the analysis who took the IDAA in one year but not the following year or vice versa. (p. 38)
Illinois	Details/ Specifications Because the ELA state assessment is a distal measure of students’ performance, the IL MTSS-N worked on its capacity to collect and analyze local district assessments that are more proximal and sensitive to change, (e.g., AIMSweb, STAR, and NWEA MAP) in Phase III, Year 4. (p.14) Data Limitations One data limitation is in the area of standardization across progress monitoring tools across TZ districts. In Phase III, Year 4, the IL MTSS-N piloted a process to report uniform local district assessment data (e.g., AIMSweb, STAR, and NWEA MAP) for the SSIP. The IL MTSS-N purchased a software license for a data dashboard to meet multiple needs, such as collecting and analyzing data in meaningful ways. The goal was to enhance ISBE’s ability to judge progress toward achieving the SSIP long-term outcome (SIMR). The IL MTSS-N did pilot the collection of local progress monitoring data across all but one of the TZ districts. A survey that would standardize the data across schools and benchmarking assessments was utilized. However, most district staff demonstrated that they did not have the capacity to organize their data in alignment with the survey; therefore, districts sent PDFs, Excel documents, and Word documents of each of their different types of benchmark assessments to the IL MTSS-N. Not all districts provided clean, clear, and workable data. Some local district data systems did not have the capability to disaggregate by students with IEPs. Pilot results indicated a lack of conformity across the eleven TZ districts. The lack of conformity did not allow for the necessary standardization. Therefore, ISBE could not include the depth of information regarding growth as evidenced by local progress monitoring data as it originally intended for Phase III, Year 4. IL MTSS-N has a team that is working to improve the local assessment data collection procedures for next year. This team has gathered input from relevant parties and is developing a plan to improve the data collection process for next year given the observed issues with the pilot. (p. 38)
Indiana
Iowa	Details/ Specifications (including any stated purposes for using interim assessments): Student performance in grades 2 through 6 was assessed using the universal literacy screening assessment, the Curriculum-Based Measurement for reading (CBMr) from FastBridge Learning. The CBMr provides an index for word reading efficiency—a predictor of reading comprehension—by measuring the number of words read correctly (WRC) in a 1-minute timed test. The study measured changes in (1) the percentage of students who met grade-level benchmarks for the number of WRC, (2) the average rate of improvement, and (3) the percentage of students who made expected and ambitious growth gains from the 2016–2017 to the 2017–2018 school year. (p. 33)
Kansas	Details/ Specifications For students to reach grade-level benchmark on a Curriculum-Based Measure General Outcome Measure (CBM-GOM), both fluency consistent with the grade-level criteria and 95% accuracy must be achieved. When students struggle learning to read, initial intervention focuses on improvement in accuracy and then shifts to improvement in fluency, which allows the students to achieve benchmark. (p. 3) CBM-GOM Universal Screening Data in Reading: Reading CBM-GOM screening is conducted in fall, winter, and spring. Student and grade level composite data support customized coaching and collaborative team, school, and district decision making. These data are reviewed by District Leadership Teams, Building Leadership Teams, and Collaborative Teams and summarized in the Kansas MTSS and Alignment Collaborative Team Progress Planner. (p. 36)
Kentucky
Louisiana	Details/ Specifications The SSIP utilizes LEAP 360 (a new statewide assessment system with diagnostic and interim assessments aligned to state standards) to measure student progress throughout the school year as well as curriculum imbedded formative assessments. (p. 31) Data Limitations Changes within the SSIP cohort over time. Due to the long-term commitment of the SSIP, work and changing personnel and priorities within school systems our SSIP cohort has changed over time. Three of the original nine school systems have decided to discontinue their participation in the SSIP cohort. To stabilize the cohort and maintain the integrity of data collection, three new school systems were identified, through the school redesign process, to participate in the SSIP cohort. These new members of the cohort are school systems that have been identified as having low performance for particular sub-group student populations, specifically students with disabilities. These three new school systems are exceptionally eager to engage in the work of our SSIP. Because of these changes, any comparison of the cohort over time will be challenging. (p. 39)
Maine	Details: Specifications The training included a focus on the use of a diagnostic screening tool to pinpoint student difficulties across several areas and the use of formative assessments and formative feedback. (pp. 3-4) Data Limitations Stakeholders, Math4ME external evaluators, and DOE staff have discussed the concern that the current measure of student proficiency, the statewide Maine Educational Assessment (MEA), is relatively broad-based and, compared to other assessments that might be used, not as focused on the more specific aspects of student learning that are expected to increase as a result instruction by Math4ME teachers. Other assessments that are commonly used in classrooms may be more sensitive to increases in student performance. (p. 23)
Maryland	Data Limitations Originally, the Partnership for Assessment of Readiness for College and Careers test (PARCC) was identified as the measure for this outcome. However, it was given for the last time in 2018 and has been replaced with the Maryland Comprehensive Assessment Program (MCAP). The new tests are broken down into math and English, the same as the PARCC exams, although MCAP exams will be broken down further in order to give more flexibility for schools. (p. 4) Comparisons in data collection over time, across districts, and among school, district, and State data sources. State assessment data is collected only once a year, and the PARCC data has not been sensitive to changes in growth of student proficiency over time, especially for lower performing subgroups. While this is the primary data source identified to measure progress toward the SiMR, MSDE has looked to local data sources to evaluate student performance and progress. At the school level, teachers use formative assessments to monitor their students, which are important to inform instruction, but not to evaluate progress. Universal screening and progress monitoring data used by districts vary from one local jurisdiction to another; and sometimes across years within one district or across grades within a year. This makes it impossible to aggregate those data for any analyses or to examine trends over time. (p. 41)
Michigan	Details/ Specifications The State Identified Measurable Result (SiMR) is intended to stay the same—that is, improved reading proficiency for students with disabilities. However, a recommendation was made to SEAC to expand the SiMR from grades K-3 to grades K-5. The measure will be the Acadience Reading assessment, a universal screening and progress monitoring assessment that measures the acquisition of early literacy skills from kindergarten through sixth grade. Acadience Reading is comprised of six brief measures that function as indicators of the essential skills that every child must master to become a proficient reader. In previous years, two ISDs and five districts within the MDE transformation zone served as the data source for reporting on the SiMR. Moving forward, the sample of schools for the SiMR will be drawn from districts receiving professional learning and technical assistance support in the identified EBPs within an MTSS framework from the MiMTSS TA Center. (p. 28)
Mississippi	Details: Specifications Districts could select a state approved universal screener. Of the 21 districts with SSIP literacy coaches, 15 selected STAR, 5 NWEA MAP, and 10 i-Ready. (pp. 30-31)
Missouri	Data Limitations Concerns about understanding what data to collect and if the systems are collecting this data accurately. (p. 26)
Nebraska	Details/ Specifications During the 2017-18 school year, the state developed interim data measures for the SIMR. The State began obtaining MOUs between the districts and NWEA to obtain MAP data that is planned to be used to monitor reading proficiency prior to the 3rd grade statewide reading assessment to better analyze the extent to which the strategies implemented have had an effect. MAP data will also be used to measure progress toward the Growth Goals that were established when the SIMR was updated for Phase III. (p. 38) Data Limitations The biggest data limitation is the number of times districts administer the MAP assessment. Only districts who administered the MAP assessment three times during the 2018-19 school year were analyzed which omitted some districts from the interim analysis. However, given there were so few districts that didn’t test three times, NDE is confident in the baseline data obtained from the analysis and hopes to establish a trend in the number of students identified as “at-risk” readers in order to establish targets to reduce the overall number of students considered “at-risk.” (p. 38) The current statewide data collection does not permit real-time viewing of data and has limits based on collection fields. Nebraska changed the vendor providing the statewide assessments in 2017 which impacted the ability of the Office of Special Education to compare reading proficiency results for students with disabilities in an equitable manner. Another consideration with the measurement of the SIMR is that the statewide measure of reading proficiency begins at the 3rd grade level. (p. 39)
Nevada
New Mexico	Data Limitations Data limitations affecting progress reports included change in state accountability reading assessment (DIBELS to Istation) as well as the end of year state assessment (PARCC to NMSTAMELA), and data collection processes and procedures. (p. 40)
New York	Data Limitations NYSED does not prescribe or require specific instruments to be implemented for collection of student-level data (i.e., screening, benchmark academic, behavior); this is a local decision. Regional Specialists were directed to leverage existing assets of each school to ensure efficiency and to not exceed capacity of district and school resources. As mentioned earlier in this report, a data collection workbook and collection schedule were provided to Regional Teams supporting SSIP schools to ensure consistency of data points, disaggregation methods, and reliable tools. Nevertheless, many challenges arose indicating that data systems and structures are lacking across the 14 schools. Some common themes emerged: • Even within the same district, schools are using different tools to gather and report data (AIMSweb, DIBELS, Fountas and Pinnell, etc.). Some schools/districts are further ahead with how they share data at a glance to drive decision making. (p. 29)
Ohio	Data Limitations Ohio’s education system should interpret results for SIMR 2 with caution. There may be inconsistencies in reading diagnostic assessments across time as schools select different assessments each year. Additionally, each district using a reading diagnostic can select its own benchmark to measure “on track,” if that benchmark is above the vendor-recommended cutoff. The state does not track the benchmarks or reading diagnostic selected by districts each year. It is possible that either the benchmark, assessment, or both have changed in each district since the start of the pilot” (pp. 2-3). Again, Ohio must interpret these differences with caution, understanding they may not be due to pilot implementation because the state also saw a significant decrease (12.9 percent) in the percentage of students on track for reading proficiency since the start of the pilot.” (p. 3) Changing definitions and instruments limit the ability to make comparisons over time. For example, there may be inconsistency with reading diagnostic assessments, used to measure state-identified measurable result 2, over time as schools are able to select different assessments each year. (p.36)
Oregon	Data Limitations The SEA considers both summative and interim (screening) assessments when noting progress toward the SIMR. Oregon law permits students to opt out of participation in summative assessments, contributing to varying rates of district participation. The SEA did not examine summative assessment participation rates when examining statewide assessment data. It may not be accurate to draw conclusions about the performance of all grade three students with disabilities in Oregon when assessment participation rates varied among districts. Furthermore, the population included in the SIMR target includes students statewide, and the SEA is only able to provide implementation supports for a limited number of districts. The SEA cannot expect that intervening directly with few districts will significantly impact statewide assessment results within the reporting phases of the SSIP. The SEA also notes limitations related to using reading screening data as a measure of progress toward the SIMR. Both of the districts in Cohort B selected to focus MTSS implementation at the secondary level. While the SEA continued to collect reading screening data for these districts, these districts did not select to include literacy as a priority focus area of MTSS implementation. Of the three Cohort C districts participating in SSIP/SPDG supports and ORTII literacy supports in Phase III-4, one district submitted literacy screening data to the SEA. Due to the limited quantity and applicability of reading screening data, the SEA is not able to reliably infer progress toward the SIMR from these reading screening data. The SSIP/SPDG coordinators also examined literacy screening data for districts statewide participating in ORTIi supports. There were 12 districts receiving ORTIi elementary literacy supports during Phase III-4. Of these 12, three districts also receiving SSIP/SPDG supports (Cohort C). While the screening data represents pockets of implementation across the state, these are not necessarily the same districts working within the MTSS coaching established through the SSIP/SPDG. (pp. 30-31)
Oklahoma
Palau	Data Limitations: Some of the issues on the data process were related to: • Challenges continue for data reporting to the SSIP Core Team on a timely manner. • The SSIP Core Team needs to receive the data analysis timely so that planning for training activities could be based on the data, such as previous training results, student screening results, survey results, and other relevant information. (p. 55)
Puerto Rico	Data Limitations: One of the principal limitations that affected the data collection was obtaining the data for the pre-posttest for the students in the participating schools. PRDE suggests to teachers to administer a pretest at the beginning of the school year. The results give important information to the teachers to identify the needs of their students and gives a base to the teachers on what material needs to be reinforced (from the last semester). The pre and posttest weren’t administered in the participating schools due to all the work load that the teachers had since the Humacao ORE was the one more affected by Hurricane Maria. (p. 37)
Rhode Island	Data Limitations: One major area of concern is that sites use different local assessments and tools to collect universal screening and ongoing progress monitoring data. The data collection tool we refined after pilot use has been helpful as we look across various screening results from different measures. The student-level DBI [data-based individualization] case studies also reflect schools’ use of different local assessments. This reporting year is the first year in which we aggregated formative assessment data at the student level gathered through the student-level DBI case studies. Only seven case study students had complete data, which limits the Math Project’s ability to determine if the progress they made toward ambitious, individualized goals in targeted areas of need would extend to other students in the schools. (p. 49) A critical component of the student case study was to select and implement a progress monitoring tool to track growth in the student’s mathematical skills and abilities. Tools used to monitor students’ progress were AIMSweb, STAR Math, and Monitoring Basic Skills Progress (MBSP). The frequency with which the assessments were conducted varied according to the student deficit areas being targeted and the progress monitoring measure’s administration recommendations. For example, MBSP is administered weekly, whereas STAR Math typically is administered monthly. (p. 23) Reviewing progress on the SiMR from Phase I through the April 2019 submission has been challenging with two state assessment changes and two baseline resets. Examination of local data, implementation data, and other evaluation measures as described previously continue to be vital to understanding progress in improving outcomes for the target population. (p. 49) To address the data quality issues raised in the previous year’s report related to the lack of common assessments to screen and progress monitor students, the Math Project created a screening data collection tool. Continued training of school-level participants to extract universal screening data by disability category and race will improve future outcome measures. In addition, continuing to expand the case-study approach to examine progress monitoring data for specific disabilities and races will strengthen data quality in the evaluation. (pp. 49-50)
South Carolina	Data Limitations As pointed out in previous SSIP reports, comparison of third grade scores year-by-year does not yield sensitive and reliable measures of growth. South Carolina continues to struggle with the lack of an easily accessible, student-level data system to collect and report formative assessment data at the state level. (p. 21)
South Dakota	Data Limitations No data quality issues are surrounding the evaluation measures in the 2019-20 school year. There could be a data quantity issue in that only one SSIP district is providing information on all the evaluation measures in the 2019-20 Evaluation Plan. Though data is collected from one district, this district is one of the largest in the state. As mentioned in the prior SSIP APR, four of the five SSIP districts decided to sustain the SSIP work on their own, so they are not providing any evaluation information other than participating in the follow-up phone interviews (which gauges the extent of their sustainability efforts) and the state test. Both of these measures are very important for judging the success of the SSIP, so having the “sustainability” districts participate in these measures was very important. (p. 14)
Tennessee	Details: Specifications For question 15 in the evaluation plan, a sampling of students’ universal screening data is required to determine improvement in scores from the beginning of the school year to the end of the school year. Though these data are valuable and appropriately address the goal of increasing the rate of improvement in areas of deficit, capacity once again becomes a concern for both the department and district staff, who will be responsible for providing the universal screening data. In light of this, the evaluation team had to pull a limited selection of student records to determine improvements. (p. 24) The department has developed a method by which to evaluate progress across different universal screeners and communicating the need for this data with participating districts. To address concerns about different universal screeners providing different data for districts, the department developed a more fundamental metric in which progress was assessed at the district level, and categories of “increase,” “decrease,” or “same” were used to see change in universal screener data, rather than more nuanced data that might be tool-specific. This same methodology was employed for the SSIP 1.0 districts in the 2017-18 school year. (p. 21)
Texas
Utah	Data Limitations: Because LEAs develop or select their own benchmarks for formative assessment and measuring fidelity of implementation, Utah will continue to provide guidance on assessing the reliability and validity of these measures and interpreting findings, particularly if the outcomes reported by LEAs using these measures do not correlate with the statewide end of level assessment data. To date, this has not been an issue, and Utah will address the discrepancies with individual LEAs as they arise. It is less likely that these measures will be assessed for reliability of data, so Utah will not know the extent to which they provide reliable data and accurately measure the constructs they target. Formative evaluation findings based on these potentially less reliable measures will be tempered accordingly. However, given the focus on the SIMR and RISE results, Utah is confident that our summative conclusions are valid and will remain the key target. Given Utah’s political focus on local control, LEAs report other aggregated data (i.e., formative assessments, implementation fidelity using LEA- created/selected instrumentation) and sample selection procedures to the USBE. These samples and procedures may vary across LEAs. (pp. 28-29)
Vermont	Data Limitations: Information and activities need to be more closely targeting the SiMR in a way to effect change (i.e., math proficiency for students identified as having an emotional disturbance in grades 3, 4, & 5); Vermont is a small state, therefore small “n” size continues to be a limitation within certain regions of the state. Data from those regions will need to be reported in aggregate form during the scale-up phase of the SSIP work. The VT SiMR was originally established to only include students in grades 3-5 identified as having an emotional disturbance as their primary disability on their IEP. Beginning with the 2019 SBAC data included in this report, Vermont has broadened the reporting of its SiMR data to include all students in grades 3, 4, & 5 identified as having an emotional disturbance, regardless if the disability was considered primary, secondary, or tertiary. Expanding the SiMR requires changing our SPP/APR baseline and target numbers. Vermont is extending current targets through federal fiscal year (FFY) 2019. New targets will be set after presenting data to stakeholder groups and receiving their feedback and advisement. The aim is to have targets which are rigorous yet achievable. Key stakeholder input on this was obtained through the Special Education Advisory Council. (p. 25)
Virgin Islands	Data Limitations: The VIDE/SOSE was unable to carry-out a large number of the coherent improvement strategies listed in RtI and PBIS logic models, particularly data collection and implementation of PBIS and RtI in both districts. (p. 39)
Washington	Data Limitations The State Design Team noted that Franklin Pierce School District participated in a separate surveying process for state monitoring and their data was not included in the overall data summary shared by WSU for indicator 17. For this reason, the total number of respondents, and other factors (race/ethnicity, LRE, survey language, etc.) have shifted significantly from FFY 2017. (p. 30) As stated previously, of significance is that the requirement for full implementation of the WaKIDS assessment as part of the Full-Day Kindergarten legislation took place over a series of stages, first being a pilot in 2010–11, leading to full implementation in 2017–18. Stakeholders expressed concern that there appears to be a correlation between the increase in the number of students with disabilities participating in the WaKIDS assessment and a variety of factors, including: TSG platform change which required new learning for seasoned staff; uploading errors that were not identifying students by race, gender, or IEP status; and poor recruitment of special education staff and specialists. (p. 34)
Wisconsin	Details/ Specifications Wisconsin continues to be a leader in designing and implementing high quality integrated data systems for student-level data. In 2016-17, WDPI transitioned to a new system, WISEdata, to reduce duplicate data collection tools and processes and replace outdated data collection software. This has resulted in reduced burden and streamlined data reporting requirements for districts. (p. 55). Data Limitations: Like many states, WDPI has experienced changes in regular statewide assessment tools (in 2014-15 and 2015-16) that complicates year to year comparison of test results. However, Wisconsin’s SiMR is designed as a points-based proficiency measure averaged over three years of data, and is thus more resilient to changes in assessment than a raw single year proficiency rate might be. Maintaining accurate and comprehensive data has been a key goal in the design of data collection tools and systems used in the SSIP Evaluation, and Wisconsin’s depth of application development resources will allow us to accomplish this goal. (p. 55)
Wyoming	Data Limitations In general, the data collected have been of high quality, and the WYSSIP Team has had very few concerns. The most important data for evaluating progress is the State Test Data. This high-quality data is being collected on all students. One complicating factor to examining state test data over time is that during the 2017-2018 school year, the WDE adopted the Wyoming Test of Proficiency and Progress (WYTOPP) as the new state assessment. This means that any increase or decrease in reading proficiency rates from 2016-17 to 2017-18 could be a function of the new test and not a function of any real increase or decrease in actual reading achievement. (p. 13)

Note: Page numbers refer to the page where the information is found in each state’s SSIP.

NCEO Report 425

The Role of Assessment Data in State Systemic Improvement Plans (SSIPs): An Analysis of FFY 2018 SSIPs

Acknowledgments

Table of Contents

Executive Summary

Overview

Method

Results

Content Areas

Grade Levels

Target Student Population

Assessment Type

Data Limitations

Discussion

References

Appendix A

Analysis of States’ FFY 2018 SSIP (Indicator 17) Submission