States' Out-of-Level Testing Policies

Out-of-Level Testing Project Report 4

Published by the National Center on Educational Outcomes

Prepared by Martha Thurlow and Jane Minnema

June 2001

This document has been archived by NCEO because some of the information it contains may be out of date.

Any or all portions of this document may be reproduced and distributed without prior permission, provided the source is cited as:

Thurlow, M., & Minnema, J. (2001). States' out-of-level testing policies (Out-of-Level Testing Project Report 4). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Retrieved [today's date], from the World Wide Web: http://education.umn.edu/NCEO/OnlinePubs/OOLT4.html

Executive Summary

Throughout the past decade, state policy agendas on education have supported reform movements that foster a standards-based approach to classroom instruction. Large-scale assessments that measure student progress toward meeting grade level content standards are a prominent feature of these educational reform efforts. One of the overarching goals of a statewide testing program is to provide accountability results that can be used to improve local schools. Forty-eight states have designed their own statewide assessment systems that strive to meet the assessment specifications set out in federal policy. Unfortunately, even with the addition of alternate assessments to statewide assessment systems, states are finding that their standards-based, statewide tests are not good measures of all students’ abilities.

In an attempt to include more students with disabilities in large-scale assessments, 12 states (Alaska, Arizona, California, Connecticut, Delaware, Iowa, Louisiana, North Dakota, South Carolina, Utah, Vermont, and West Virginia) allowed out-of-level testing during the 2000-2001 school year. However, to date, there are no empirical data to guide policymakers’ decisions about out-of-level testing. While descriptive data are accruing that describe the status of testing students with disabilities across those states that allow this approach to testing, no study has described the state level policies that guide the implementation of out-of-level testing.

The purpose of this report is two-fold. First, we discuss the context of state assessment systems, including the accountability practices in which out-of-level testing is implemented. Second, we describe specific out-of-level testing policies in those states with policies that allow students to be tested out of level in large-scale assessments. By reviewing all 12 states’ policies on out-of-level testing, we gleaned themes of results concerning state-level policy features, assessment instrument characteristics, required implementation practices, and test score uses. Generally speaking, there was wide variability across states in both policy content and suggested practices for implementing out-of-level testing at the local level. The one point of commonality (11 of 12 states) was the practice of testing students with disabilities out of level.

We conclude our report by highlighting four discussion points. First, there is a need to increase the specificity of out-of-level testing policy language to guide testing at the local level in a suitable manner. Second, our analyses indicated four labels (accommodation, modification, nonstandard, and alternate assessment) are used for out-of-level testing across the states, but that there is little consistency in what terms mean. Third, what is reflected in state policy may not reflect actual implementation. Fourth, the long term effects on students who are tested out of level in large-scale assessment programs are unknown at this time. There is a strong need for research, policy, and assessment communities to join together in an effort to resolve the many contentious issues that surround out-of-level testing.

Overview

Throughout the past decade, state policy agendas on education have supported reform movements that foster a standards-based approach to classroom instruction. Large-scale assessments that measure student progress toward meeting grade level content standards are a prominent feature of these educational reform efforts. One of the overarching goals of a statewide testing program is to provide accountability results that can be used to improve local schools (Olson, Bond, & Andrews, 1999). Forty-eight states have designed their own statewide assessment systems that strive to meet the assessment specifications set out in federal policy.

Federal legislation has, in part, shaped the context for improving local schools through statewide assessments. With the reauthorization of the Individuals with Disabilities Education Act of 1997 (IDEA 97), schools were mandated to include all students in statewide assessment programs. To support this endeavor, states were also required to develop and administer alternate assessments by July 2000 for the small number of students in each school district whose assessment needs were not met by the regular statewide assessment. Unfortunately, even with the addition of alternate assessments to statewide assessment systems, states are finding that their standards-based statewide tests are not good measures of all students’ abilities. Often they view this as a student problem – the student does not fit into the assessment system. Sometimes they recognize that the problem lies in the assessment system itself (Almond, Quenemoen, Olsen, & Thurlow, 2000).

Large-scale assessments tend to be global quantitative measures of student group progress toward attaining state standards in specific curricular content areas. With the implementation of these global measures, concerns arose at the practical level about how all students could possibly participate in statewide tests. Policymakers, educators, and parents of students with disabilities contended that the test item content on statewide assessments did not adequately test what all students know (Minnema, Thurlow, & Scott, 2001). In other words, the instruments constructed to satisfy the assessment policies developed by state education agencies did not satisfy all individuals who had a stake in implementing statewide assessments in local education agencies.

Since the purpose of a statewide assessment is to measure students’ progress toward achieving state standards in specific curricular content areas, tests are administered at specific grade levels. However, some advocacy groups maintain that there is a segment of students who are striving to achieve grade level standards, but are doing so at a slower pace than their grade level peers. Consequently, the statewide assessment may be too difficult for these students. From this perspective, the test experience for these students is frustrating and embarrassing (Minnema et al., 2001). In addition, teachers believe that a statewide assessment that does not adequately measure what a student knows yields useless information for making good instructional decisions. As a partial solution to these stakeholder concerns, 12 states (Alaska, Arizona, California, Connecticut, Delaware, Iowa, Louisiana, North Dakota, South Carolina, Utah, Vermont, and West Virginia) had developed out-of-level testing programs that were active during the 2000-2001 school year. All but one of these states uses out-of-level testing in their statewide large-scale assessment systems. Iowa is the only state that does not have a large-scale assessment mandated statewide.

The number of states choosing to use some type of out-of-level measure has increased in recent years, possibly in response to the requirement to include all students in assessments. See Table 1 for a display of the status of out-of-level testing as of school year 2000–2001.

Table 1. Implementation History of States with Out-of-Level Testing

Past Implementation	Implementation 1999	Implementation 2000
Arizona California Connecticut Iowa North Dakota Utah Vermont West Virginia	Alaska Louisiana South Carolina	Delaware

Note: Out-of-level testing policy status current as of September, 2000.

Eight of the 12 states that currently test students out of level have implemented an out-of-level testing program for a number of years. Four other states have more recently decided to test students with disabilities in large-scale assessments at levels other than a student’s assigned grade level in large-scale assessments. When considering the status of out-of-level testing across states, it is important to note that state-level assessment policies change rapidly. The list of states that test students out of level was current up to the point of beginning to prepare this report. It is possible that some of these states may discontinue their out-of-level testing policy while new states may adopt this approach to testing students with disabilities during or after publication of this report.

Regardless of the status of testing students out of level, it is certain that there is renewed interest in this approach to testing. However, policymakers have no empirical data to guide their decisions about out-of-level testing. There are no research studies that demonstrate the advantages or disadvantages of testing students out of level in large-scale assessments (Minnema, Thurlow, Bielinski, & Scott, 2000). Initial descriptive data are accruing that describe the content of out-of-level testing policies at the state level and the context within which those policies are implemented.

The purpose of this report is to inform policymakers about the status of out-of-level testing nationwide. We do this in two ways. First, we discuss the context of state assessment systems, including the accountability practices in which out-of-level testing is implemented. Second, we describe specific out-of-level testing policies in those states with policies that allow students to be tested out of level in large-scale assessments.

Method

We used two sources for data to collect information for this study. First, we searched the Web sites of each of the 12 targeted states for any available public information about their out-of-level testing policies. For those states that described their out-of-level testing policy thoroughly on their Web sites, we downloaded the information; this often included state-level policy documents. When we were unable to collect adequate information about a state’s out-of-level testing policy on-line, we contacted the state education agency (SEA) for additional policy information. We also referred to the analysis of a series of telephone interviews with state assessment directors about out-of-level testing practices to supplement any information missing from the policy reviews (Minnema, et al., 2001).

We were able to obtain policy information or documents from all of the 12 states that tested students out of level in large-scale assessments during 2000-2001. Two states are in the process of reviewing their written out-of-level testing policies so that some of the data were not available at the point of data collection for this project. Once these policy data were collected, we reviewed each policy separately to ascertain the specific content of the out-of-level testing policies on a state by state basis. Then, to obtain a broader understanding, we considered all of the policies as a composite data set, from which we identified state-specific contextual features of implementing out-of-level testing programs, the current state status of these testing programs, and important content details from out-of-level testing policies. We conclude this report with a discussion of key policy issues that are relevant to testing students with disabilities out of level in statewide assessments.

Overview of Out-of-Level Testing

To understand the status of out-of-level testing policies nationwide, it is helpful to first briefly review background information about the history of use and the definition of out-of-level testing. Out-of-level testing was first introduced to the field of measurement and testing in the 1960s for monitoring student progress and evaluating program effectiveness (Minnema et al., 2000). By the 1970s, test companies had developed norm-referenced tests with normative data that extended above and below specific grade levels. At this time, it became possible to administer a level of a norm-referenced test that was either above or below a student’s assigned grade level.

The actual use of out-of-level testing from the 1970s through the 1980s is unknown. With the introduction of standards-based statewide assessments throughout the 1990s, there has been a renewed interest in testing students out of level. However, the recent increase in testing students out of level has not evolved without conflict. In fact, the adoption or rejection of an out-of-level testing policy has been contested in state legislatures, public town meetings, parent group meetings, state and local school board meetings, state education agencies, and local education agencies.

Those who support testing students out of level contend that out-of-level testing avoids student frustration and emotional trauma, improves the accuracy of measurement because the test items match a student’s level of instruction, and reduces student guessing on test items that are too difficult. Those who oppose out-of-level testing argue that out-of-level tests serve a different purpose from large-scale assessments, the out-of-level test scores are difficult to aggregate for group reporting, and teachers tend to lower their instructional expectations for students who are tested at a lower grade level than their assigned grade level. These arguments about out-of-level testing persist at the local, state, and federal levels. None of the issues that underlie the arguments have been resolved empirically by either past or current research studies.

Until 1999, the literature also had not provided a clear definition of out-of-level testing. In a paper commissioned by the State Collaborative on Assessments and Student Standards (SCASS), a study group defined out-of-level testing as the “administration of a test at a level above or below the level generally recommended for students based on their age-grade level” (Study Group on Alternate Assessment, 1999, p. 20). Minnema et al. (2001) indicated that the term most often used today when testing students below grade level is out-of-level testing. There are a few states that prefer the term “off-level testing” when referring to the testing of students below their assigned grade level.

Out-of-level Testing Context

To best understand out-of-level testing across those states that allow this approach to testing students in large-scale assessment systems, it is helpful to understand the context within which out-of-level testing is implemented. Table 2 presents a description of the statewide tests for those 12 states that currently test students out of level. We present the test name, the grades at which the state test is administered, the core content areas that are tested by the state test, the type of accountability systems in place in those states, and the type of high stakes that are tied to the state tests. The source for these Internet-based data was a survey research project conducted by the Consortium for Policy Research in Education (CPRE, 2000). Throughout 1999 and 2000, CPRE conducted a larger study of standards-based reform that included a survey of state assessment and accountability systems from which we took our out-of-level testing context information.

Table 2. Out-of-Level Testing Context - State Assessments by States

State

State Tests

Grades Tested

Content Areas Tested

Accountability System(s)

High Stakes Effects

Alaska

AL Benchmark Exams

High School Graduation Exam

California Achievement Test

3^rd, 6^th, 8^th

Begin at 10^th

4^th, 8^th

Reading, writing, math

Test battery

Student accountability with voluntary system accountability

Students must pass graduation exam to receive regular high school diploma

Arizona

AZ Instrument to Measure Standards (AIMS)

Stanford 9 Achievement Test

3^rd, 4^th, 8^th, high school

3^rd - 11^th

Reading, writing, math

Test battery

Student accountability by 2002

Class of 2005 must pass AIMS for regular high school diploma

California

Standardized Testing & Reporting Program (STAR)

High School Exit Exam

2^nd – 11^th

English, language arts, math

Science, social studies (9^th – 11^th only)

Student accountability

System accountability

- Class of 2004 must pass graduation for regular high school diploma

- Identify and assist low performing schools

Connecticut

CT Mastery Test (CTM)

CT Academic Performance Test (CAPT)

4^th, 6^th, 8^th

10^th

Reading, writing, math

Math, language arts, writing, science

Student accountability

System accountability

- Students receive Certificates of Mastery for each CAPT content area passed.

- Low performing schools identified with plan to assist in development.

Delaware

Delaware Student Testing Program

(DSTP)

Stanford Achievement Test

3^rd, 5^th, 8^th, 10^th

Reading, writing, math (3^rd, 5^th, 8^th, 10^th)

Science, social studies (6^th, 8^th, 11^th)

Student accountability

System accountability

- Students must pass DSTP for promotion and graduation

- Rewards and sanctions within district accreditation system

Iowa

(Not mandated tests)

Iowa Tests of Basic Skills (ITBS)

Iowa Tests of Educational Development (ITED)

3^rd – 8^th

9^th – 12^th

Test battery

System accountability (At district level by establishing local annual improvement goals)

None

Louisiana

LEAP 21 for the 21^st Century Test

Iowa Test of Basic Skills (ITBS)

Graduation Exit Exam

4^th, 8^th, 10^th, 11^th

3^rd, 5^th, 6^th, 7^th, 9^th

10^th , 11^th

- English, Math, Science, Social Studies

- Test battery

Student accountability

System accountability

- Student must pass LEAP for 4^th & 8^th grade promotion.

- 10 year growth goals with recognition or sanctions.

North Dakota

Standards-based Standardized Tests

Comprehensive Tests of Basic Skills (CTBS/5) TerraNova

Tests of Cognitive Skills (2^nd Ed.)

4^th, 6^th, 8^th, 10^th

Reading, writing, speaking, listening, math

Test battery

School accountability

None

South Carolina

Palmetto Achievement Challenge Tests (PACT)

Comprehensive Tests of Basic Skills (CTBS/5) TerraNova

1^st – 8^th (1^st, 2^nd grade optional)

3^rd,6^th, 9^th

English language arts, math, science, social studies

Test battery

System accountability

Schools that meet student achievement benchmarks receive incentive awards and employee bonuses. Schools that do not receive assistance or employees lose jobs.

Utah

Core Curriculum Assessment Program

Direct Writing Assessment

Stanford Achievement Test (9^th Ed.)

Utah Basic Skills Competency Test

1^st – 12^th

6^th, 9^th

5^th, 8^th, 11^th

10^th

Reading, math, science

Test battery

Reading, writing, & math

System accountability (At district level by submitting accreditation report to Northwest Association of Schools & Colleges)

Student accountability (by 2005)

- None

- Must pass for basic high school diploma

Vermont

VT Developmental Reading

English/Language Arts/Math Assessment

Science Assessment

2^nd

4^th, 8^th, 10^th

6^th, 11^th

Reading, language arts, math, science

Student accountability

System accountability

- Must meet standards-based graduation requirements for high school diploma.

- Very low performing schools receive technical assistance

West Virginia

Stanford Achievement Test (9^th Ed.)

1^st – 11^th

Test battery

Student accountability

System accountability

- Need 70^th %ile for college warranty; 50^th %ile for workforce warranty

- SEA controls districts with low performing schools.

Generally speaking, there is wide variability across states in describing the context in which students are tested out of level. No two states use the same statewide assessment instrument, test at the same grade levels, have similar accountability systems, or use high stakes effects in similar ways. There is one point of agreement about the content areas that are tested. While no two states test exactly the same subject areas, all states do test academic abilities that are related to basic reading, math, and writing.

State Status of Out-of-Level Testing

Many of the states that test students out of level have done so for a number of years. Table 3 displays the number of years that states have tested students out of level. In fact, Iowa has a history of testing students out of level that spans at least the last three decades. Two states have implemented an out-of-level testing policy for 10 years. Four states have several years of out-of-level testing experience—two years in Vermont, three years in Arizona and California, and four years in West Virginia. One of these states, Arizona, varies in its years of experience according to the type of state test administered; it has tested out of level with a norm-referenced test for three years, but only one year with a criterion-referenced test. Of the remaining states, Alaska, Louisiana, and Delaware, the decision to test out of level is a recent one, with Delaware the most recent state to develop and implement an out-of-level testing policy. We were not able to ascertain the specific number of years that Utah has tested students out of level due to changes in personnel; however, we did learn that out-of-level testing has been allowed in Utah for many years.

Table 3. Time Testing Out of Level by State

State	Number of Years Testing Out of Level
Alaska	1 year
Arizona	1 year (CRT) 3 years (NRT)
California	3 years
Connecticut	10 years
Delaware	Less than 1 year
Iowa	More than 10 years
Louisiana	1 year
North Dakota	10 years
South Carolina	1 year
Utah	Many years
Vermont	2 years
West Virginia	4 years

Of those states that most recently began testing students out of level in large-scale assessments, there was a consistent pattern as to what drove the decision to allow out-of-level testing. In all four states (Alaska, Delaware, Louisiana, and South Carolina), special interest groups organized to oppose the participation of subgroups of students in state tests. These special interest groups included students, parents, and teachers who advocated for either students with disabilities or English language learners, claiming that the state tests were unfairly difficult for these students. The SEA in each state decided to allow the administration of lower levels of state tests in response to this external pressure.

In deciding to test out of level, three of these states sought a broad base of support to develop an out-of-level testing policy. For instance, Louisiana convened a special task force of state board members, local educators, parents, and state department personnel to negotiate a fair policy that included all students in state assessments. South Carolina and Delaware also sought a wide range of support from various stakeholders by convening either state-level committee meetings or public hearings. Alaska is an exception in that the SEA made an internal decision to test students from an English language immersion program out of level so that fourth grade students were allowed to take a third grade standards-based exam at a time when English language learners are more proficient in English.

Out-of-level Testing Policy Content

The intent of out-of-level testing policy is to provide policy language that guides the practice of out-of-level testing at the local district and school level. Overall, there is wide variability in the policy content among those states that test students out of level. Some states’ policies specify detailed procedures for testing students out of level while other states’ policies contain language that is more general. To more clearly describe the variability across policies, we organized policy content into four categories: policy features, instrument characteristics, implementation practices, and test score use. Each category is displayed in a table format that provides detailed policy information across states. Policy content is then discussed through thematic generalizations gleaned from reviewing each table.

State-Level Policy Features

Table 4 contains state-specific policy content according to features of the written out-of-level testing policies. Four generalizations emerged from the review of this policy information.

Table 4. Out-of-Level Testing Policies by States - State-Level Policy Features

State	Written Policy Format	OOLT Classification	Selection Criteria	Students Tested
Alaska	“Participation Guidelines for Alaska Students in State Assessments”	Modification	Must be ELL in specific language immersion or bilingual program	Only 4th grade in Language Immersion Program
Arizona	“Special Education Guidelines”	Modification	Yes	Students with IEPs or 504 Plans
California	“Questions and Answers on the STAR Augmentation”	Non-standard accommodation	Not specified in policy	Students with IEPs
Connecticut	“Assessment Guidelines (7^th Ed.)”	Alternate Assessment Option #1	Yes	Students with IEPs
Delaware	“DSTP Guidelines”	Accommodation	Yes	Students with IEPs
Iowa	Policy in review	Alternate Assessment	Not specified in policy	Students with IEPs
Louisiana	“Questions & Answers About Out-of-level Testing”	Modification in test administration (not officially stated in policy)	Yes	Students with IEPs
North Dakota	“Test Coordinator’s Manual”	Accommodation	Not specified in policy	Students with IEPs
South Carolina	“Guidelines for Testing Students with Documented Disabilities”	Modification	Yes	Students with IEPs
Utah	“Guidelines for Participation of Students with Special Needs in the Utah Performance Assessment System for Students”	Modification	Yes	Students with IEPs
Vermont	“Participation Guidelines for Students with Special Assessment Needs”	Adapted Assessment	Yes	Any student. Typically students with IEPs, 504 Plans, or LEP
West Virginia	“Special Education & 504 Questions & Answers”	Modification	Yes	Students with IEPs or 504 Plans

State level policies on out-of-level testing are in a variety of written formats. Most of the states that allow out-of-level testing have created written policies to guide practice at the local school level. Two states (Louisiana and Vermont) have developed written policies that more thoroughly explain their particular states’ expectations for implementing an out-of-level testing program. Louisiana specifies an out-of-level testing policy in a Questions and Answers document focused specifically on testing students out of level. Vermont has extensive procedures detailed about testing students out of level within a set of guidelines for testing students with special assessment needs. Seven states (Arizona, California, Connecticut, Delaware, South Carolina, Utah, and West Virginia) provide policy language that refers to out-of-level testing within policy documents that describe statewide testing systems overall. These policy formats are either in Question and Answer format or testing guidelines format. For example, Delaware has a special section in its testing guidelines document that describes its procedures for testing students out of level.

Only one state (North Dakota) does not have a separate document that explains an out-of-level testing policy. Teachers in this state are encouraged to implement state tests at lower grade levels by using the test coordinator’s manual prepared by the test company that developed the North Dakota statewide assessment. Another state (Alaska) does not refer to the term out-of-level testing in its assessment guidelines; rather, out-of-level testing is referred to as delaying the administration of a specific grade level exam by “administering a 3rd grade test to a 6th grade student” (Alaska Department of Education & Early Development, 2000). At the time of data collection for this project, the remaining one state (Iowa) was in the process of revising its state policies on out-of-level testing.

States that allow out-of-level testing do not treat out-of-level testing similarly in their written policies. There are four labels used for out-of-level testing across states: modification, accommodation, nonstandard, and alternate assessment. Six states (Alaska, Arizona, Louisiana, South Carolina, Utah, and West Virginia) label out-of-level testing as a test modification while two states (Delaware and North Dakota) label out-of-level testing as a test accommodation. Even though some states may use the same label, it is unlikely that the labels mean the same thing in the different states (Thurlow & Wiener, 2000). California refers to out-of-level testing as a nonstandard accommodation. Two states (Connecticut and Iowa) treat out-of-level testing as an alternate assessment.

Most states provide criteria in their assessment policies for selecting students for out-of-level testing. Eight of the states that allow out-of-level testing (Alaska, Arizona, Connecticut, Delaware, Louisiana, Utah, Vermont, and West Virginia) give criteria to identify students eligible for out-of-level testing. Specificity of the criteria varies considerably across states. For instance, some states provide general expectations, such as the requirement that performance data be available to support the decision to test out of level, or the mandate that the level of an out-of-level test must be consistent with the student’s instructional level. Another state provides more explicit criteria by stating that a student with disabilities must spend at least 50% of his or her instructional time at a lower grade level in a content area to qualify for an out-of-level test in that particular content area. Another state provides a documentation of eligibility form that records a student’s eligibility for an alternate (out-of-level) assessment. The form contains a series of questions for teachers and parents to answer, thus guiding the decision to administer a modified, an adapted (out-of-level), or a lifeskills assessment. (See Appendix A for a copy of the documentation form. For an on-line version use http://www.state.vt.us/educ/cses/alt/eligdoc.htm.) We were unable to obtain policy language that contained out-of-level testing criteria from either California or Iowa.

Students with disabilities who have an Individualized Education Program (IEP) are typically the only students who may be tested out of level. Generally speaking, each state’s out-of-level testing policy specifies which students can be selected for an out-of-level test. All but one state (Alaska) administers out-of-level tests to students with disabilities who have IEPs. In fact, 8 of the 12 states that test out of level do so only for students with IEPs. Of the remaining four states, Alaska tests only third grade students enrolled in a language immersion program when they are in fourth grade. Arizona and West Virginia allow students with IEPs and 504 Plans to be tested out of level. Vermont allows any student to be tested out-of-level, but the practice is qualified by the indication that students with IEPs, 504 Plans, or limited English proficiency typically are tested out of level.

Instrument Characteristics

The characteristics of the instruments that states use to test students out of level are presented in Table 5. We drew four generalizations from reviewing this information.

Table 5. Out-of-Level Testing Policies - Instrument Characteristics by States

State	Type of Test	Grade Levels Tested Out	Levels Tested Out	Content Areas Tested
Alaska	CRT	Only 4^th grade take 3^rd grade benchmark exam	1 level	Reading, writing, & math
Arizona	CRT NRT	Any available level of CRT or NRT needed	Number of levels to match test level to instructional level	Reading, writing, & math (May test one area)
California	NRT/CRT	Any available level of CRT/NRT needed	1 level (standard presentation) 2 or more levels (nonstandard presentation)	Reading, language arts, math, science, & social studies
Connecticut	CRT	Grades 2, 4, 6, & 8	Number of levels necessary to match test level & instructional level	Reading, writing, & math
Delaware	CRT/NRT	Grades 3, 5, & 8 (All 3^rd grade students take 3^rd grade level)	Only levels tested by statewide assessment	Reading, writing, math, science, & social science (May test one area)
Iowa	NRT	Grades K - 12	3 - 4 levels below (or above)	Vocabulary, reading, math, writing, science, & social studies
Louisiana	NRT (In lieu of CRT)	Grades 3 - 9	3 levels or more in at least one subject	Reading, writing, & math (May test one area)
North Dakota	NRT	4, 6, 8, & 10	No more than 2 levels recommended	Test battery
South Carolina	CRT	Grades 1 - 8	Test level at 50% of instruction time	Language Arts, math, science, & social studies
Utah	CRT	Grades 1 - 11	Number of levels necessary to match test level & instructional level	Reading/language arts, math, & science
Vermont	CRT	Grades 4, 6, 8, 10, & 11	Test at state test levels only	Language arts, math, & science
West Virginia	NRT	Grades 1 - 11	1 level	Test battery

Both criterion-referenced and norm-referenced tests are used to test students out of level at a variety of grade levels. In terms of the type of test administered out of level, five states (Alaska, Connecticut, South Carolina, Utah, and Vermont) use criterion-referenced tests, three states (Louisiana, North Dakota, and West Virginia) use norm-referenced tests, two states (Arizona and Louisiana) use both criterion and norm-referenced tests, and two states (California and Delaware) use a test that combines a criterion-referenced with a norm-referenced test.

Louisiana is unique in the approach used to test students with disabilities out of level. An out-of-level norm-referenced test is substituted for the criterion-referenced test that is used for the state test if a student cannot be appropriately tested by an on grade level version of the state test.

Of the 12 states that allow out-of-level testing, Iowa differs in that there is no mandated statewide assessment although the SEA recommends a norm-referenced test for testing groups of students. All of the other 11 states administer out-of-level tests in large-scale statewide assessment programs.

Every grade level is tested out of level across assessment programs, although not all states test every grade out of level. Of those states that have mandated large-scale assessments, only two states (Utah and West Virginia) test the same grade levels. Grades tested out of level in state tests range from 1st grade through 12th grade with only two states (Utah and West Virginia) testing all elementary and secondary grade levels through 11th grade. There also appears to be no consistent pattern in the grades tested by state tests. Only Utah and West Virginia test 1st grade students in statewide assessments. Most other states test approximately four nonconsecutive grade levels during the elementary and early secondary grades.

Because of the variety of grade levels tested by state tests, it is difficult to determine either an age or grade level at which students are more likely to be tested out of level. Generally speaking, both elementary and secondary students are tested out of level across states that allow out-of-level testing, although the grade level tested depends on the state in which a student is tested.

No two states recommend the same number of levels for an out-of-level test. All state assessment policies specify the number of grade levels that can be tested out of level. However, the number of levels below grade level allowable for out-of-level tests differs across the 12 states. For instance, two states allow testing only one level below a student’s grade level (Alaska and West Virginia), while one state recommends no more than two levels below grade level (North Dakota). In contrast, another state tests three or more levels below grade level (Louisiana). Two other states offer out-of-level tests at only those grade levels tested within their large-scale assessment program (Delaware and Vermont). Of the remaining six states with out-of-level testing, three test as many levels below as are necessary to match a student’s instructional level (Arizona, Connecticut, and Utah).

The three remaining states differ in that their policies qualify the allowable levels to be tested below grade level (California, Iowa, and South Carolina). For instance, California identifies an out-of-level test one level below grade level as a “standard presentation”; in contrast, an out-of-level test two or more levels below grade level is a “nonstandard presentation.” Iowa, on the other hand, equates those out-of-level test scores that are three or four levels below grade level back to in-level scores. However, beyond four levels below, the out-of-level test scores are not equated in Iowa. South Carolina allows any level below grade level to be tested out of level, but only if 50% of the student’s instructional time is spent at the grade level tested out of level.

All states test core content areas out of level. Every state that tests students out of level does so in the core content areas of reading, writing, and math. Six states (California, Delaware, Iowa, South Carolina, Utah, and Vermont) also test out of level in the content areas of either science or social studies. For those states that use a norm-referenced test for a statewide assessment, the full test battery is administered that would include an assessment of core subject areas. There is one distinction across states regarding the instructional areas tested. Three states allow out-of-level testing for one or more content areas. The remaining states require that the entire state test be administered out of level, which means that all content areas are tested out of level in one test presentation. Thus, in nine states, if an IEP indicates that a student needs to be tested below grade level for math, that student is automatically tested below grade level in reading, writing, and any other content areas the states assesses.

Implementation Practices

Table 6 displays the tenets of implementing out-of-level testing programs across states that allow out-of-level testing. The following three generalizations were gleaned from our review of out-of-level testing policies.

Table 6. Out-of-Level Testing Policies - Implementation Practices by States

State	OOLT Decision Process	Parent Involvement	Required Documentation	Monitoring Procedures
Alaska	Not stated in policy	Not stated in policy	Not stated in policy	Currently none
Arizona	IEP process	IEP process	In IEP or 504 Plan	IEP documentation monitored
California	IEP process	IEP process	In IEP	Currently none
Connecticut	IEP process	IEP process. Parents & students informed of OOLT high stakes effects	In IEP	Currently none
Delaware	IEP process	IEP process	In IEP	State reviews number of accommodated tests
Iowa	IEP process	IEP process	In IEP	Currently none
Louisiana	IEP process	IEP process	In IEP with parent signature on “Out-of-level Testing Criteria” form	In development
North Dakota	IEP process	IEP process	Required in IEP & student test booklet	Currently none
South Carolina	IEP process	IEP process	In IEP with parent signature on “Parent/Guardian Acknowledgement of Off-Level Testing” form	Monitors percentage of students tested
Utah	IEP process	IEP process	In IEP	Relies on collective decision within IEP team
Vermont	IEP process	IEP process	In IEP using Allowable Accommodations Grid & Modified Assessment planning worksheet	State reviews documentation forms for approval prior to testing
West Virginia	IEP process	IEP process	In IEP or 504 Plans	Currently under review

The decision to test students out of level is made by the IEP team in most states. Currently, the IEP team selects those students who are best assessed out of level in 11 out of the 12 states that allow out of level testing. Alaska is the only state that does not require this decision-making process to identify students for out-of-level testing.

States vary according to the amount of guidance that SEAs provide to practitioners in making testing decisions (see discussion above about out-of-level testing criteria). All 12 states report that IEP team members receive training through a variety of formats that supports appropriate decision making about out-of-level testing (Minnema et al., 2000). These training formats include SEA and district organized assessment training workshops, mass mailings statewide that describe changes in assessment policies, and postings on the Internet that are updated periodically. We do not know at this time whether states provide any out-of-level testing information for parents of students with disabilities. Some states assume that information is shared within the IEP team process to foster informed decision making by all team members (Minnema et al., 2000).

Most states assume that parents of students with disabilities are involved in the decisions to administer out-of-level tests. In the 12 states that currently allow out-of-level testing, 9 state policies indicate that parents are to be involved in the IEP decision-making process about testing students out of level. One state (Connecticut) has policy language that specifies the expectation that both parents and students who select the option of out-of-level testing must be informed about the future high stakes consequences of taking an out-of-level test. Another state (Alaska) does not indicate that parents of students to be tested out of level are to be included in the decision to do so.

To better understand the level of parent involvement in the decision to test a student out of level, it is helpful to consider how this decision is documented in student records. All but one state (Alaska) requires documentation in a student’s IEP that a state assessment is to be administered out of level. Eight of these states’ policies (Arizona, California, Connecticut, Delaware, Iowa, North Dakota, Utah, and West Virginia) require documentation, but do not provide specific procedures for doing so. However, three states (Louisiana, South Carolina, and Vermont) have developed procedures that both document the administration of an out-of-level test and parent involvement in the IEP team decision. These states provide specific forms that require parent signature or planning worksheets that both guide the decision to test out of level as well as confirm parent awareness of this decision. See Appendix B for a copy of a parent signature form. (For an on-line version, see http://www.doe.state.la.us/DOE/Assessment/OOLform.pdf.) Two states (Louisiana and South Carolina) require parents to sign forms indicating agreement with the decision to administer an out-of-level test. In addition, these forms state the specific high stakes consequences for students who do not pass the regular statewide assessment.

Most states do not specifically monitor out-of-level testing at the local level. Only one state (Vermont) has specified monitoring procedures whereby the SEA approves the decision to test out-of-level prior to the day of testing on a student by student basis. In this way, both the number of out-of-level tests and the students who are tested are monitored. As a way to ensure that an unusually large number of students is not tested out of level, South Carolina reviews the percentage of students tested both on or below grade level. Another state (Arizona) monitors the IEP documentation of out-of-level testing within the state’s regular monitoring of IEP content. The remaining states do not have monitoring procedures in place that ensure appropriate policy implementation of out-of-level testing. Two states (Louisiana and West Virginia) are either developing or reviewing procedures to monitor out-of-level testing at the local level.

Test Score Use

In Table 7, we present information that is related to how states use out-of-level test scores. Three generalizations are important to understanding this information.

Table 7. Out-of-Level Testing Policies – Test Score Use by States

State	Equated to In-Level Scores	Reporting Methods	High Stakes Effects
Alaska	No	Disaggregated	Grade Promotion –None Graduation - None
Arizona	Not determined	Reporting procedures in process	Grade Promotion – None Graduation – Yes with IEP specifying passing level
California	No	Nonstandard scores not reported at state level	Grade Promotion – None Graduation – None
Connecticut	No	Only participation reported	Grade Promotion –None Graduation – Do not take CAPT
Delaware	No	Disaggregated without reporting NRT scores	Grade Promotion – None Graduation – Yes as do not receive regular high school diploma
Iowa	Yes (Could equate)	Not disaggregated	Grade Promotion –None Graduation - None
Louisiana	No	Disaggregated	Grade Promotion – None Graduation – Yes as do not receive regular high school diploma
North Dakota	Yes (Could equate)	Aggregated	Grade Promotion – None Graduation - None
South Carolina	No	Disaggregated	Grade Promotion – Not yet Graduation – Not yet
Utah	No	Disaggregated	Grade Promotion – None Graduation - None
Vermont	Yes	Equated scores entered in accountability index	Grade Promotion – None Graduation – Not yet
West Virginia	Yes (Could equate)	Aggregated with all non-standard SAT-9 scores	Grade Promotion – None Graduation - None

For most states, it is difficult to acquire information about how states manage out-of-level test scores. Only one state (Vermont) describes on its Web site the procedures that are used to manage out-of-level test scores once students have been assessed. For those states that use norm-referenced instruments for testing students out of level (Iowa, Louisiana, North Dakota, and West Virginia), we assumed that out-of-level test scores could be equated to in-level test scores by using test companies’ conversion procedures. In some cases, the test protocols are returned to the test companies, which in turn report the raw test data to the SEAs.

With the exception of two states (Arizona and Vermont), most states that use a criterion-referenced instrument for out-of-level testing do not have readily available information on how data managers prepare these test scores for reporting purposes. Arizona is currently reviewing this process. Vermont has developed a set of transformation rules whereby various proficiency levels of adapted assessment scores earn a specific point total so that these “transformed” scores can be entered into an accountability index. For instance, students who take an adapted (out-of-level) alternate assessment and receive a score of “achieved the standard with honors” receive 300 points as an accountability index score. Students who receive an adapted assessment score of “nearly achieved the standard” receive 0 points. On the other hand, students who take an on-grade level alternate assessment and obtain a score of “achieved the standard with honors” receive 600 points for entry into an accountability index. No points are awarded for an on-grade level alternate assessment score of “little or no evidence of achievement.” In other words, students who take an out-of-level test in Vermont’s statewide assessment program receive a lower point total for accountability purposes than those students who take the assessment on-grade level even though the student in the out-of-level test earned the same proficiency level for a lower grade (See Appendix C for picture of transformation rules)

Reporting practices for out-of-level testing scores vary widely across states that test students out of level. Generally speaking, out-of-level test scores are either disaggregated or aggregated for public reporting; however, discerning a pattern within these reporting practices is difficult since few states have adopted similar reporting practices. One state (Vermont) enters weighted out-of-level test scores in an accountability index (described above). Another state (Connecticut) reports the participation numbers for those students tested out of level, but does not report the performance of students on out-of-level testing. Four states (Alaska, Delaware, Louisiana, and South Carolina) disaggregate out-of-level test scores for public reporting while two states (North Dakota and West Virginia) aggregate out-of-level test scores. North Dakota aggregates out-of-level test scores with in-level test scores for reporting purposes while West Virginia aggregates out-of-level test scores with other non-standard test scores. None of these six states use the same approach for reporting these scores at the state level. For instance, one SEA (North Dakota) converts out-of-level test scores to in-level test scores for reporting on-grade level aggregated performance. Another SEA (West Virginia) does not disaggregate out-of-level test scores separately; instead it reports aggregated non-standard test scores that include out-of-level test scores. Yet another SEA disaggregates out-of-level test scores without reporting the norm-referenced out-of-level test scores at the state level. One state (California) indicated that nonstandard out-of-level test scores are not reported at the state level. The remaining SEA (Arizona) reported that it is in the process of developing state-level reporting procedures for out-of-level testing.

Out-of-level testing in most of these states does not have high stakes effects on either grade promotion or high school graduation. One state (Arizona) indicated that out-of-level testing impacts a student’s graduation. Materials on the Arizona Web site indicate that when students complete “the testing requirement appropriate to the student (as defined in the student’s IEP or accommodation plan), a student will be eligible for a diploma” (see http://www.ade.state.az.us/standards/specialed.htm, page 7). Two states’ policies (Delaware and Louisiana) do not allow a student tested out of level on a statewide assessment to receive a regular high school diploma. Two other states (South Carolina and Vermont) indicated that there were no high stakes effects yet for students who are tested out of level, but the implications of out-of-level testing may change in the future. Otherwise, the remaining eight states have no high stakes effects on graduation for out-of-level testing, or plans to change their policies in the near future. None of these states has policy that impacts grade promotion for students tested out of level in large-scale assessments.

Discussion

Out-of-level testing policy and practice are rapidly changing. As we were gathering data, one state (Utah) was in the process of developing new guidelines for its out-of-level testing program. The state assessment director requested that we not report on “old” out-of-level testing policy. It was only because Utah’s new guidelines were posted on the state Web site that we were able to include the state in our analysis. Because of the volatile nature of out-of-level testing policy and practice, any report is likely to be out-dated quickly if only the “facts” of who is doing what and how are reviewed. The underlying trends and issues are what is likely to be more enduring.

It is also important to note that our sources for data may be considered limited in that we relied primarily on Web sites. While Internet-based information is rapidly becoming an important venue for acquiring information, not all Web sites are easy to navigate or consistently updated with current information. It is possible to miss newly posted information because of the timing of data collection. We were careful in checking the accuracy of our information and in accessing any missing information. Still, it is possible that the policy information gleaned from these 12 state Web sites may be somewhat incomplete or inaccurate.

With these limitations in mind, our analyses of out-of-level testing policies that guide the implementation of out-of-level testing in large-scale assessments raise four key discussion points. While these points are important to consider, it should also be noted that there are no definitive research results that guide this discussion. Rather, we raise additional questions about testing students with disabilities out of level. In that sense, we view these discussion points as the next steps in researching this approach to testing.

First, generally speaking, few out-of-level testing policies contain the level of specificity needed to guide the testing of students out of level in a suitable manner. Assessment policies that do not address all relevant aspects of testing students with disabilities out of level create testing programs that may be open to misuse. For instance, few states provide selection criteria that are written in a concrete manner. Without concrete criteria, practitioners who are charged with deciding which students to test out of level must rely on a subjective decision-making process. When choices are not objective about whose assessment needs are best met by an out-of-level test, the decision to test below grade level can be grounded in faulty assumptions about the student’s academic functioning or test taking skills. The resulting test score, while thought to be a more valid measure of what a student knows, may not be more valid simply because of inappropriate decisions about testing level. In order to measure students’ academic abilities with high levels of precision and accuracy, out-of-level policies need to provide enough direction for practitioners across a state to implement consistent testing practices.

A second discussion point pertains to how states treat out-of-level testing within their large-scale assessment programs. Some states call an out-of-level test an accommodated test while others call an out-of-level test a modified test. Two states define out-of-level testing as one option in their alternate assessment program, but even within this variation of defining what an out-of-level test is in policy, there is within classification variation as well. It is important to note that for those states that refer to out-of-level testing as an accommodation, some states consider the test administration to be a standard administration while other states do not. States vary in how “accommodation” and “modification” are defined.

Third, it is important to note that what states present in assessment policies do not necessarily represent how those policies are implemented in practice. Many questions remain about testing students out of level in large-scale assessment programs. For instance, many states rely on the collective judgment of an IEP team to make the decision to test out of level. However, no research to date has described how this decision-making plays out in practice. States assume that parents of students with disabilities help make informed decisions about testing their student out of level, but we have not really examined parents’ perceptions of out-of-level testing or their participation in the decision to test out of level. In fact, the literature has not adequately addressed whether those students for whom out-of-level testing is intended are actually the students tested below grade level.

As a final discussion point, it is important not to overlook the fact that the long term effects on the educational experiences of students with disabilities who are tested out of level are unknown. To date, the literature has not adequately explained the interplay between the decision to test a student below grade level and instructional decisions. Does testing a student out of level affect teachers’ learning expectations so that students do not receive challenging curriculum that support striving to meet grade level standards? Is the decision to test out of level based solely on a student’s current classroom performance? If so, how does out-of-level testing affect a student’s learning over time? The literature to date has not described whether the long-term effects of out-of-level testing (e.g., high school graduation with a regular diploma, benefiting from school improvement planning) are considered when the decision is made to test a student out of level.

Future Research

It is critical that research better meet policymakers’ information needs. Well-designed research studies that address the important questions about testing students out of level are needed for policymakers to make informed decisions about the content of educational policies whose implementation will have long term effects on the school results of students with disabilities. Too often, state legislatures have mandated large, sweeping educational reforms that rely on assessment programs – an easy, perhaps logical, solution to complex issues. Assessment programs that test students with disabilities out of level often seem to be grounded in this type of logic. What seems logical (e.g., testing a student at his or her instruction level) must be considered at a deeper level. This has not been done so far. We hope that this policy and contextual analysis is a step in this direction.

The research that is needed cannot be accomplished without support of the policy and assessment communities. There is a critical need for all out-of-level testing scores to be reported publicly – at both the district and state levels. Reporting must be based on procedures that ensure accuracy and fairness in reporting. It is essential that states articulate and implement strict monitoring procedures that guard against misuse of out-of-level testing. IEP team decision making needs to be opened to research like that conducted by Shriner and DeStefano (2001), so that we can determine how decisions are made to test students with disabilities out of level. Finally, determining how out-of-level test scores can be used for both student and system accountability systems is a topic that needs the joint solution of researchers and policymakers.

References

Alaska Department of Education & Early Development. (December, 2000). Participation guidelines: For Alaska students in state assessments. [Brochure]. Juneau, AK: Author.

Almond, P., Quenemoen, R., Olsen, K., & Thurlow, M. (2001). Gray areas of assessment systems (Synthesis Report 32). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

CPRE. Assessment and accountability in the fifty states: 1999-2000. (2000). Philadelphia, PA: Consortium for Policy Research in Education (CPRE). Retrieved January, 2001, from the World Wide Web: http://www.gse.upenn.edu/cpre/frames/pubs.html

Arizona. Special education guidelines. (n.d.). Phoenix, AZ: Arizona Department of Education. Retrieved December 12, 2000, from the World Wide Web: http://www.ade.state.az.us/standards/specialed.htm

Louisiana. Out-of-level testing criteria for LEAP. (n.d.). Baton Rouge, LA: Louisiana Department of Education. Received September 26, 2000. Available on the World Wide Web: http://www.doe.state.la.us/DOE/Assessment/OOLform.pdf

Minnema, J., Thurlow, M., Bielinski, J., & Scott, J. (2000). Past and present understandings of out-of-level testing: A research synthesis (Out-of-Level Report 1). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

Minnema, J., Thurlow, M., & Scott, J. (2001). Testing students out of level in large-scale assessments: What states perceive and believe (Out-of-Level Testing Report 5). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

Olson, J. F., Bond, L., & Andrews, C. (Fall, 1999). Annual survey: State student assessment programs (Summary Report). Washington, D.C.: Council of Chief State School Officers.

Shriner, J. G., & DeStefano, L. (2001, April). Curriculum access and state assessment for students with disabilities: A research update. Paper presented at the annual conference of the Council for Exceptional Children, Kansas City, MO.

Study Group on Alternate Assessment (1999). Alternate assessment resource matrix: Considerations, options, and implications (ASES SCASS Report). Washington, DC: Council of Chief State School Officers.

Thurlow, M., & Wiener, D. (2000). Non-approved accommodations: Recommendations for use and reporting (Policy Directions No. 11). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

Vermont. Documentation of eligibility for alternate assessment (n.d.). Burlington, VT: University of Vermont. Retrieved December 20, 2000, from the World Wide Web: http://www.state.vt.us/educ/cses/alt/eligdoc.htm

Appendix A

Documentation of Elgibility Form

See http://www.state.vt.us/educ/cses/alt/Eligdocumentation.doc

(An online version can be found at http://www.state.vt.us/educ/cses/alt/eligdoc.htm)

Appendix B

Parent Signature Form

OUT-OF-LEVEL TESTING CRITERIA FOR LEAP

This form must be completed to determine whether the student with a disability is eligible for out-of-level testing.

Student DOB School/District Grade Enrolled

Definition: Out-of-level testing in the Louisiana Educational Assessment Program (LEAP) means the student

· will be assessed at his/her functioning grade level(s) in language/reading and/or mathematics, not the actual grade level in which he or she is enrolled;

· will be assessed with the Iowa Tests of Basic Skills (ITBS); and

· will not be assessed with the LEAP for the 21st Century (LEAP 21) at grades 4 and 8.

The decision to test a student out-of-level cannot be

· based on a disability category,

· based on placement setting, or

· determined administratively.

The LEA is required to provide the student with

· LEAP remediation, and

· accommodations and modifications to ensure the student progresses towards meeting his or her IEP goals and objectives related to the general education curriculum.

Circle “Agree” or “Disagree” for each item below.

Agree Disagree The student does not meet the LEAP Alternate Assessment Participation Criteria.

Agree Disagree The student scored at the Unsatisfactory level on the previous year’s LEAP 21 in English language arts and/or mathematics.

The student’s previous year’s Total(s) on the ITBS in language, reading, and/or mathematics was at or below the fifth percentile.

Agree Disagree The student’s IEP reflects a functioning grade level in English language arts (including reading) and/or mathematics at least three (3) grade levels below the actual grade level in which he or she is enrolled.

Agree Disagree The parent agrees his or her child should participate in out-of-level testing.

Note: For the student with a disability to be eligible for out-of-level testing, the response to each statement above must be “Agree.”

Parental Understanding: If my child is eligible for and participates in out-of-level testing, my initials indicate I understand the statements below.

_____ CAUTION: Out-of-level testing means my child is performing below grade level. If my child continues to be tested below grade level, it is highly unlikely that he or she will earn a regular high school diploma. I am aware that my child must pass all components of the Graduation Exit Examination (GEE) and earn the necessary 23 Carnegie Units in order to receive a regular high school diploma.

_____ If my child is enrolled in either grade 4 or 8, he or she will not be assessed with the LEAP 21. He or she will be assessed with the Iowa Tests of Basic Skills at his or her functioning level(s).

_____ If my child is enrolled in either grade 4 or 8 and is assessed below grade level, then my child is not entitled to the LEAP 21 summer school remediation and the decision to promote or retain my child will be made by the School Building Level Committee (SBLC).

_____ If my child is enrolled in grade 3, 5, 6, 7, or 9, the decision to promote or retain my child will be based on the local school district’s Pupil Progression Plan.

Decision-Making: This decision must also be documented on the student’s IEP.

_________________________will be assessed in all content areas at the actual grade level in which he or she is enrolled.

If the decision is to test out-of-level, document this decision on the student’s IEP along with the grade level(s) in which the student will be assessed in language/reading and mathematics. Out-of-level testing is allowed only if the parent agrees. If the parent disagrees with having his or her child test out-of-level, the child must be tested on grade level.

Approved BESE 9/99 Copies must be provided to teacher(s), parent, and central office.

Appendix C

Transformation Rules

http://www.state.vt.us/educ/cses/alt/Assessment_Participation_Presentation/sld027.htm

Top of page

States' Out-of-Level Testing Policies

Out-of-Level Testing Project Report 4

Published by the National Center on Educational Outcomes

June 2001

State

Written Policy Format

Selection Criteria

Students Tested

Type of Test

Levels Tested Out

State

OOLT Decision Process

Parent Involvement

Required Documentation

Monitoring Procedures

High Stakes Effects