Testing Students with Disabilities Out of Level: State Prevalence and Performance Results


Out-of-Level Testing Project Report 9

Published by the National Center on Educational Outcomes

Prepared by:
Martha Thurlow, Jane Minnema, John Bielinski, and Kamil Guven

October 2003


This document has been archived by NCEO because some of the information it contains may be out of date.


Any or all portions of this document may be reproduced and distributed without prior permission, provided the source is cited as:

Thurlow, M., Minnema, J., Bielinski, J., & Guven, K. (2003). Testing students with disabilities out of level: State prevalence and performance results (Out-of-Level Testing Project Report 9). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Retrieved [today's date], from the World Wide Web: http://cehd.umn.edu/NCEO/OnlinePubs/OOLT9.html


Executive Summary

Over the past decade, states have struggled to include all students, particularly students with disabilities, in large-scale assessment and accountability programs. The use of accommodations and alternate assessments for students with disabilities has increased access to states’ standards-based measures. However, states continue to indicate that there is a group of students who have not yet been exposed to grade-level curriculum, so that testing them on their grade of enrollment is not possible. In response to this concern, 14 states were testing students below the grade that they were enrolled in school during 2000-2001.

This approach to testing has grown within a contentious environment where the advantages and disadvantages of out-of-level testing are debated at the local, state, and federal levels of the American education system. One of many concerns is that too many students will be tested out of level. A second concern is that below-grade level testing reflects inappropriately low expectations, and that too many students may be tested at too low of a grade level.

To address these concerns, we requested state data to address two research questions: (1) How many students with disabilities are tested below their grade of enrollment in each state’s standards-based large-scale assessments? (2) What do test performance data show about the difficulty of tests for students with disabilities who are tested below their grade of enrollment? Of the 14 states invited to participate, 3 states provided test results that we used for our data analysis activity. The three participating states provided data that they had analyzed. As a result, data are reported here in different ways.

Results showed widely divergent rates of students tested below grade level, from approximately 20% in one state to 50% in another. Percentages varied some by content area (higher in reading) and grade (higher in upper grades). The question of the difficulty of the tests for students tested below grade level also had results that varied by state. Overall, just one state had large numbers of students performing at high levels on below-level tests – suggesting a more difficult test should have been administered to those students; another state had high numbers performing well in one content area (math).

The results of the study clearly indicate that the percentages of out-of-level tests within a state are influenced by state policies. Also, finding that students are performing within expected ranges (i.e., not too high) on many out-of-level tests may be more an indictment of instruction and access to grade level curriculum for these students than it is an endorsement of out-of-level testing. Finally, while the data from this study are not necessarily representative of all states that were testing students with disabilities out of level in the school year 2000-2001, they do demonstrate wide variability, even when looking at only three states. Such variability is certainly a red flag for states or districts with out-of-level testing policies.


Overview

Since the advent of standards-based reform with federally-mandated statewide testing as a means to measure student and system achievement toward grade-level academic standards, states have struggled with the best ways to include students with disabilities. Most students with disabilities take the regular assessment with or without accommodations, and about 1% of the total population of students participate in an alternate assessment developed for students with significant cognitive disabilities (approximately 10% of the population of students with disabilities). Yet, states have still expressed concerns about the appropriateness of the assessments (Almond, Quenemoen, Olsen, & Thurlow, 2000). To better include these students in large-scale assessments, states have added other testing options to their statewide testing program.

One alternative option to the regular assessment with or without accommodations or the alternate assessment that is used in some states (14 in 2000-2001) is testing students with disabilities using tests intended for students at a lower grade level. This option is often called out-of-level testing. Out-of-level testing is a controversial and politicized approach to standards-based assessment (Thurlow & Minnema, 2001). Historically, out-of-level testing was thought to reduce student test anxiety, yield a more accurate measure of student academic achievement, and allow more students with disabilities to be included in state testing and accountability programs. However, recent work at the National Center on Educational Outcomes (NCEO) has demonstrated that out-of-level test scores are rarely reported publicly (Minnema & Thurlow, 2003) or used for either student or system accountability purposes. There are also no data that definitively determine whether student test anxiety is actually reduced during below-grade level testing. In fact, in case studies (e.g., Minnema, 2003), teachers reported that students with disabilities are embarrassed when taking a test that is lower than the test level of their peers. In addition, research has not sorted out the accuracy and precision of out-of-level test results in terms of measuring students’ progress toward academic standards.

Besides the lack of definitive research, clarification of the issues that surround out-of-level testing is also complicated by the variety of testing options that states have developed. In fact, there are several closely related approaches to non-grade-level testing that may or may not be viewed as the same thing as out-of-level testing (e.g., levels testing), making it difficult for the field to sort out what really is out-of-level testing and what is not. To complicate the situation further, it is difficult to find any non-grade-level test results in states’ data reports. In other words, there is a lack of information about the number of students who are involved in any type of non-grade level testing within states that have standards-based assessments (Minnema & Thurlow, 2003). Further, states’ data reports do not include information that describes at what grade levels students with disabilities were tested by non-grade level tests. Without these test data, patterns in students’ test performance cannot be ascertained that point to the appropriateness of the test levels administered. We view these data – the number of students tested on non-grade-level tests (prevalence) and the performance of these students – as necessary first steps toward understanding how states are including students with disabilities by administering non-grade-level tests.

The purpose of this study was to analyze data from states that offer out-of-level testing for students with disabilities. Analyses were conducted to determine, first, the prevalence of students with disabilities participating in non-grade-level testing options. Specifically, we examined test results from three states for the school year 2000-2001. As a second step in data analysis, we examined the overall performance patterns in the test results to begin to evaluate the appropriateness of administering non-grade-level tests to specific groups of students with disabilities. The study had two research questions:

(1) How many students with disabilities are tested below their grade of enrollment in each state’s standards-based large-scale assessments?

(2) What do test performance data show about the difficulty of tests for students with disabilities who are tested below their grade of enrollment in each state’s standards-based large-scale assessments.


Method

Each of the 14 states that used out-of-level testing in its statewide testing program during the 2000-2001 school year was invited to participate in this study. States could either send raw data files for NCEO researchers to analyze, or they could provide state-analyzed test results for NCEO to use. For states that had not yet determined how to make out-of-level test scores public, we requested special data runs of their out-of-level test results.

Three states consented to provide data. The other states either were not interested in participating or had a variety of issues related to identifying out-of-level test results on a statewide basis. Two of the states that provided data had analyzed data on their own and publicly reported those data. For example, one state analyzed its data and distributed a report statewide for districts to review and examine local patterns of out-of-level test results. The publicly distributed data were examined and included here. Another state analyzed data specifically for NCEO in response to its request.

We varied our analyses of data for prevalence information based on the nature of each state’s data. Generally we used frequency counts, and percentages if possible, to describe the prevalence of testing below grade level for each content area tested in a state. In some states, however, information was available only for the grade in which the test was administered, and not for the grade in which students were enrolled. In some states, it was the opposite – data were available for the students’ grades of enrollment, but did not indicate the specific grade level of the test that was taken. The nature of the prevalence data is clarified in the presentation of each state’s data.

To examine test difficulty for those students tested out of level, we examined states’ performance data as a reasonable indicator of whether a student was appropriately challenged. The criteria that we used to indicate that a test was "too hard" or "too easy" were based on the available performance data. Because states’ data varied widely across the three participating states, we used two different approaches to deduce the overall difficulty of out-of-level tests. In State One, we used an indication of the proficiency levels attained by students. However, we did not have this type of indicator to judge test difficulty for out-of-level tests administered in State Two and State Three. In the other two states, we examined the percentage correct obtained by the students. We selected criteria of fewer than 30% correct as a proxy for a test that was "too hard" and more than 80% for a test that was "too easy." For these, a student was considered to have scored "quite low" if the student answered <30% of the items correctly. A student was considered to have scored "quite high" if the student answered >80% of the items correctly.


Results

The results of our analyses are presented below for each state separately. States’ names are not reported here, but a general description of the assessment system in each state as it existed in 2000-2001 is provided to give context to the results.

 

State One

Assessment Program. State One’s statewide assessment program included two criterion-referenced tests, one of which was administered in grades 4, 6, and 8, and the other of which was administered in grade 10. These tests were aligned with the state’s framework of K-12 curricular goals and standards in reading, writing, and mathematics. The grade 10 assessment, which also included science, was not specifically a graduation exit exam but students who met or exceeded the goal standard in each content area on this test received a certification of mastery in those areas.

In this state, special education students who were thought to be unable to participate in the standard statewide assessment program had the option of participating in one of two alternate assessments. Alternate Assessment Option 1, out-of-level testing, was designed for students who had not received any grade-level instruction on skills covered on the regular state assessments. These students typically had moderate disabilities and had been instructed below grade level over consecutive school years; a standard test administration at these students’ assigned grade levels was thought to result in invalid assessments of their academic achievement. Therefore, the test was administered two or more grade levels below the grade in which the student was enrolled (e.g., tenth-grade student takes grade 8 test). This option was not available for fourth-grade students for the writing test. The second option was Alternate Assessment Option 2, which was a skills checklist designed for students who do not participate in an academic curriculum due to severe disabilities.

The decision of which assessment option to use with each student was made by the student’s Individualized Education Program (IEP) team and was reiterated in the student’s IEP. It was expected that about 15 percent of special education students would participate in Alternate Assessment Option 1 (out-of-level testing) and approximately 5 percent of special education students would participate in the Alternate Assessment Option 2 (skills checklist).

Prevalence of Out-of-Level Testing. Table 1 displays the number of students in special education enrolled in each grade who were tested out of level in reading, math, and writing (i.e., 28% of grade 8 special education students taking the reading test were tested out of level) in 2000-2001 in State One. Overall, approximately 30% of the special education students were tested out of level in reading and math across test grades 4, 6, and 8. Fewer students, approximately 20%, were tested out of level in writing. This was probably due to the lack of a writing prompt at Grade 2.

 

Table 1. Out-of-Level Testing Prevalence by Grade and Content Area in State One

Grade Enrolled

Total Number of Special Education Students

Out-of-Level Reading

Out-of-Level Math

Out-of-Level Writing

Grade 4

5,064

1,612 (32%)

1,363 (27%)

*

Grade 6

5,376

1,794 (33%)

1,672 (31%)

1,036 (20%)

Grade 8

5,503

1,582 (28%)

1,595 (29%)

1,227 (22%)

* No grade 4 student was able to take a lower level test in writing because a prompt was not available.

 

Presented in Table 2 is the number of out-of-level tests at each grade and the percentage of students by grade at which they were tested. For instance, of the 1,794 6th grade students who were tested out of level, 32% tested down two levels below their assigned grade level meaning that they took the 2nd grade test. Overall, the largest number of students tested out of level was in the 6th grade. More specifically, for each grade, the largest number of students was tested two grade levels below their grade of enrollment.

 

Table 2. Percent of Students Enrolled in Each Grade Who Were Tested at Each Out-of-Level Test Grade in Reading in State One

Grade Enrolled

Total Count of Out-of-Level Tests

Grade at Which Tested

Grade 2

Grade 4

Grade 6

4

1,612

100%

 

 

6

1,794

 32%

68%

 

8

1,582

 16%

40%

44%

 

Placement Accuracy. In Table 3, we show the percentage of students at each grade level scoring at or above the goal. Scoring at this level would indicate that the test probably was too easy for them. As is evident in the table, approximately 30% at each grade level (4, 6, 8) who took the 2nd grade test scored at or above goal (level 3). The percentage dropped to about 10% for the grade 4 and grade 6 tests.

 

Table 3. Number and Percentage of Students by Score Band for Students Tested Out of Level in Reading in State One

 

Grade 2 Test

Grade 4 Test

Grade 6 Test

Grade

Enroll

 

1*

 

2

 

3**

 

Total

 

1*

 

2

 

3

 

4***

 

Total

 

1*

 

2

 

3

 

4***

 

Total

 

4

759

47%

388

24%

465

29%

 

1612

 

 

 

 

 

 

 

 

 

 

 

6

288

50%

126

22%

159

28%

 

 573

789

65%

165

14%

149

12%

118

10%

 

1221

 

 

 

 

 

 

8

96

38%

68

27%

87

35%

 

251

423

67%

 76

12%

 78

12%

 59

 9%

 

636

436

63%

 99

14%

 85

12%

 75

11%

 

695

 

* 1 = Intervention Level

** 3 = Goal Level for Grade 2 Test

*** 4 = Good Level for Grades 4 and 6 Tests

 

Similar data were presented by State One for the content areas of mathematics and writing. These data are summarized in Table 4, which lists just the percentages of students in each grade of enrollment who performed at the intervention level and the goal level. As is evident in this table, in each grade there were some students who performed at or above goal level. It is also evident, however, that there were increasing percentages of students at each grade level who performed at the intervention level.

 

Table 4. Percentage of Students Tested Out of Level Who Scored in Intervention and Goal Level Bands in State One

Grade

Enroll

 

Grade 2 Test*

 

Grade 4 Test**

 

Grade 6 Test**

Math

Intervention

Goal

Total

Intervention

Goal

Total

Intervention

Goal

Total

4

27%

22%

1363

 

 

 

 

 

 

6

26%

19%

491

39%

13%

1181

 

 

 

8

33%

17%

230

47%

9%

641

54%

5%

724

Writing

Intervention

Goal

Total

Intervention

Goal

Total

 

 

 

4

 

 

 

 

 

 

 

 

 

6

41%

11%

1036

 

 

 

 

 

 

8

46%

11%

600

38%

9%

627

 

 

 

* Goal Level was 3.
**Goal Level was 4.

 

State Two

Assessment Program. State Two assessed students in grades 3, 5, 8, and 10 on the state’s content and performance standards. Corresponding to each grade were "Benchmarks": Benchmark 1 corresponded to 3rd grade, Benchmark 2 corresponded to 5th grade, and Benchmark 3 corresponded to 8th grade. In 10th grade, the benchmark was called the Certificate of Mastery Benchmark. For each benchmark, State Two had three test levels, referred to as Levels A, B, and C. The three levels addressed the same content and concepts, and they shared some common items; however, they differed in the overall level of difficulty of the items. All students could take one of the three levels (A, B, or C) corresponding to their grade benchmark, with Level A referred to as "challenging down" and Level C referred to as "challenging up." For students with disabilities, the challenge down could extend into lower benchmarks; this led to being designated in the student’s IEP. Another option in State Two’s assessment system for students receiving special education services was to participate in one of two alternate assessments: (1) the Extended Reading Assessment, Extended Math Assessment, and Extended Writing Assessment (the Extended Assessments are for those students whose instructional level is well below Benchmark 1); and (2) the Extended Career and Life Roles Assessment.

For students "challenged" against their enrolled grade level benchmark, students were assigned to the level best aligned to their ability as indicated by the four criteria: (a) student’s performance from a prior grade, (b) 20-item locator test, (c) results from a sample test provided by the state, and (d) professional judgment.

 

Prevalence of Out-of-Level Testing. Prevalence and performance results for State Two’s Reading Literature test are displayed in Table 5. The table shows the numbers of students with disabilities taking each grade-level test that was a benchmark below their enrolled grades. Overall, of all students tested on Benchmarks 1, 2, and 3, 12% (n=1,344) were actually enrolled in a grade level above the benchmark on which they were tested. The percentage of students taking each benchmark who were enrolled in higher grades decreased as the benchmark increased. Thus, 19% (n=730) of all students taking Benchmark 1 (n=3,794) were actually enrolled in higher grade levels; for those taking Benchmark 2 (grade 5), the percentage decreased to 11%, and by Benchmark 3 (grade 8), the percentage was 4%.

 

Table 5. Percent of State Two Students in Performance Group on Reading Literature Test

 Benchmark

Test Condition

 N

 Percent

<30% correct

30-80% correct

>80% correct

1

Below grade

730

19

17

80

3

On grade

3,064

81

12

84

4

2

Below grade

478

11

18

74

8

On grade

3,976

89

8

80

12

3

Below grade

136

4

18

79

3

On grade

3,228

96

13

83

4

Overall

Below grade

1,344

12

18

77

5

On grade

10,268

88

11

82

7

 

Placement Accuracy. Table 5 also shows that the percentages of students scoring above 80% (indicating the student scored quite high) were relatively low, never reaching more than 8% of the students taking a below grade benchmark. In other words, among students taking the reading test below grade, it was three times more likely that they would score below 30% correct (students scored too low) than that they would score above 80% correct. In comparison, for those students taking the reading test on-grade, 7% scored above 80% correct compared to 11% who scored below 30% correct.

 

State Three

Assessment System. The assessment system in State Three differed from the other states in that it was a measure of performance on a statewide curriculum, with assessments in reading and math in grades 3-8. The out-of-level tests in this state were alternative assessments developed specifically to assess students in special education who received instruction on the curriculum below their grade of enrollment, covering instructional levels K-8. The student’s IEP team made the decision about which test the student would take based the student’s primary level of instruction, and this level could differ by content area.

Both the regular assessments and the alternative tests were mastery assessments, with the distribution of scores skewed so that most students scored well above 50% correct and very few scored below 30% correct. The criterion of more than 80% correct indicating that the student scored too high and less the 30% correct indicating that the student scored too low does not apply well to the results from these kinds of assessments. It would be informative to examine the actual alignment between the assessment assigned to the student and the curriculum in which the student received instruction. However, since information on the alignment between the assessment and actual instruction was not available to us, we could only examine performance on the regular assessment compared to performance on the out-of-level assessment, with the assumption that comparable performance would indicate a comparable level of challenge for the student.

Prevalence of Out-of-Level Testing. Tables 6 and 7 summarize the frequency with which students with disabilities participated in State Three’s assessment system under three options: alternative assessment below grade, alternative assessment on grade, and regular assessment. As is evident in these tables, more students with disabilities took the alternative assessments below grade than the other two options. Overall for reading, 62% of students with disabilities took the below-grade alternative assessment and 2% took the on-grade level alternative assessment; 36% took the regular assessment on-grade level. The percentages in math were 56%, 3%, and 41%.

 

Table 6. Percent of State Three Students in each Performance Group on Reading Tests

 Grade

 Condition

 N

 Percent

<30% correct

30-80% correct

>80% correct

3

Below grade 21,262 57 1 46 53

On grade

1,370 4 2

60

39

Regular test

14,747 39 2

39

59

4

Below grade

26,149

63

1

46

53

On grade

1,140

3

1

46

52

Regular test

14,116 34 1

42

57

5

Below grade

27,388

64

1

45

54

On grade

989

2

2

56

42

Regular test

14,223 33 2

50

48

6

Below grade

26,852

64

2

52

46

On grade

798

2

4

56

41

Regular test

14,336 34 3

56

41

7

Below grade

25,221

62

3

53

44

On grade

799

2

4

53

43

Regular test

14,811 36 4

61

35

8

Below grade

23,446

60

3

49

48

On grade

644

2

1

44

55

Regular test

15,081

38

2

58

40

Overall

Below grade

150,318

62

2

48

50

On grade

5,740

2

2

53

45

Regular test

87,314

36

2

51

47

 

Table 7. Percent of State Three Students in each Performance Group on Math Tests

 Grade

 Condition

 N

 Percent

<30% correct

30-80% correct

>80% correct

3

Below grade 17,630 48 1 46 53

On grade

1,743 5 2

60

39

Regular test

17,602 48 5

67

28

4

Below grade

22,551

55

1

46

53

On grade

1,383

3

1

46

52

Regular test

16,857 41 2

66

32

5

Below grade

24,285

57

1

45

54

On grade

1,072

3

2

56

42

Regular test

16,988 40 2

50

48

6

Below grade

24,327

58

2

52

46

On grade

888

2

4

56

41

Regular test

16,577 40 6

69

25

7

Below grade

23,609

58

3

53

44

On grade

836

2

4

53

43

Regular test

16,185 40 5

79

16

8

Below grade

22,651

58

3

49

48

On grade

704

2

1

44

55

Regular test

15,550

40

4

79

17

Overall

Below grade

135,053

56

1

49

50

On grade

6,626

3

2

53

45

Regular test

99,759

41

4

68

28

 

Placement Accuracy. Overall, the percentage of students getting more than 80% correct was similar across the assessment options on the reading test. However, on the math test, a substantially greater percentage of the students taking the alternative test below-grade or on-grade got more than 80% correct than students who took the regular test. In both reading and math, the percent of students taking the regular test and scoring more than 80% correct dropped as the student’s grade level increased; however, the percentage remained about the same for students taking the alternative assessment.


Discussion

Our examination of the prevalence of out-of-level testing in three states revealed that it is important to consider both the purpose of the test and the context within which it occurs. While standards-based reforms are prompting the serious study of states’ assessments in their function as accountability tools, this is not the context within which all states started their assessment policies, particularly those related to students with disabilities and out-of-level testing. It is important to recognize that out-of-level testing comes in a variety of packages, which is reflected in that states across the nation do not all use the term "out-of-level testing" to refer to the practice of allowing a student to take an assessment designed for a lower grade level than the student’s grade of enrollment.

Each of the three states for which we were able to obtain data on the prevalence of out-of-level testing is distinct in certain ways and different from the other states. In some cases, the differences are reflected in statements about purpose and about expected numbers of students who will participate in out-of-level testing. In other cases, the differences are reflected in the decision-making criteria and processes that are used to determine which level of a test a student will take. The mechanism by which the student who is to be tested out of level actually takes the lower level test also varies, with at least one state shifting students into a test specifically developed for assessing content at lower grade levels.

 

Prevalence of Out-of-Level Testing

Evident in this report is the fact that there are different ways to look at out-of-level testing and its prevalence. One way is to look at the number of 4th grade tests administered, for example, and determine how many of those were taken by students enrolled in a higher grade. Another approach is to look at all students enrolled in a grade and determine what percentage of those students are taking the test for the grade level in which they are enrolled and what percentage are taking a lower-level test. The two approaches give slightly different pictures of out-of-level testing prevalence.

How large the out-of-level percentages are seems to be influenced by state policies. For example, if a state requires that a student be at least three years behind before being allowed to take an out-of-level test, the number of students taking tests at the lowest grades are going to be larger than might be expected. If a state has no limitation on how far behind the student must be, or if the state simply requires that the student not be performing on grade level, then out-of-level testing numbers are more likely to be spread across the grades and likely larger overall.

It is clear from the three states included in our study that prevalence rates do vary from state to state. The state that reported the percentage of students enrolled in grades who took lower grade level tests (State One) had a fairly consistent percentage of approximately 30% of students taking out-of-level tests.

The two states for which prevalence data were expressed in terms of the percentage of tests taken that were out of level (States Two and Three) showed widely divergent percentages of out-of-level testing. State Two, which had a locator test available to help in the selection of testing level, had percentages that ranged from 19% down to 4%. In other words, most of the out-of-level tests were at the lowest grades – students were taking more 3rd grade tests (Benchmark 1), regardless of their actual grade levels. In State Three, which required only that the student not be performing on grade level to be eligible for out-of-level testing (the alternative test), well over 50% of students with disabilities were being tested below their actual grade level. It is interesting that in State Three, several students actually took the alternative test at the same level as their level of enrollment. This is an unexpected finding, which may indicate either that there is something different about the alternative assessments that make them more appropriate for students with disabilities even though they are at the same difficulty level, such as more universally designed in nature (Thompson, Johnstone & Thurlow, 2002), or that there is a perception among decision makers that they are more appropriate for special education students. It is also possible that other factors are at play here such as the possibility that during this test cycle, the alternative test scores did not factor into school accountability measures, thus tempting teachers to place students in their assessment where their scores would not count even though the students were not actually below grade in their performance.

 

Placement Accuracy

Our examination of the performance of students who were tested out of level was an attempt to look at the accuracy of the decision that students should be tested out of level. What this approach cannot factor in is the possibility that placement accuracy may be, in part, a function of instruction that the student is given rather than the capabilities of the student. In looking at the percentages of students who performed in the 30-80% correct performance range, most of the data that we obtained from the states indicated that this was happening – perhaps leading to the conclusion that many placements out of level are accurate and appropriate.

Looking just at the percentages of students who scored quite high (>80% correct), we see that there are many students being placed at a level of the test that is probably too low. Clearly, there was variability among the three states, variability that reflects their criteria for placement in out-of-level testing. Thus, State Three, which had the least restrictive policy (i.e., the student simply had to be judged to be in a curriculum below grade level), we saw the largest percentages of students were performing at levels that indicated the test was too easy. Consistent with this, nearly half of the students in this state who took a below grade level test were taking a test that was too easy for them. In states with much stricter policies (e.g., State One in which students have not received any grade level instruction over consecutive school years), the percentages of students performing in the "too easy" range were much lower. These numbers of students performing at the upper levels of performance, whether smaller or larger, are significant because they indicate that students are being inappropriately placed, indicating perhaps that educators hold expectations for them that are too low, and that as the students go up in grades they are likely to be farther and farther behind their peers.

While it is important to look at prevalence figures, it is also important to look beyond the figures to what is happening instructionally for students who are tested out of level. We already know that it is unlikely (or at least, has been in the past) that the scores of students tested out of level will be publicly reported or included in accountability measures (Minnema & Thurlow, 2003). The prevalence data suggest that generally as the students go up in grade they are tested at lower and lower grade levels – in other words, instead of maintaining a certain number of levels below their grade of enrollment, the number of levels below the enrolled grade increases. This suggests that something is not right. Students should at least be moving at the same pace as their peers if they start at a lower level, otherwise, they will never make adequate yearly progress.

It will be important to continue to look at prevalence and performance information from out-of-level testing. Adjustments to policy seem to be one of the most likely interventions to restrict the number of students taking out-of-level tests. In fact, state policies allowing out-of-level testing seem to be waning in response to No Child Left Behind, which strongly discourages its use (see Title I Regulations, July 5, 2002). Nevertheless, adjustments to instruction also are likely to be an intervention that will decrease the prevalence of out-of-level testing in those states that allow it.


References

Almond, P., Quenemoen, R., Olsen, K., & Thurlow, M. (2000) Gray areas of assessment systems (Synthesis Report 32). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

Minnema, J., & Thurlow, M. (2003). Reporting out-of-level test scores: Are these students included in accountability programs? (Out-of-Level Testing Project Report 10). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

Minnema, J., & Thurlow, M. (2003, April). Report on a case study of out-of-level testing in a school district. Paper presented at the meeting of the American Educational Research Association, Chicago, IL.

Minnema, J., Thurlow, M., & Scott, J. (2001). Testing students out of level in large-scale assessments: What states perceive and believe (Out-of-Level Testing Project Report 5). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

Thompson, S.J., Johnstone, C.J., & Thurlow, M.L. (2002). Universal design applied to large scale assessments (Synthesis Report 44). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

Thurlow, M., & Minnema, J. (2001). States’ out-of-level testing policies (Out-of-Level Testing Project Report 4). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

U.S. Department of Education. (2001). Twenty-third annual report to Congress on the implementation of the Individuals with Disabilities Education Act. Washington, DC: Author.