Prepared by Rachel Quenemoen, Sandra Thompson, Martha Thurlow and Ken Olsen
This document has been archived by NCEO because some of the information it contains is out of date.
Any or all portions of this document may be reproduced and distributed without prior permission, provided the source is cited as:
Quenemoen, R., Thompson, S., Thurlow, M., & Olsen, K., (1999). Forum on alternate assessment and "gray area" assessment. Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Retrieved [today's date], from the World Wide Web: http://cehd.umn.edu/NCEO/GrayAreaForum/Conference_Report.htm
One hundred sixty-one representatives from 42 state departments of education, three large school districts, one territory and the Department of Defense Dependent Schools participated in a forum on June 11-12, 1999 in Snowbird, Utah to discuss alternate assessment and "gray areas" in large scale assessments. The forum, a second annual pre-session to the CCSSO National Large Scale Assessment Conference, was co-sponsored by the Regional Resource and Federal Centers (RRFCs), in partnership with the Council of Chief State School Officers (CCSSO), the National Association of State Directors of Special Education (NASDSE), and the National Center on Educational Outcomes (NCEO).
Purpose of the Forum
The purpose for the forum came from the 1997 Amendments to the Individuals with Disabilities Education Act (Public Law 105-17) and its provisions related to participation in State and district-wide assessment programs. These provisions reflect the increased emphasis on accountability to improve curriculum and instruction, and the demand for more and better information about educational results for children with disabilities. The specific focus of the forum was on:
At this forum a consensus emerged that it is not the students but the assessments for which gray areas currently exist, thus the term "gray area" assessment.
Participants of this two-day forum were state teams working to meet the Alternate Assessment provisions of IDEA. The goal for the forum was to have the teams meet and share with each other what they are doing and the challenges they face. Most of the time was spent in small groups organized by similar demographics, similar levels of development, or similar practices. Resource people from the sponsoring agencies (RRFC, CCSSO, NASDSE, NCEO) were available to assist participants with making needed links with resources. The forum focused on three outcomes for participants:
The conference design included state panels on five major topics of inclusive large-scale assessment, provided in two strands Strand A and B relevant to the differing developmental stages of participating states (see the Forum Agenda in Appendix B for design details). Each presentation was followed by small group discussions. The five major topics were:
Two participant observers recorded each presentation and group discussion. Across the content of all presentations, discussions, forum materials, and small group recording forms, 5 areas of concern and 17 related issues were identified through theme analysis. They are presented here with a summary of key concepts from the state presentations and discussion groups. Presenter and participant discussion examples and notes for each issue are displayed in chart format following the summary.
Purpose of This Report
The purpose of this report is to facilitate further networking across states through the coming year. According to the conference conveners, this was a forum of questions more than answers, and the continuation of a process of jointly discovering good answers to tough questions and issues.
Participants are encouraged to use the issue charts as impetus to connect to presenters or other participants to pose new questions and arrive at better answers. State presenters uniformly identified their states practices as in development, emerging as they spoke, and not as a final process and product. With continued exchanges and discussion, it is hoped that by the next annual forum, we will be able to move from discussions of "emerging practices" to discussing "best practices."
Structure of This Report
In this report, we have taken recorded ideas from presenters and participants and organized them in chart form by area of concern and related issue. There is no further "narrative" of the proceedings, simply an idea by idea display of key concepts discussed and explored by the presenting states and discussion groups. State presenters were asked to review ideas attributed to them for validity, and to verify they were willing to be listed as their state contact. The updated participant list from the forum is in Appendix D.
Earlier work on inclusive assessment and accountability systems is reflected in the areas of concern that emerged from this forum. See the resource section following the charts and notes for Web addresses and additional citations on these topics.
The appendices contain:
Process approach: The data from the conference strongly suggest that developing an inclusive assessment system is a complex process with interacting key components. For example, philosophy and belief (why) articulated or left unarticulated shapes how standards (what) are chosen, which influences decision rules to include or exclude groups or individuals (who). Lack of clarity on purpose and use of assessment (why) or lack of skills and understanding on the part of key implementers (training and development) leads to stakeholder concern and misunderstanding or unexpected political outcomes (context), at times causing revisiting or even redesigning of earlier decisions. Lack of alignment of beliefs, policies, and procedures creates difficult technical problems (how).
Continuing definitional issues: Participants noted that difficult definitional issues remain across all areas of concern, including basic terms such as accommodations and modifications. Terms mean different things to different people. A tool used at the conference for resolving these differences is in Appendix C, an Assessment Terminology Test.
The five areas of concern that emerged from the forum analysis reflect key processes states uniformly reported as important to their development and implementation of inclusive assessment and accountability systems. These five areas of concern or key processes are:
The related issues that emerged from this forum for each area of concern and summaries of content are presented here. Following the brief overviews of each area of concern, we provide charts that contain key concepts discussed and explored by the presenting states and discussion groups.
AREA OF CONCERN A. Articulating beliefs across stakeholder groups, and building support for a system based on those beliefs.
Related issues: Three issues emerged as key to this area. Forum discussion examples for each issue are on the charts labeled "A1-A3." This area of concern addresses the "WHY?" question, but also focuses on stakeholder involvement, and the need to carefully define use and purpose of assessment strategies.
Chart A1. Philosophy, beliefs examples of importance or actual belief statements.
Chart A2. Collaborative stakeholder involvement and training focus on who needs to be involved at all steps.
Chart A3. Use and purpose of assessment strategies necessity of clarifying use and purpose, and working with stakeholders to maintain clarity.
AREA OF CONCERN B. Identifying the standards students are expected to meet, and determining how to include all students within that framework based on articulated beliefs.
Related issues: Five issues grouped into this area. They focus on "WHAT?" and "WHO?" The examples of discussion ideas that emerged vary, and reflect differing philosophies. These examples clearly show the developing nature of the systems. See charts labeled B1-B5.
Chart B1. Content and performance standards for all students examples on charts show how presenters and participants are grappling with pros and cons of same vs. different standards, expanded or extended standards.
Chart B2. How many "alternates?" What does the law require? Here again, examples show a range of interpretation.
Chart B3. Decision rules and processes for participation, accommodations who will participate in what setting, and how they will participate, but also discussion on determining what instruction is given, who makes decisions.
Chart B4. Linkages to school improvement processes including all students; defining "improvement" at lowest levels focus of examples in this chart is on expectations, and need to rethink basic expectations, holding students and schools accountable.
Chart B5. Out-of-level testing examples show the developmental process again, with range of current opinion.
AREA OF CONCERN C. Responding to political realities, and identifying those that do not match articulated beliefs, research based understanding, or current technical realities.
Related issues: These two issues are related to "CONTEXT" and "UNINTENDED OUTCOMES." These issues were not part of the organizing principles of the conference as were "Why, Who, How, What, and Reporting and Use," but they clearly emerged as an undercurrent to much of the discussion on development and implementation. See charts labeled C1-C2.
Chart C1. Shifting political context political realities as a "given" condition.
Chart C2. Intended and unintended outcomes of high stakes state presenters mentioned specific consequences; discussion groups grappled with the complexities and unknowns.
AREA OF CONCERN D. Building understanding and skills among all school personnel and IEP teams and community partners.
Related issues: These issues relate to "HOW" in that training issues for practitioners emerged from multiple presentations. The related issues of public awareness and understanding emerged as well. See charts D1-D3 for specific examples.
Chart D1. Staff development at preservice and inservice levels for general and special educators multiple presenters reported the need for intensive training, as did discussion groups.
Chart D2. IEP team training the IEP team was identified as the major decision making unit.
Chart D3. General community training, policymakers, journalists examples are related to a marketing and reporting approach.
AREA OF CONCERN E. Working with best practice leaders and researchers to develop valid assessments that "work" for maximum numbers of students, and valid accountability systems that reflect progress of all students.
Related issues: The examples of issues in Charts E1-E4 are difficult and challenging, and are the "HOW," and "REPORTING AND USE" components. All other topics of WHY, WHO, and WHAT as well as CONTEXT and UNINTENDED OUTCOMES play out here. There is a range of opinion and concern expressed that the research base is not adequate for implementation status.
Chart E1. Technical development and design of alternate assessments a range of options for alternate assessment is included from state presentations.
Chart E2. Technical development and design of inclusive large-scale assessments, including research on accommodations presentations showed concerns, strategies to address inclusive accountability.
Chart E3. Scoring issues differences of opinion emerged on how to score across all students, what scores mean depending on use, and purpose of strategy.
Chart E4. Reporting issues reporting has both technical and contextual challenges.
The remainder of this report contains the charts of findings by areas of concern and related issues and resources. You may find it helpful to browse all the charts, or you may choose to focus on those issues of greatest concern to you. Continuing discussion is encouraged.
Below are the contact names for states that gave presentations. For complete contact information for each state, see Appendix D.
|Mary Ann Mieczkowski
|C. Scott Trimble
Please note that the charts reflect a process framework, with the areas of concern as key process components. States graciously shared thoughtful insights and issues from their "emerging practices." The key concepts from their presentations represent a snapshot of the development of inclusive assessment and accountability systems at one point in time.
A1. Philosophy, beliefs:
State Key Concepts
|Standards-based reforms should benefit all students, including those who receive special education services. Inclusion in assessment is essential for participation in standards-based reform and documents the benefits of reform.
|Our guiding philosophy was we wanted students with disabilities in regular curriculum, and keeping them in regular assessment would promote that.
|We believe all learners can learn at high levels, educators and schools will expect that, and schools will account for ALL.
|This is an opportunity to look at testing issues across the spectrum."Being counted in the conversation counts a lot."
|Our philosophy was inclusive accountability.
|Local control issues demanded an open process, but it is so important to build one system, not many.
Group discussion: One group said, "You have to know WHO you mean, when you say ALL. Thats the first thing that should be done."
A2. Collaborative stakeholder involvement and training:
|We built guiding principles with stakeholders, and we continually check back as we develop.
|Student, parent, teacher needs were balanced with federal requirements.
|Stakeholders helped link regular curriculum to alternate.
|There was active involvement of key stakeholders; partners have
actively influenced the direction of the alternate assessment development; partnership of
SEA, contractor, advisory committee (stakeholders) with work groups in four areas:
1. Communications, including materials and training.
2. Participation, philosophy, guidelines for IEP teams.
3. Curriculum frameworks, define access to the general curriculum for students taking the MCAS alternate.
4. Assessment strategies.
|Special Education Advisory Council developed principle statements that guided alternate assessment plans.
|SEA special and general education assessment and evaluation sections plus two universities have been involved.
|It is hard to convince people "all means all," but highest level state support was there. Stakeholders made the decisions on numbers of and which learning areas were chosen from the state learning areas; and from them which standards, benchmarks and real world demonstrations were to be expected.
Group discussion: Discussion suggested the stakeholder group needs to be broader than just teachers. They were concerned that teachers set expectations too low. Another group suggested inviting students to give feedback on what works and what doesnt.
A3. Use and purpose of assessment strategies:
|Flexibility is needed to satisfy multiple criteria, for example, informing parents about student progress, holding students accountable, and documenting school and district effectiveness.
|There are three purposes of their state test - student, instruction, system accountability - they have had discussion about that raising some validity issues.
|If test is used for system accountability, it's okay to give a 0 score if that reflects where student is. Use another method to get at diagnostics.
Group discussion: Varying opinions and some confusion about use and purpose marked the discussion groups. Some concern was expressed about how to clarify use and purpose to media outlets, general public, parents, students.
On the question, "What difference does it make if you decide that your assessment is for school accountability versus student accountability?" one group reported, "It is difficult to use the same assessment to measure all three areas of system, school, and student. It is also important to define what each of these areas mean and how information will be used." Another group responded, "There are differences in test construction, reliability measures, confidentiality issues, and reporting differences." However, one group reported, "Some assessments are for all three purposes, and the unit of analysis for reporting would vary." This is an area that needs careful and public discussion, related to overall philosophy and belief as well as technical consideration.
B1. Content and performance standards for all students:
|We are using expanded state content standards, and defining
"access skills," underlying skills leading to content standards, life outcomes,
career development, and community membership.
The state's 12 content standards are highly academic in content, but we've learned from work done in school to career and special populations that there are defined "access skills" to get to those content standards, so there is a continuum on which every student fits. Some students may need direct instruction on things other kids are getting incidentally. The target for all students is the grade level benchmarks and proficiencies. Extensions of the benchmark for instruction and assessment focus on these key components or basic proficiencies required by the standard. Access skills and key components aren't separate, but are all foundational skills that help support learning of more advanced steps in the continuum of learning. We have interpreted the reading, writing and math benchmarks to their essential concepts. The expanded standards process can be for a broad range of kids. It can be infused into the IEP and be used for students whether or not they're in the alternate test path. When you systematically are looking at student learning and performance related to one set of standards, it helps educators hold higher expectations for students to meet standards.
|We are bridging between state content standards, looking at
essential elements in broad meaning of standards, worked with teachers to develop
performance indicators for 14 of 38 state standards.
We asked teachers to take 38 content standards into classrooms to see how students with disabilities did, what they could do, what were the essential elements. We looked for broad meaning of standards and listed examples of performance indicators aligned with 14 of the 38 standards, called functional standards. Teachers like it because it gives them curriculum, some structure to use in developing IEPs, and leads students in the same direction as general education.
|They developed a subset of standards for students who go for a special diploma - tie in to State standards that the subset begins with a standard that says, "the student will participate and make progress in the Sunshine State Standards as appropriate to the individual student."
|Standards for all are linked with the general curriculum, based on the state's academic learning standards.
|They developed separate outcomes for 12 categories of disability,
shifted to life roles; they had difficulty linking to state standards, but are starting to
think about how to do that.
We had developed outcomes for 12 categories of disability, and shifted (continued) from those categories to "life roles." There are 4 levels: full independence (should not need alternate), functional independence, supported independence, participation.
|There is a functional context of state standards for our state
alternate assessment, linked to curriculum framework.
Our Show Me standards define what all students should know and be able to do. For our Alternate, we have a functional context defined for each state standard. We also give teachers sample learning scenarios for these in the Alternate Curriculum Framework.
|A decision was made to stay with Board adopted standards, and we
developed lower level items for career related learning strand.
The state assessment director went to the board to develop lower level items for the career related learning strand, functionally assessing kids in natural environments for reporting purposes.
|Stakeholder groups determined how many of state learning areas required, minimum number of standards in each area, minimum number of benchmarks required for each standard, how many real world performance indicators required.
Group discussion: There was extensive discussion of pros and cons of same versus different standards.
Positive effects of having "same" standards for all students: Access to general education, high expectations, spirit of the law is met with full inclusion.
Negative effects of having "same" standards for all students: Some students will never meet them, could expand them so far as to be meaningless, and it changes from performance standards to a questionable standard.
Positive effects of "different standards for some students: May meet individual needs of kids better.
Negative effects of "different" standards for some students: It is a separate system, doesnt meet the spirit of the law, doesnt encourage access to general education curriculum.
B2. How many "alternates?" What does the law require?
|Assessment is seen as a continuum, somewhere every student should
fit in ONE continuum - not "if" kids are going to be included, but
"how." State alternate will be parallel to state general assessments by content
and grade levels.
Suggested strategies will be developed for districts to collect a body of evidence that can be used to describe student progress and serve as alternates to district tests. You may measure with no accommodations, accommodated, modified or on expanded standards, but assessment is on one continuum of learning.
|They include as many as possible in general education curriculum,
but there are some for whom the regular test is not accessible without modifications, yet
they are not eligible for alternate.
Students with 504 or IEP plans could take a regular assessment, a regular assessment with accommodations, the alternate assessment. But a modified assessment - which is the regular assessment with modifications - must be cleared with the state, inservice would be provided, and results will not be reported as regular assessments. They will monitor this closely to measure effectiveness.
|The alternate is part of, not apart from MCAS - one alternate, but linked.
|Considering two alternates - one a developmental alternate in reading, writing, math; other a functional assessment.
B3. Decision rules and processes for participation, accommodations:
|State guidelines restrict eligibility for alternate assessment to those students with more significant disabilities who are primarily involved in a functional curriculum. IEP teams need training in order to make educationally appropriate decisions, particularly for students who may be borderline. Because there are parent exemptions allowed by law, we are concerned that parents may be asked by schools, "Don't you really want to exempt your child?" when in fact the student is special education but should be in regular assessment. We don't know that yet.
|There are five state assessments, and a decision can be made to exempt from one area and not others; we need to see if criteria provide sufficient direction to IEP teams.
|Developed decision rules with a seven point scale, scored by IEP team.
|Post school goals for the student help determine decisions - instructional focus is either independence vs. academic. May participate in some content and area assessments, not others; accommodated in some, not others.
|Curricular validity is an issue - students should be getting curriculum that allows them to make progress toward standards.
Group discussion: There was concern that an IEP team could overrule state law regarding participation and use of accommodations, not resolved. Groups discussed a need to audit the IEP processes. There was quite a lot of discussion on use of sanctions for "cheating" in the decision process, in order to avoid a more rigorous curriculum and appropriate supports. There was a question on "caps" in numbers allowed to participate in the alternate assessment.
B4. Linkages to school improvement processes including all students; defining "improvement" at lowest levels:
|Content standards are implemented by having IEP teams classify instructional goals into functional domains using a rubric. Performance standards applied by having IEP teams rate goal attainment in terms of level of mastery/progress on a four point scale. Districts are encouraged to use a variety of other measures, ranging from local portfolios to locally adopted assessments.
|Student can get a special diploma, but continue working toward regular standards until 22.
|They looked at the IEP as the potential measure of "value
added" by special education, and found quality problems - they are intervening on IEP
goal quality to get to this. They hope to use the IEP to aggregate results, and they saw
that quality had to improve. There are three important decisions involved:
1. How much progress should the student make in a given period of time?
2. Discrepancy between what child does and the criteria the child should attain.
3. An independence conclusion - does instruction on the goal lead to more independence for the student, and how will we know?
The question they posed is: does a standards referenced system affect the type of goals written for students receiving special education services? If yes, does it improve student learning? There are philosophical positions, but are there data?
|The AUEN materials focus on four levels of independence in major life roles that students with varying levels of impairment can realistically be expected to achieve, with framework and key to judge quality.
Group discussion: Several small groups said stakeholders and teachers need to work on development of what content and performance standards are used to define expectations. Both groups need to rethink basic expectations. This is not just about accountability, it is "keeping your eye on the prize," raising the bar for students and improving results, not just about "alternate assessment." Others suggested a need to carefully validate the benchmarks, study the validity of performance indicators for alternate assessment. A consensus of many groups was the hope of increasing the number of students who have access to general education curriculum.
B5. Out-of-level testing:
|Test publishers' practice permits full reporting of results when tests are administered one year out-of-level; IEP teams have the authority to decide on the level.
|We are not doing out-of-level testing at this time - we want to
know "how are 4th Graders doing on math, all kids in the content area, at that
level" - to allow out-of-level this early, before changes occur in MO classrooms, may
We're convinced through professional development and high expectations for all kids, kids will achieve better results and those who are not obtaining an achievement level now will be able to do so in the future. It's back to the purpose of the test: How are 10th graders doing? Then you need to test 10th graders at level.
|They will allow out-of-level testing for the first time in
1999-2000, intend to measure along a continuum toward state standards, first step will be
"Off grade" (state tests are at 3, 5, 8, and 10; districts may opt for testing at 4th, 6th, 7th, and 9th grades) - This "off grade" testing is not provided at the state level, individual districts may contract for this with testing vendors. "Out-of-level" for them is focused on a continuum toward the standard. You test all 10th graders, but you learn what proportion meets the 10th grade standard, meets 8th grade standard, 5th grade standard, 3rd grade standard, or NO standard.
C1. Shifting political context:
|The statewide assessment was adopted prior to approval of the content standards. There has been progress in aligning the assessment with the Reading and Mathematics standards. The public supports the assessment and sees the need for continuity in measurement and accountability. Although state law calls for assessing all students, IEP teams and parents can obtain exemptions.
Group discussion: Most state presentations referred to shifting political winds being a complicating factor. Discussion groups spent time discussing the political realities and their unintended outcomes. The consensus was you have to understand the internal politics in the governors office, the state legislature, the state board of education, and come up with a system that will sell. The discussion groups focused on "unintended outcomes." High test scores promote real estate values, said one group, and that has lots of outcomes. Many presenters mentioned a humorous aside to political changes or upheavals that were causing them to redo parts of their work. Each time, the participants would laugh, perhaps recognition of a reality many shared. Yet no presenter spent much time on it, simply referred to the condition as a given.
C2. Intended and unintended outcomes of high stakes:
|Students in alternate assessment get a certificate, not a diploma.
|If the accountability index goes up for schools, including scores from ALL students, there are substantial rewards; if it goes down or stays flat, there are negative consequences.
|We are exploring high stakes alternate assessment; high stakes MCAS is controversial, we do not have answers yet.
Group discussion: According to several groups, the policy of high stakes for individuals and systems is a double edged sword. In the absence of high stakes, people arent motivated, but if there are high stakes, it increases pressures to exempt kids. Some groups talked about lack of clarity about what or who is being measured. If teachers are punished for not doing a "good job" the question is "a good job at what?" How is a good job measured, and who is included in the measurement? We dont have our system built or even thought out, but theres lots of expectations of measurement right now.
There should be rewards and sanctions to help move forward. For districts who involve more kids, demonstrate progress, we could waive state red tape, provide more flexibility, more local control, some money for building level or school district.
Area of Concern D. Building understanding and skills among all school personnel and IEP teams and community partners.
D1. Staff development at preservice and inservice levels for general and special educators:
|An audit collection of student data and teacher decision-making showed a reliability problem - went to more training the second year, also formatted reports so teachers weren't threatened by comparisons.
|As they looked at IEP goals as a measurement device, they had to develop intensive training to build skills of teachers.
|From Alternate Assessment eligibility criteria survey, they found more teachers than administrators thought LD and BD should be in alternate - need to focus on training.
|In implementing alternate portfolio system, they found teacher
skills needed building, skills on reporting, selecting evidence, and biases against push
to numeracy and literacy for students with severe disabilities need intervention.
North Carolina, Tennessee, Kentucky, Indiana were among presenting states that have developed portfolio approaches to alternate assessment. Each reported learning a lot during their pilots, and each reported training as an issue in implementation. North Carolina specifically stated they were disappointed with the quality of their pilot alternate portfolio system, with respect to the skills of the teachers in matching authentic evidences with the assessment objectives. The issue of quality and teacher preparation must be addressed. They found teachers who were involved in the pilot were eager to learn, reported learning much as they looked at other teacher/student work during statewide scoring, and reported that they believed their teaching had been better because of the portfolio alternate assessment.
Group discussion: Both general education and special education staff remain "segregated" in most systems. Moving from "these are our kids" TO "these are ALL our kids" is necessary. That requires staff development, pre and in service. We have to demystify why we are doing this, and build ownership. You have to consider the states capacity and willingness to do training, or this wont fly.
D2. IEP team training:
Group discussion: If the IEP team is the major decision making unit, then the skills of all IEP team members in basic assessment and accountability issues have to be raised. If we are to raise low expectations, we need to intervene at the IEP team level.
D3. General community training, policymakers, journalists:
|Audiences for the statewide assessment include students, parents, teachers, administrators, legislators, researchers, and state and federal agencies. These audiences have varying information needs and varying degrees of understanding of measurement topics. Effective communication of assessment results requires sensitivity to the needs and sophistication of these audiences.
Group discussion: This is marketing and training if media handles this well, it may control for irrational political decisions.
Area of Concern E. Working with best practice leaders and researchers to develop valid assessments that "work" for maximum numbers of students, and valid accountability systems that reflect progress of all students.
E1. Technical development and design of alternate assessments:
|Technical quality is an issue for alternate assessment. If high stakes are associated with the data gathered, reliability and validity should be measured.
|State assesses some standards, district others - state will build a tool that measures expanded state standards, backed up by a body of evidence in the districts. State alternate strategies focus on content, district focus is the individual. State is point in time, district over time.
|Not developing tools for alternate assessment, but working from local level training teachers on performance based assessment.
|Demonstration of portfolio software for alternate assessment with user friendly technologically based documentation system - allows multiple points in time assessment for alternate.
|Right now IEP teams develop alternate locally while the MCAS state alternate is being developed (based on state standards).
|Strategies include: out-of-level testing, off grade testing (see Section B5 for discussion), standard error of measurement with test item calibration, augment existing measurements, adjust student response mode, employ predictive validity measures, curriculum based measures, access skills, developing lower level items for career related standards strands.
|Alternate assessment is a set of procedures rather than a single test - includes observation, recollection, record review, performance events. Gathered from among Wyoming's educational standards, with benchmarks for each standard, and at least two real world performance indicators for each benchmark. IEP manages process, score given locally on skill level, forwarded to SEA.
E2. Technical development and design of inclusive large-scale assessments, including research on accommodations:
|Given lack of research on accommodations vs. modifications,
decisions made by test publishers appear to be based on opinion. Until research is done,
very careful reporting needs to occur, and test publishers and state need to work closely.
In a standards-based reform context, decisions about scoring and reporting can have significant downstream policy and educational consequences. Pending the availability of research, due consideration should be given to the interests of students and parents, the integrity of assessment programs, pertinent legal issues, and professionally acceptable practices for scoring and reporting accommodated tests. Looking at the "bigger picture" requires close collaboration between test publishers, educators, and policymakers.
|They decided on Alternate Assessment eligibility criteria, and then where the line is between accommodations and modifications, and then had to report results for both.
|They are exploring the possibility of modified assessments, "special" accommodations, "disability-unique" formats, with assistive technology, and are looking for more and better research on how these affect testing results and reporting.
|They are using NCEO's list of accommodations, but find many who are eligible do not use them. Oral reading, extended time, and taking assessment in small group or individually were used most heavily, others seldom used.
|98% participate in state assessment - some complain "these kids are taking the test but we are not getting any useful data from them." The useful information is that they are or are not making expected progress. Accommodations that do not change the measured construct are available to students, and the results tell us about the system. Only accommodations used by the student in instruction that year are allowed.
E3. Scoring issues:
|On Alternate Assessment, we plan to score on proficiency levels, and we can aggregate those, even if the test is modified.
|We can combine scores on the Kentucky Core Content Test with alternate assessment scores in systems reporting, using standards established and the scores are comparable. But building that in a norm referenced test is harder, don't know how exactly yet. Some technical people raise the concern, "Have you changed the meaning of 'math' by doing this?" but we'd suggest that we've changed the meaning of who gets addressed in the teaching of those areas. .5 to 1% of population is in alternate, counts equally with other student assessments - not more or less.
|Of the students with disabilities who take the assessment, we know those who had an achievement level vs. a "level not determined" outcome. NCEO is doing an analysis of who is in our Step 1 achievement level. We need to learn more about these kids and those who take the test but don't generate an achievement level.
|On state assessment, we believe it is "ok" to give a "0" score - that is a belief about reporting, linked to the technical reality of a systems accountability piece.
Group discussion: Some are concerned that you not report a student who does not take the test, and there is no alternate, you shouldnt give a 0 this concept appears to be quite controversial among the participants. Issues discussed ranged from concern about the unintended outcome of increased exemptions if students are not counted to concern about the emotional toll on students and parents who perceive a "0" as "failing" when, in fact, the system is unprepared to assess the students appropriately.
E4. Reporting issues:
|There is concern about getting reports to parents that give them better information after nonstandard administration. State reporting of the results of the system needs to be done carefully until better research on accommodations effects is available.
|It is harder to find the alternate assessment kids in the state report, have to look in the fine print, but they are there. In the summary reports, the students are aggregated. If there are small numbers, less than 10, then we do not aggregate. Scores are tracked to school child would attend if not disabled. Academic expectations are for all students, including those at risk of failure.
|All scores for students using accommodations (standard) are included in the report, and accommodations are recorded on the test document. If scores are at lowest end, or unscoreable (0), that gives us valuable information about who is making progress within the system.
|The technical reality of a systems accountability piece means that "0" scores are meaningful in the reporting system. There is concern about confidentiality with very small numbers. If more than 10 students, we will report, for both alternate and general.
(available on-line at http://www.cehd.umn.edu/NCEO )
Alternate Assessment for Students with Disabilities. NCEO Policy Directions 5, 1996.
A Comparison of State Assessment Systems in Maryland and Kentucky. NCEO State Assessment Series, Maryland/Kentucky Report 1,1996.
Issues and Considerations in Alternate Assessments. NCEO Synthesis Report 27, 1997.
NCEO Framework for Educational Accountability. NCEO, 1998.
Putting Alternate Assessments Into Practice: Possible Sources of Data. NCEO Synthesis Report 28, 1997.
Questions and Answers: Tough Questions About Accountability and Students with Disabilities. NCEO Synthesis Report 24, 1996.
Understanding Educational Assessment and Accountability. NCEO and PEER, 1997.
Alternate Assessment (videotape). National Association of State Directors of Special Education, 1997.
Assessment Desk Reference, 12th Installment. Northeast Regional Resource Center, March, 1997.
Educating One & All. National Academy Press, 1997.
The Full Measure: Report of the NASBE Study Group on Statewide Assessment Systems. National Association of State Boards of Education, October, 1997.
Testing Our Children: A Report Card on State Assessment Systems. National Center for Fair Open Testing (Fair Test), September, 1997.
Testing Students with Disabilities: Practical Strategies for Complying with District and State Requirements. Corwin Press, 1998.
Trends in State Student Assessment Programs. Council of Chief State School Officers, 1997.
Who Takes the Alternate Assessment? State Criteria. Mid-South Regional Resource Center, 1998.
MSRRC Publications and Resources
(available on-line at http://www.ihdi.uky.edu/msrrc)
|Large Scale Assessment
|Assessment and Standards
(links to other Web sites)
Alternate Assessment Pre-conference Session:
Teleconference with Dr. Thomas Hehir, Director
Office of Special Education Programs
June 11, 1999
Dr. Hehir began the session with some opening remarks and devoted the majority of the session to issues raised by participants. He discussed the importance of assessments, and emphasized how important it is to make sure that students with disabilities are part of the accountability systems in education. He further stressed the importance of public reporting of how well students with disabilities are doing so that we can adjust what we do educationally to help them achieve high levels of performance. He observed that this is not a simple area. Most students with disabilities have not been included in accountability systems until very recently, and there are tremendous issues, some technical and some philosophical. The Individuals with Disabilities Education Act of 1997 (IDEA 1997) statutory and regulatory requirements will likely play out differently in every state since the states use statewide assessments and local assessments for different purposes. The fundamental point, according to Dr. Hehir, was that "the law and regulations reflect an equity standard that kids with disabilities should be benefiting from the accountability systems in education." He continued, "those of us who administer these programs, need to make sure that we know how well kids are doing and that we adjust our programs accordingly to make sure kids have the skills and knowledge that they need to be successful when they leave school."
At the conclusion of the opening remarks, Dr. Hehir invited participant inquiry and discussion. Questions and requests for clarification were provided by Jean Taylor (Idaho), Mark Fetler (California) Katie Dunlap (Oklahoma) Joe Eric (Oregon), Peggy Dutcher and Kathy Bradford (Michigan), Liz Healey and Pat Ozrello (Pennsylvania), John Haigh (Maryland), and Jim Ysseldyke (National Center on Educational Outcomes). Several of the participants raised similar issues that are summarized by topic as follows:
This summary of the six topics includes major points from the discussion of issues. It also includes IDEA 1997 statutory and regulatory information that was not presented during Dr. Hehirs discussion, but is provided here for clarification and reference. An electronic copy of the IDEA 1997 final regulations can be accessed through the Department's Web site at: http://www.ed.gov/.
The Federal statute and accompanying Part B regulations do not prescribe that an alternate assessment must be a "test." Dr. Hehir indicated that, in some instances, the concept of a (traditional paper-pencil) "test" might not be appropriate. It may not provide the child an opportunity to demonstrate his or her progress in the areas being assessed. Dr. Joleta Reynolds explained further that "if you have a state or district level general assessment that you are using for non-disabled children to measure a particular area, you would need an alternate to measure that same area for children with disabilities who cannot participate in the general assessment program."
IDEA 1997 requires that the State or local educational agency (SEAs and LEAs) must develop and, beginning not later than July 1, 2000, conduct alternate assessments for those children who cannot participate in the State and district-wide assessment programs (34 CFR §300.138). Although SEAs and LEAs are not required by §300.138 to conduct alternate assessments until July 1, 2000, each SEA and LEA is required to ensure, beginning July 1, 1998, that, if a child will not participate in the general assessment, his or her IEP documents show the child will be assessed (Federal Register, Vol. 64, p. 12565).
The Analysis of Comments and Changes that accompanies the IDEA 1997 regulations states that, "alternate assessments need to be aligned with the general curriculum standards for all students and should not be assumed appropriate only for those students with significant cognitive impairments" (Federal Register, Vol.64, p.12564).
Dr. Hehir recognized that States have an interest in knowing: Are students with disabilities exiting school with the skills that they need to be successful in life? He urged States/districts to look broadly at alternate assessment content with an emphasis on meaningful assessments leading to meaningful results. Broadly speaking, he continued, "children with disabilities would need to be assessed in the same domains that are assessed for children without disabilities." Children with disabilities also need to be provided opportunities in school to learn what other children are learning and they need to participate in assessments. Ultimately, their performance on assessments needs to feed back into the State's accountability for special education as it relates to performance goals under IDEA 1997.
IDEA requires that IEP teams have the responsibility and the authority to determine what, if any, individual modifications in the administration of State or district-wide assessments are needed in order for a particular child with a disability to participate in the general assessment. As part of each State's general responsibility under 34 CFR §300.600, it must ensure the appropriate use of modifications in the administration of State and district-wide assessments. (Federal Register, Vol.64, p.12565)
In discussing the IEP team decision-making role, Dr. Hehir presented a research-based caution that inappropriate decisions may result in "over accommodation" and "over modification." A related concern was raised about how IEP teams use State lists of accommodations and modifications. Some participants saw potential problems with prescribed State lists. Dr. Hehir indicated that there is support for State policies and guidance on accommodations and modifications. However, problems may arise with State policies that only allow prescribed accommodations and modifications without providing a vehicle by which the child can still be assessed with the general assessment, especially when the prescribed accommodations and modifications are not appropriate for that child. He explained that "being too prescriptive" could potentially deny some children access "in a way that would be inconsistent with the law."
While recognizing concerns raised about "out-of-level" testing and children "who fall between the cracks," Dr. Hehir emphasized the importance of developing general assessments "that are broadly acceptable" for administration to children with and without disabilities." He cautioned further that development of assessments for children with disabilities other than the general or alternate assessment and the use of "out-of-level" testing might lead States in a direction opposite the intent of IDEA 1997. Conceivably, such actions could inappropriately limit children's access to the general curriculum, as well as prevent them from being part of the State or district assessment program. Dr. Hehir referred to the House and Senate Committee Reports that accompanied IDEA 1997 and told participants that: "What the law envisions is that the great majority of children with disabilities are within the general assessment programs with appropriate accommodations and modifications in administration if necessary." It is envisioned that there will be relatively few children for whom the general assessment is clearly not appropriate. The number of children with disabilities whose IEP teams determine their appropriate participation in the alternate assessments or the general assessment, with or without accommodations and modification, will vary from State to State and from test to test. Both Dr. Hehir and Dr. Reynolds reminded participants that the Congress and the Department recognized this variability by leaving some flexibility in the Statute and the Federal regulations.
IDEA requires that students with disabilities participate in State or district-wide assessment programs or, if the child's IEP team determines this is not appropriate, participate in an alternate assessment. Under Part B, the determination of what level of an assessment is appropriate for a particular child is to be made by the IEP team (34 CFR §300.138). The Analysis of Comments and Changes that accompanies the IDEA 1997 regulations states that, "out-of-level testing will be considered a modified administration of a test rather than an alternative test and as such should be reported as performance at the grade level at which the child is placed unless such reporting would be statistically inappropriate" (Federal Register, Vol.64, p.12565).
One of the issues raised for discussion was the notion of "statistical soundness," as used in IDEA 1997. Dr. Hehir told participants that the statistical soundness criteria primarily refer to assessments that use sampling techniques, as opposed to assessing the whole population. He explained that the statistical soundness provision is included in IDEA 1997 to reflect the ability, or the lack of ability, to make inferences from samples of populations.
Participants stated that among other complex issues facing States and districts was the issue of how to ensure the reporting of valid test results for children with disabilities who participate in State and district-wide assessment programs with accommodations and modifications to the assessment administration. Dr. Hehir indicated that States and districts would need to address the issue of validity in relationship to any assessment used as it pertains to this issue. The Department is supporting research efforts designed to address these issues that may be still evolving. Dr. Hehir restated the concept of States and districts making assessments "broadly acceptable" to students with disabilities in a manner that reduces validity issues and provides useful information. He also underscored the importance of remembering two major purposes of an individualized educational program - providing access to the general curriculum and addressing factors related to the child's disability in order to ensure the provision of a free and appropriate public education.
Another inquiry was whether States need to report both State and district-wide results. The criteria used for public reporting of general assessment and alternate assessment data for children with disabilities are the same as the State's criteria for public reporting about children without disabilities. The only exception is a situation wherein data would be personally identifiable for an individual child. For example, a small LEA has one child in grade 3 with a disability who is assessed using the State's general assessment and one child in grade 8 who is assessed using the State's alternate assessment. The children must be assessed, but it would not be permissible to report data for either child to the public in a way that would allow the child to be identified.
Dr. Hehir recognized that States/districts use assessment results for different purposes. Concerns involve whether State/district guidelines include rigid policies on accommodations and modifications that have a negative effect, for example, on promotion from grade to grade or graduation from high school. Dr. Hehir told participants that the Americans with Disabilities Act and/or Section 504 of the Rehabilitation Act of 1973, as amended, may provide children with disabilities Federal protections in this area, rather than IDEA 1997, per se. As he reiterated, "there are some issues that relate to how tests are used and whether they are used to deny benefits inappropriately for a person with a disability." These may be described as big issues that require continued research and study. The Department has invested discretionary funds, such as funding to the National Center on Educational Outcomes and others, for providing assistance to States regarding appropriate accommodations, modifications, and other large scale assessment issues. Dr. Hehir acknowledged the complexity of these issues and indicated that guidance to the field would be needed. He also stated that the Office of Special Education Programs would continue its collaborative work with other Offices in the Department in a way that is beneficial to children with disabilities, States, and districts.
Forum on Alternate Assessment and "Gray Area" Students
A Pre-session to the CCSSO Large Scale Assessment Conference
Agenda for Friday June 11, 1999
7:30 Registration, Coffee, Showcase set-ups and Initial Conversations
8:30 Introduction: Ken Olsen, MSRRC
9:30 Break to plan state team sessions and/or identify "Demographics-Alike" groups
9:45 Forum #1 (Plenary session): Rationale for Inclusive Large Scale Assessment
Host: John Olson, CCSSO; Facilitators: Carol Massanari, MPRRC
1. Massachusetts Dan Wiener
2. Wyoming Alan Sheinker
3. Oregon Pat Almond
11:15 Connection and Conversation Break
11:45 Federal Requirements for Inclusive Large Scale Assessment
Luncheon and Plenary Session:
Drs. Thomas Hehir and JoLeta Reynolds, OSERS/SEP
Drs. David Malouf and Gerrie Hawkins, OSERS/OSEP, Moderators
1:15 Break to move to Forum #2
1:30 Forum #2: Who gets what test?
|Strand A Moderator:
Pat Burgess, MSRRC
1. Kansas Lynnett Wright, Sid Cooley
2. Tennessee Ann Sanders
|Strand B Moderator:
Michele Rovins, FRC
1. Missouri Melodie Friedebach
2. California Mark Fetler
3:00 Connection and Conversation Break
3:30 Forum #3 (Plenary): How to collect and score alternate assessment information
Facilitator: Cesar DAgord, GLARRC
1. Indiana Deb Bennett
2. Iowa David Tilly
3. North Carolina David Mills
5:00 Connection and Conversation by State Teams and Demographics-Alike Groups Eileen Ahearn, NASDSE
5:30 Adjourn Day One
Agenda for Saturday June 12, 1999
7:30 Coffee, Connections and Conversations
8:00 Forum #4: What content and performance standards apply?
|Strand A Facilitator:
Clay Starlin, WWRC
1. Delaware Mary Ann Mieczkowski
2. Michigan Peggy Dutcher
|Strand B Facilitator:
Marian Parker, SERRC
1. Colorado Sue Bechard
2. Florida Carol Allman
9:30 Final Connection and Conversation Break
10:00 Forum #5 (Plenary Session): Reporting and Use
Host: Jim Ysseldyke, NCEO; Facilitator: Dee Spinkston, NERRC
1. Minnesota - Mike Trepanier
2. Scott Trimble and Sarah Kennedy
11:15 State planning and Pre-session evaluation Eileen Ahearn (NASDSE)
11:55 Afternoon Session Overview NCEO
12:00 Adjourn (An NCEO "Clinic" and update session will occur from 1 to 5 PM)
Assessment Terminology Test
Background: The terms below and the attached definitions are extensions of work by the National Center on Educational Outcomes, the State Collaborative on Assessing Special Education Students, IDEA and the Joint Committee on Standards for Educational Evaluation.
Purpose & Use: This is a test of your tables ability to collaborate and agree upon definitions to the terms below. However, the scores are irrelevant. The intent is to get us all thinking about what words we use.
Procedures: Match the term with the definition or add a definition.
____ 1. Accommodations
____ 2. Accountability
____ 3. Adaptations
____ 4. Alternate Assessment
____ 5. Assessment
____ 6. Content Standards
____ 7. Evaluation
____ 8. Evaluation (Program)
____ 9. Exclusion from Testing
____ 10. Exemption from Testing
____ 11. Extended Response
____ 12. Gray Area Students
____ 13. High Stakes Testing
____ 14. IEP Committee
____ 15. Individualized Education Plan (IEP)
____ 16. Modifications
____ 17. Opportunity to Learn Standards
____ 18. Out-of-Level Testing
____ 19. Participation Rate
____ 20. Performance Assessment
____ 21. Performance Standards
____ 22. Portfolio Assessment
____ 23. Reliability
____ 24. Testing
____ 25. Validity
|A substitute way of gathering information on the performance and progress of students who do not participate in the assessments used with the majority of students who attend schools. An alternate to the typical state test, generally reserved for students who are not working toward the state standards and who are not seeking a typical diploma.
|The group which meets to discuss a student s areas of strength and need, and develops an individualized plan for the students educational program.
|A systematic method to assure to those inside and outside of the educational system that schools are moving in desired directions: commonly included elements are goals, indicators of progress toward meeting those goals, analysis of data, reporting procedures, and consequences or sanctions. Consequences or sanctions might include additional/fewer resources, removal of accreditation, provision of professional development training, etc.
|A task that requires a student to create an answer or a product rather than simply fill in a blank or select a correct answer from a list; the task performed by the student is intended to simulate real life situations.
|A collection of products that provide the basis for judging student accomplishment; in school settings, portfolios typically contain extended projects and may also contain drafts, teacher comments and evaluations, assessment results, and self-evaluations. The products typically depict the range of skills the student has or the improvement in a students skill over time.
|The process of collecting data for the purpose of making decisions about individuals, groups, or systems. (Salvia & Ysseldyke, 1995).
|The systematic investigation of the worth or merit of educational and training programs, projects or materials. (Joint Committee Standards, 1994).
|Exemption from Testing: The act of releasing someone from a testing requirement to which others are held.
|Assessment that has significant consequences for an individual or school system, e.g., a high school graduation test that is used to determine whether a student receives a diploma is a high stakes test for the student. Whereas a test that determines whether a school receives a financial reward (or is accredited) is a high stakes test for the school. (Texas Education Agency, 1995)
|Administration of a test at a level above or below one generally recommended for a student based on his/her grade level/age. Done to enable students who are either much above or below the average of students their age to demonstrate the entire range of skills they have.
|The administration of a particular set of questions to an individual or group of individuals for the purpose of obtaining a score.
|Benchmarks for how good a students skills must be in areas aligned with content standards. Typically, performance standards are indices of level of performance.
|Requirements for educational inputs and processes designed to ensure that all students are given the opportunity to achieve the knowledge and skills contained in national, state, district and/or school content and performance standards.
|Statements of the subject-specific knowledge and skills that schools are expected to teach and students are expected to learn. They indicate what students should know and be able to do.
|Alteration in how a test is presented to the test taker or how the test taker responds; includes a variety of alterations in presentation format, response format, setting in which the test is taken, timing or scheduling. The alterations do not substantially change level, content or performance criteria. The changes are made in order to level the playing field (i.e., to provide equal opportunity to demonstrate what is known).
|Changes made in assessment practices to allow students to participate in the assessment. Adaptations include (1) accommodations and (2) modifications.
|A document which reflects the decisions made by the IEP committee during an IEP meeting. Included in this document is a description of the student s performance level and the corresponding goals and objectives to address the areas of need.
|Number of students with disabilities taking a test divided by the number of students with disabilities at the grade level or corresponding age level (for ungraded students) covered by the assessment.
|In measurement, the extent to which it is possible to generalize from an observation of a specific behavior observed at a specific time by a specific person to observations conducted on similar behavior, at different times, or by different observers. (Salvia & Ysseldyke, 1995)
|The act of barring someone from participation in an assessment program.
|Procedures to determine whether a child has a disability whether a child has a disability and the nature and extent of the special education and related services that the child needs [IDEA Regulations 300.500(b)(2)].
|Substantial changes in what a student is expected to learn and/or demonstrate. The changes include changes in instructional level, content, and performance criteria, as well as changes in test form or format.
|The extent to which a test measures what its authors or users claim it measures. Specifically, test validity concerns the appropriateness of the inferences that can be made on the basis of test results. (Salvia & Ysseldyke, 1995)
|A multi-purpose term used variously to mean: (a) students who cannot validly take the regular assessment, but do not qualify for the alternate, (b) slow learner, (c) poor students, (d) students reading far below grade level, (e) other students who are hard to test.
|A test item that requires the student to provide narrative or multiple step response to a stimulus question.
Inclusive assessment pre-session participant survey
Snowbird, June, 1999
|WHO IS HERE? JOB TITLES
|State Dir/Asst dir
|State commissioner or supt
|State spec ed coord
|School Board Member
|(NB - State assessment directors' meeting is occurring)
|Here last year?
|Staying for NCEO clinic?
|Staying for large scale?