Published by the National Center on Educational Outcomes
Number 17 / November 2003
Prepared by Rachel Quenemoen and Scott Marion
Any or all portions of this document may be reproduced and distributed without prior permission, provided the source is cited as:
Quenemoen, R., & Marion, S. (2003). Rethinking basic assumptions of test development: Assessment frameworks for inclusive accountability tests (Policy Directions No. 17). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Retrieved [today's date], from the World Wide Web: http://cehd.umn.edu/NCEO/OnlinePubs/Policy17.htm
For the No Child Left Behind Act of 2001 (NCLB), states are required to assess the achievement of all students on state-defined grade level content standards. They are to use the results to hold schools, and the state, accountable for all students achieving state-defined proficiency on grade-level content by the year 2014. Schools and the state are to show "adequate yearly progress" (AYP) toward reaching that goal in the meantime. Results from each year’s assessment of student achievement can be used to redesign instructional programs so that more students are successful each year.
If educators were asked today whether all students are achieving at the proficiency level expected by 2014, virtually no educator in the U.S. could say yes. Some would say that the goal is unrealistic; others would say that fundamental changes in instructional approaches, structures, and budgets have to occur to make the goal achievable.
Some educators see a need to improve assessments so that they inform instruction on grade level content. These educators are calling for assessments based on a limited number of clearly defined and important constructs that reflect what all students should know and be able to do at grade level, and that are constructed to yield unambiguous evidence of whether groups or classrooms of students have mastered those constructs at grade level. They are rethinking common assumptions about the entire test development process—from the definition of the constructs to be tested, through development, to implementation.
This requires a transparent discussion on how to put into practice the new assumptions about teaching and learning that have come out of standards-based reform so that assessments can be designed to meet the accountability requirements of NCLB, yet give better information about the performance of students within the prescribed content. Compatible classroom assessments can provide additional diagnostic information for all students. Clarification and articulation of content standards—that is, precisely defined constructs—are the foundation of good assessments in order to maximize the measurement of achievement rather than the effects of disability, cultural background, or socioeconomic status.
As an example, the Commission on Instructionally Supportive Assessment (2002) identified nine requirements for assessment development (see Table 1). These requirements were ones the Commission believed that, if implemented, would create responsible assessments for the improvement of students’ learning.
Table 1. Instructionally Supportive Assessments
Purpose, Procedure, and Challenges |
Purpose: To rethink test development procedures, starting from the alignment of content standards, instruction, constructs being assessed, and assessment. Instructionally supportive assessments are based on the belief that required accountability tests must be useful to educators concerned about improving the instruction of children. Procedure: Starting from a very comprehensive approach, the Commission on Instructionally Supportive Assessment (2002) identified nine requirements for assessment development:
Specific reasons to support each of these requirements are provided. The requirements can be met either by drawing on the capabilities of state agencies or by issuing competitive requests for proposals to firms or individuals capable of carrying out one or more of the required activities. Challenges: Instructionally supportive assessment involves rethinking the entire system, including the content standards, aligned curriculum, and rich and varied instruction, to ensure that educators know what to teach and that children are taught. It is easier to focus on narrow aspects of assessments. |
Dr. Jim Popham, chair of the Commission on Instructionally Supportive Assessment, summarized what he considers to be the implications of the assessment requirements of the No Child Left Behind Act for children with disabilities. This summary is presented in Table 2.
Table 2. Inclusive Assessment Requirements and Instructionally Supportive Assessments
Implications Identified by Jim Popham |
(See NCEO teleconference materials at http://cehd.umn.edu/NCEO/Presentations/tele5.htm) |
There are opportunities created when "extremely significant" content standards are identified, as explained in point 6 in Table 2. For example, universally designed assessments, which are assessments designed from the beginning to be accessible and valid for the widest range of students, are typically portrayed as one aspect of assessments that are designed to support instruction and accountability for all students. The identification of a small number of "extremely significant" content standards to be the focus of large-scale assessments in itself can be viewed as an aspect of a universally designed assessment. Benefits of merging these ideas for students with disabilities are shown in Table 3.
Table 3. Benefits of Instructionally Supportive Assessments for Students with Disabilities
Benefits |
|
As state policymakers grapple with the challenge of truly "rethinking" entrenched approaches to content standards and assessment, there are practical and political issues to consider. These include the "yeah, but…" scenarios heard by assessment directors in discussions among policymakers and practitioners across the country. Here are a few of the scenarios and responses related to the implications for students with disabilities of the NCLB assessment requirements.
Yeah, but how can we expect essentially all students, who may be instructed at very different levels, to participate in the same tests?
Response:
Federal laws require not only assessments, but also require identical content expectations for all students (other than the small exception proposed for students with significant cognitive disabilities). NCLB requires specifically that all children be assessed on grade level content and achievement standards. There are multiple solutions to this "yeah but…" and they include revisiting how we design instruction and curriculum to ensure all students learn the challenging content from the earliest grade levels.Clarification and articulation of extremely
significant content standards is essential to ensuring a coherent, aligned, and
focused teaching and learning system that is accessible to all learners. Even
with shifts in how we ensure all children learn, we still have to address the
adequacy of current assessment approaches. With the opportunities of
instructionally supportive and universally designed assessments aligned to
extremely significant constructs, many more students can meet the assessment
targets.
Yeah, but if we hunker down, this too shall pass.
Response:
The basic assumptions of standards-based reform that underpin the Federal laws – all students, meeting high standards, and accountability on the part of schools and states – have been in place for over a decade. These assumptions have broad bipartisan support, and have shown remarkable political and popular durability. Also, these assumptions are consistent with the predominant agenda of public educators at local and state levels, that is, of meaningful education with high expectations for all students. Looking for ways that we can use the law to further our agenda of improving results for all students, and improving the quality of our educational assessments to measure the results of instruction more accurately, is consistent with that agenda.Yeah, but how can we change our content standards? Our content experts worry that "if it is not tested, it will not get taught." And we don’t want to inadvertently narrow or "dumb down" the curriculum.
Response:
This is perhaps the biggest challenge. In many states, content experts and teachers worked over a period of years to negotiate the academic content standards. In that process, they tended to include more content, in part to avoid conflict and to satisfy multiple constituencies in an inherently politicized system. In many states, this may have resulted in a vast number of content standards or benchmarks, too many to assess in the time available for state-level testing and, perhaps, too many to teach and learn in the time available for instruction.The pressures created by NCLB provide an opportunity to revisit this approach. Content experts and teachers will agree that schools and teachers should be held accountable for reasonable learning targets, and assessments designed to measure the targets should be reasonable and fair as well. As long as the state-defined content cannot be assessed in total on an assessment, and states have not defined and communicated the content to be assessed, assessment becomes a punishing variation of Russian roulette, with teachers and schools guessing at what will be assessed. No content experts or teachers support that approach, which may be the current reality.
One of the foundations for instructionally supportive accountability tests is that they focus on a modest number of assessment targets. If the accountability purpose of the test is to encourage teaching of important content, then those targets must be truly significant and communicated clearly to students, teachers, and the general citizenry.
The entire system of state content standards does not have to be redesigned. Instead, states can work to define an assessment framework that functions as a companion document to already approved content standards. This requires alignment justifications demonstrating how assessed skills and bodies of knowledge are derived from the state’s existing content standards. The identified content clusters and important constructs that are assessed will incorporate many of the smaller grain size content standards or benchmarks.
There is no magic way to do this, but there are some general principles that should be followed, based on an example used in one state.
The steps that state took can be summarized more generically as:
Raise awareness and develop consensus among various groups of the need to prioritize or rework content standards.
For each content area and grade, form groups to identify:
non-essential knowledge and skills;
unnecessary duplications of content across or within grade levels;
content that cannot be assessed through large-scale methods; and
content that can be logically clustered into larger constructs incorporating small grain content.
Share the resulting revisions across content areas and grades as a check on progression and importance.
Have people who are very familiar with the content area and people less familiar review the resulting content and progression to ensure that the resulting descriptions are considered important by people other than the writing team, and by those who are not content experts.
Develop an assessment framework based on the resulting content descriptions.
This process does not require new content standards, and specifically avoids identifying "easier" content standards. It instead requires identifying high import challenging content clusters that incorporate essential bits of content and knowledge across the discipline and across grades. It should result in a limited number of challenging and important constructs that will serve as the framework for the accountability test.
The process could also be modified by using a
version of Norman Webb’s Balance of Representation protocol. This is
another tool that would push people to prioritize the required knowledge and
skills for their content areas (see N.L Webb’s Alignment of Science and
Mathematics Standards... and Criteria for Alignment of Expectations...
in the Resources).
Yeah, but the NCLB accountability requirements are so challenging that many of our schools will fail to meet AYP requirements within the first few years. This seems like an undoable challenge!
Response:
We need to use the best research in teaching, learning, and assessment to help states design assessment systems that can promote student learning. As we organize fewer, high import, challenging content clusters to assess, we also provide a clearer "path" to the essential skills and knowledge. This will allow all teachers and all IEP teams, for the first time in many states, to understand how all the hundreds of "tiny grain size" standards work together to result in durable and important skills and concepts along the grade levels. Although NCLB accountability provisions may not allow for a great deal of innovation and tailoring, working to make the assessment instructionally sensitive and universally designed will make a big difference ensuring that all schools, districts, and states are successful. Most important of all, it will make a difference in ensuring that all students are successful.Alignment of Science and Mathematics Standards and Assessments in Four States (NISE Monograph No. 18). Webb, N.L. (1999). Published jointly by the National Institute for Science Education and the Council of Chief State School Officers.
Criteria for Alignment of Expectations and Assessments in Mathematics and Science Education (Monograph No. 6). Webb, N.L. Madison: Wisconsin Center for Education Research, Council of Chief State School Officers and National Institute for Science Education.
Building Tests to Support Instruction and Accountability: A Guide for Policymakers. Commission on Instructionally Supportive Assessment. (2001). AASA, NAESP, NASSP, NEA, NMSA. Available from the American Association of School Administrators Web site at http://aasa.org/issues_and_insights/assessment/Building_Tests.pdf. Also see companion piece on Illustrative Language for an RFP to Build Tests to Support Instruction and Accountability, at http://aasa.org/issues_and_insights/assessment/Illustrative_Language_for_an_RFP.pdf.
Crafting Curricular Aims for Instructionally
Supportive Assessment, with Examples of Appropriate Curricular Aims.
Popham, W. J., Farr, R., & Lindquist, M. (2003).
Available from the National Center for Educational Outcomes at
http://cehd.umn.edu/NCEO/Presentations/CraftingCurricula.pdf.
Materials for the January 27, 2003 NCEO teleconference: Part Two of "Building Tests to Support Instruction and Accountability for All Students." Popham, W. J., Thurlow, M.L., & Marion, S., Presenters. (2003). Available from the National Center on Educational Outcomes at http://cehd.umn.edu/NCEO/Presentations/tele5.htm.
Universal Design Applied to Large-Scale Assessments (Synthesis Report 44). Thompson, S.J., Johnstone, C. J., & Thurlow, M. L. (2002). Minneapolis, MN: University of Minnesota, available from the National Center on Educational Outcomes at http://cehd.umn.edu/NCEO/OnlinePubs/Synthesis44.html.
Universally Designed Assessments: Better Tests for Everyone! (Policy Directions 14). Thompson, S., & Thurlow, M. (2002). Minneapolis, MN: University of Minnesota, available from the National Center on Educational Outcomes at http://cehd.umn.edu/NCEO/OnlinePubs/Policy14.htm.