Focus Group Follow-up Survey on

Psychometric Impact of Out-of-level Testing

This survey asks for your expert opinion about how out-of-level testing effects test score reliability and validity. An out-of-level test is defined here as any test taken by a student that is other than the test taken by the majority of students in her/his grade. A context is provided for each question because the conditions within which out-of-level testing occurs may influence your opinions.

Please note: you may change you survey responses at any time before pressing the submit button. However, once you have submitted the survey, all responses will be final and additional changes will not be possible. If you have any questions, or if you run into any problems filling this survey out, please contact John Bielinski at bieli001@umn.edu

Thank you for taking the time to complete this survey.

Test Score Reliability

From item response theory, it can be shown that measurement precision increases as the match between test difficulty and person ability increases. One benefit of out-of-level testing is that test score reliability may be increased for those students who would otherwise earn a very low or a very high score on the in-level (grade level) test. However, when the performance of the students taking an out-of-level test is to be reported on the scale of the in-level test, some form of equating of scales or linking of item parameter estimates is required. Equating may add measurement error to the scaled score. Questions 1-3 ask your opinion on the degree to which the potential gain in measurement precision from out-of-level testing is offset by the measurement error introduced in the equating process.

Scenario I

A state wants to assess math proficiency for all its 8^th graders. An off-the-shelf norm-referenced test was chosen that was standardized on an 8^th grade sample, and was reasonably well aligned to that state’s mathematics standards. The 8^th grade test is part of a multi-level testing system in which test scores from any level of the test can be placed onto a common scale. The test publisher conducted vertical scaling/equating studies for this purpose.

Please rate the extent to which the measurement error added in the vertical equating process offsets the gain in measurement precision obtained by giving a student an out-of-level test under each condition.

A.) Assume that a brief locator test was used to assign each student to a test level….

The student was assigned to the test	None	Too little to Warrant concern	Some	A lot
· One level below	A	A	A	A
· Two levels below	A	A	A	A
· More than two levels below	A	A	A	A

B.) Assume that the classroom teacher made the decision as to which level each student should take….

The student was assigned to the test	None	Too little to Warrant concern	Some	A lot
· One level below	A	A	A	A
· Two levels below	A	A	A	A
· More than two levels below	A	A	A	A

Briefly explain your choices

Scenario II

A state wants to assess math proficiency for all its 8^th graders. The state has developed a test specifically designed to measure that state’s math standards. The state has also developed a math test to assess the math standards for its 5^th graders, and one to assess the math standards for its 3^rd graders. The state conducted an equating study so that a score from either the 3^rd grade or the 5^th grade test could be translated into a score on the 8^th grade test. A random sample of 500 8^th graders took both the 8^th grade and the 5^th grade test, and another random sample of 500 8^th graders took both the 8^th grade test and the 3^rd grade test. Equipercentile equating was used to translate scores from the 3^rd grade and the 5^th grade test to the scale of the 8^th grade test. A classroom teacher familiar with the student made the decision as to which test to give that student. All scores, regardless of the level of the test a student took were reported on the scale of the 8^th grade test.

An 8^th grader was assigned to the

None

Too little to

Warrant concern

Some

A lot

· 5^th grade test

· 3^rd grade test

Scenario III

Participants in the vertical equating studies conducted by the large test publishing companies are selected so that they best represent the demographic characteristics and ability levels of the population of students in a particular grade. However, the students who actually take an out-of-level test are likely to differ from their grade level peers in both ability and demographic characteristics. What, if anything, do you consider would be the effect on test results from this dissimilarity between the participants in vertical equating studies and the students who actually take an out-of-level test?

No effect A

Introduces test score bias A

Introduces random error A

Other possible effect A

Test Score Validity

Out-of-level testing may increase measurement precision for low or high scoring students. However, there is concern that it may alter the validity of the test score. Below we are interested in getting your expert opinion about the degree to which an out-of-level test score alters the validity of the data. Because meaningful evaluation of validity requires information about how the scores are to be reported and used, we present each set within a specific context. The contexts differ in the way in which test scores are reported and how the results are to be used.

Scenario I

· An off-the-shelf norm-referenced test was used. The test was part of a multi-level testing system in which vertical equating was done so that scores from each test could be placed onto a common scale.

· No student was allowed to take a level of the test more than two levels below the test intended for her/his grade level.

· The percent of students within a grade taking an out-of-level test varied across schools.

· No distinction was made between students getting an out-of-level test and those getting the in-level test.

Select the box that best reflects your opinion as to the effect out-of-level testing (using the context above) has on the validity of the test results. For school level results, assume that scores from out-of-level tests are pooled with those from in-level tests to generate the summary statistic.

Report Format A: At the school level, the percent of students in a grade that met the state standard
	Effect on Validity
Scores will be used…	Dramatically Reduce	Somewhat Reduce	No Effect	Somewhat Enhance	Dramatically Enhance
for school-to-school comparisons.	A	A	A	A	A
to monitor adequately yearly progress. Each school must demonstrate gain in the percent of students meeting the state standard.	A	A	A	A	A

Report Format B: At the school level, the mean scaled score for each grade tested
	Effect on Validity
Scores will be used…	Dramatically Reduce	Somewhat Reduce	No Effect	Somewhat Enhance	Dramatically Enhance
for school-to-school comparisons.	A	A	A	A	A
to monitor adequately yearly progress. Each school must demonstrate a specified gain their mean scaled score.	A	A	A	A	A

Report Format C: Individual student score
	Effect on Validity
Scores will be used…	Dramatically Reduce	Somewhat Reduce	No Effect	Somewhat Enhance	Dramatically Enhance
to determine eligibility for high school graduation. Each student must pass the test to graduate.	A	A	A	A	A
to guide classroom instruction.	A	A	A	A	A

Scenario II

· A state developed test was used. The testing system included a 3^rd grade, a 5^th grade, and an 8^th grade test. A score from a lower level test (e.g. 5^th grade test) could be translated into a score on a higher-level test (e.g. 8^th grade test). The equating scenario described in the reliability section, scenario II was used.

· An 8^th grader could take either the 3^rd grade, the 5^th grade, or the 8^th grade test. A teacher familiar with the student made the determination as to which test was most appropriate.

· The percent of students within a grade taking an out-of-level test varied across schools.

· No distinction was made between students getting an out-of-level test and those getting the in-level test.

· Consider only the 8^th grade test results.

Report Format A: At the school level, the percent of students in a grade that met the state standard
	Effect on Validity
Scores will be used…	Dramatically Reduce	Somewhat Reduce	No Effect	Somewhat Enhance	Dramatically Enhance
for school-to-school comparisons.	A	A	A	A	A
to monitor adequately yearly progress. Each school must demonstrate gain in the percent of students meeting the state standard.	A	A	A	A	A

Report Format B: Individual student score
	Effect on Validity
Scores will be used…	Dramatically Reduce	Somewhat Reduce	No Effect	Somewhat Enhance	Dramatically Enhance
to determine eligibility for high school graduation. Each student must pass the test to graduate.	A	A	A	A	A
to guide classroom instruction.	A	A	A	A	A

Back to Out-of-Level Testing Report 7