Focus Group Follow-up Survey on

Psychometric Impact of Out-of-level Testing

 

This survey asks for your expert opinion about how out-of-level testing effects test score reliability and validity. An out-of-level test is defined here as any test taken by a student that is other than the test taken by the majority of students in her/his grade.  A context is provided for each question because the conditions within which out-of-level testing occurs may influence your opinions.

 

                                                                                                                                                                                   

 

Please note: you may change you survey responses at any time before pressing the submit button.  However, once you have submitted the survey, all responses will be final and additional changes will not be possible.  If you have any questions, or if you run into any problems filling this survey out, please contact John Bielinski at bieli001@umn.edu

Thank you for taking the time to complete this survey.

 

                                                                                                                                                                                   

 

 

Test Score Reliability

 

From item response theory, it can be shown that measurement precision increases as the match between test difficulty and person ability increases.  One benefit of out-of-level testing is that test score reliability may be increased for those students who would otherwise earn a very low or a very high score on the in-level (grade level) test.  However, when the performance of the students taking an out-of-level test is to be reported on the scale of the in-level test, some form of equating of scales or linking of item parameter estimates is required.  Equating may add measurement error to the scaled score.  Questions 1-3 ask your opinion on the degree to which the potential gain in measurement precision from out-of-level testing is offset by the measurement error introduced in the equating process.   

 

Scenario I

A state wants to assess math proficiency for all its 8th graders.  An off-the-shelf norm-referenced test was chosen that was standardized on an 8th grade sample, and was reasonably well aligned to that state’s mathematics standards.  The 8th grade test is part of a multi-level testing system in which test scores from any level of the test can be placed onto a common scale.  The test publisher conducted vertical scaling/equating studies for this purpose.

 

Please rate the extent to which the measurement error added in the vertical equating process offsets the gain in measurement precision obtained by giving a student an out-of-level test under each condition.

 

A.)    Assume that a brief locator test was used to assign each student to a test level….

 

 

 

The student was assigned to the test

None

Too little to

Warrant concern

Some

A lot

·                                One level below

A

A

A

A

·                                Two levels below

A

A

A

A

·                                More than two levels below

A

A

A

A

 

B.)    Assume that the classroom teacher made the decision as to which level each student should take….

 

 

 

The student was assigned to the test

None

Too little to

Warrant concern

Some

A lot

·                                One level below

A

A

A

A

·                                Two levels below

A

A

A

A

·                                More than two levels below

A

A

A

A

 

 

Briefly explain your choices

 

 

 

 

 

 

 


Scenario II

A state wants to assess math proficiency for all its 8th graders.  The state has developed a test specifically designed to measure that state’s math standards.  The state has also developed a math test to assess the math standards for its 5th graders, and one to assess the math standards for its 3rd graders.  The state conducted an equating study so that a score from either the 3rd grade or the 5th grade test could be translated into a score on the 8th grade test.  A random sample of 500 8th graders took both the 8th grade and the 5th grade test, and another random sample of 500 8th graders took both the 8th grade test and the 3rd grade test.  Equipercentile equating was used to translate scores from the 3rd grade and the 5th grade test to the scale of the 8th grade test.  A classroom teacher familiar with the student made the decision as to which test to give that student.  All scores, regardless of the level of the test a student took were reported on the scale of the 8th grade test.

 

Please rate the extent to which the measurement error added in the vertical equating process offsets the gain in measurement precision obtained by giving a student an out-of-level test under each condition.

 

 

 

An 8th grader was assigned to the

None

Too little to

Warrant concern

Some

A lot

·                                5th grade test

A

A

A

A

·                                3rd grade test

A

A

A

A

 

Scenario III

 

Participants in the vertical equating studies conducted by the large test publishing companies are selected so that they best represent the demographic characteristics and ability levels of the population of students in a particular grade.  However, the students who actually take an out-of-level test are likely to differ from their grade level peers in both ability and demographic characteristics.  What, if anything, do you consider would be the effect on test results from this dissimilarity between the participants in vertical equating studies and the students who actually take an out-of-level test?  

 

No effect                                                             A

Introduces test score bias                             A

Introduces random error                                A

Other possible effect                                       A

 

 

 

 

 

 

 

 


                                                                                                                                                                                          

 

 

Test Score Validity

 

Out-of-level testing may increase measurement precision for low or high scoring students.  However, there is concern that it may alter the validity of the test score.  Below we are interested in getting your expert opinion about the degree to which an out-of-level test score alters the validity of the data.  Because meaningful evaluation of validity requires information about how the scores are to be reported and used, we present each set within a specific context.  The contexts differ in the way in which test scores are reported and how the results are to be used.

 

 

Scenario I

·        An off-the-shelf norm-referenced test was used.  The test was part of a multi-level testing system in which vertical equating was done so that scores from each test could be placed onto a common scale.

·        No student was allowed to take a level of the test more than two levels below the test intended for her/his grade level.

·        The percent of students within a grade taking an out-of-level test varied across schools.

·        No distinction was made between students getting an out-of-level test and those getting the in-level test.

 

 

Select the box that best reflects your opinion as to the effect out-of-level testing (using the context above) has on the validity of the test results.  For school level results, assume that scores from out-of-level tests are pooled with those from in-level tests to generate the summary statistic.

 

 

 

Report Format A: At the school level, the percent of students in a grade that met the state standard

 

Effect on Validity

Scores will be used…

Dramatically Reduce

Somewhat Reduce

No Effect

Somewhat Enhance

Dramatically Enhance

for school-to-school comparisons.

A

A

A

A

A

to monitor adequately yearly progress.  Each school must demonstrate gain in the percent of students meeting the state standard.

A

A

A

A

A

 

 

 

Report Format B: At the school level, the mean scaled score for each grade tested

 

Effect on Validity

Scores will be used…

Dramatically Reduce

Somewhat Reduce

No Effect

Somewhat Enhance

Dramatically Enhance

for school-to-school comparisons.

A

A

A

A

A

to monitor adequately yearly progress.  Each school must demonstrate a specified gain their mean scaled score.

A

A

A

A

A

 

 

 

Report Format C: Individual student score

 

Effect on Validity

Scores will be used…

Dramatically Reduce

Somewhat Reduce

No Effect

Somewhat Enhance

Dramatically Enhance

to determine eligibility for high school graduation.  Each student must pass the test to graduate.

A

A

A

A

A

to guide classroom instruction.

A

A

A

A

A

 

 

 

Scenario II

·        A state developed test was used.  The testing system included a 3rd grade, a 5th grade, and an 8th grade test.  A score from a lower level test (e.g. 5th grade test) could be translated into a score on a higher-level test (e.g. 8th grade test).  The equating scenario described in the reliability section, scenario II was used.

·        An 8th grader could take either the 3rd grade, the 5th grade, or the 8th grade test.  A teacher familiar with the student made the determination as to which test was most appropriate. 

·        The percent of students within a grade taking an out-of-level test varied across schools.

·        No distinction was made between students getting an out-of-level test and those getting the in-level test.

·        Consider only the 8th grade test results.

 

 

Report Format A: At the school level, the percent of students in a grade that met the state standard

 

Effect on Validity

Scores will be used…

Dramatically Reduce

Somewhat Reduce

No Effect

Somewhat Enhance

Dramatically Enhance

for school-to-school comparisons.

A

A

A

A

A

to monitor adequately yearly progress.  Each school must demonstrate gain in the percent of students meeting the state standard.

A

A

A

A

A

 

 

Report Format B: Individual student score

 

Effect on Validity

Scores will be used…

Dramatically Reduce

Somewhat Reduce

No Effect

Somewhat Enhance

Dramatically Enhance

to determine eligibility for high school graduation.  Each student must pass the test to graduate.

A

A

A

A

A

to guide classroom instruction.

A

A

A

A

A

 

 Back to Out-of-Level Testing Report 7