Addressing the Blue Book Problem: An IRT Mixture Model for Item Position Effects [Online]


High stakes admissions testing is often carried out using items with pre-calibrated parameters. Though this approach often works well in practice, factors such as item order and time pressure can modify the testing context enough that pre-calibrated parameters are no longer valid, the validity of score interpretations. One example occurred during the 2014 administration of ENEM, the national Brazilian college entrance exam. Despite all students taking the same items, students exposed to one particular ordering performed the worst in math, implying that item order has an effect on student performance and the color of your booklet you are assigned could partially determine whether or not you are able to attend college. Previous approaches that model position effects as variation in either item parameters or person abilities may make unreasonable homogeneity assumptions. To address this gap, we propose an item response model that treats position effects as both person-side and item-side by modeling heterogeneity in individual response processes over the course of the test. Here an individual’s encounter with an item is treated as a smoothly varying mixture of how the student would interact with the item if encountered early in the test and how the student would interact with the item were it encountered late in the test, weighted by actual item position. This directly models a difference in response processes for students who are “fresh” and “fatigued,” estimating ability net of individual endurance. This presentation will focus on the estimation and properties of this model derived from simulation and look at applications to the 2014 ENEM Math administration.

Klint Kanopka is a Ph.D candidate at the Stanford Graduate School of Education. His research interests revolve around a blend of psychometrics, machine learning, and network analysis, with a specific focus on the computational analysis of process data and text to gather validity evidence in computerized testing scenarios. Prior to his time at Stanford, he taught physics in the Philadelphia School District.

Tuesday, April 27, 2021 - 2:00pm
Online session
PDF icon Presentation slides3 MB