Biblio
NAEP Pilot Learning Progression Framework. Report to the National Assessment Governing Board.
. (2007). 
A multidimensional Rasch analysis of gender differences in PISA mathematics. Journal of applied measurement, 9, 18.
. (2008). 
Multidimensional classification of examinees based on the mixture random weights linear logistic test model. Educational and Psychological Measurement.
. (In Press). Modeling Randomness in Judging Rating Scales with a Random-Effects Rating Scale Model. Journal of Educational Measurement, 43, 335–353.
. (2006). 
A model of cognition: The missing cornerstone of assessment. Educational Psychology Review, 23, 221–234.
. (2011). 
Mixture models in a developmental context. Advances in Latent Variable Mixture Models, 199.
. (2008). 
Measuring progressions: Assessment structures underlying a learning progression. Journal of Research in Science Teaching, 46, 716–730.
. (2009). 
Measuring pregnancy planning: An assessment of the London Measure of Unplanned Pregnancy among urban, south Indian women. Demographic research, 23, 293.
. (2010). 
Measuring measuring: Toward a theory of proficiency with the Constructing Measures framework. Journal of applied measurement, 296.
. (2009). 
Mapping student understanding in chemistry: The perspectives of chemists. Science Education, 93, 56–85.
. (2009). 
Mapping multiple dimensions of student learning: the ConstructMap program. Journal of applied measurement, 10, 1.
. (2009). A LLTM approach to the examination of teachers' ratings of classroom assessment tasks. Psychology Science, 50, 417.
. (2008). An IRT modeling of change over time for repeated measures item response data using a random weights linear logistic test model approach. Asia Pacific Education Review, 13, 487–494.
. (2012). 
An introduction to multidimensional measurement using Rasch models. Journal of Applied Measurement, 4, 87–100.
. (2003). Introducing multidimensional item response modeling in health behavior and health education research. Health education research, 21, i73–i84.
. (2006). 
Introducing equating methodologies to compare test scores from two different self-regulation scales. Health education research, 21, i110–i120.
. (2006). 
Improving measurement in health education and health behavior research using item response modeling: introducing item response modeling. Health education research, 21, i4–i18.
. (2006). 
Improving measurement in health education and health behavior research using item response modeling: comparison with the classical test theory approach. Health Education Research, 21, i19–i32.
. (2006). 
Improving measurement in health education and health behavior research using item response modeling: comparison with the classical test theory approach. Health education research, 21, i19–i32.
. (2006). Improving assessment evidence in e-learning products: some solutions for reliability. International Journal of Learning Technology, 5, 191–208.
. (2010). 
A gentle introduction to Rasch measurement models for metrologists. Journal of Physics: Conference Series, 459, 012002. Retrieved from http://stacks.iop.org/1742-6596/459/i=1/a=012002
. (2013). Generalizability in item response modeling. Journal of Educational Measurement, 44, 131–155. Retrieved from http://onlinelibrary.wiley.com/doi/10.1111/j.1745-3984.2007.00031.x/full
. (2007). Gender differences in large-scale math assessments: PISA trend 2000 and 2003. Applied Measurement in Education, 22, 164–184.
. (2009). 
Gender differences and similarities in PISA 2003 mathematics: a comparison between the United States and Hong Kong. International Journal of Testing, 9, 20–40.
. (2009).