Biblio
The nature of assessment systems to support effective use of evidence through technology. E-Learning and Digital Media, 8, 121–132.
. (2011). The Nature of Assessment Systems to Support Effective Use of Evidence through Technology. E-Learning and Digital Media, 8, 121–132.
. (2011). 
Road maps for learning: A guide to the navigation of learning progressions. Measurement: Interdisciplinary Research & Perspective, 9, 71–123.
. (2011). 
The role of mathematical models in measurement: A perspective from psychometrics. Universitätsbibliothek Ilmenau.
. (2011). 
Triangulating Validity Evidence from Classroom Discussions, Written Assessments, and Cognitive Interviews.” In J. Matthews-Lopez (Chair), Dimensions of Test Validation.. American Educational Research Association (AERA). Symposium, New Orleans, LA.
. (2011). Validating a learning progression in mathematical functions for college readiness. Mathematical Thinking and Learning, 13, 259–291.
. (2011). 
Articulating Assessments Across Childhood: The Cross-Age Validity of the Desired Results Developmental Profile–Revised. Educational Assessment, 15, 1–26.
. (2010). 
Concrete, abstract, formal, and systematic operations as observed in a" Piagetian" balance-beam task series. Journal of applied measurement, 11, 11.
. (2010). The evidence-based reasoning framework: Assessing scientific reasoning. Educational Assessment, 15, 123–141.
. (2010). 
"Exploring Contexts of Assessment." In R. Lehrer (Chair), Constructing A Multidimensional Learning Progression of Data Modeling: Design Studies, Psychometric Modeling and Brokering Professional Development. National Countcil of Teachers of Mathematics. Paper, San Diego, CA.
. (2010). Improving assessment evidence in e-learning products: some solutions for reliability. International Journal of Learning Technology, 5, 191–208.
. (2010). 
Measuring pregnancy planning: An assessment of the London Measure of Unplanned Pregnancy among urban, south Indian women. Demographic research, 23, 293.
. (2010). 
Selecting cut scores with a composite of item types: the construct mapping procedure. Journal of applied measurement, 12, 298–309.
. (2010). 
Sources of self-efficacy belief: development and validation of two scales. Journal of applied measurement, 11, 24.
. (2010). Unfair treatment? The case of Freedle, the SAT, and the standardization approach to differential item functioning. Harvard Educational Review, 80, 106–134.
. (2010). 
Validation of the International Classification of Functioning Disability and Health framework using multidimensional item response modeling. Disability & Rehabilitation, 32, 1397–1405.
. (2010). 
Concrete, abstract, formal, and systematic operations as observed in a" Piagetian" balance-beam task series. Journal of applied measurement, 11, 11–23.
. (2009). 
Constructing One Scale to Describe Two Statewide Exams. Journal of applied measurement, 10, 170–184.
. (2009). 
. (2009).
Gender differences and similarities in PISA 2003 mathematics: a comparison between the United States and Hong Kong. International Journal of Testing, 9, 20–40.
. (2009). Gender differences and similarities in PISA 2003 mathematics: A comparison between the United States and Hong Kong. International Journal of Testing, 9, 20–40.
. (2009). 