BEAR Seminars BEAR Projects BEAR Publications BEAR Portal
Berkeley Evaluation & Assessment Research  Center Director: Mark Wilson
Measurement Journal Books and Papers by BEAR Authors Contacts GSE

Convener: Mark Wilson
Coordinator: PJ Hallam

|Archive of Past Seminars| Current Seminars |

BEAR Seminars, Spring 2006

The Berkeley Evaluation and Assessment Research (BEAR) Center coordinates several seminars designed to provide a forum for researchers to share cutting-edge findings and to prompt congenial discussion of educational assessment and evaluation topics.

Events take place on Tuesdays, from 2-4 PM at:
UC Berkeley, Graduate School of Education
2515 Tolman Hall, unless otherwise noted.

Directions to UC Berkeley

Directions to 2515 Tolman Hall | Map to Tolman and transit

General Information for Seminar Presenters

Date
Additional Information
Speaker
Title (Click for Details)
Jan. 24 Alicia Alonzo, Stanford University (CAESL) Supporting Teacher's Formative Assessment Practices: An Example Involving Science Notebooks
Jan. 31 Noon to 2 p.m. CAESL Brown Bag Lunch:
Rich Shavelson, Stanford University; Steve Schneider, Alice Fu and Mike Timms, WestEd

The 2009 NAEP Science Framework

Feb. 7 5634 Tolman

Cathleen Kennedy, Sevan Tutunciyan and Richard Vorp, UC Berkeley

BEAR IT: Berkeley Evaluation and Assessment Research Information Technology
Feb. 21 Diane Allen, UC Berkeley Validity, Reliability and Responsiveness of the Movement Ability Measure: Using Item Response Models
March 9 Thursday

Cathleen Kennedy and the Technology and Assessment Group (TAG):
Kathleen Scalise, Mike Timms,
Diana J. Bernbaum, Kristen Burmester, and S. Veeragouder Harrell,
UC Berkeley

Assessment for e-Learning: Case Studies of an Emerging Field
March 21 Juliet P. Shaffer, UC Berkeley Some Problems with Confidence Intervals
Apr.
5-7
International House, UCB IOMW 2006
The 13th International Objective Measurement Workshop
Apr. 18

Mike Timms, UC Berkeley

Using IRT in an Intelligent Tutoring System

January 24

Supporting Teachers' Formative Assessment Practices: An Example Involving Science Notebooks

Alicia Alonzo

Abstract:
Literature on teachers’ formative assessment practices has both extolled their impact on student achievement and lamented their seeming absence in most classrooms. To realize the potential benefits of formative assessment, teachers need support in incorporating these practices into their classrooms. In this presentation, I will discuss the beliefs, knowledge, and skills which have been shown to affect teachers’ formative assessment practices – drawing on both research literature and my recent study of science notebooks. These are used to propose professional development experiences which may impact teachers’ formative assessment practices.


Click here to view the slides from this presentation.

top

Janary 31

CAESL Brown Bag Lunch:
The 2009 NAEP Science Framework

Rich Shavelson, Stanford University
Steve Schneider, Alice Fu & Mike Timms, WestEd

Abstract:
This brownbag session will cover various aspects of the development of the new 2009 Science Framework for the National Assessment of Educational Progress (NAEP). Under contract to the National Assessment Governing Board, WestEd's National Center for Improving Science Education (NCISE) and the Council of Chief State School Officers conducted an 18-month process to develop the Framework involving hundreds of individuals across the country, including some of the nation's leading scientists, science educators, policymakers, and assessment experts. The Governing Board also engaged an external review panel to evaluate the draft Framework and convened a public hearing to gather additional input during the development process. Four CAESL members will address different aspects of the development of the Framework during this session. * Steve Schneider: The committee process and review cycles by which the NAEP Framework and Test Specifications documents were produced. * Rich Shavelson: The assessment specifications that are in the Framework and the Test Specs documents and what is new for 2009. * Alice Fu: The science content of the Framework, with particular reference to what is different for 2009. * Mike Timms: The new Interactive Computer Tasks that will form a new part of the assessments from 2009.


Click here to view the handout for this presentation.
Click here to view the slides for this presentation.

top

February 7

BEAR IT: Berkeley Evaluation and Assessment Research Information Technology

Cathleen Kennedy, Sevan Tutunciyan & Richard Vorp
University of California at Berkeley

Abstract:
Come and see new information technologies under development in the BEAR Center:
Advances in GradeMap - Software to facilitate multidimensional item response modeling and the interpretation of longitudinal response data. The GradeMap program accommodates the calibration of multiple forms linked by common items and produces reports of respondent change. We will also demonstrate reports that support the analysis of item and person fit, alternate forms analysis, and traditional item statistics.
Standard Setting using ConstructMap - This software assists users in the evaluation of criterion-referenced cut-points. Calibrated item estimates and person proficiency estimates are imported into ConstructMap and then the software is used to demonstrate the impact of selecting alternative cut-points.
The BEAR Scoring Engine - This web-based software is called by external applications to compute multidimensional proficiency estimates. Calibrated item parameters and response data are sent to the Scoring Engine in XML files via an HTTP request, the Scoring Engine computes proficiency estimates, and then transmits the estimates back to the calling program via an XML output file. Input response data and the returned proficiency data files comply with IMS/QTI (Question-Test Interface) XML specifications. We will demonstrate two applications that call the Scoring Engine and highlight interface techniques.
Online Assessment Delivery - Preview an online system under development that delivers an assessment, gathers responses, and produces multidimensional proficiency reports in real time.


Click here to view the slides from this presentation.

top

February 21

Validity, Reliability, and Responsiveness of Movement Ability Measure: Using Item Response Models

Diane Allen
University of California at Berkeley

Abstract:
Instruments created to test subjective factors related to human performance frequently lack validity because participant responses can indicate interpretations of items that differ radically from measurers'. Item response modeling (IRM) methods can assist in the development and testing of instruments that retain verifiable links to their subjective constructs and thus support using them to test theory. The purpose of this talk is to demonstrate IRM methods used to generate and test the Movement Ability Measure (MAM), a self-report questionnaire asking for people's perceptions of their ability to move. The MAM was generated to match closely the Movement Continuum Theory of physical therapy, generated by Cott et al. (1995) and extended and operationalized for this study. The responses of 318 adults (age range 18-101 years) provided evidence of content, construct, and criterion validity, and an internal consistency of .94; responses from 34 adults revealed a test-retest reliability of .84. Wright Maps showed the strong relationship between the theorized construct and the empirical data. A six-dimensional model fit the data better than a unidimensional model although the dimensions correlate highly. Results of the MAM and a 32-item self-reported functional assessment instrument correlated at r = .76. Repeated measures of 34 patients (age range 19-85 years) undergoing physical therapy in outpatient clinics indicated that the MAM was responsive to intervention after both 2 weeks (p < .00003) and at 2 months or discharge, whichever came first (p < .00002). Correlation between these patients' and their physical therapists' responses regarding their movement at initial visit was moderate, at r = .68. The evidence supported the predictions of the Movement Continuum Theory: current movement ability increased, and the gap between current and preferred movement abilities decreased following physical therapy for these patients with mostly orthopedic diagnoses. Thus, the IRM methods supported generation of an instrument closely linked to its underlying construct, and able to provide evidence supporting the overall theory within which the construct rests. Similar IRM methods might enhance the development and testing of additional theories related to human performance.

Click here to view the slides for this presentation.
Click here to view the handout from this presentation.

top

Thursday,
March 9

Assessment for e-Learning: Case Studies of an Emerging Field

Cathleen Kennedy and the
Technology and Assessment Group (TAG):
Diana J. Bernbaum, Kristen Burmester, S. Veeragouder Harrell, Kathleen Scalise, and Mike Timms
UC Berkeley

Abstract:
This symposium will discuss the rapidly emerging field of computer-based assessment in e-learning. In e-learning products, a variety of assessment approaches are being used for such diverse purposes as adaptive delivery of content, individualizing learning materials, dynamic feedback, cognitive diagnosis, score reporting and course placement. This symposium discusses evidence-based assessment principles in e-learning. Four case studies will be presented of e-learning products with assessment components. The products in the case studies were selected for exhibiting at least one exemplary aspect regarding assessment and measurement. The principles of the BEAR Assessment System will be used as a framework of analysis for these products with respect to key measurement principles, such as evidence identification and accumulation.

Click here to view slides for this presentation.
Click here to view additional slides for this presentation.

top

March 21

Some Problems with Confidence Intervals

Juliet P. Shaffer
University of California at Berkeley

Abstract:
Social-behavioral scientists are often warned about the defects of hypothesis testing, and exhorted to rely instead on confidence interval and effect size estimation. However, there are also problems unique to confidence intervals that are much more rarely addressed. For example, if attention is paid only to intervals not including the null value, the confidence coverage of those intervals is often much less than the nominal value. This phenomenon will be explained and illustrated, related issues will be discussed, and the possible impact of the results in the educational context will be noted.

Click here to view slides for this presentation.

top

April 5-7

International Objective Measurement Workshop (IOMW)

Nathaniel Brown & Brent Duckor, Coordinators
University of California at Berkeley

top

April 18

Using IRT in an Intelligent Tutoring System

Mike Timms, Ph.D.
University of California at Berkeley

Abstract:
Providing feedback, including hints, is one of the key steps in the tutoring process. However, a persistent challenge in the development of computer-based intelligent tutoring systems (ITSs) is how to determine accurately when a student needs help, and then determine what the best help is for that individual student. In this session I will describe a study in which I investigated the feasibility of predicting students’ need for help in an ITS using Item Response Theory (IRT). The first part of my study involved analysis of data from the PACT (Pittsburgh Advanced Cognitive Tutors) Geometry Tutor and a randomized study that compared three versions of a tutoring system that used IRT. The analysis of prior data showed that the use of hints was related to the students’ beginning ability and the size of the gap between that initial ability and the difficulty of the item. For the second part of my study, I worked with staff from the Principled Assessment Design for Inquiry (PADI) project to develop three versions of a computer-based self-assessment system used with the Full Option Science System (FOSS) curriculum on Force and Motion. The self-assessment system, or tutor, was designed to help middle-school students to learn to solve physics problems using the equation for calculating speed. The full version of the tutor used item response theory to give students hints appropriate to the size of their learning gap. The feedback version of the tutor provided feedback on errors that they made, but gave no hints on how to repair those errors. The limited version of the tutor gave neither error feedback nor hints, just confirming if their responses were right or wrong. I will report on the result of the comparative study of these three versions and discuss the implications of design decisions that were made in the development process.

top

Thank You for visiting, come back soon!

BEAR Center
Graduate School of Education
University of California, Berkeley
Berkeley, CA 94720

© 2002-2008 BEAR Center