Item Estimates and Fit Graph

From BEARWiki
Jump to: navigation, search

This report helps users analyze how well the items are fitting the model. It shows the parameter estimates, their standard errors, the outfit (unweighted) and infit (weighted) mean squares and associated t-statistics. Step parameters are displayed according to the measurement model:

  • Dichotomous items display the item difficulty, the difficulty of achieving a score of 1 rather than 0 (no step parameters).
  • Partial credit items display the average of all the step difficulties for the item, and difficulties for each step, representing the difficulty of achieving a score at that score level rather than at the preceding level.
  • Rating scale items display the average of all the step difficulties, and a tau parameter for each step, representing the difficulty of achieving a score at that score level rather than at the preceding level. Tau parameters are constant across all items within a dimension and separate tau parameters are displayed for each dimension.

Selecting the Reports - Item Reports - Item Estimates & Fit Graph option for a partial credit model produces a report similar to Figure 1. In this example, item names are used rather than item numbers. Values in the report are explained below.

This example uses the Unidimensional Partial Credit sample project.

Item Estimates

The first item shown in Figure 1, item “i1”, has three categories, and therefore two steps. The first entry with an estimate of -0.396, labeled “i1,” is the average of the two step difficulties, or the average item difficulty. The next two entries, labeled “i1.1” and “i1.2” are the step difficulty values for j=1 and j=2. The last step, in this case i1.2, is constrained, and therefore does not have a standard error displayed.

Figure 1. Item Estimates report for the Example 2, partial credit, project.

Standard Errors of the Estimates

Referring again to Figure 1, standard errors are produced for the average item estimates and for the estimated step parameters. Standard errors are not produced for constrained parameters.

Mean Squares

The infit mean square for an item is defined as the ratio of the variance of the observed residuals over the variance of the expected residuals. When these values are close to 1.0, then the observed residuals vary as expected.

Large mean squares in item fit are an indication of more variance in the observed scores than expected, suggesting that an item does not measure the underlying variable as well as other items in the model (i.e., it has a flat slope). These items are less predictable than expected and may suggest unmodelled noise in the data. Items exhibiting large mean squares contribute less to the overall latent trait, and so are the most important to evaluate and correct or remove.

Small values are an indication of less variance than expected, which typically occurs when an item discriminates proficiencies very well over a relatively small range of person abilities (i.e., it has a steep slope). These items are overly predictable, implying less stochastic variation than expected for meaningful measurement.

Outfit mean squares are sums of the squared standardized residuals for an item, while infit mean squares are variance-weighted sums of the squared standardized residuals for an item. Outfit mean squares are influenced by outliers, while infit mean squares are influenced by unexpected patterns among more average observations. Adams and Khoo (1996) suggest 0.75 and 1.33 as bounds of acceptable mean score values.


The t-statistic is a transformation of the mean square into a standard normal distribution. Values above 2.0 (or below -2.0) are generally considered significant. This statistic is sensitive to sample size, sometimes resulting in large values for large samples, so we recommend using both the mean square and t-statistic together (Wilson, 2005). When both indicate significant misfit, an item should be investigated further to understand why.

Fit Graph

In addition to the table of item parameter estimates and fit statistics, this report produces a graph of fit statistics by item. As shown in Figure 2, the graph helps the user identify items that do not fit the model well. The vertical columns of dots at 0.75 and 1.33 are theoretical boundaries defining “acceptable values” for infit mean squares. The *’s represent the infit mean square values for the item averages; those near the vertical column of |’s, indicating a value of 1.00, fit the model quite well. These item parameters were produced from the report shown in Figure 1.

Figure 2. Fit Graph of item infit mean squares.

The figure below shows output for a rating scale model. Note that all of the average item difficulties are presented in one section near the beginning of the report, while tau parameters (step intervals) are presented later in the report. In multidimensional models, a separate group of tau parameters is displayed for each dimension.

Figure 3. Item Estimates report for the Example 3, rating scale, project.
  1. Since this report produces text file output as well as an output screen, you can print this report using Word or Notepad. The file will be located in the folder you specified (note the filename in the upper left-hand corner of the heading area).
  2. Close the map display by clicking on the close box, CloseX.jpg, in the upper right-hand corner.