Abstract
In assessment programs where scores are reported for individual examinees, it is desirable to have responses to performance exercises graded by more than one rater. If more than one item on each test form is so graded, it is also desirable that different raters grade the responses of any one examinee. This gives rise to sampling designs in which raters are nested within items. These designs lead to simple methods for estimating variance components owing to examinees and to interactions of examinees by items and examinees by raters within items. The authors review here some useful results from generalizability analysis based on these estimates and show that they may be used to correct the item response information functions and standard errors for conditional dependence of multiple ratings. Examples based on data from two performance testing studies are presented.
Original language | English |
---|---|
Pages (from-to) | 364-375 |
Number of pages | 12 |
Journal | Applied Psychological Measurement |
Volume | 26 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2002 Jan 1 |
Keywords
- Conditional dependence
- Generalizability analysis
- Item response analysis
- Multiple ratings of responses
- Performance exercises
- Test information
ASJC Scopus subject areas
- Social Sciences (miscellaneous)
- Psychology (miscellaneous)