Computer Adaptive Testing: New directions (12)

Chair: Maarten Speekenbrink, Friday 24th July, 11.30 - 12.50, Palmeston Lecture Theatre, Fisher Building.

Maarten Speekenbrink, Nick Chater and David R. Shanks, Department of Cognitive, Perceptual and Brain Sciences, University College, London, UK. Adaptive tests for model discrimination. (072)

Niki-Nils Seitz and Andreas Frey, Leibniz Institute for Science Education, Kiel, Germany.  Multiple-category classification using multidimensional adaptive testing. (208)

Chen-Wei Liu and Wen-Chung Wang, Department of Psychology, National Chung-Cheng University, Taiwan. The application of the random-threshold generalized graded unfolding model to computerized adaptive testing and computerized classification testing. (154)

ABSTRACTS

Adaptive tests for model discrimination. (072)
Maarten Speekenbrink, Nick Chater and David R. Shanks
Adaptive tests are used routinely in psychometric ability assessment. Here, we discuss our recent work on adaptive testing for a different purpose, namely to distinguish between competing models of cognitive functioning (e.g., learning and categorization models). In this context, the goal is the sequential design of an experiment such that each consecutive trial maximizes the information for discriminating a set of competing models, while simultaneously estimating the models' parameters. As analytical solutions are typically intractable, our method uses approximations based on recent methods of sequential sampling (particle filters). We will illustrate the method with a simple example of category learning. Based on this example, we then discuss a number of problems and challenges for future work.

Multiple-category classification using multidimensional adaptive testing. (208)
Niki-Nils Seitz and Andreas Frey
With multidimensional adaptive testing (MAT) individuals can be classified into two or more categories (like pass/fail) on multiple dimensions. One possible classification criterion is if the probability of an incorrect classification falls below a predefined level (e.g. 5%). However, results regarding the applicability of MAT for the classification of individuals to multiple categories are still missing. In a simulation study, MAT was compared both to one-dimensional adaptive testing (CAT) and sequential testing with a fixed item set (FIT) regarding measurement efficiency (ME). ME was calculated by the ratio of measurement precision and the number of items used in the test. The test conditions included several pass/fail test situations and a multiple category test. In a pass/fail situation, measurement efficiency was lower for FIT (ME = 0.295) than for CAT (ME = 1.423) and MAT (ME = 1.537). In a multiple-category situation, measurement efficiency was generally lower with MAT (ME = 0.100) and CAT (ME = 0.105) performing better than FIT (ME = 0.063). Thus, MAT is applicable for classification and increases efficiency, especially in a pass/fail test situation. Based on the results, additional gains in efficiency with MAT are discussed.  

The application of the random-threshold generalized graded unfolding model to computerized adaptive testing and computerized classification testing. (154)
Chen-Wei Liu and Wen-Chung Wang
The random-threshold generalized graded unfolding model (RTGGUM) was proposed in this study. This model takes the randomness in the threshold over persons into account by treating it as random-effect and adding a random variable for each threshold (Roberts et al., 2000) .To investigate the applicability of the RTGGUM model more thoroughly, it was further applied to both computerized adaptive testing (CAT) and computerized classification testing (CCT). By taking the random-effect as another dimension, the RTGGUM-CAT can be regarded as a special case of multidimensional adaptive testing and both the latent trait and random effects were estimated simultaneously. On the other hand, the focus of RTGGUM-CCT is on the accuracy of classification and therefore the random effects are viewed as randomness. The results indicated that the existence of random effects would lead to inaccurate latent trait estimates, as well as increase the classification errors. Therefore, the reliability of latent trait estimates and the accuracy of classification are doubtful if the random effects were ignored. The related issues about the application of RTGGUM in CAT and CCT are discussed.