Cognitive diagnosis: Classification and clustering (15)

Chair: Jimmy De la Torre, Wednesday 22nd July, 9.55 - 11.15, Boys Smith Room, Fisher Building. 

Young-Sun Lee and Jimmy de la Torre, Department of Human Development, Teachers College, Columbia University, New York, USA. Type 1 error and power of the Likelihood Ratio and Wald tests under the generalized DINA model framework. (021)

Chia-Yi Chiu, Department of Educational Psychology, Rutgers, The State University of New Jersey, USA and Jeff Douglas, Department of Statistics, University of Illinois Urbana-Champaign, USA. Finding the number of clusters and labeling clusters for profile classification in cognitive diagnosis. (019)

Elizabeth Ayers, Rebecca Nugent and Nema Dean, Department of Statistics, Carnegie Mellon University, Pittsburgh, USA. A comparisonof student skill knowledge estimates. (120)

Pei-Hua Chen, National Chiao Tung University, Taipei, Taiwan and Haiyan Wu, Department of Educational Psychology and Learning Systems, Florida State University, USA. A sampling and classification approach to construct parallel tests based on a cognitive diagnosis model. (089)

ABSTRACTS

Type 1 error and power of the Likelihood Ratio and Wald tests under the generalized DINA model framework. (021)
Young-Sun Lee and Jimmy de la Torre
As a general cognitive-diagnosis model (CDM) framework, the generalized deterministic input, noisy "and" gate (G-DINA) model includes a component for estimating reduced forms of the G-DINA model, and a component for testing the adequacy of the reduced models in place of the general model. With regard to the testing component, two tests have been proposed – likelihood (LR) and Wald tests. This study investigates the Type I error and power of the two tests in the context of three reduced models – DINA model, deterministic input, noisy "or" gate model, and additive-CDM. The Type I error for each of the three reduced models is examined by comparing the empirical and nominal rejection rates of the LR and Wald tests across different significance levels. The power of the tests is investigated by examining the rejection rates of models with varying degrees of disparity relative to a particular reduced model. In addition to the theoretical chi-squared distributions, the empirical distributions of the test statistics obtained in the Type I error study are used in determining the power of the tests. Finally, the Type I error and power of the LR and Wald tests are documented for three sample sizes.

Finding the number of clusters and labeling clusters for profile classification in cognitive diagnosis. (019)
Chia-Yi Chiu and Jeff Douglas
An asymptotic theory for clustering examinees based on their cognition profiles using cluster analysis has been developed and examined (Chiu, Douglas, & Li, 2009). The theory shows that given a particular cognitive diagnosis model and a long test, K-means and hierarchical agglomerative cluster analysis can perform nearly as well as the model-based method. The characteristics of efficiency and easy access make the new method an attractive alternative. However, cluster analysis does not directly deal with the issues of how to determine the number of clusters and label the clusters, which, from the cognitive diagnosis perspective, are substantial in order to finely understand examinees’ cognitive strengths and weaknesses. This paper presents methods to handle the problems. The number of clusters is chosen according to a scree plot of the fusion coefficients, the distances between clusters that are joined in each step of a HACA analysis. Regarding the issue of labeling clusters, an objective consistency index of the clusters with respect to underlying attribute patterns is proposed as a criterion to exhaustively search for all possible ways of labeling. Details of the methods and the empirical results are provided in the paper.

A comparison of student skill knowledge estimates. (120)
Elizabeth Ayers, Rebecca Nugent and Nema Dean
A fundamental goal of educational research is identifying students' current stage of skill mastery (complete/partial/none). Recently, cognitive diagnosis models have become a popular means of estimating student skill knowledge. However, estimation becomes dicult as the number of students, items, and skills grows. Two currently used alternatives are sum-scores (Henson et al., 2007) and the capability matrix (Ayers et al, 2008); these estimates are clustered to nd groups of students with similar skill set proles. While initial theoretical work on sum-scores has been done, the behavior of sum-scores and the capability matrix is not well understood with respect to each other or to estimates from cognitive diagnosis models. We compare the performance of the DINA model, sum-scores, and the capability matrix under a variety of clustering methods. In simulated examples, recovery of the true skill set proles was better across all estimation and clustering methods when both single and multiple skill items were used, compared to examples with only multiple skill items. In addition, we note that the alternative methods of estimating student skill knowledge behave similarly. It is interesting to note that sum-scores and the capability matrix perform as well as the DINA model estimates while being computationally more attractive.

A sampling and classification approach to construct parallel tests based on a cognitive diagnosis model. (089)
Pei-Hua Chen and Haiyan Wu
Most large-scale assessments developed based on item response theory (IRT) model only provide a single summary score that indicates the overall performance level of a student. According to the No Child Left Behind Act of 2001, assessments should provide useful diagnostic information in addition to single overall scores to know whether the students have mastered in assessed topics. A cognitive diagnostic test can offer a skill mastery profile for each examinee, and identifying areas where additional instruction is needed. In three parameter IRT framework, when the test information functions (TIFs) of alternative test forms are identical, the test forms can be treated as weakly parallel forms (Samejima, 1977). Most of the existing methods build multiple forms by using the constrained combinatorial approach to produce similar TIFs. Chen & Chang (2005) developed a new sampling and classification approach to assemble equivalent forms by mimicking the joint distribution of the discrimination and difficulty parameters. The purpose of this research is to further develop an algorithm based on the sampling and classification approach proposed in my dissertation for generating information-rich tests using the Cognitive Diagnostic Index that can provide diagnostic information.