Challenges of modeling in cognitive diagnostic assessment and solving them in TIMSS data

Document Type : Research Article


1 Assistant Professor, National Center for TIMSS and PIRLS, Research Institute for Education (RIE), Tehran, Iran

2 Professor, Sociology Department, University of Tehran, Iran


Objective: Cognitive diagnostic assessment has been introduced as a new issue in educational measurement. In this approach, more information was examined about how people learn and master cognitive attributes in school. There are several data modeling issues in cognitive diagnostic assessment due to differences with another statistical modeling.
Methods: In the present study, science data of grade eight in TIMSS was analyzed by cognitive diagnostic assessment, as an empirical example, and the problems were entitled as modeling challenges. Each challenge has been explained in order to highlight differences from the usual statistical modeling.
Results: The challenges included; unidimensionality versus multidimensionality, number of attributes, correlation between attributes, number of items in each attribute, operationalization of attribute, reliability of attribute, validity, item parameters, fit of the model, identification and specification, convergence, and complex sampling.
Conclusion: Each topic was discussed in the context of modeling TIMSS data in a science course and the experience of solving these challenges was shared.


Main Subjects

Chen, Y. H., Gorin, J. S., Thompson, M. S., Tatsuoka, K. K. (2008a). An alternative examination of Chinese Taipei mathematics achievement: Application of the rule-space method to TIMSS 1999 data. In M. v. Davier & D. Hastedt (Eds.), Issues and Methodologies in Large-Scale Assessments (Vol. 1, pp. 23-49). Hamburg: IEA-ETS Research Institute.
Chen, Y. H., Gorin, J. S., Thompson, M. S., Tatsuoka, K. K. (2008b). Cross-cultural validity of the TIMSS-1999 mathematics test: verification of a cognitive model. International Journal of Testing, 8(3), 251-271.
Chiu, C. Y., Seo, M. (2009). Cluster analysis for cognitive diagnosis: An application to the 2001 PIRLS reading assessment. In M. v. Davier & D. Hastedt (Eds.), Issues and Methodologies in Large-Scale Assessments (Vol. 2, pp. 137-159). Hamburg: IEA-ETS Research Institute.
De la Torre, J., Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333-353.
De la Torre, J., Minchen, N. D. (2019). The G-DINA Model Framework. In M. von Davier & Y. Lee (Eds.), Handbook of diagnostic classification models: Models and model extensions, applications, software packages (pp. 155-170). Switzerland, Cham: Springer.
DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: the DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35(1), 8-26.
DiBello, L. V., Roussos, L. A., Stout, W. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics psychometrics (Vol. 26, pp. 979-1030). Amesterdom: Elsevier Science Publishers.
Finkelman, M. D., Kim, W., Roussos, L., Verschoor, A. (2010). A binary programming approach to automated test assembly for cognitive diagnosis models. Applied Psychological Measurement, 34(5), 310-326.
Gierl, M. J., Alves, C., Majeau, R. T. (2010). Using the attribute hierarchy method to make diagnostic inferences about examinees’ knowledge and skills in mathematics: An operational implementation of cognitive diagnostic assessment. International Journal of Testing, 10(4), 318-341.
Gierl, M. J., Cui, Y., Zhou, J. (2009). Reliability and attribute based scoring in cognitive diagnostic assessment. Journal of Educational Measurement, 46(3), 293-313.
Haberman, S. J., von Davier, M. (2007). Some notes on models for cognitively based skills diagnosis. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (Vol. 26, pp. 1031-1038). Amesterdom: Elsevier Science Publishers.
Henson, R., Roussos, L., Douglas, J., He, X. (2008). Cognitive diagnostic attribute-level discrimination indices. Applied Psychological Measurement, 32(4), 275-288.
Im, S., Corter, J. E. (2011). Statistical Consequences of Attribute Misspecification in the Rule Space Method. Educational and Psychological Measurement, 71(4), 712-731.
Jang, E. E. (2006). Pedagogical implications of cognitive skills diagnostic assessment for teaching and learning. Paper presented at the the annual meeting of the American Educational Research Association, San Francisco, California.
Kabiri, M., Ghazi-Tabatabaei, M., Bazargan, A., Shokoohi-Yekta, M., Kharrazi, K. (2017) Diagnosing Competency Mastery in Science: An Application of GDM to TIMSS 2011 Data, Applied Measurement in Education, 30(1), 27-38,
Kim, Y. H. (2011). Diagnosing EAP writing ability using the reduced reparameterized unified model. Language Testing, 28(4), 509-541.
Leighton, J., Gierl, M. (2007). Why cognitive diagnostic assessment? In J. Leighton & M. Gierl (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 3-18). Cambridge: Cambridge University Press.
Leighton, J. P., Cui, Y., Cor, M. K. (2009). Testing expert-based and student-based cognitive models: An application of the attribute hierarchy method and hierarchy consistency index. Applied Measurement in Education, 22(3), 229-254.
Leighton, J. P., Gierl, M. J. (2011). The learning sciences in educational assessment: The role of cognitive models. New York: Cambridge University Press.
Leighton, J. P., Gierl, M. J., Hunka, S. M. (2004). The attribute hierarchy method for cognitive assessment: A variation on Tatsuoka's rule-space approach. Journal of Educational Measurement, 41(3), 205-237.
Roussos, L. A., Templin, J. L., Henson, R. A. (2007). Skills diagnosis using IRT based latent class models. Journal of Educational Measurement, 44(4), 293-311.
Rupp, A. A., Templin, J., Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York: The Guilford Press.
Rupp, A. A., Templin, J. L. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement, 6(4), 219-262.
Rutkowski, L., Gonzalez, E., Joncas, M., von Davier, M. (2010). International large-scale assessment data. Educational Researcher, 39(2), 142-151.
Sinharay, S., Almond, R. G. (2007). Assessing fit of cognitive diagnostic models: A case study. Educational and Psychological Measurement, 67(2), 239-257.
Stone, C. A., Ye, F., Zhu, X., Lane, S. (2010). Providing subscale scores for diagnostic information: A case study when the test is essentially unidimensional. Applied Measurement in Education, 23(1), 63-86.
Skaggs, G., Wilkins, J. L. M., Hein, S. F. (2016). Grain Size and Parameter Recovery with TIMSS and the General Diagnostic Model, International Journal of Testing, 16(4), 310-330. DOI: 10.1080/15305058.2016.1145683
Tatsuoka, K. K., Corter, J. E., & Tatsuoka, C. (2004). Patterns of diagnosed mathematical content and process skills in TIMSS-R across a sample of 20 countries. American Educational Research Journal, 41(4), 901-926.
Von Davier, M. (2007). Mixture distribution diagnostic models (No. RR-07-32). Princeston, NJ: ETS.
Von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287-307.
Von Davier, M. (2009). Using the general diagnostic model to measure learning and change in a longitudinal large-scale assessment (No. RR-09-28). Princeston, NJ: ETS.
Yamaguchi, K., Okada, K. (2018). Comparison among cognitive diagnostic models for the TIMSS 2007 fourth grade mathematics assessment. PLoS ONE, 13(2), e0188691.