Abstract
The single-strategy deterministic, inputs, noisy “and” gate (SS-DINA) model has previously been extended to a model called the multiple-strategy deterministic, inputs, noisy “and” gate (MS-DINA) model to address more complex situations where examinees can use multiple problem-solving strategies during the test. The main purpose of this article is to adapt an efficient estimation algorithm, the Expectation–Maximization algorithm, that can be used to fit the MS-DINA model when the joint attribute distribution is most general (i.e., saturated). The article also examines through a simulation study the impact of sample size and test length on the fit of the SS-DINA and MS-DINA models, and the implications of misfit on item parameter recovery and attribute classification accuracy. In addition, an analysis of fraction subtraction data is presented to illustrate the use of the algorithm with real data. Finally, the article concludes by discussing several important issues associated with multiple-strategies models for cognitive diagnosis.
Traditional (i.e., unidimensional) item response models (IRMs) primarily use overall scores to compare and rank examinees along a proficiency continuum. In more recent applications of traditional IRMs to large-scale assessments (e.g., the National Assessment of Education Progress; Lee, Grigg, & Dion, 2007), these models have been used to identify what examinees with varying proficiencies can do differentially by providing exemplar problems along the different points of the proficiency continuum. However, because items in such tests are not purposely designed to be diagnostic in nature, they may not provide a sufficiently informative diagnosis of students’ strengths and weaknesses. In contrast, cognitive diagnosis models (CDMs), a family of psychometric models, which can provide score profiles in place of, or possibly in addition to, the overall test scores, have been developed in recent years to assist educational practitioners in evaluating students’ mastery or nonmastery of finer grained skills or attributes required for solving problems in a test. Among various CDMs, the deterministic, inputs, noisy “and” gate (DINA; Haertel, 1984; Junker & Sijtsma, 2001) model is one of the most popular CDMs that have been widely studied (e.g., de la Torre, 2009a; de la Torre & Douglas, 2004; C. Tatsuoka, 2002).
Implicitly, the DINA model, as most CDMs, is a single-strategy model. For the purposes of this article, the DINA model will be referred to as the single-strategy DINA (SS-DINA) model. Recently, the SS-DINA model has been extended to a model called the multiple-strategy DINA (MS-DINA) model to address a more complex situation where examinees can use multiple problem-solving strategies employed in the test (de la Torre & Douglas, 2008). Such situations are not uncommon in education. For example, Fuson et al. (1997) identified three strategies, which were invented by children at elementary schools rather than taught by teachers in class, to solve problems in the domain of multidigit addition and subtraction. These strategies are named as the sequential, the combining-units-separately, and the compensating strategies. The detailed illustrations of these three strategies and their associated four attributes are presented in Appendix A.
In cases where examinees can apply more than one strategy to solve a problem, a single-strategy model such as the SS-DINA model may not adequately capture the complex nature of multiple-strategy phenomena. In these situations, CDMs that can accommodate multiple strategies such as the MS-DINA model are more appropriate. Specifically, the MS-DINA model evaluates whether an examinee fulfills at least one of the possible strategies for solving a problem. Previous research (i.e., de la Torre & Douglas, 2008) has shown that this model can correct the size of the guessing parameter estimate when an alternative strategy is being used.
Objectives
One serious limitation of the current implementation of the MS-DINA model is that it relies on Markov chain Monte Carlo (MCMC; e.g., Carlin & Louis, 2000; Gamerman, 1997; Gelman, Carlin, Stern, & Rubin, 2003) algorithm to estimate its model parameters. Although flexible, MCMC is a computer-intensive and time-consuming estimation algorithm and, thus, can be impractical when dealing with larger data sets. Another limitation of the MS-DINA model as described and estimated by de la Torre and Douglas (2008) is that the joint distribution of the attributes is expressed as a function of a higher order ability. This is a very specific form of the joint distribution of the attributes and, therefore, may not apply to all settings.
To address these two issues, the main purpose of this article is to develop a more efficient estimation procedure, namely, Expectation–Maximization (EM; Dempster, Laird, & Rubin, 1977) algorithm that can estimate the parameters of the MS-DINA model when the joint attribute distribution is allowed to be as general as possible (i.e., saturated). This article also examines the impact of sample size and test length on the SS-DINA and MS-DINA model fit, and the implications of misfit on item parameter recovery and attribute classification accuracy. The remaining sections of the article are laid out as follows. The “Overview and Background” section introduces the notations and formulations for the SS-DINA and the MS-DINA models, as well as the joint distributions of the attributes. The EM algorithm for the MS-DINA model is described in the “Estimation” section. A simulation study and a real data analysis are given in the “Simulation Study” and “Fraction Subtraction Data Illustration” sections, respectively. Last, a “Discussion and Conclusion” section presents the conclusions of this article.
Overview and Background
Let
SS-DINA Model
In the SS-DINA model, the entire examinee population is partitioned into two latent groups at the level of each item. In one group, examinees possess all the attributes required for the item, and in the other group, examinees lack at least one required attribute. Let
Using the DINA model, the conditional distribution of
In this model, parameter
MS-DINA Model
As an extension of the SS-DINA model, the MS-DINA model proposed by de la Torre and Douglas (2008) offers the possibility that problems can be solved in multiple ways, and those alternative strategies can be decoupled from lucky guessing. To do so, the MS-DINA model requires constructing
In other words,
The item response function of the MS-DINA model is the same as Equation 1 once
The item response function for the DINA models assumes that examinees belonging to the same group with respect to item
The formulation of the item response function demonstrates two advantages associated with the SS-DINA and MS-DINA models. First, the DINA models are parsimonious because only two parameters (i.e.,
Joint Distribution of Attributes
In addition to the item response function given in the previous section, the DINA models also need the specification of the joint distribution of latent attribute patterns, each of which represents a unique latent class. The most general and fundamental formulation of the joint distribution of attributes allows for all the possible latent classes. Assuming the total number of attributes is
Estimation
To date, the estimation of the DINA model typically uses two types of model parameter estimation algorithms. One is the EM algorithm for the DINA model with the saturated latent class specification (e.g., de la Torre, 2009b), and the other is the MCMC algorithm for the HO-DINA model (e.g., de la Torre & Douglas, 2004). Although both algorithms were originally developed for the SS-DINA model, de la Torre and Douglas (2008) have extended the MCMC algorithm for the MS-DINA model with a higher order specification of the joint distribution. This article aimed to adapt the EM algorithm for the MS-DINA model with the saturated joint attribute distribution. The estimation algorithm for the MS-DINA model differs from that of the SS-DINA model in that it needs to be able to handle multiple strategies and incorporate the
The MS-DINA EM algorithm is a straightforward extension of the SS-DINA EM algorithm. The MS-DINA EM algorithm is a two-stage process. The first stage determines the appropriate
Simulation Study
Equipped with the newly adapted EM algorithm, a simulation study was designed to examine the performance of the algorithm on the MS-DINA model calibration. The simulation study also aimed to investigate the impact of sample size and the test length on the SS-DINA and MS-DINA model fit, and the implications of misfit on item parameter recovery and attribute classification accuracy. In addition, given that previous research (e.g., de la Torre & Douglas, 2008) has been conducted through the MCMC algorithm for the SS-DINA and MS-DINA models, the article also compared the current simulation results obtained through the EM algorithms for the SS-DINA and MS-DINA models with the previous simulation results using the MCMC algorithms for both models in de la Torre and Douglas.
Design
The simulation study examined four factors: generating or true model (SS-DINA and MS-DINA models), fitted model (SS-DINA and MS-DINA models), sample size (N = 500, 1,000, and 2,000), and test length (
To generate the item response data, the examinee’s parameters
Results
Item parameter estimates
The item parameter (i.e.,
Table 1 shows the results of
Summary of
Note. MAD = mean absolute deviation;
SS-DINA = single-strategy deterministic, inputs, noisy “and” gate; MS-DINA = multiple-strategy deterministic, inputs, noisy “and” gate.
Substantial differences in terms of estimation accuracy were observed between the two types of misfit as shown in columns 3 through 7. Using the SS-DINA model to fit the data generated by the MS-DINA model produced less accurate estimates than using the MS-DINA model to fit the SS-DINA data. As will be shown later, the overestimation of the guessing parameter was clearly observed at the item level. Because the SS-DINA model cannot detect the usage of alternative strategies, and treat unexpected correct responses as the product of random guessing, the guessing parameters were always overestimated. The discrepancies between these two models, which were incorrectly fitted, were largely caused by overestimation of the guessing parameter by the SS-DINA model.
Moreover, in examining the results from all possible conditions, it can be seen that test length and sample size were also important factors that affected the parameter estimation accuracy. In general, as expected, larger sample sizes and longer test length yielded more accurate item parameter estimates. However, the impact of test length and sample sizes was more pronounced and complex when incorrectly fitted models were involved, and the magnitudes of the difference varied depending on the nature of the fitted models. For instance, when the data were incorrectly fitted with the MS-DINA model, the shorter test length produced less accurate item parameter estimates of
In addition to the point estimates of
Average of Computed
Note. SS-DINA = single-strategy deterministic, inputs, noisy “and” gate; MS-DINA = multiple-strategy deterministic, inputs, noisy “and” gate.
Chosen from the results on all possible simulation conditions, the estimates of
Attribute classification
Estimated attribute patterns for all examinees were compared with the true attribute patterns to see how accurately the examinees’ attribute patterns can be recovered by the different DINA models. Table 3 shows the percent of correct classification for the attribute vector
Percent of Correct Attribute Classification of
The individual attribute classification rates shown in Tables C4 and C5 were generally consistent with the attribute vector rates. When models were correctly specified, the SS-DINA model always yielded higher correct classification rates for
MS-DINA model can provide additional information about attribute classification of
Fraction Subtraction Data Illustration
Data
To further compare the SS-DINA and MS-DINA models, real data involving responses of 536 middle school students to 15 fraction items that can be solved by two strategies were analyzed. After the original data with responses to 20 items were introduced by K. K. Tatsuoka (1987, 1990), a subset of the data with 15 items were analyzed in some studies (de la Torre, 2009b; de la Torre & Douglas, 2004, 2008; Mislevy, 1996; C. Tatsuoka, 2002). For these data, students solved the mixed number subtraction problems using either Strategy A or B. The major difference between Strategies A and B lies in the different ways students used the strategies to deal with mixed numbers. Students who used Strategy A separated mixed numbers into the whole number and fractional parts before performing subtraction on each part. In contrast, students who used Strategy B converted mixed numbers to improper fractions prior to the subsequent subtraction operation. A total of seven attributes were shared by the two strategies. These seven attributes originally described by Mislevy (1996) are summarized in Table C7. The attributes for Strategy A were numbered as 1, 2, 3, 4, and 5, and for Strategy B, 1, 2, 5, 6, and 7. The Q-matrices for both strategies are given in Table C8.
Results
Item parameter estimation
The SS-DINA (assuming the use of Strategy A only) and MS-DINA (assuming the use of both Strategies A and B) models have been used to fit the fraction subtraction data, and the model parameter estimates are summarized in Table 4. As expected, the results of parameter estimates in the two DINA models, as well as their corresponding theoretical
Parameter Estimates for the Fraction Subtraction Data Based on the SS-DINA and MS-DINA Model.
Note. DINA = deterministic, inputs, noisy “and” gate; SS-DINA = single-strategy deterministic, inputs, noisy “and” gate; MS-DINA = multiple-strategy deterministic, inputs, noisy “and” gate.
In addition, to compare the relative model fit between these two models at the test level, the statistics of the
Attribute classification
The SS-DINA model can provide diagnostic information on Attributes 1 to 5; the MS-DINA model can offer examinees with additional diagnostic information on Attributes 6 and 7. The common classification of Attributes 1 to 5 for both the models has been compared. The SS-DINA and MS-DINA models had a high degree of attribute classification agreement in most cases. The common classification rates for the attributes were around 95% or higher, with the exception of Attribute 3, which is 89.74%. In investigating further, it was found that the discrepancy in the Attribute 3 classifications was due to the fact that the proportion of examinees estimated to have mastered Attribute 3 was higher using the SS-DINA model than the MS-DINA model. In addition, the SS-DINA and MS-DINA models had common attribute vector classification for 81% of the students.
Discussion and Conclusion
de la Torre (2009b) discussed the EM and MCMC algorithms for estimating the SS-DINA model parameters. A previous study (de la Torre & Douglas, 2008) implemented the MCMC algorithm for the SS-DINA and MS-DINA models. This article shows that the EM algorithm for the SS-DINA model can be extended to the multiple-strategy situation. The EM algorithm is relatively more efficient. On a desktop computer with 3.0 GHz processor and 1 GB of memory, fitting the MS-DINA model took about 20 seconds when the convergence criterion was set at 0.0001 for the fraction subtraction data set used in this article. For the comparison purposes, the higher order multiple-strategy deterministic, inputs, noisy “and” gate (HO-MS-DINA) model MCMC algorithm (de la Torre & Douglas, 2008), which was run on the same computer, took almost 2 hours to analyze the same fraction subtraction data using a single chain of 250,000 iterations. Additional time would be needed to run multiple chains (e.g., four) to examine convergence. In addition to its efficiency, both the simulation and real data analyses showed that the EM algorithm produced the results that were comparable with those obtained using the MCMC algorithm.
Another contribution of this article is the systematic investigation of the impact of model misspecification, sample size, and test length on model parameter estimation and attribute classification for both the SS-DINA and MS-DINA models. A misspecified MS-DINA model always yields more accurate item parameter estimates and has generally better attribute classification accuracy than the SS-DINA model when both of them are misspecified. These findings may have important practical implications because with real data the true underlying model and number of strategies employed are seldom, if at all, known. For this reason, it might be “safer” to fit the DINA model using the MS-DINA approach because the MS-DINA model is more robust against model misspecification than the SS-DINA model. However, the authors would like to note that such potential gain from the MS-DINA model is closely contingent on some prerequisites, such as, the availability of correct multiple-strategy Q-matrices, and constructing and validating Q-matrices for multiple strategies can be more challenging than for a single strategy. As such, the advantages of the MS-DINA model may not be realized in situations where additional effort cannot be expended to guarantee that the Q-matrices of the MS-DINA model are correct. In addition, a more comprehensive approach to model-selection should also take into account the complexity of the model vis-à-vis its fit to the data.
The fraction subtraction data example illustrates that both the SS-DINA and MS-DINA models produce comparable item parameters in most cases and that the SS-DINA model provides better model fit than does the MS-DINA model when model complexity is taken into account. Given these findings, the necessity of the MS-DINA model might be called into question. Although it is true that there is no guarantee that the MS-DINA model can outperform the SS-DINA model all the time, the authors would like to argue that statistical criteria (e.g., model fit indices) are not the only standards to evaluate the necessity and utility of more complex models, such as the MS-DINA model. In addition to statistical evaluation, it would be helpful to also pay attention to non-statistical rationales for considering the MS-DINA model.
Multiple-strategies CDMs were developed in response to existing practical needs and their potential applications in the near future. Single-strategy problem solving may be too strict an assumption and may not always be appropriate for all kinds of problems encountered in practical educational settings. School teachers typically teach students more than one way to solve problems; at the same time, students may spontaneously invent multiple strategies on their own. It is therefore reasonable to expect that more mature learners to have greater likelihood of applying multiple strategies to solve problems in more complex and advanced teaching and learning environments. Moreover, the unique feature of CDMs that distinguishes them from traditional item response theory (IRT) lies in their ability to provide refined attribute profiles to diagnose what skills students have or have not mastered. As indicated in the real data illustration, although the MS-DINA model is not necessarily better than the SS-DINA model in the statistical sense, the MS-DINA model can provide attribute profiles that contain two additional attributes than can the SS-DINA model. Such additional information is afforded by using a multiple-strategy model, and represents the type of information one would expect of the cognitively diagnostic assessments.
For future research, it is worthwhile to further investigate whether the substantial difference in theoretical and empirical SEs is due to the complexity of the Q-matrix in the simulation study. As indicated by the simulation results shown in a previous study by de la Torre et al. (2010), the structure of the Q-matrix can affect the accuracy of item parameter estimates. In the future, a simpler Q-matrix where items are measured by fewer attributes can be considered to examine its impact on the SEs. The current MS-DINA model assumes that the probabilities of the guessing and slip across different strategies are identical. This assumption could be too strict and may not be suitable in many practical situations. A straightforward extension of this article is to develop a CDM that allows for the guessing and slip parameters of the same item to change across strategies. The current MS-DINA model also assumes that the application of each strategy is equally difficult (de la Torre & Douglas, 2008). Therefore, students can apply different strategies from item to item throughout the entire test according to the MS-DINA model. However, this flexibility does not always reflect the truth for all real data applications. In the fraction subtraction data set, it is more reasonable to assume that students have tendency to use one particular strategy over the other. Already noted by de la Torre and Douglas, another extension of the MS-DINA model can incorporate a latent variable to associate individual students with each strategy. Such latent variable can define which strategy is more likely to be chosen than others by each student. It would also be interesting to explore additional variations of the MS-DINA model. For example, a more general version of the MS-DINA model would allow for different numbers of strategies for the different items. This would subsume the current MS-DINA model, and the model where, in the same test, some items can be solved using only a single strategy, and the remaining items using multiple strategies. Developing efficient algorithms to accompany those more extensions of the MS-DINA model would also be necessary.
The main focus of this article is the applicability of the CDMs in multiple-strategies settings. The two Q-matrices used in the fraction subtraction illustration in this article were theoretically derived (Mislevy, 1996), and have been used in the previous and current studies for the purpose of examining multiple-strategy cognitive modeling in fraction subtraction. Although the authors assume that the two Q-matrices are correct for the purpose of this article, they do not necessarily assume that the Q-matrices, in the single or multiple-strategy context, are always correct. However, examining the correctness of Q-matrices is beyond the scope of the current work. As well known, Q-matrix validation is another key component in cognitive diagnostic assessment in addition to diagnostic modeling. For example, as noted by DeCarlo (2011), misspecification of the Q-matrix can have serious consequences on the latent class classification obtained from the specific CDMs. Specification of Q-matrix is usually initiated by test content domain experts, psychometricians have been taking more and more active roles in the Q-matrix validation for several reasons. First, the Q-matrix construction and validation requires tremendous cost in terms of time and human effort. Second, relying only on experts’ judgment can lead to Q-matrix construction and validation processes that are highly subjective in nature. Recently, a few research on Q-matrix validation from psychometric perspectives have been undertaken (e.g., Barnes, 2010; de la Torre, 2008; de la Torre & Chiu, 2010; Huo & de la Torre, 2013; Liu, Xu, & Ying, 2012, 2013). However, there have no particular psychometric methods for determining Q-matrices in the multiple-strategy situations thus far. Technically, validating multiple Q-matrices is a straightforward extension of the single Q-matrix validation. However, advancements in Q-matrix construction and validation in a multiple-strategy setting requires theoretically driven judgments that explicitly specify the needs to classify problem-solving skills into more than one strategy. This would be in addition to the subsequent psychometric Q-matrix validation procedures. As such, it is anticipated that constructing and validating multiple Q-matrices in multiple-strategy settings would be more complicated than in single-strategy settings. To the extent that it is worthwhile to incorporate more than one Q-matrix, it would also be necessary to investigate how to validate multiple Q-matrices using psychometric procedures.
Last, the authors want to address the identifiability issue associated with the MS-DINA model in particular, and the CDMs in general. Compared with the SS-DINA model, the MS-DINA model has greater complexity both in terms of the number of attributes involved, and how the latent or ideal response
In sum, as more potential applications call for more complex model specifications, identification issues may also emerge. Despite many theoretical efforts dealing with identifiability issues on item parameters and attribute patterns, some empirical approaches can be easily implemented in real data analysis to examine model problems with identification. For example, in the fraction subtraction data analysis, real data estimates were used as the generating parameters to simulate item responses. The results showed that the item parameters can be recovered very well. 3 As such, the simulation study provided evidence that model identification is not an issue in fitting the MS-DINA model to the fraction subtraction data.
Footnotes
Appendix A
Appendix B
Appendix C
The Q-Matrices for the Fraction Subtraction Data.
| Item | Attribute |
|||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Strategy A |
Strategy B |
|||||||||
| 1 | 2 | 3 | 4 | 5 | 1 | 2 | 5 | 6 | 7 | |
| 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| 2 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 |
| 3 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| 4 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 |
| 5 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
| 6 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 |
| 7 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 |
| 8 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 |
| 9 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 |
| 10 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 |
| 11 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 |
| 12 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 |
| 13 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 |
| 14 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| 15 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 |
Appendix D
Authors’ Note
The opinions expressed herein are those of the authors, and do not necessarily represent the views of the National Science Foundation or the National Institute on Alcohol Abuse and Alcoholism.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the National Science Foundation Grant DRL-0744486 and the National Institute on Alcohol Abuse and Alcoholism Grant R01AA019511.
