An Explicit Form With Continuous Attribute Profile of the Partial Mastery DINA Model

Abstract

Cognitive diagnosis models (CDMs) are the statistical framework for cognitive diagnostic assessment in education and psychology. They generally assume that subjects’ latent attributes are dichotomous—mastery or nonmastery, which seems quite deterministic. As an alternative to dichotomous attribute mastery, attention is drawn to the use of a continuous attribute mastery format in recent literature. To obtain subjects’ finer-grained attribute mastery for more precise diagnosis and guidance, an equivalent but more explicit form of the partial-mastery-deterministic inputs, noisy “and” gate (DINA) model (termed continuous attribute profile [CAP]-DINA form) is proposed in this article. Its parameters estimation algorithm based on this form using Bayesian techniques with Markov chain Monte Carlo algorithm is also presented. Two simulation studies are conducted then to explore its parameter recovery and model misspecification, and the results demonstrate that the CAP-DINA form performs robustly with satisfactory efficiency in these two aspects. A real data study of the English test also indicates it has a better model fit than DINA.

Keywords

cognitive diagnosis model continuous attribute profile DINA

1. Introduction

Cognitive diagnosis (CD) is a new trend in the development of contemporary psychometrics theory (von Davier & Lee, 2019). One of the main advantages of CD is that it can provide more in-depth and detailed diagnostic information to the subjects so that they can be provided with person-oriented remedial teaching and guidance for greater improvement (Leighton & Gierl, 2007; Rupp et al., 2010).

CD models (CDMs) are statistical models integrated with cognitive variables. They define the core structure of cognitive diagnostic assessment (CDA) and their properties directly determine the accuracy and effectiveness of CDA. The well-known CDMs developed in the last 20 years include the deterministic inputs, noisy “and” gate (DINA) model (Haertel, 1989; Junker & Sijtsma, 2001); the noisy inputs, deterministic, “and” gate (NIDA) model (Maris, 1999); the deterministic input, noisy “or” gate (DINO) model (Templin & Henson, 2006); the general diagnostic model (von Davier, 2008); and the generalized DINA (GDINA) model (de la Torre, 2011).

Most CDMs developed in the literature adopt discrete attribute mastery variables. That is, the subjects’ attribute mastery level is characterized by several discrete values. Initially, dichotomous values are used in CDMs (e.g., DINA, DINO, and NIDA) to characterize dichotomous attribute mastery level, with 0 representing nonmastery and 1 representing mastery (Chen et al., 2014; Cheng, 2009; Hsu et al., 2013; Mao & Xin, 2013; Wang, 2013; Wang et al., 2011). However, from a statistical point of view, it is deterministic to use only 0 and 1 to characterize the two levels of mastery (complete mastery or complete nonmastery; Zhan, Wang, et al., 2019). The framework of polytomous attributes was formed to alleviate the problem (Chen & de la Torre, 2013; de la Torre et al., 2010; Karelitz, 2004). But it did not solve the problem because the subjects’ diagnosis at each attribute level is still an absolute and unprecise value—0 or 1. In fact, the subjects’ attribute mastery level should not be deterministic because even for a group of subjects with the same value in a certain attribute (i.e., all of them are 0s or all of them are 1s), they still likely differ more or less in the attribute mastery degree. In such cases, it is that the differences in attribute mastery degree among subjects may be amplified or narrowed in a substantial level, resulting in less refined diagnosis and radically different guidance and remedies. In the CDMs, for example, the subject’s attribute profile is usually estimated via expected a posteriori (EAP), which is considered simplest, fastest, and conforms to the traditional Bayesian statistical thought (Huebner & Wang, 2011). During the process of estimation, the subject’s attribute mastery level is initially continuous and represented by attribute mastery probability (AMP) and finally just made an artificial discretization to facilitate CD. Suppose subjects A and B obtain a certain AMP of 0.4 and 0.6, respectively, in the estimation process. In that case, the diagnosis of their attribute mastery level will be truncated into 0 (complete nonmastery) and 1 (complete mastery), respectively, which amplify their difference, thus corresponding guidance and remedy will just be provided for A, but not for B. However, in fact, neither of them has a good attribute mastery level, and B also needs to consider whether guidance and remedy are necessary or not. Likewise, if their AMPs are 0.6 and 0.9, respectively, the problem still exists. Their difference will be narrowed, and A cannot get reasonable guidance and remedy. In addition, when partial mastery or incomplete mastery exists with subjects, the standard CDMs may not do a good job of explaining the heterogeneity in the responses (Shang et al., 2021). Therefore, it is reasonable to introduce continuous variables to describe the subjects’ attribute mastery degree in CDMs.

Earlier than the development of CDMs, the item response theory (IRT) was well established and widely used in educational and psychological application. It refers to a family of probabilistic models that attempt to explain the relationship between latent traits and their manifestations (Baker & Kim, 2004; DeMars, 2010; Embretson & Reise, 2000; Embretson & Steven, 2013; Hambleton & Swaminathan, 1985; Sijtsma & Junker, 2006; Van der Linden & Hambleton, 1997). In fact, multidimensional IRT (MIRT) models contain the idea of continuous latent traits and can also provide certain diagnostic information (Ackerman, 1994; Embretson & Yang, 2013; Reckase, 1997, 2009; Stout, 2007; Wang & Nydick, 2015; Whitely, 1980), and a typical one is the multicomponent MIRT model (Multicomponent latent trait models [MLTM]; Whitely, 1980). However, it should be noted that the latent trait θ in MIRT has no boundaries, and even if its absolute location is determined, its relative location cannot be determined, so MIRT cannot provide a direct and accurate diagnosis for subjects (Shang et al., 2021).

CDMs can also be assumed to be constructed in an unobservable multidimensional continuum to calibrate a subject’s location. Nevertheless, there is a major difference between them. Each attribute mastery (α) in CDMs now is in the range $[0, 1]$ , while each latent trait (θ) in MIRT is in the range $(- \infty, + \infty)$ . This continuous approach is not only in line with the statistical point of view thus more reasonable and precise, but also has the boundary $[0, 1]$ to determine its relative location and provide a direct diagnosis. Hong et al. (2015) develop a continuous conjunctive model (CCM). It assumes conjunctive relations among the latent abilities as many of the CDMs do, but the difference is that there are no item-level parameters, and the stochastic components are absorbed into the subjects’ ability profiles. Zhan et al. (2018) propose a different conceptualization of attribute mastery as a probabilistic concept and two models-the probabilistic-input, noisy conjunctive (PINC) model for independent attributes and the higher order PINC model for correlated attributes to provide a finer description of mastery status, but the attributes are still binary.

Shang et al. (2021) propose several more specific and flexible CDMs-partial-mastery CDMs (PM-CDMs), which allow for partial mastery based on continuous attributes. They emphasized that these models are “mixed membership generations of the binary attribute CDMs.” However, the construction and estimation of these PM-CDMs are somehow complicated. As the DINA model is a simple and intuitive CDM, which can be easily estimated, a DINA-type form for continuous attribute profile (CAP), which is in fact an equivalent form of the PM-DINA model, is introduced in this article. To avoid confusion, it is referred to below as the CAP-DINA form. The CAP-DINA form is much simpler in terms of form, construction, and data-generating process of subjects, which gives better computation advantage.

The remainder of this article is structured as follows. In Section 2, the CAP-DINA form is proposed from a different perspective after reviewing the original DINA model; then, in Section 3, two simulation studies are presented. Study 1 simulates various conditions to explore the parameter recovery of the CAP-DINA form but with data-generating process of subjects different from the PM-DINA model, and the simulation in Study 2 compares the model misspecification of DINA and the CAP-DINA form. Section 4 applies the CAP-DINA form to English test data and demonstrates a good model fit. Finally, this article is concluded with a discussion and some directions for further research.

2. The CAP-DINA Form

2.1. The DINA Model

The DINA model is a simple, intuitive parameterized model that is easily estimated and has received much attention in recent CDM literature. As a result, we will take DINA as an example in this study. We assume the number of attributes measured by the test is K. The item response function (IRF) for the DINA model, which represents the correct response probability of Subject i on item j, is defined as follows:

P (x_{i j} = 1 | α_{i}) = {(1 - s_{j})}^{η_{i j}} g_{j}^{1 - η_{i j}},

in which

η_{i j} = Π_{k = 1}^{K} α_{i k}^{q^{j k}},

where $x_{i j}$ denotes the response of the Subject i to item j ( $j = 1, \dots, J$ ), with 1 representing a correct response and 0 representing an incorrect response. g_j and s_j represent the guessing and slipping parameters for item j, respectively. $η_{i j}$ represents the ideal response of Subject i with binary attribute profile vector $α_{i} = {(α_{i 1}, α_{i 1}, \dots, α_{i K})}^{′}$ on Item j with binary attribute vector $q_{j} = {(q_{i 1}, q_{i 1}, \dots, q_{i K})}^{′}$ . The ideal response equation assumes a value of 1 if Subject i possesses all the attributes required for Item j and a value of 0 if Subject i lacks at least one of the required attributes. Correspondently, the correct response probability $P (x_{i j} = 1 | α_{i})$ is $1 - s_{j}$ and g_j , respectively.

2.2. Formulation of CAP-DINA Form

Subjects’ attribute mastery in standard CDMs is dichotomous—mastery or nonmastery, which is deterministic. In addition, from a statistical point of view, a subject’s attribute mastery degree is usually determined by calculating the proportion of their correct response on that attribute. The greater the proportion, the higher the attribute mastery degree. As a result, it is reasonable to assume the attribute mastery degree to be continuous. In this article, we assume Subject i’s attribute mastery $α_{i k} (k = 1, \dots, K)$ is continuous on interval $[0, 1]$ and represented by attribute mastery proportion; then, the attribute profile $α_{i} = {(α_{i 1}, α_{i 2}, \dots, α_{i K})}^{′}$ is also continuous on the hypercube ${[0, 1]}^{K}$ . According to the conjunctive condensation rule, the Subject i’s mastery proportion of the attribute profile examined by Item $j (j = 1, 2, \dots, J)$ is $Π_{k = 1}^{K} α_{i k}^{q^{j k}}$ . We denote it by $η_{i j}$ as in the DINA model; however, note that it is not dichotomous but continuous on the interval $[0, 1]$ now. Correspondently, the nonmastery proportion is $1 - η_{i j}$ . Next, the mixed membership model (Erosheva, 2005; Erosheva et al., 2004; Haberman, 1995; Manton et al., 1994) is to be applied as the theoretical basis to obtain the IRF. The notion of mixed membership arises naturally in the context of multivariate data analysis (Hair, 2011) when attributes collected on individuals or objects originate from a mixture of different categories or components. The assumption that individuals or objects may combine attributes from several basis categories in a stochastic manner, according to their proportions of membership in each category, is a distinctive feature of mixed membership models. The CAP-DINA form follows the same assumption as the mixed membership models. For Subject i’s mastery proportion, slipping from items should be considered, and its correct response probability to Item j is $η_{i j} (1 - s_{j})$ . Likewise, for Subject i’s nonmastery proportion, guessing from items should also be considered, and its correct response probability to Item j is $(1 - η_{i j}) g_{j}$ . As a result, the correct response probability to Item j is the combination of the two probabilities above, and IRF for the CAP-DINA form is defined as follows:

p_{i j} = P (x_{i j} = 1 | α_{i}) = η_{i j} (1 - s_{j}) + (1 - η_{i j}) g_{j},

in which

η_{i j} = Π_{k = 1}^{K} α_{i k}^{q^{j k}} .

An example of its IRF computation is given as follows:

if $s_{j} = 0.2$ , $g_{j} = 0.1$ , $q_{j} = (1, 0, 1, 0, 1)$ , $α_{i} = (0.9, 0.1, 0.8, 0.7, 0.1)$ , then

η_{i j} = 0 {.9}^{1} \times 0 {.1}^{0} \times 0 {.8}^{1} \times 0 {.7}^{0} \times {0.1}^{1} = 0.072,

p_{i j} = 0.072 \times (1 - 0.2) + (1 - 0.072) \times 0.1 = 0.1504.

2.2.1. Monotonically increasing property

For any fixed Item j, the larger the subject’s mastery proportion ( $η_{i j}$ ) is, the higher the correct response probability $p_{i j}$ is. As $η_{i j}$ goes to 1, $p_{i j}$ goes to $1 - s_{j}$ . As $η_{i j}$ goes to 0, $p_{i j}$ goes to g_j . That is, the IRF for the CAP-DINA form is monotonically increasing with respect to $η_{i j}$ . The proof is as follows.

Taking the first derivative of Equation 3 with respect to $η_{i j}$ will yield

\frac{\partial P_{i j}}{\partial η_{i j}} = 1 - s_{j} - g_{j} .

Considering the monotonicity restriction $1 - s_{j} > g_{j}$ , the first derivative

\frac{\partial P_{i j}}{\partial η_{i j}} > 0.

So, $P_{j} (α_{i})$ is a monotonically increasing function of $η_{i j}$ . Because $0 \leq η_{i j} \leq$ 1, then

P_{i j} (η_{i j} = 0) \leq P_{i j} \leq P_{i j} (η_{i j} = 1),

that is,

g_{j} \leq P_{i j} \leq 1 - s_{j} .

As a result, the CAP-DINA form satisfies the monotonically increasing property with respect to $η_{i j}$ .

2.2.2. Relationships with DINA

The core difference between the CAP-DINA form and DINA is that the former assumes each subject has a CAP $α = {(α_{1}, α_{2}, \dots, α_{K})}^{′}$ , which lives on the hypercube ${[0, 1]}^{K}$ , indicating their mastery degree on each attribute of interest, while the latter assumes the attribute profile is discrete. The generation of their responses $x_{i j}$ is the same, both following $x_{i j} \sim Bernoulli (p_{i j})$ . See Equations 3, 4, and 8; it is not difficult to find that when the attributes in the CAP-DINA form take binary values, resulting $η_{i j}$ also takes binary values, DINA model is just a special case of the CAP-DINA form. Specifically, both models satisfy $P_{i j} (η_{i j} = 0) = g_{j}$ and $P_{i j} (η_{i j} = 1) = 1 - s_{j}$ , which are the only two correct response probabilities in the DINA model, but the minimum and maximum correct response probabilities in the CAP-DINA form. As a result, the CAP-DINA form subsumes DINA model as a special case.

2.2.3. Relationships with MLTM

In MIRT, there is a typical model-MLTM (Whitely, 1980) similar to the CAP-DINA form. They have in common that they both are noncompensatory diagnosis models based on the products of the continuous latent traits. The differences between them are as follows: (1) The latent trait $θ$ in MLTM has no boundaries, and even if its absolute location is determined, its relative location cannot be determined, so MLTM cannot provide a direct and accurate diagnosis for subjects, which is referred to Section 1, (2) the construction of MLTM is so complicated that it has computational limitations. Specifically, (a) it needs the great requirements for acceptable estimation, such as a sample size of 4,000 people, six unidimensional items per dimension, and so on, and (b) it has practical limitations when used in real data. But the CAP-DINA form overcomes the above limitations, which can be seen in the following simulation and real date studies.

2.2.4. Relationships with PM-DINA

Four similarities can be found between the CAP-DINA form and the PM-DINA model (Shang et al., 2021): (1) They both have the assumption that each subject has a continuous attribute mastery on the interval $[0, 1]$ , (2) they both subsume DINA as a special case when elements in the attribute profile all take discrete values—0 or 1, (3) the mixed membership model is applied as the theoretical basis to obtain the IRF in both of them, and (4) the most interesting and important is that the IRF in the CAP-DINA form is essentially equivalent to the IRF in the PM-DINA model. The proof is as follows.

Shang et al. (2021) allow that the subject’s attribute mastery is also partial and continuous on the hypercube ${[0, 1]}^{K}$ , which is called attribute mastery score and denoted by $d = (d_{1}, \dots, d_{K})$ . $α = (α_{1}, \dots, α_{K})$ is the subject’s attribute profile, which is discrete on ${0, 1}^{K}$ . q _j provides the full requirements for Item j.

For DINA model, the ideal response $ξ_{j, α}^{D I N A}$ is represented as

ξ_{j, α}^{D I N A} = I (α ≽ q_{j}),

which can also be represented as

ξ_{j, α}^{D I N A} = \prod_{k = 1}^{K} α_{k}^{q_{k}} .

Then, the probability of a positive response to Item j $θ_{j, α}$ takes the form

θ_{j, α} = {(1 - s_{j})}^{ξ_{j, α}^{D I N A}} g_{j}^{1 - ξ_{j, α}^{D I N A}},

$d$ is then ascribed to all the possible attribute profiles $α = (α_{1}, \dots, α_{K}) \in {0, 1}^{K}$ with various probabilities as follows:

p_{α | d} = \prod_{1}^{K} d_{k}^{α_{k}} {(1 - d_{k})}^{1 - α_{k}} \in [0, 1] .

For a subject with a general mastery score $d$ , the marginal probability of a positive response to Item j is a weighted combination of $θ_{j, α} = P (R_{j} = 1 | α)$

θ_{j, d} = P (R_{j} = 1 | d) = \sum_{α = (α_{1}, \dots, α_{K}) \in {0, 1}^{K}} θ_{j, α} \times p_{α | d},

in which $p_{α | d}$ denotes the mixture weight of $θ_{j, α}$ given $d$ , and $\sum_{α} p_{α | d} = 1$ .

Next, we simplify Equation 13.

The number of all the possible attribute profiles $α = (α_{1}, \dots, α_{K}) \in {0, 1}^{K}$ is $2^{K}$ . We list them in a matrix as follows:

α = {(\begin{matrix} \begin{matrix} 0 & \dots \\ 0 & \dots \\ 0 & \dots \end{matrix} & \begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{matrix} \\ \begin{matrix} 0 & \dots \\ 0 & \dots \\ ⋮ & ⋮ \end{matrix} & \begin{matrix} 0 & 1 & 1 \\ 1 & 0 & 0 \\ ⋮ & ⋮ & ⋮ \end{matrix} \\ \begin{matrix} \begin{matrix} \begin{matrix} 1 \\ 1 \end{matrix} \\ \begin{matrix} 1 \\ 1 \end{matrix} \end{matrix} & \begin{matrix} \begin{matrix} \dots \\ \dots \end{matrix} \\ \begin{matrix} \dots \\ \dots \end{matrix} \end{matrix} \\ 1 & \dots \\ 1 & \dots \end{matrix} & \begin{matrix} \begin{matrix} \begin{matrix} 0 \\ 0 \end{matrix} \\ 1 \\ 1 \end{matrix} & \begin{matrix} \begin{matrix} 1 \\ 1 \end{matrix} \\ 0 \\ 0 \end{matrix} & \begin{matrix} \begin{matrix} 0 \\ 1 \end{matrix} \\ 0 \\ 1 \end{matrix} \\ 1 & 1 & 0 \\ 1 & 1 & 1 \end{matrix} \end{matrix})}_{2^{K} \times K} .

The rows of α-matrix are arranged in positive order from top to bottom according to the decimal value of each row and it is important to note that the subscript of each row is the decimal value of that row plus 1.

(i) Without loss of generality, suppose that Item j measures the first $m (1 \leq m \leq K)$ attributes, that is,

q_{j} = (\underset{m}{\underset{︸}{1, 1, \dots, 1,}} \underset{k - m}{\underset{︸}{0, \dots 0}}) .

For $α = (α_{1}, \dots, α_{K}) \in {0, 1}^{K}$ , as long as $α ≽ q_{j}$ , that is, the first m values of α are 1, the ideal response $ξ_{j, α}^{D I N A}$ is always 1 regardless of whether the next $K - m$ values are 0 or 1. For the last $K - m$ values of α , each of them may take 0 or 1. So, there are $2^{K - m}$ attribute mastery profiles satisfying $α ≽ q_{j}$ , we denote it with set $\begin{matrix} α_{α ≽ q_{j}} = {(\underset{m}{\underset{︸}{1, 1, ..., 1}}, \underset{K - m}{\underset{︸}{α_{m + 1}, \dots, α_{K}}}), \forall α_{m + 1}, \dots, α_{K} \in (0, 1)}, \end{matrix}$

in which $(α_{m + 1}, \dots, α_{K}) \in {0, 1}^{K - m}$ . It is obvious that $(\underset{m}{\underset{︸}{1, 1, ..., 1}}, \underset{K - m}{\underset{︸}{0, ..., 0}})$ is the smallest $α$ satisfying $α ≽ q_{j}$ , and its row subscript is its decimal value plus 1, that is, $2^{K - m} + 2^{K - m + 1} + \dots + 2^{K - 1} + 1 = 2^{K} - 2^{K - m} + 1$ , we denote it by $β_{m}$ . The biggest $α$ satisfying $α ≽ q_{j}$ is $(\underset{K}{\underset{︸}{1, 1, ..., 1}})$ with all the elements 1, and its row subscript is $2^{K}$ . Then, the subscript of $α_{α ≽ q_{j}}$ in Equation 14 is from $β_{m}$ to $2^{K}$ .

It can be expressed as

α_{α ≽ q_{j}} = {(\begin{matrix} 1 & \dots & 1 & 0 & \dots & 0 & 0 & 0 \\ 1 & \dots & 1 & 0 & \dots & 0 & 0 & 1 \\ 1 & \dots & 1 & 0 & \dots & 0 & 1 & 0 \\ 1 & \dots & 1 & 0 & \dots & 0 & 1 & 1 \\ 1 & \dots & 1 & 0 & \dots & 1 & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 1 & \dots & 1 & 1 & \dots & 0 & 0 & 1 \\ 1 & \dots & 1 & 1 & \dots & 0 & 1 & 0 \\ 1 & \dots & 1 & 1 & \dots & 0 & 1 & 1 \\ 1 & \dots & 1 & 1 & \dots & 1 & 0 & 0 \\ 1 & \dots & 1 & 1 & \dots & 1 & 0 & 1 \\ 1 & \dots & 1 & 1 & \dots & 1 & 0 & 0 \\ 1 & \dots & 1 & 1 & \dots & 1 & 1 & 1 \end{matrix})}_{2^{K - m} \times K} .

Note that each value in the first m columns is 1.

Because the number of $α_{α ≽ q_{j}}$ is $2^{K - m}$ , so there are $2^{K} - 2^{K - m}$ attribute mastery profiles satisfying $α ≺ q_{j}$ , we denote it with set $α_{α ≺ q_{j}}$ . It is obvious that $(\underset{m - 1}{\underset{︸}{1, 1, ..., 1}} \underset{K - m}{\underset{︸}{0, 1, ..., 1}})$ is the largest $α$ satisfying $α ≺ q_{j}$ , and its row subscript is $β_{m} - 1$ , that is, $2^{K - 1} + 2^{K - 2} + \dots + 2^{K - m} = 2^{K} - 2^{K - m}$ . Then, the subscript of $α_{α ≺ q_{j}}$ is from 1 to $β_{m} - 1$ .

According to Equation 11,

θ_{j, α_{α ≽ q_{j}}} = 1 - s_{j},

θ_{j, α_{α ≺ q_{j}}} = g_{j} .

Combining Equations 13, 16, and 17, we can get

θ_{j, d} = P (R_{j} = 1 | d) = \sum_{α_{α ≽ q_{j}}} θ_{j, α_{α ≽ q_{j}}} \times p_{α | d} + \sum_{α_{α ≺ q_{j}}} θ_{j, α_{α ≺ q_{j}}} \times p_{α | d} = (1 - s_{j}) \sum_{α_{α ≽ q_{j}}} p_{α | d} + g_{j} \sum_{α ≺ q_{j}} p_{α | d} .

Note that

\sum_{α_{α ≽ q_{j}}} p_{α | d} + \sum_{α ≺ q_{j}} p_{α | d} = 1.

First, we calculate $\sum_{α_{α ≽ q_{j}}} p_{α | d}$ .

For $d = (d_{1}, \dots, d_{K})$ ,

\sum_{α_{α ≽ q_{j}}} p_{α_{α ≽ q_{j}} | d} = p_{α_{2^{K - 1} + 2^{K - 2} + \dots + 2^{K - m} + 1} | d} + p_{α_{2^{K - 1} + 2^{K - 2} + \dots + 2^{K - m} + 2} | d} + \dots + p_{α_{2^{K} + 1} | d} .

Here, according to Equation 12,

p_{α_{2^{K - 1} + 2^{K - 2} + \dots + 2^{K - m} + 1} | d} = (\prod_{1}^{m} d_{k}^{α_{k}} {(1 - d_{k})}^{1 - α_{k}}) \times (1 - d_{m + 1}) (1 - d_{m + 2}) \times \dots \times (1 - d_{K}),

p_{α_{2^{K - 1} + 2^{K - 2} + \dots + 2^{K - m} + 2} | d} = (\prod_{1}^{m} d_{k}^{α_{k}} {(1 - d_{k})}^{1 - α_{k}}) \times (1 - d_{m + 1}) (1 - d_{m + 2}) \times \dots \times (1 - d_{K - 1}) \times d_{K},

p_{α_{2^{K - 1} + 2^{K - 2} + \dots + 2^{K - m} + 3} | d} = (\prod_{1}^{m} d_{k}^{α_{k}} {(1 - d_{k})}^{1 - α_{k}}) \times (1 - d_{m + 1}) (1 - d_{m + 2}) \times \dots \times (1 - d_{K - 2}) \times d_{K - 1} \times (1 - d_{K}),

p_{α_{2^{K - 1} + 2^{K - 2} + \dots + 2^{K - m} + 4} | d} = (\prod_{1}^{m} d_{k}^{α_{k}} {(1 - d_{k})}^{1 - α_{k}}) \times (1 - d_{m + 1}) (1 - d_{m + 2}) \times \dots \times (1 - d_{K - 3}) \times (1 - d_{K - 2}) \times d_{K - 1} \times d_{K},

⋮

p_{α_{2^{K}} | d} = (\prod_{1}^{m} d_{k}^{α_{k}} {(1 - d_{k})}^{1 - α_{k}}) \times d_{m + 1} \times d_{m + 2} \times \dots \times d_{K - 2} \times d_{K - 1} \times d_{K} .

As we can see, the first left-hand side of Equations 21 through 25 is the same. Because $α_{1}$ through $α_{m}$ is 1, $\prod_{1}^{m} d_{k}^{α_{k}} {(1 - d_{k})}^{1 - α_{k}} = d_{1} \times d_{2} \times \dots \times d_{m}$ , which is just $η$ in the CAP-DINA form. This is important and should be noted. Then

\begin{array}{l} \sum_{α_{α ≽ q_{j}}} p_{α_{α ≽ q_{j}} | d} = η \times [(1 - d_{m + 1}) (1 - d_{m + 2}) \times \dots \times (1 - d_{K}) \\ + (1 - d_{m + 1}) (1 - d_{m + 2}) \times \dots \times (1 - d_{K - 1}) \times d_{K} \\ + (1 - d_{m + 1}) (1 - d_{m + 2}) \times \dots \times (1 - d_{K - 2}) \times d_{K - 1} \times (1 - d_{K}) \\ + (1 - d_{m + 1}) (1 - d_{m + 2}) \times \dots \times (1 - d_{K - 3}) \times (1 - d_{K - 2}) \times d_{K - 1} \times d_{K} \\ + \dots + d_{m + 1} \times d_{m + 2} \times \dots \times d_{K - 2} \times d_{K - 1} \times d_{K}] . \end{array}

To calculate Equation 26, we just need to calculate the part in the square bracket first.

To simplify the part in in the square bracket, we think of the $K - m$ dimension space. Assume that there are $K - m$ exclusive events with probabilities $d_{m + 1}$ , $d_{m + 2}$ , $\dots$ , d_K , respectively, what is the possibility sum of all the possible exclusive $2^{K - m}$ results about the m events? Because it contains all the possible results in the $K - m$ dimension space, the sum of the probabilities is 1. The problem mentioned earlier is essentially the same with this because it also contains all the $2^{K - m}$ results.

So, the sum of the part in the square bracket of Equation 26 is also 1, then

\sum_{α_{α ≽ q_{j}}} p_{α | d} = η .

Hence, the first part in Equation 18

(1 - s_{j}) \sum_{α_{α ≽ q_{j}}} p_{α | d} = (1 - s_{j}) η.

Second, we calculate $\sum_{α ≺ q_{j}} p_{α | d}$ in Equation 18.

According to Equation 19,

\sum_{α_{α ≺ q_{j}}} p_{α | d} = 1 - η,

then

g_{j} \sum_{α ≺ q_{j}} p_{α | d} = g_{j} (1 - η) .

Substitute Equations 29 and 30 into Equation 18,

θ_{j, d} = (1 - s_{j}) η + g_{j} (1 - η) .

The above is the proof for Item j measuring the first $m (1 \leq m \leq K)$ attributes.

(ii)If the $m (1 \leq m \leq K)$ attributes measured are in any m attributes, Equation 31 also can be proved in the same way.

So, the IRF in the CAP-DINA form is essentially equivalent to the IRF in the PM-DINA model.

2.2.5. Model identifiability

The model identifiability issue is critical in Bayesian approach. Shang et al. (2021) represent the PM-DINA model by a restricted class model (RLCM), which is identifiable if the Q-matrix satisfies certain structural conditions (Gu & Xu, 2020; Xu, 2017). With this equivalent RLCM representation, the identifiability of the model parameters of PM-DINA is expected to be established under a similar set of structural conditions for the Q-matrix (Shang et al., 2021). Since the CAP-DINA form is shown to be essentially equivalent to the PM-DINA model, we would also conclude that the identifiability of CAP-DINA form could be established if the Q-matrix satisfies the similar structural conditions.

2.3. Model Estimation

The parameters of the CAP-DINA form are estimated via the Bayesian technique using the Markov chain Monte Carlo (MCMC) algorithm, which is implemented in the freeware JAGS programme (Version 4.3.0; Plummer, 2015). By default, the Gibbs sampling algorithm is used by JAGS (Gelfand & Smith, 1990), and an additional tutorial on using JAGS for Bayesian CDM is provided in Zhan, Jiao, et al. (2019).

α is generated by Gaussian copula model to allow for the dependencies among the attributes, which is the same to Shang et al. (2021). Specifically,

{Φ^{- 1} (α_{k}); k = 1, \dots, K}^{T} \sim MVN (μ, Σ),

where $Φ^{- 1}$ is the inverse cumulative distribution function of a standard normal distribution, $μ$ is a K-dimensional mean vector, and $Σ$ is a $K \times K$ covariance matrix.

The hyperpriors of $μ$ are conjugate and specified as follows:

μ \sim MVN (0, Σ_{μ}),

The hyperpriors of $Σ_{μ}$ follow:

Σ_{μ} \sim InvWishart (R, K),

where $R$ is a K-dimensional identity matrix.

The response of Subject i to Item j is assumed to be independently distributed following a Bernoulli distribution:

y_{ij} \sim Bernoulli (p_{i j}),

where $p_{i j}$ is given in Equation 3.

The priors of item parameters s_j and g_j is specified as follows:

s_{j} \sim B e t a (1, 1),

g_{j} \sim B e t a (1, 1) T (0, 1 - s_{j}),

in which T is a truncation function and $T (0, 1 - s_{j})$ means the monotonicity restriction $0 < g_{j} < 1 - s_{j}$ .

Finally, the posterior mean is treated as the estimate value for the item parameters (i.e., s_j and g_j ), and the posterior mode is treated as the estimate for the person parameters (i.e., $α_{i}$ ).

It is necessary to compare the data-generating processes of a subject between the CAP-DINA form and the PM-CDM model as they will affect the recovery of the values of the parameters. In order to understand the difference more intuitively, we illustrate it with Figure 1. The PM-CDM model introduces for each subject a vector of dichotomous auxiliary latent indicators $α_{j}^{*} = (α_{j, 1}^{*}, \dots, α_{j, K}^{*}) (j = 1, \dots, J)$ for item responses, which can be seen in Step 3 of PM-CDM in Figure 1, and then a Gibbs sampling algorithm is designed to estimate the model parameters. But the CAP-DINA form generates item responses directly with CAP α . By contrast, the introduction of the auxiliary latent indicators $α_{j}^{*}$ makes the parameter estimation within PM-CDM model more steps and therefore more complicated. As a result, the estimation within the CAP-DINA form is relatively straightforward and efficient.

Figure 1.

Continuous attribute profile-deterministic inputs, noisy “and” gate (DINA) form and partial-mastery-DINA date-generating processes for a subject.

3. Simulation Studies

Two simulation studies were conducted to explore the performance of the CAP-DINA form under various conditions. Specifically, Simulation Study 1 was to demonstrate the model parameter recovery accuracy. The data were simulated from the CAP-DINA form and analyzed with the true model. Simulation Study 2 was to compare the misspecification of the CAP-DINA form and the DINA model. Specifically, both the DINA model and the CAP-DINA form are fitted for the DINA and the CAP-DINA form generated datasets.

3.1. Simulation Study 1

3.1.1. Design and data generation

Two numbers of independent attributes ( $K = 3, 5$ ), one mean vector ( $μ = 0$ ), two correlation coefficients ( $ρ = 0, 0.8$ ), three test-lengths (short, moderate, and long; $L = 15, 20, and 30$ ), four sample sizes ( $N = 500, 1,000, 1,500, and 2,000$ ), were considered. In summary, this simulation study contained $2 \times 1 \times 2 \times 3 \times 4 = 48$ conditions. For each condition, 50 replications were generated. The Q-matrices used in the simulation study were given in Equations 34 and 35 (Minchen et al., 2017; Shang et al., 2021), denoting $Q_{15}$ , $Q_{20}$ , $Q_{30}$ , $Q_{15}^{'}$ , $Q_{20}^{'}$ , and $Q_{30}^{'}$ , respectively, which are complete under the DINA model, with an $K \times K$ identity submatrix. The slipping and guessing parameters s and g in the CAP-DINA form were set to be 0.10:

Q_{15} = (\begin{matrix} \begin{matrix} \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \end{matrix} \end{matrix}) Q_{20} = (\begin{matrix} \begin{matrix} \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 1 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix} \end{matrix} \end{matrix}) Q_{30} = (\begin{array}{l} \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix} \\ \begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix} \end{array})

Q_{15}^{′} = (\begin{array}{l} \begin{matrix} 1 & 0 & \begin{matrix} 0 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \\ 0 & 1 & \begin{matrix} 0 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 1 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 0 & 0 & \begin{matrix} 0 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 0 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 1 & 1 & \begin{matrix} 0 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 1 & 0 & \begin{matrix} 0 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 0 & 1 & \begin{matrix} 1 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 1 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 0 & 0 & \begin{matrix} 0 & \begin{matrix} 1 & 1 \end{matrix} \end{matrix} \\ 1 & 1 & \begin{matrix} 1 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \\ 1 & 1 & \begin{matrix} 0 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 1 & 0 & \begin{matrix} 0 & \begin{matrix} 1 & 1 \end{matrix} \end{matrix} \\ 0 & 1 & \begin{matrix} 1 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 1 & \begin{matrix} 1 & 1 \end{matrix} \end{matrix} \end{matrix} \end{array}) Q_{20}^{′} = (\begin{array}{l} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 1 & 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 \\ 1 & 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 & 0 \\ 0 & 1 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 & 1 \\ 1 & 0 & 0 & 1 & 1 \\ 1 & 1 & 0 & 0 & 1 \end{array}) Q_{30}^{′} = (\begin{array}{l} \begin{matrix} 1 & 0 & \begin{matrix} 0 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \\ 0 & 1 & \begin{matrix} 0 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 1 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 0 & 0 & \begin{matrix} 0 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 0 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 1 & 0 & \begin{matrix} 0 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 0 & 1 & \begin{matrix} 0 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 1 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 0 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 0 & 0 & \begin{matrix} 0 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 1 & 1 & \begin{matrix} 0 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \\ 1 & 0 & \begin{matrix} 1 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 1 & 0 & \begin{matrix} 0 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \\ 1 & 0 & \begin{matrix} 0 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 0 & 1 & \begin{matrix} 1 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 0 & 1 & \begin{matrix} 0 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \\ 0 & 1 & \begin{matrix} 0 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 1 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 0 & 0 & \begin{matrix} 1 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 0 & \begin{matrix} 1 & 1 \end{matrix} \end{matrix} \\ 1 & 1 & \begin{matrix} 1 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 1 & 1 & \begin{matrix} 0 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \\ 1 & 1 & \begin{matrix} 0 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 1 & 0 & \begin{matrix} 1 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 1 & 0 & \begin{matrix} 1 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 1 & 0 & \begin{matrix} 0 & \begin{matrix} 1 & 1 \end{matrix} \end{matrix} \\ 0 & 1 & \begin{matrix} 1 & \begin{matrix} 1 & 0 \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 0 & 1 & \begin{matrix} 1 & \begin{matrix} 0 & 1 \end{matrix} \end{matrix} \\ 0 & 1 & \begin{matrix} 0 & \begin{matrix} 1 & 1 \end{matrix} \end{matrix} \\ 0 & 0 & \begin{matrix} 1 & \begin{matrix} 1 & 1 \end{matrix} \end{matrix} \end{matrix} \end{array}) .

3.1.2. Analysis

The CAP-DINA form was fitted to each replication. For each replication, two Markov chains with random starting values were used and each chain had 5,000 iterations, with the first the numbers of 2,000 iterations in each chain as burn-in, and the remaining 3,000 iterations for model parameter inference. The potential scale reduction factor $\hat{R}$ (Brooks & Gelman, 1998) was computed to assess the convergence of each parameter. A value of $\hat{R}$ less than 1.10 or 1.20 indicates convergence (Brooks & Gelman, 1998; de la Torre & Douglas, 2004). In this study, $\hat{R}$ was generally less than 1.10, suggesting good convergence with the specified settings.

To evaluate parameter recovery accuracy, for the item parameters, the absolute bias standard error (ABSE) and the root mean square error (RMSE) averaged across all items and overall replications were both reported; for the person parameters, ABSE and RMSE of each attribute averaged across all subjects and overall replications were both reported.

3.1.3. Results

Tables 1 through 4 present the ABSE and RMSE of parameter estimates in various settings, respectively. We start by analyzing each table individually, and several conclusions can be drawn. We take Table 2 as an example for specific analysis. First, for a test given a fixed length (L), the larger the sample size (N), the smaller the ABSE and RMSE. For example, for $L = 15$ , the largest ABSE and RMSE for g are 0.040 and 0.050, respectively, when the sample size is 500, but reduce to 0.019 and 0.024, respectively, when the sample size is 2,000; the largest ABSE and RMSE for s are 0.064 and 0.077, respectively, when the sample size is 500, but reduce to 0.026 and 0.032, respectively, when the sample size is 2,000; the largest ABSE and RMSE for Alpha1 are 0.131 and 0.168, respectively, when the sample size is 500, but reduce to 0.127 and 0.161, respectively, when the sample size is 2,000; similar results can also be found from Alpha2 to Alpha3. For $L = 20$ or $L = 30$ , similar results also exist. Second, for a given sample size (N), the longer the test length (L), the smaller the ABSE and RMSE. For example, for $N = 500$ , the largest ABSE and RMSE for g are 0.040 and 0.050, respectively, when L is 15, but reduce to 0.030 and 0.039, respectively, when L is 30; the largest ABSE and RMSE for s are 0.064 and 0.077, respectively, when L is 15, but reduce to 0.055 and 0.069, respectively, when L is 30; the largest ABSE and RMSE for Alpha1 are 0.131 and 0.168, respectively, when L is 15, but reduce to 0.109 and 0.140, respectively, when L is 30; similar results can also be found from Alpha2 to Alpha3. Especially, the change of person parameter estimates caused by test length is even more dramatic than that caused by the sample size. The direct reason is that the increase of test-lengths means more measurements on one subject, which leads to better estimation of person parameters, while the increase of sample size affects person parameter estimation by improving the item parameters. As we expected, enlarging the sample size and increasing test-length can improve the precision of both item and person parameter estimation of PM-DINA-2. Third, it should be noted that with the same sample size and test length, the ABSE and RMSE for person parameter estimates are pretty close to each other, which indicates that the same level of estimation precision can be achieved across all the attributes. Fourth, compared with person parameter estimates, item parameter estimates have better precision. Each condition can make an acceptable recovery of item parameters (e.g., RMSE < 0.1). Similar conclusions also can be found in Tables 1, 3, and 4.

Table 1.

ABSE and RMSE of Parameter Estimates (K = 3 and $ρ = 0$ )

N	L	Item						Alpha
		g		s		Alpha1		Alpha2		Alpha3
		ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE
500	15	.042	.053	.096	.115	.164	.207	.168	.212	.163	.204
	20	.037	.047	.110	.134	.156	.198	.159	.201	.160	.202
	30	.031	.039	.096	.119	.141	.179	.139	.177	.139	.177
1,000	15	.029	.037	.063	.077	.163	.204	.164	.204	.162	.203
	20	.024	.031	.069	.086	.155	.195	.158	.198	.158	.198
	30	.023	.029	.061	.078	.138	.175	.139	.175	.138	.174
1,500	15	.025	.031	.041	.052	.161	.201	.161	.201	.162	.201
	20	.021	.027	.055	.072	.156	.195	.157	.196	.157	.196
	30	.019	.024	.045	.058	.137	.173	.138	.173	.137	.173
2,000	15	.021	.026	.036	.044	.162	.200	.163	.202	.163	.202
	20	.019	.024	.043	.055	.156	.193	.157	.195	.158	.196
	30	.016	.021	.037	.048	.138	.174	.137	.173	.137	.173

Note. ABSE = absolute bias standard error; RMSE = root mean square error.

Table 2.

ABSE and RMSE of Parameter Estimates (K = 3 and $ρ = 0.8$ )

N	L	Item						Alpha
		g		s		Alpha1		Alpha2		Alpha3
		ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE
500	15	.040	.050	.064	.077	.131	.168	.131	.168	.130	.167
	20	.035	.046	.061	.074	.121	.156	.123	.159	.122	.158
	30	.030	.039	.055	.069	.109	.140	.108	.139	.109	.140
1,000	15	.029	.036	.041	.050	.128	.163	.128	.163	.128	.163
	20	.024	.032	.038	.047	.119	.153	.120	.153	.120	.154
	30	.021	.027	.033	.041	.106	.136	.107	.137	.106	.136
1,500	15	.023	.029	.030	.037	.127	.161	.128	.162	.128	.162
	20	.019	.026	.032	.039	.118	.151	.120	.153	.120	.152
	30	.017	.023	.027	.033	.107	.137	.107	.136	.106	.136
2,000	15	.019	.024	.026	.032	.127	.161	.129	.162	.128	.161
	20	.018	.023	.024	.030	.119	.151	.121	.153	.120	.152
	30	.017	.022	.023	.029	.106	.136	.107	.136	.107	.136

Note. ABSE = absolute bias standard error; RMSE = root mean square error.

Table 3.

ABSE and RMSE of Parameter Estimates (K = 5 and $ρ = 0$ )

N	L	Item				Alpha
		g		s		Alpha1		Alpha2		Alpha3		Alpha4		Alpha5
		ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE
500	15	.041	.054	.153	.179	.219	.260	.220	.261	.217	.258	.217	.257	.217	.258
	20	.029	.038	.108	.138	.189	.229	.189	.229	.189	.229	.189	.229	.191	.231
	30	.027	.034	.124	.156	.175	.216	.173	.214	.174	.216	.173	.214	.174	.215
1,000	15	.030	.036	.077	.099	.218	.256	.216	.255	.215	.254	.215	.254	.215	.254
	20	.025	.031	.056	.077	.186	.225	.185	.224	.184	.223	.185	.224	.185	.224
	30	.022	.027	.081	.106	.170	.210	.172	.212	.170	.210	.171	.212	.171	.210
1,500	15	.036	.041	.045	.060	.216	.255	.216	.255	.217	.256	.218	.257	.215	.254
	20	.030	.035	.041	.056	.185	.223	.185	.223	.184	.222	.185	.223	.185	.223
	30	.020	.026	.058	.078	.170	.209	.170	.209	.170	.209	.170	.209	.170	.209
2,000	15	.045	.045	.035	.048	.215	.253	.216	.254	.215	.253	.215	.254	.216	.255
	20	.031	.037	.034	.045	.184	.222	.185	.222	.184	.221	.185	.223	.184	.223
	30	.020	.025	.049	.067	.170	.209	.170	.209	.170	.208	.170	.209	.160	.208

Note. ABSE = absolute bias standard error; RMSE = root mean square error.

Table 4.

ABSE and RMSE of Parameter Estimates (K = 5 and $ρ = 0.8$ )

N	L	Item				Alpha
		g		s		Alpha1		Alpha2		Alpha3		Alpha4		Alpha5
		ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE	ABSE	RMSE
500	15	.030	.039	.068	.084	.141	.180	.141	.180	.140	.178	.141	.179	.140	.179
	20	.035	.046	.059	.072	.130	.166	.129	.165	.129	.165	.129	.167	.128	.165
	30	.026	.034	.060	.075	.120	.154	.120	.154	.120	.154	.119	.153	.120	.154
1,000	15	.021	.027	.037	.047	.139	.175	.139	.175	.140	.176	.139	.175	.140	.177
	20	.023	.029	.036	.045	.128	.162	.128	.161	.128	.162	.128	.162	.128	.163
	30	.019	.026	.035	.043	.117	.150	.118	.151	.118	.151	.119	.152	.118	.150
1,500	15	.018	.022	.028	.035	.139	.174	.141	.176	.139	.175	.140	.175	.139	.175
	20	.018	.023	.027	.033	.128	.161	.129	.162	.128	.161	.129	.162	.129	.161
	30	.017	.022	.028	.035	.118	.150	.118	.150	.118	.150	.119	.150	.118	.150
2,000	15	.021	.026	.025	.030	.141	.176	.142	.176	.141	.176	.141	.175	.142	.176
	20	.021	.026	.025	.030	.130	.162	.129	.161	.129	.162	.129	.162	.129	.162
	30	.016	.021	.025	.030	.119	.150	.119	.150	.119	.150	.119	.150	.119	.150

Note. ABSE = absolute bias standard error; RMSE = root mean square error.

Next, we compare Tables 1 through 4. First, comparing Tables 1 and 3, it can be seen that with the same sample size and test length, the larger the K, the larger the ABSE and RMSE for person parameter estimates, that is, the less precise the person parameter estimates, which is also expected since fixed items provide fixed information on more person parameter estimation, which indicates with the same size of data information, the more parameters to be estimated as the attribute number increases, the less precision for estimation. This also can be seen from the comparison of Tables 2 and 4. Second, comparing Tables 1 and 2, it can be seen that both item and person parameters have better estimation accuracy when the correlation coefficient ρ is 0.8 than $ρ$ is 0. This also can be seen from the comparison of Tables 3 and 4. This conclusion is good for real applications because independent attributes are rare in practice.

In general, the MCMC estimation method demonstrates a good parameter recovery in this study.

3.2. Simulation Study 2

3.2.1. Design and data generation

The Q-matrices, mean vector $μ$ , and coefficients $ρ$ , in Simulation Study 2 were the same as those in Simulation Study 1, and the item parameters were also fixed as 0.10. When $K = 3$ , the sample size N was fixed at 500, and when $K = 5$ , N was fixed at 1,000. To obtain a better comparison, both the DINA model and the CAP-DINA form are fitted to the DINA and CAP-DINA form-generated data sets. In summary, this simulation study contained $3 \times 2 \times 2 \times 4 = 48$ conditions. For each condition, 50 replications were generated.

3.2.2. Analysis

For the MCMC estimation of CAP-DINA form, the number of chains, burn-in iterations, and post-burn-in iterations were set the same as those in Simulation Study 1. For DINA model, the EAP method was used for estimation. Certain indicators were used to evaluate the two models. Specifically, (1) when the true model was DINA, that was, the data were generated following DINA, we evaluated the two models by total ABSE and RMSE of item parameters instead of each item parameter and attribute marginal match ratio (AMMR), that is, $AMMR = \sum_{i = 1}^{N} \sum_{k = 1}^{K} | α_{i k} - {\hat{α}}_{i k} | / (N K)$ , in which ${\hat{α}}_{i k}$ and ${\hat{α}}_{i}$ were the estimates of $α_{i k}$ and $α_{i}$ . It should be noted that when computing AMMR for CAP-DINA form, both ${\hat{α}}_{i k}$ and ${\hat{α}}_{i}$ need to be rounded first and then substituted into the above formula. (2) When the true model was the CAP-DINA form of the PM-DINA model, that was, the data were generated following the CAP-DINA form. In addition to total ABSE and RMSE of both item and person parameters, we also evaluated the DINA and the CAP-DINA form by AMMR. It should also be noted that (a) when computing AMMR, the true attribute $α_{i k}$ should be rounded first for both the DINA and the CAP-DINA form and ${\hat{α}}_{i k}$ should be rounded for CAP-DINA form estimation, and (b) when computing ABSE and RMSE of person parameter estimates, we take the posterior probability of each attribute as ${\hat{α}}_{i k}$ for DINA estimation.

3.2.3. Results

Tables 5 and 6 present the results for all the 16 settings. We take Table 5 as an example for specific analysis. Table 5 reports the results of the condition ( $K = 3 and N = 500)$ of Simulation Study 2. It can be found that compared with DINA, (1) when the true model is the CAP-DINA form of the PM-DINA model, the CAP-DINA form performs better in each indicator of both item and person parameter estimation, which is very important for CD. And with the increase of test lengths, the CAP-DINA form performs much better. When the test length $L = 30$ , the maximum difference of RMSE is close to 0.25. (2) When the true model is DINA, the ABSE and RMSE of item parameter of the CAP-DINA form are just a little higher in the thousandth place than DINA, and with the increase of test lengths, the difference is getting smaller or even zero. Moreover, the CAP-DINA form also performs well in AMMR, almost identical to that of DINA. Similar results can also be found in Table 6. As a result, compared with DINA, the CAP-DINA form performs robust in parameter recovery. Based on these results, a conclusion can be made that no matter the subject’s mastery is partial or binary, the CAP-DINA form is more advantageous than DINA, especially for moderate or long length tests.

Table 5.

Results of Simulation 2 for K = 3 With N = 500

			True Model
			CAP-DINA Form					DINA
			Item		Alpha			Item		Alpha
$ρ$	L	Fitted Model	ABSE	RMSE	AMMR	ABSE	RMSE	ABSE	RMSE	AMMR
0	15	CAP-DINA	.068	.089	.773	.165	.208	.024	.029	.986
	15	DINA	.186	.208	.769	.215	.278	.017	.021	.986
	20	CAP-DINA	.075	.103	.784	.158	.200	.020	.026	.987
	20	DINA	.202	.239	.779	.217	.278	.017	.023	.988
	30	CAP-DINA	.064	.089	.817	.140	.178	.018	.023	.992
	30	DINA	.206	.321	.808	.216	.274	.017	.022	.992
0.8	15	CAP-DINA	.052	.066	.829	.130	.168	.019	.024	.984
	15	DINA	.175	.190	.824	.214	.269	.016	.020	.985
	20	CAP-DINA	.045	.059	.844	.121	.156	.017	.021	.986
	20	DINA	.177	.198	.835	.215	.269	0016	.021	.986
	30	CAP-DINA	.041	.054	.861	.108	.140	.016	.020	.996
	30	DINA	.182	.203	.852	.221	.272	.015	.019	.996

Note. ABSE is the mean absolute bias standard error of parameter estimates; RMSE is the root mean square error of parameter estimates; AMMR is the attribute marginal match ratio. CAP = continuous attribute profile; DINA = deterministic inputs, noisy “and” gate.

Table 6.

Results of Simulation 2 for K = 5 With N = 1,000

			True Model
			CAP-DINA Form					DINA
			Item		Alpha			Item		Alpha
ρ	L	Fitted model	ABSE	RMSE	AMMR	ABSE	RMSE	ABSE	RMSE	AMMR
0	15	CAP-DINA	.054	.075	.676	.216	.256	.046	.053	.890
	15	DINA	.203	.256	.696	.242	.310	.016	.021	.930
	20	CAP-DINA	.041	.060	.741	.186	.225	.041	.047	.958
	20	DINA	.199	.241	.732	.224	.290	.014	.018	.958
	30	CAP-DINA	.051	.077	.765	.171	.211	.031	.037	.980
	30	DINA	.213	.262	.756	.219	.282	.013	.018	.980
0.8	15	CAP-DINA	.031	.040	.810	.140	.176	.027	.033	.931
	15	DINA	.165	.191	.792	.210	.270	.014	.019	.946
	20	CAP-DINA	.028	.037	.828	.128	.162	.025	.030	.978
	20	DINA	.171	.193	.817	.211	.266	.012	.016	.979
	30	CAP-DINA	.027	.033	.838	.116	.149	.017	.021	.985
	30	DINA	.177	.204	.831	.216	.269	.011	.014	.986

4. Real Date Examples

To illustrate the application of the CAP-DINA form, we apply both the CAP-DINA form and the DINA model to the English tests dataset collected by the Examination for the Certificate of Proficiency in English and introduced within the CDM package in R software, which was analyzed in some previous studies, containing responses to 28 items designed to assess three skills—morphosyntactic form ( $α_{1}$ ), cohesive form ( $α_{2}$ ), and lexical form ( $α_{3}$ ).

The Q-matrix and MCMC estimates of the posterior item parameters based on CAP-DINA form are reported in Table 7. First, the 28 items display variability in estimated item parameters g_j and s_j . Specifically, g_j ranges from 0.019 to 0.782 and s_j ranges from 0.013 to 0.032, that is, $1 - s_{j}$ ranges from 0.968 to 0.987. The probability of a correct response to item j ranges from g_j to $1 - s_{j}$ , which is provided in model formulation. Take Item $20$ for an example, subjects who have the requisite attributes are able to answer it with a maximum probability $1 - 0.015 = 98.5 %$ , subjects who have not the requisite attributes are able to answer it with a minimum probability $10.5 %$ , and subjects who have mastered any requisite attribute partially are able to answer it with a probability between $10.5 %$ and $98.5 %$ , which depends on the mastery degree of attribute. However, for Item 8, subjects who do have not the requisite attributes are able to answer it with a minimum probability $74.4 %$ , which is not too far from $1 - 0.019 = 98.1 %$ of the subjects who have the requisite attributes. The example indicates the heterogeneity of item complexity and attribute difficulty may result in the heterogeneity of item parameters. Second, it should be noted that items with large g_j and s_j may not provide diagnostic information of high quality. It is obvious that several items of the English tests have large g_j and s_j , especially g_j , which may be related to the misspecification of Q matrix. Consequently, it is worth considering additional research of the item quality of the English test.

Table 7.

Q-Matrix for English Test Data and the Estimation Results

Item	Q			$\hat{g}$		$\hat{s}$
Item	$α_{1}$	$α_{2}$	$α_{3}$	Mean	SE	Mean	SE
1	1	1	0	.674	.014	.014	.010
2	0	1	0	.654	.064	.022	.015
3	1	0	1	.349	.153	.017	.019
4	0	0	1	.219	.116	.031	.012
5	0	0	1	.622	.018	.024	.006
6	0	0	1	.544	.034	.027	.008
7	1	0	1	.472	.004	.016	.003
8	0	1	0	.744	.011	.019	.007
9	0	0	1	.323	.159	.032	.012
10	1	0	0	.317	.055	.026	.017
11	1	0	1	.479	.008	.016	.007
12	1	0	1	.070	.140	.013	.019
13	1	0	0	.504	.038	.024	.014
14	1	0	0	.386	.128	.024	.017
15	0	0	1	.589	.013	.025	.006
16	1	0	1	.454	.010	.016	.007
17	0	1	1	.782	.021	.014	.008
18	0	0	1	.579	.058	.027	.008
19	0	0	1	.193	.100	.030	.011
20	1	0	1	.105	.122	.015	.022
21	1	0	1	.547	.009	.015	.007
22	0	0	1	.019	.138	.015	.011
23	0	1	0	.528	.019	.024	.010
24	0	1	0	.176	.247	.028	.027
25	1	0	0	.391	.190	.026	.019
26	0	0	1	.384	.181	.028	.012
27	1	0	0	.102	.261	.023	.020
28	0	0	1	.453	.049	.030	.008

Table 8 reports AIC, BIC, and DIC (Spiegelhalter et al., 2002) comparison between the CAP-DINA form and the DINA to assess the relative model fit. This demonstrates a better model fit than DINA due to its small AIC, BIC, and DIC.

Table 8.

AIC, BIC, and DIC for English Test Data

Model	AIC	BIC	DIC
CAP-DINA form	80,157	80,545.7	86,125.810
DINA	81,353	81,729.7	87,364.710

Note. AIC = Akaike Information Criterion; BIC = Bayesian Information Criterion; DIC = Deviance Information Criterion.

5. Conclusion and Discussion

In order to obtain more precise and in-depth diagnostic information about subjects’ attribute profiles instead of only mastery or nonmastery, we propose a flexible DINA for CAP-DINA form-an equivalent form of the PM-DINA model, which subsumes DINA model as a special case. A Bayesian technique with MCMC algorithm is developed with accurate parameter recovery in Simulation Study 1. Simulation Study 2 compares the model misspecification of DINA with the CAP-DINA form and the CAP-DINA form performs more robust. Specifically, when the data are generated under continuous attribute condition, the CAP-DINA form fits the simulated data better than DINA model, and when the data are generated under binary attribute condition, the results of the CAP-DINA form are comparable to that of DINA model. The real data study demonstrates the application of CAP-DINA form in practical educational testing and shows that CAP-DINA form has a better fit than DINA in English test data through the goodness-of-fit measures, such as AIC, BIC, and DIC. Therefore, it is believed that the CAP-DINA form has more advantages than DINA and could be widely accepted in application.

The simulation and the real data studies also show that (1) the CAP-DINA form has the advantage over MIRT models, for the former can determine subjects’ relative location and then provide a direct and accurate diagnosis for subjects. With a simple form, the CAP-DINA form also overcomes the limitations of noncompensatory MIRT models; (2) although the CAP-DINA form is a CCM like the CCM, the CCM only contains person parameters but no item parameters; thus, the stochastic components are absorbed into the subjects’ ability profiles, while the CAP-DINA form adds item parameters; thus, part of stochastic components are absorbed into the items. In this regard, the CAP-DINA form is more reasonable; and (3) both the CAP-DINA form and PM-DINA model assume CAP and are essentially equivalent in terms of IRF. However, model complexity of PM-DINA leads to high computation cost especially when the number of attributes is large (Shang et al., 2021), whereas the estimation within CAP-DINA form is relatively straightforward and efficient due to its simplicity in terms of form, construction, and data-generating process of subjects. In conclusion, the concept of attribute continuation and the proposal of CAP-DINA form are considered significant, with the anticipation that it opens a new research area for CDM.

With the CAP-DINA form, a potentially promising application procedures related to Fisher information (FI) could be developed using the continuity feature of the attribute mastery. FI has been developed and widely used in IRT models (Chang & Ying, 1996). Although the distance information such as Shannon entropy and Kullback–Leibler Divergence (Claude Shannon, 1948; Cover & Thomas, 1991; Kullback, 1959) are also applied, FI has its own unique function that it reflects the precision of the parameter estimation by measuring each subject’s estimation error. However, the methodological and theoretical developments of FI in CDM appear to lag behind due to the discreteness of attribute profile. The introduction of attribute continuity in CAP-DINA form makes it possible to propose Fisher-type information in CDM. Hence, the model will have more applications based on Fisher-type information.

Another interesting extension of this study is to nest the continuous attribute into polytomous attribute CDMs (Chen and de la Torre, 2013; von Davier, 2008). Although the framework of polytomous attributes has been put forward to relax the binary assumption of attributes so as to provide a more accurate diagnosis for subjects, but the diagnosis at each attribute level of is still an absolute and unprecise value—0 or 1. The idea of attribute continuity would be a solution, which allows us to construct continuous polytomous attribute CDMs, such as polytomous CAP-DINA form.

Extending other CDMs by incorporating the idea of attribute mastery continuation will also be interesting. Due to the simplicity and intuitiveness of DINA, the attribute mastery continuation is just applied in the DINA model in this article. In fact, there are varieties of CDMs, such as GDINA (de la Torre, 2011) and RRUM (Hartz, 2002), each with different roles. Similar continuation could be applied on other CDMs. The direct extension can be DINO (Templin & Henson, 2006), because the DINA and the DINO models share a “dual” relation (Ko¨hn & Chiu, 2016). How to extend the idea to more CDMs is an interesting topic.

Although the real date study indicates the CAP-DINA form has a better fit than DINA in English test data, the following two issues should be noted, (1) the heterogeneity of item complexity and attribute difficulty may result in the heterogeneity of item parameters, so the item complexity and attribute difficulty is worth studying in the future; and (2) large estimates of g_j and s_j , especially g_j , may be related to the misspecification of Q matrix. Consequently, it is necessary to conduct additional research of the item quality of the English test (Yu & Cheng, 2020).

Finally, although the MCMC algorithm is developed for CAP-DINA form estimation and provides accurate parameter recovery, the estimation process is somehow time-consuming, and the computation efficiency needs to be improved further. In particular, there are situations where efficiency is necessary, such as computerized adaptive test, and the development of an efficient algorithm for CAP-DINA form is imperative. It is known that the EM algorithm is much faster than MCMC. To develop a more efficient EM-type algorithm for the CAP-DINA form would be our future research direction.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article: This study is supported in part by the 14th Five-Year Plan for Education Science in Jiangxi Province (Grant No. 21YB257, 21YB027), the 14th Five-Year Plan for Social Science in Jiangxi Province (Grant No. 21JY06, 22JY16), and the Humanities and Social Sciences Program of Jiangxi Provincial Department of Education (Grant No. XL20202).

ORCID iD

Tian Shu

References

Ackerman

T. A.

(1994). Using multidimensional item response theory to understand what items and tests are measuring. Applied Measurement in Education, 7, 255–278.

Baker

F. B.

Kim

S. H.

(2004). Item response theory: Parameter estimation techniques. Marcel Dekker.

Brooks

S. P.

Gelman

(1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434–455.

Chang

H.-H.

Ying

(1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 20, 213–229.

Chen

de la Torre

(2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37(6), 419–437.

Chen

Y. X.

Liu

J. C.

Ying

Z. L.

(2014). Online item calibration for Q-matrix in CD-CAT. Applied Psychological Measurement, 38(1), 5–15.

Cheng

(2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74, 619–632.

Cover

T. M.

Thomas

J. A.

(1991). Elements of information theory. Wiley.

de la Torre

(2011). The generalized DINA model framework. Psychometrika, 76, 179–199.

10.

de la Torre

Douglas

(2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333–353.

11.

de la Torre

Lam

Rhoads

Tjoe

(2010). Measuring grade 8 proportional reasoning: The process of attribute identification and task development and validation. Paper presented at the annual meeting of the American Educational Research Association.

12.

DeMars

(2010). Item response theory. Oxford University Press.

13.

Embretson

S. E.

Reise

S. P.

(2000). Item response theory for psychologists. Lawrence Erlbaum Associates.

14.

Embretson

S. E.

Steven

P. R.

(2013). Item response theory. Psychology Press.

15.

Embretson

S. E.

Yang

X. D.

(2013). A multicomponent latent trait model for diagnosis. Psychometrika, 78, 14–36.

16.

Erosheva

E. A.

(2005). Comparing latent structures of the grade of membership, Rasch, and latent class models. Psychometrika, 70(4), 619–628.

17.

Erosheva

E. A.

Fienberg

Lafferty

(2004). Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences, 101(suppl 1), 5220–5227.

18.

Gelfand

A. E.

Smith

A. F. M.

(1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398–409.

19.

(2020). Partial identifiability of restricted latent class models. The Annals of Statistics, 48(4), 2082–2107.

20.

Haberman

(1995). Book review of “Statistical applications using fuzzy sets,” by K.G. Manton, M.A. Woodbury, and L.S. Corder. Journal of the American Statistical Association, 90(431), 1131–1133.

21.

Haertel

E. H.

(1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 301–321.

22.

Hair

J. F.

(2011). Multivariate data analysis: An overview. Springer Berlin Heidelberg.

23.

Hambleton

R. K.

Swaminathan

(1985). Item response theory principles and applications. Kluwer-Nijhoff Publishing.

24.

Hartz

S. M.

(2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. PhD thesis, University of Illinois, Urbana-Champaign.

25.

Hong

Wang

Lim

Y. S.

Douglas

(2015). Efficient models for cognitive diagnosis with continuous and mixed-type latent variables. Applied Psychological Measurement, 39(1), 31–43.

26.

Hsu

C. L.

Wang

W. C.

Chen

S. Y.

(2013). Variable-length computerized adaptive testing based on cognitive diagnosis models. Applied Psychological Measurement, 37(7), 563–582.

27.

Huebner

Wang

(2011). A note on comparing examinee classification methods for cognitive diagnosis models. Educational and Psychological Measurement, 71, 407–419.

28.

Junker

B. W.

Sijtsma

(2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272.

29.

Karelitz

T. M.

(2004). Ordered category attribute coding framework for cognitive assessments (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign.

30.

Ko¨hn

H.-F.

Chiu

C.-Y.

(2016). A proof of the duality of the DINA model and the DINO model. Journal of Classification, 33, 171–184.

31.

Kullback

(1959). Information theory and statistics. Wiley.

32.

Leighton

J. P.

Gierl

M. J.

(2007). Cognitive diagnostic assessment for education: Theory and applications. Cambridge University Press.

33.

Manton

Woodbury

Tolley

(1994). Statistical applications using fuzzy sets. Wiley.

34.

Mao

X. Z.

Xin

(2013). The application of the Monte Carlo approach to cognitive diagnostic computerized adaptive testing with content constraints. Applied Psychological Measurement, 37(6), 482–496.

35.

Maris

(1999). Estimating multiple classification latent class models. Psychometrika, 64, 187–212.

36.

Minchen

N. D.

de la Torre

Liu

(2017). A cognitive diagnosis model for continuous response. Journal of Educational and Behavioral Statistics, 42(6), 651–677.

37.

Plummer

(2015). JAGS Version 4.0.0 User Manual. Retrieved from http://sourceforge.net/projects/mcmc-jags/

38.

Reckase

M. D.

(1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21, 25–36.

39.

Reckase

M. D.

(2009). Multidimensional item response theory. Springer.

40.

Rupp

A. A.

Templin

Henson

(2010). Diagnostic measurement: Theory, methods and applications. Guilford Press.

41.

San Mart´

Jara

Rolin

J.-M.

Mouchart

(2011). On the Bayesian nonparametric generalization of IRT-type models. Psychometrika, 76, 385–409.

42.

San Mart´

Rolin

J.-M.

Castro

L. M.

(2013). Identification of the 1PL model with guessing parameter: Parametric and semi-parametric results. Psychometrika, 78, 341–379.

43.

Shang

Erosheva

E. A.

(2021). Partial-mastery cognitive diagnosis models. The Annals of Applied Statistics, 15(3), 1529–1555.

44.

Shannon

. (1948). A mathematical theory of communications. Bell System Technology Journal, 27, 379–423.

45.

Sijtsma

Junker

B. W.

(2006). Item response theory: Past performance, present developments, and future expectations. Behaviormetrika, 33, 75–102.

46.

Spiegelhalter

D. J.

Best

N. G.

Carlin

B. P.

van der Linde

(2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, 64, 583–616.

47.

Stout

(2007). Skills diagnosis using IRT-based continuous latent trait models. Journal of Educational Measurement, 44, 313–324.

48.

Templin

J. L.

Henson

R. A.

(2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287–305.

49.

Van der Linden

W. J.

Hambleton

R. K.

(Eds.). (1997). Handbook of modern item response theory. Springer.

50.

von Davier

(2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287–307.

51.

von Davier

Lee

Y. S.

(2019). Handbook of diagnostic classification models. Springer International Publishing.

52.

Wang

(2013). Mutual information item selection method in cognitive diagnostic computerized adaptive testing with short test length. Educational and Psychological Measurement, 73(6), 1017–1035.

53.

Wang

Chang

H. H.

Huebner

(2011). Restrictive stochastic item selection methods in cognitive diagnostic computerized adaptive testing. Journal of Educational Measurement, 48, 255–273.

54.

Wang

Nydick

S. W.

(2015). Comparing two algorithms for calibrating the restricted non-compensatory multidimensional IRT model. Applied Psychological Measurement, 39, 119–134.

55.

Whitely

S. E.

(1980). Multicomponent latent trait models for ability tests. Psychometrika, 45, 479–494.

56.

(2017). Identifiability of restricted latent class models with binary responses. The Annals of Statistics, 45, 675–707.

57.

Cheng

(2020). Data-driven Q-matrix validation using a residual-based statistic in cognitive diagnostic assessment. British Journal of Mathematical and Statistical Psychology, 73(1), 145–179. https://doi.org/10.1111/bmsp.12191

58.

Zhan

Jiao

Man

Wang

(2019). Using JAGS for Bayesian cognitive diagnosis modeling: A tutorial. Journal of Educational and Behavioral Statistics, 44(4), 473–503.

59.

Zhan

Wang

W. C.

Jiao

Bian

(2018). Probabilistic-input, noisy conjunctive models for cognitive diagnosis. Front Psychol, 9, 997. https://doi.org/10.3389/fpsyg.2018.00997

60.

Zhan

Wang

W. C.

(2019b). A partial mastery, higher-order latent structural model for polytomous attributes in cognitive diagnostic assessments. Journal of Classification, 37(2), 328–351.