Abstract
In this article, we systematically introduce the just another Gibbs sampler (JAGS) software program to fit common Bayesian cognitive diagnosis models (CDMs) including the deterministic inputs, noisy “and” gate model; the deterministic inputs, noisy “or” gate model; the linear logistic model; the reduced reparameterized unified model; and the log-linear CDM (LCDM). Further, we introduce the unstructured latent structural model and the higher order latent structural model. We also show how to extend these models to consider polytomous attributes, the testlet effect, and longitudinal diagnosis. Finally, we present an empirical example as a tutorial to illustrate how to use JAGS codes in R.
Keywords
Introduction
In recent years, many cognitive diagnosis models (CDMs) have been proposed, such as the deterministic inputs, noisy “and” gate (DINA) model (Haertel, 1989; Junker & Sijtsma, 2001; Macready & Dayton, 1977); the deterministic inputs, noisy “or” gate (DINO) model (Templin & Henson, 2006); the linear logistic model (Maris, 1999); and the reduced reparameterized unified model (rRUM; Hartz, 2002). A few general CDMs have also become available, such as the log-linear CDM (LCDM; Henson, Templin, & Willse, 2009), the generalized DINA model (de la Torre, 2011), and the general diagnostic model (von Davier, 2008).
With the advancement in computing power and the Markov chain Monte Carlo (MCMC) algorithms, Bayesian CDM has become increasingly popular (Culpepper, 2015a; Culpepper & Hudson, 2018; DeCarlo, 2012; de la Torre & Douglas, 2004; Huang & Wang, 2014; F. Li, Cohen, Bottge, & Templin, 2016; X. M. Li & Wang, 2015; Sinharay & Almond, 2007; Zhan, Jiao, & Liao, 2018; Zhan, Jiao, Liao, & Bian, 2018). Multiple software programs are available to implement certain Bayesian MCMC algorithms, such as WinBUGS (Lunn, Thomas, Best, & Spiegelhalter, 2000), OpenBUGS (Spiegelhalter, Thomas, Best, & Lunn, 2014), just another Gibbs sampler (JAGS; Plummer, 2015), and MCMCpack (Martin, Quinn, & Park, 2011) package in R (R Core Team, 2016). However, there remains a lack of systematic introduction to using such software programs to fit Bayesian CDMs.
Unlike the frequentist approach that treats model parameters as fixed, the Bayesian approach considers them as random and uses (prior) distributions to model our beliefs regarding them. Within the frequentist framework, parameter estimation refers to a point estimate of each model parameter. In contrast, in a fully Bayesian analysis, we seek a whole (posterior) distribution of the model parameter, which includes the entire terrain of peaks, valleys, and plateaus (Levy & Mislevy, 2016). The posterior distribution of parameters, given the data, is proportional to the product of the likelihood of the data, given the parameters, and the prior distribution of the parameters. Typically, the posterior distribution is represented in terms of the posterior mean (or median, or mode) as a summary of central tendency, as well as the posterior standard deviation as a summary of variability.
Adopting the Bayesian MCMC estimation over a frequentist estimation—for example, a maximum likelihood estimation (see, e.g., Wagenmakers, Lee, Lodewyckx, & Iverson, 2008)—includes the following advantages: (a) It does not depend on asymptotic theory, (b) it treats both item and person parameters as random effects, (c) it incorporates the principle of parsimony by marginalization of the likelihood function, and (d) it is more robust in terms of handling complex models. Moreover, in the Bayesian estimation, the percentiles of the posterior can be used to construct the credible interval (or Bayesian confidence interval), which can be used for testing significance (Box & Tiao, 1973). Further, it is also easy to conduct model–data fit with posterior predictive model checking (PPMC).
Currently, numerous CDM studies are rather technical and limited to statistical and psychometric researchers (Templin & Hoffman, 2013). There is a lack of available software for more applied practitioners who would like to use CDMs in developing their diagnostic testing programs or conducting empirical research. Moreover, most existing programs for cognitive diagnosis are either limited in terms of model options or commercialized. For example, the Arpeggio Suite (Bolt et al., 2008), mdltm (von Davier, 2005), CDM package (George, Robitzsch, Kiefer, Gross, & Uenlue, 2016), and the GDINA package (Ma & de la Torre, 2016a) limit users to a few options. In addition, although the Mplus (Muthén & Muthén, 2010) and the flexMIRT (Cai, 2017) can be used to fit many CDMs (Hansen, Cai, Monroe, & Li, 2016; Templin & Hoffman, 2013), their commercialization may limit their access to certain researchers or students. More importantly, CDMs can only be estimated on the basis of the frequentist estimation methods (e.g., maximum likelihood estimation) that are embedded in their software.
In this article, we demonstrate how to use the freeware JAGS to fit several popular CDMs and present the codes of these CDMs. Then, researchers will be able to adapt these codes to fit extended CDMs, which cannot be fitted into existing software or packages.
The Gibbs sampler is used by default in JAGS. In general, JAGS makes it easy to construct a Markov chain for parameters and does not require users to derive the posterior distribution of the model parameters by hand. Moreover, the R2jags package (Version 0.5-7; Su & Yajima, 2015) in R could easily be used to call JAGS. Further, it must be noted that the JAGS codes presented in this study can be easily generalized to other BUGS software programs with minor revisions, such as WinBUGS and OpenBUGS. 1
The following sections illustrate JAGS codes for five CDMs: (1) the DINA model, (2) the DINO model, (3) the log-linear model (LLM), (4) the rRUM, and (5) the LCDM. Apart from these five models, which are based on the unstructured (or saturated) latent structural model, we also demonstrate the higher order latent structural model (de la Torre & Douglas, 2004). Further, we also present the extensions to the polytomous attributes, the testlet effect, and the longitudinal diagnosis using JAGS. Lastly, we conduct an empirical example analysis to illustrate how to use the R2jags package to run the example JAGS codes.
The DINA Model
Let Yni be the response of person n (n = 1,…, N) to item i (i = 1,…, I). Let α
nk
be the binary variable for person n on attribute k (k = 1,…, K,), where α
nk
= 1 indicates that person n shows mastery over attribute k and α
nk
= 0 indicates nonmastery, and let
Among CDMs, the DINA model is one of the most popular models because of its simple structure and straightforward interpretation. The DINA model can be expressed in the following manner:
where pni is the correct response probability of person n to item i; si and gi are the slipping and guessing probabilities, respectively, of item i, which describe the item-level aberrant response probability; η ni is the ideal response for person n to item i based on the conjunctive condensation rule (Maris, 1999), assuming a value of 1 if person n possesses all the attributes required for item i and a value of 0 if the person lacks at least one of the required attributes; mathematically, it is expressed in the following manner:
where wnik can be treated as the latent response on item i for person n to attribute k. Table 1 presents the JAGS codes to fit the DINA model. The codes are elaborated below.
The DINA Model
Note. DINA = deterministic inputs, noisy “and” gate.
Line 1 signals the beginning of the model. Lines 2 through 7 specify the measurement model, lines 8 through 10 are the unstructured latent structural model and priors, and lines 11 through 13 are the priors assumed for the item parameters.
A part of the parameters in Table 1 is assigned with certain previously defined values, including
In the unstructured latent structural model, line 8 describes the method used to obtain the following attributes: α nk = α ck , where c ∈ {1,…, C} indicates person n’s attribute profile and is assumed to follow a categorical distribution, with the mixing proportion of the cth pattern.
Lines 12 and 13 specify the prior distribution of si and gi, respectively. In addition, a monotonicity restriction (gi < 1 − si) is added by truncation
In addition, when no previous experience is available for the scale parameters, a prior on the scale parameters—which is called a hyperprior—can be used. A few extra lines can be added in the following manner:
Then,
The JAGS codes of the reparameterized DINA (RDINA) model (DeCarlo, 2012) are presented in Table 2. The RDINA model uses the logit link function and it is equivalent to the regular DINA model, which can be expressed in the following manner:
where the intercept parameter (λ0,i) defines the log odds of a correct response to item i for a person who is not a master of either one of the attributes; λ(K),i is the K-way interaction effect parameter for item i. In this formulation, the regular gi and si parameters in Equation 1 can be described in the following manner:
The RDINA Model
Note. RDINA = reparameterized deterministic inputs, noisy “and” gate.
Lines 4 through 10 specify the model. Line 12 specifies the distribution for the λ0,i parameter. A normal prior distribution is assumed targeting a mean of −1.096—that is,
Further, the monotonicity restriction (gi < 1 − si) is realized by constraining λ(K),i parameters to be positive. Thus, a truncated normal distribution is specified for λ(K),i in line 13 by truncation
The DINO Model
The DINO model, similar to the DINA model, models the probability of a correct response as a function of a slipping parameter, si, and a guessing parameter, gi. However, the ideal response, η ni , in the DINO model is modeled on the basis of the disjunctive condensation rule (Maris, 1999) rather than the conjunctive condensation rule, as in the DINA model. η ni is expressed as
which is an indicator of whether person n has mastered at least one of the required attributes for item i. Thus, η ni = 1 for any person who has mastered one or more of the item’s required attributes, and η ni = 0 for a person who has mastered none of the required attributes. Although the DINO model shares a dual relationship with the DINA model (Köhn & Chiu, 2016), it is easier for practitioners to directly fit the DINO model to the data.
Table 3 presents the JAGS codes for the DINO model. The differences between the DINO and DINA models are easily handled by JAGS, as shown in lines 4 and 5 in Tables 1 and 3, respectively.
The DINO Model
Note. DINO = deterministic inputs, noisy “or” gate.
The LLM
The LLM (also called the compensatory reparameterized unified model) is constructed on the basis of the compensatory condensation rule (Maris, 1999). The LLM can be expressed in the following manner:
where λ
k,i
is the kth main effect parameter and all λ
k,i
≥ 0. In the LLM, the lowest correct response probability is
The JAGS codes of the LLM are presented in Table 4. For 1 item, the number of main effect parameters is
The LLM
Note. LLM = log-linear model.
The rRUM
In the DINA model, the aberrant responses are modeled at the item level. However, in practice, it may appear reasonable that a respondent lacking only one of the measured attributes has a higher chance of a correct response than a respondent who has not mastered any of the measured attributes. To further differentiate between respondents who have not mastered at least one attribute, the noisy inputs, deterministic “and” gate (NIDA) model (Junker & Sijtsma, 2001) models the aberrant responses at the attribute level, but with equality constraints across items. A straightforward extension of the NIDA model is the generalized NIDA (G-NIDA) model (de la Torre, 2011), in which the slipping and guessing parameters are allowed to vary across items. Thus, there are
where
The rRUM
Note. rRUM = reduced reparameterized unified model.
Following the same sequence, the rRUM is first specified from lines 4 through 6, and priors are specified in the subsequent lines of the table. Please note the distinction between
For 1 item, only
The LCDM
Among the CDMs, the LCDM is sufficiently general to encompass many popular CDMs (e.g., the DINA model, the DINO model, the rRUM, and the LLM), which are special cases obtained by imposing different constraints on item parameters (Henson et al., 2009; Rupp, Templin, & Henson, 2010). In the LCDM, the correct response probability for person n on item i is defined in the following manner:
where the intercept parameter, λ0,i, defines the log odds of a correct response for a person who does not master any attribute, λ k ,i is the main effect of α nk , λ kk′,i is the two-way interaction effect of α nk and α nk′ , and λ(K),i is the K-way interaction effect. To keep pni increasing as the number of mastered attributes increase, λ k,i s and all interaction effects are typically nonnegative. The interaction effects can assume any values.
For simplicity, we assume that only three attributes are required by a test, which means that there are three main effects, three two-way interaction effects, and one three-way interaction in the LCDM. The corresponding JAGS codes are presented in Table 6.
The LCDM
Note. LCDM = log-linear cognitive diagnosis model.
In the LCDM, the number of main effect parameters is
By setting different constraints, the LCDM can be transferred into different CDMs. For example, if we set all interaction effect parameters in lines 25–28 to zeros, then the codes in Table 6 are equivalent to the codes in Table 4—namely, the LCDM reduced to the LLM:
The Higher Order Latent Structural Model
In CDMs, the number of all possible attribute patterns is typically 2 K . The unstructured latent structural model that was used in previous sections requires 2 K − 1 structural parameters for such 2 K possible patterns and leads to a substantial computational burden when there are numerous attributes. de la Torre and Douglas (2004) proposed a solution to reduce the calculations related to the estimation of CDM parameters by involving a higher order latent structure beyond the attributes. They proposed a higher order latent structural model in which all attributes are assumed to be conditionally independent, given a continuous latent trait θ:
where pnk is the probability of person n mastering attribute k, given θ. β k and ξ k denote the intercept and slope parameter of the kth attribute, respectively, and θ is assumed as N(0, 1) for model identification. Owing to this higher order structure, the number of attribute parameters to be estimated is only 2K (i.e., K attribute intercept parameters and K attribute slope parameters) rather than 2 K − 1. Because the number of parameters grows linearly, not exponentially, this formulation significantly reduces the computational burden. Theoretically, as a latent structural model, Equation 9 can be employed in any CDM. For the sake of simplicity, the DINA model is used to illustrate how to incorporate the higher order latent structural model into the DINA model to yield the higher order DINA (HO-DINA) model.
We use the codes from lines 8 through 12 to describe the method used to obtain attributes from the higher order latent structural model. Lines 13 through 15 are the priors of the latent structural parameters. According to the estimated results in previous studies (e.g., Zhan et al., 2018), the absolute values of β
k
and ξ
k
may be large and go up to a value of 3 or even 4. Thus, the scale parameters are suggested to be set as
From the seven examples presented in Tables 1 through 7, an obvious advantage of using JAGS is that previously introduced models could be easily extended by altering a few lines of JAGS codes. In the next three sections, we further extend CDMs to address the polytomous attribute, the testlet effect, and the longitudinal data.
The HO-DINA Model
A DINA Model for Polytomous Attributes
Previously presented models are limited to binary attributes (i.e., mastery or nonmastery). In such a binary classification, it may be difficult to differentiate persons within the same category who master a specific attribute at different levels. For example, it could be that one person fully masters an attribute, while another person is a borderline master who is slightly above the threshold (e.g., 0.5). Thus, the polytomous attributes and the polytomous Q-matrix (Karelitz, 2004; von Davier, 2008) could be a better option. While a binary attribute is related to only two categories, a polytomous attribute is related to more than two categories (e.g., 0, 1, 2). This fine-grained sizing helps to provide a more informative diagnosis of respondents in terms of their mastery levels. Substantively, the polytomous categories can differ for each attribute with well-defined meanings by content experts. Currently, the ordered category attribute coding (OCAC) framework (Karelitz, 2004) is used to address polytomous attributes (e.g., J. Chen & de la Torre, 2013; Zhan, Bian, & Wang, 2016). In the OCAC framework, the ordinal levels of each attribute are coded as nonnegative integers, beginning from 0 to 1 and going up to the highest level. For illustration, the reparameterized polytomous attributes DINA (RPa-DINA) model (Zhan et al., 2016) is demonstrated as an example here. The RPa-DINA model can be expressed in the following manner:
where α
nk
is a polytomous variable for person n on attribute k, α
nk
= l − 1 if person n masters the lth level (l = 1,…, Lk) of attribute k; moreover, we let Lk be the number of ordinal levels of attribute k. As the first level of attribute k is labeled as 0, α
nk
= l − 1; the polytomous Q-matrix is an I × K matrix with element qik = l − 1, thereby indicating the lth level of attribute k is required to answer item i correctly. Further, Anik = I{α
nk
≥ qik} is the ideal response to item i for person n on attribute k, where I{·} is an indicator function. Thus, Anik = 1, if person n’s attribute mastery level is at or above the specific attribute level that is required by item i and 0 otherwise;
The RPa-DINA Model
Note. RPa-DINA = reparameterized polytomous attributes deterministic inputs, noisy “and” gate.
In line 5,
Note that
A DINA Model for Testlet Design
Testlets have been widely adopted in educational and psychological tests. A testlet is a cluster of items that share a common stimulus (Wainer & Kiely, 1987). For example, in a reading comprehension test, a testlet is formed as a bundle of items based on one reading passage. Local item dependence among items within a testlet is called the testlet effect. The testlet effect could be an indication of a noise dimension. In the item response theory (IRT) framework, testlet effects are accounted for by adding a set of additional random effect parameters to standard IRT models: one for each testlet (Wainer, Bradlow, & Wang, 2007) or multiples for each testlet (Zhan, Wang, Wang, & Li, 2014). In practice, testlets can be used in cognitive diagnosis assessment. Although it is not conceptually challenging to add a set of random effect parameters into CDMs, limited efforts have been made for the development of testlet CDMs (Hansen et al., 2016; Liao & Jiao, 2016; Zhan, Li, Wang, Bian, & Wang, 2015; Zhan, Liao, & Bian, 2018).
For illustration, the RDINA model (see Table 2) is used as a template, and this method can be easily extended to the LCDM and other cases. To address the testlet effect, γ nd (i), a random effect parameter, is added to the RDINA model:
where γ
nd
(i) is assumed from a normal distribution
The number of testlets (i.e.,
In this model,
In addition to the unstructured latent structural model, the higher order latent structural model (Equation 9) can be introduced (see the next section).
A DINA Model for Longitudinal Data
Providing diagnostic feedback on growth is crucial to formative decisions, such as targeted remedial instructions or interventions. Measuring individual growth or change relies on longitudinal data collected over multiple measures of achievement constructed along the growth trajectory. However, only a few studies focus on measuring growth with regard to several related attributes over multiple occasions (e.g., F. Li et al., 2016; Wang, Yang, Culpepper, & Douglas, 2018; Zhan, Jiao, Liao, & Li, in press). Unlike continuous latent traits in IRT models, the attributes in CDMs are categorical. Therefore, the methods for modeling growth in the IRT framework may not be directly extended to capture growth in the mastery of attributes.
Currently, there are two main approaches for analyzing longitudinal data in cognitive diagnosis. The first approach adopts the latent class modeling perspective (Y. Chen, Culpepper, Wang, & Douglas, 2018; F. Li et al., 2016; Wang et al., 2018), which can all be taken as a particular case or an application of the mixture hidden Markov model (Vermunt, Tran, & Magidson, 2008). The second approach adopts the IRT modeling perspective, such as the longitudinal higher order DINA (Long-DINA) model (Zhan et al., in press), which uses the variance–covariance-based method by assuming multiple continuous higher order latent traits (see Equation 9) that follow a multivariate normal distribution.
The Testlet-DINA Model
where Ynit denotes the response of person n to item i on occasion t;
For illustration, we assume two occasions (T = 2), 10 items on each occasion, and the first 2 items (M = 2) on each occasion are used as anchor items. The corresponding JAGS codes are provided in Table 10.
The Long-DINA Model
Note. Long-DINA = longitudinal higher order deterministic inputs, noisy “and” gate.
For the test scenario specified above,
The multivariate normal distribution in JAGS is also parameterized in the precision matrix, which is the inverse of the covariance matrix. Thus, the covariance matrix
The expectation–maximization algorithm via flexMIRT (Version 3.51; Cai, 2017) was used in Zhan, Jiao, Liao, and Li (in press).
5
However, due to the restriction of the flexMIRT, multiple Q-matrices and attribute patterns from different occasions must be combined and rebuilt together in an analysis, which may lead to large computing burden. For example, if T = 4 and K = 5, then 2
TK
= 1,048,576 attribute patterns need to be estimated in the flexMIRT. In contrast, due to the flexibility of the JAGS, multiple Q-matrices and attribute patterns on different occasions are used separately (e.g.,
An Empirical Example: A Tutorial
To demonstrate how to use the JAGS codes presented in the earlier sections to analyze a real data set, fraction subtraction data from de la Torre (2009)—originally used by Tatsuoka (1990)—was analyzed. The data set contained a total of 536 people who responded to 15 items that measure five required attributes. The total number of possible attribute profiles was 32. The Q-matrix can be found in de la Torre (2009). The response data and the Q-matrix can be read in R first:
This section illustrates how to employ the R2jags package and JAGS codes to analyze the fraction subtraction data step-by-step. The DINA model and the rRUM were employed and compared. For simplicity, only the DINA model is presented for illustration. Readers can directly adapt the codes in R for other models.
Step 1: Construct all.patterns
Within the codes,
Step 2: Load JAGS Codes for the DINA Model
In R, to overcome incompatibility, the dummy operator
where pni is defined in the same manner as in Equation 1. Note that other kinds of discrepancy measures also can be used for different purposes (Levy & Mislevy, 2016).
Step 3: Load the R2jags Package and Data
Step 4: Preliminary Study for Parameter Convergence
A preliminary study was conducted to obtain a necessary number of iterations to achieve convergence. In Bayesian CDMs, item parameters and mixing proportions are typically checked for convergence (Culpepper, 2015a; Zhan et al., 2018). In the preliminary study, two Markov chains
Step 5: Parameter Estimation
Although the preliminary study indicates that a burn-in of 1,000 iterations is adequate, we employed a burn-in of 5,000 iterations in this study to ensure the stability of the results. Two Markov chains were used with 10,000 iterations per chain, and the first 5,000 iterations in each chain were excluded as burn-in. The thinning interval was set to be 1. Finally, 10,000 iterations were used for model parameter inferences. The
Step 6: Save Estimated Parameters
The file “summary_DINA.txt” presents the summary statistics based on the sampled values of all monitored parameters. Taking the mixing proportions as an example, Table 11 presents the first three estimated mixing proportions,
Sample Output Results of Mixing Proportions for the DINA Model
Note. SD = standard deviation; Rhat = potential scale reduction factor; n.eff = the effective sample size.
The file “ppp_DINA.txt” contains the PPP value of the DINA model for these data. In addition, the file “deviance_DINA.txt” extracts the posterior mean of
Table 12 presents the model fit comparison between the DINA and rRUM models to the fraction subtraction data. The rRUM model was identified as having a better fit based on AIC, DIC, and a PPP value of 0.701. It took approximately 1,535 and 3,370 seconds to run the DINA and rRUM models, respectively.
Model Fit and Computing Time in the Empirical Example
Note. NP = number of estimated parameters; −2LL = −2 log likelihood; AIC = Congdon’s version of Akaike’s information criterion; DIC = deviance information criterion; Time = overall computing time for two Markov chains (in seconds).
Table 13 presents the estimates of item parameters for two models. Table 14 presents the estimates of the mixing proportions for the two models. It must be noted that the comparison between these two models is beyond the scope of this article. Thus, no further explanation of the results is provided.
Estimates of Item Parameters in the Empirical Example
Note. Posterior standard deviations (i.e., standard errors) are in parentheses. DINA = deterministic inputs, noisy “and” gate; rRUM = reduced reparameterized unified model.
Estimates of Mixing Proportions in the Empirical Example
Note. Posterior standard deviations (i.e., standard errors) are in parentheses; the middle 26 patterns are omitted. DINA = deterministic inputs, noisy “and” gate; rRUM = reduced reparameterized unified model.
Summary
This article presents a systematic introduction to using JAGS for Bayesian CDM estimation. Several JAGS codes are presented to fit a few representative CDMs. The unstructured latent structural model and the higher order latent structural model are both introduced. Further, this article demonstrates how to extend these models to polytomous attributes, the testlet effect, and longitudinal data. Finally, an empirical example is presented to illustrate how the R2jags package must be illustrated to run the JAGS codes.
As a tutorial, this article has its limitations. First, only selected CDMs were demonstrated in this tutorial. Thus, the readers are encouraged to consult other sources for other model extensions. Second, certain emerging research topics are not included, such as the Q-matrix estimation (Y. Chen, Culpepper, Chen et al., 2018; Chung & Johnson, 2018), joint CDMs for response accuracy and response times (e.g., Zhan et al., 2018), and CDMs for polytomous scoring items (Ma & de la Torre, 2016b; von Davier, 2008). Third, only the R2jags package was used to call JAGS; however, other R packages such as the rjags (Plummer, Stukalov, & Denwood, 2016) and jagsUI (Kellner, 2017) can also be used. Fourth, the computing time tended to be rather long, particularly for large-scale tests with a large sample size. Thus, it is desirable to develop more effective Bayesian estimation programs to increase the efficiency in model parameter estimation for the new models, such as the dina (Culpepper, 2015b) package in R. In addition, a new Bayesian software package named Stan (Carpenter et al., 2016) has been developed. Stan uses the no-U-turn sampler (Hoffman & Gelman, 2014), an extension to the Hamiltonian Monte Carlo (Neal, 2011) algorithm. The Hamiltonian Monte Carlo is considerably faster than the Gibbs sampler, which is used in JAGS. Further exploration of using Stan to fit Bayesian CDMs would be valuable (e.g., Lee, 2016).
Overall, given the increasing number of applications of the Bayesian MCMC algorithm in fitting numerous CDMs, JAGS has the potential of becoming a popular tool in the field. Further, it is hoped that researchers can adapt the codes presented in this article for their own testing situations.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Natural Science Foundation of Zhejiang Province, China (Grant No. LY16C090001) and the MOE (Ministry of Education in China) Project of Humanities and Social Sciences (Grant No. 19YJC190025).
