Bayesian Nonparametric Monotone Regression of Dynamic Latent Traits in Item Response Theory Models

Abstract

Parametric methods, such as autoregressive models or latent growth modeling, are usually inflexible to model the dependence and nonlinear effects among the changes of latent traits whenever the time gap is irregular and the recorded time points are individually varying. Often in practice, the growth trend of latent traits is subject to certain monotone and smooth conditions. To incorporate such conditions and to alleviate the strong parametric assumption on regressing latent trajectories, a flexible nonparametric prior has been introduced to model the dynamic changes of latent traits for item response theory models over the study period. Suitable Bayesian computation schemes are developed for such analysis of the longitudinal and dichotomous item responses. Simulation studies and a real data example from educational testing have been used to illustrate our proposed methods.

Keywords

Bayesian nonparametric monotonic regression dynamic changes item response theory model Markov chain Monte Carlo

1. Introduction

Longitudinal studies play a prominent role in investigating temporal changes in a construct of interest, which are often referred to as growth curve analysis in social and behavior science. The advent of computerized testing and online rating brings an entirely new way for social and behavior researchers to collect longitudinal data. Test takers have much more freedom to choose their test time than before. Then, the responses collected are often observed at variable and irregular time points across individuals. The randomness of responses may further create sparsity in certain period, such as in the summer or winter holidays.

Traditionally, there are two models of wide usage for studying individual changes. One is called latent growth curve modeling (Bollen & Curran, 2006) and the other is multilevel modeling or hierarchical linear modeling (Raudenbush & Bryk, 2002). However, when outcomes are observed at individually varying and irregularly spaced time points, the inferences from these two traditional models for studying individual changes may become problematic due to the uneasy adjustment of parametric structures in the models (Geiser, Bishop, Lockhart, Shiffman, & Grenard, 2013). For instance, their analysis typically required the same time span of the study and the same testing points for all examiners. Furthermore, for computerized testing/survey in education, the manifest responses are usually dichotomous, ordinal, or nominal, while latent traits needed to be inferred are often continuous. These make the inference even more difficult because of the information loss in the discretization procedure of the underlying latent variables. In this article, we mainly focus on the extension of classic item response theory (IRT) model framework to model longitudinal dichotomous data collected at irregular and variable time points.

1.1. Review of Relevant Literature

There are two major approaches, that is, the multidimensional and multilevel approach, available in the current literature of IRT models for the analysis of longitudinal data. First, for the multidimensional approach, a multidimensional IRT model is used to represent the change of an ability as an initial ability and one or more modified ability in unidimensional or multidimensional tests (Cho, Athay, & Preacher, 2013; Embretson, 1991; te Marvelde, Glas, Landeghem, & Damme, 2006). However, this approach allows little variation of items on different occasions, and it often requires the individuals to take the same tests. These drawbacks prevent us from extending their methods to analyze a time series of computerized testing data.

Second, for the multilevel approach, the first level is often assumed to follow a classic IRT model, while in the higher level, there are two common ideas to model the growth. One idea is to assume the growth of a latent trait be parametric function of time, such as a linear or polynomial regression of the time variable with fixed or random coefficients. This idea is a variation of the latent growth curve modeling in the analysis of binary/categorical longitudinal data (Albers, Does, Imbos, & Janssen, 1989; Hsieh, von Eye, Maier, Hsieh, & Chen, 2013; Johnson & Raudenbush, 2006; Tan, Ambergen, Does, & Imbos, 1999; Verhagen & Fox, 2012). Another idea is to employ Markov chain models, where the changes of a latent trait over time are assumed to be dependent on its previous value or status (Bartolucci, Pennoni, & Vittadini, 2011; Kim & Camilli, 2014; Martin & Quinn, 2002; Park, 2011). However, there are many instances in which neither of two ideas would be enough to describe the growth (Bollen & Curran, 2004). One of such instances is computerized testing, especially when the time lapses between tests are unequally spaced across individuals as well as within individuals.

To tackle the challenges in computerized testing for modeling the growth of latent traits, Wang, Berger, and Burdick (2013) proposed a dynamic model by combining the ideas of parametric functions of time as well as Markov chain models to describe the growth. They imbedded IRT models into a new class of state space models for analyzing longitudinal data located at individually varying and irregularly spaced time points. Nevertheless, assuming a particular functional relationship for the growth in general is restrictive and usually difficult to justify. Instead, a nonparametric model could be much more flexible to describe changes of latent traits and avoid the errors of model misspecification.

As further investigation of the results shown in Wang et al. (2013), we found the trajectory of one’s reading ability grows more quickly in the initial period but slows down when it approaches maturation. Overall, the ability exhibits an increasing trend but often has a flat region at the end. Such discovery of the shape for the growth trajectory is consistent with prior beliefs and experiences from practitioners. In social or behavior science, prior knowledge about the shape of the trajectory, such as monotonicity, convexity, or concavity, may be available ahead of the analysis to aid in the modeling process and enhance interpretability. This calls our attention to incorporate shape constraints as a prior information to nonparametric modeling. It is expected that the usage of shape information may improve the efficiency and accuracy of the nonparametric estimates.

From the Bayesian perspective, nonparametric regression with monotonicity constraints has already been considered in the literature. For instance, Gelfand and Kuo (1991) used an ordered Dirichlet process prior to impose the monotone constraint. Neelon and Dunson (2004) imposed a piecewise linear regression, with an autoregressive prior for the parameters of basis functions. Shively, Sager, and Walker (2009) as well as Brezger and Steiner (2012) implemented restricted splines to ensure monotonicity. McKay Curtis and Ghosh (2011) used Bernstein polynomials with restrictions on parameter space, while Choi, Kim, and Jo (2016) extended this idea by allowing incorporation of uncertainty of the constraints through the prior. Lin and Dunson (2014) proposed Gaussian process projection to perform shape-constrained inference and applied the method to multivariate surface estimation. Wang and Berger (2016) imposed the constraints on the derivative process of the original Gaussian process to estimate shape-constrained functions.

However, typical nonparametric models are often less interpretable. In this article, enlightened by the idea of Bornkamp and Ickstadt (2009), we imbed a flexible Bayesian nonparametric monotone regression of latent traits into the IRT models with easy interpretation. The monotone regression can be written as the sum of two parts: (1) an intercept parameter (interpretable as one’s initial ability) and (2) the product of a scale parameter (interpretable as the maximum ability that one can grow during the study period) with a continuous function of the time variable (which is monotonically increasing over the study period and can easily capture the plateaus effect of one’s ability at the end). Another advantage of our proposed approach is that the parameters for the underlying base functions which constitute the monotone function do not have constraints on the domain of parameter space. Lacking of constraints on the parameter space could make the model more flexible and reduce computational burden. Additionally, the base functions themselves can be directly learned from the data.

1.2. EdSphere Test Bed Application

We will apply our proposed method to the EdSphere data set provided by Highroad Learning Company. EdSphere is a personalized literacy learning platform that continuously collects data about student performance and strategic behaviors each time when he or she reads an article. During a typical reading test, a student selects from a system-generated list of articles having the test difficulty level in a range targeted to the current estimate of the student’s ability. Then, for the selected article, a subset of words from the article is eligible to be clozed, that is, removed and replaced by a blank. The computer, following a prescribed protocol, randomly selects a sample of the eligible words to be clozed and presents the article to the student with these words clozed. The question items produced by this procedure are randomized items. They are single-use items generated at the time of an encounter between a student and an article. If another student selects the same article to read, a new set of clozed words is selected. As a consequence, the occurrence of individual items among students is highly improbable, so obtaining empirical estimates of item parameters is not feasible.

The difficulty levels of the items in the reading test are provided by MetaMetrics using proprietary data and methods. The ensemble mean and the variance of difficulty level for the items in a test are known due to the test design of EdSphere learning platform.

Currently, the EdSphere data set consists of 16,949 students from a school district in Mississippi, who participated over 5 years (2007–2011) in EdSphere learning platform. A snapshot of the data sets is included in Table B1 in the online version of the article. The students were in different grades and could enter and leave the program at different times. They were free to take tests on different days and had different time lapses between tests. This design yields longitudinal observations located at individually varying and irregularly spaced time points and suggests that we need to model the changes of latent traits with a dynamic structure. Further, in the spirit of EdSphere test design, we could imagine the factors, such as overall comprehension, emotional status, and others, would exist and undermine the local independence assumption of IRT models as mentioned in Wang et al. (2013). Therefore, we aim to extend the classic IRT models to accommodate the modern computerized (adaptive) testing (not merely EdSphere data sets), which have the described distinctive features, that is, randomized items, longitudinal observations, and local dependence.

2. Nonparametric Monotone Regression

In this article, we keep the discussion focused on the one-parameter IRT model that links latent ability and item difficulty to the correctness of each item. The idea could be similarly implemented on two-parameter/three-parameter IRT models or other continuous, ordinal, and nominal latent variable models. In the classic one-parameter IRT model, the latent ability of each individual and the difficulty level for each item are the two key components and often they assume to be static. But in the EdSphere data set, the item responses are longitudinal, and thus, the latent ability of individuals and item difficulty levels are both varying with time. In addition, each test taker is allowed to take tests at any time they wish in the computerized testing. Moreover, they can take more than one exams per day. Therefore, following the discussion of Wang et al. (2013), we need to extend the classic one-parameter IRT model as below.

The proposed shape-constrained IRT model involves two levels, that is,

Level 1 : Pr (X_{i, j, s, k} = 1 | θ_{i, j}, η_{i, j, s}, d_{i, j, s, k}) = F (θ_{i, j} - d_{i, j, s, k} + η_{i, j, s}),

Level 2 : θ_{i, j} = f_{i} (t_{i, j}) + ω_{i, j} .

In the first level, Equation 1 extends the classic one-parameter IRT models to the scenario of computerized testing, where $θ_{i, j}$ represents the $i th$ person’s ability (latent trait) on $j th$ day with assuming a person’s ability is constant over a given day; $d_{i, j, s, k}$ is the difficulty of the $k th$ item in the $s th$ test on the $j th$ day taken by the subject i; $η_{i, j, s}$ takes account of the random effects that cannot explained by person’s ability and the item difficulty in the $s th$ test on the $j th$ day for the $i th$ person; $F (\cdot)$ is a cumulative distribution function (CDF) for continuous random variables; and $i = 1, \dots, n$ , $j = 1, \dots, T_{i}$ , $s = 1, \dots, S_{i, j}$ and $k = 1, \dots, K_{i, j, s}$ . For the EdSphere data set, since the ensemble mean and variance of the item difficulties in a test are known quantities due to the test design, we assume

d_{i, j, s, k} = a_{i, j, s} + υ_{i, j, s, k},

where $υ_{i, j, s, k} \sim N (0, σ^{2})$ , and $a_{i, j, s}$ and σ are known. Further, we presume for each individual i, the random effects $η_{i, j, s} \overset{i.i.d.}{\sim} N (0, τ_{i}^{- 1})$ for $j = 1, \dots, T_{i}$ and $s = 1, \dots, S_{i, j}$ , where $τ_{i}$ is a precision parameter and changes according to individuals. For the link function $F^{- 1} (\cdot)$ , we will use $F^{- 1} (\cdot) = Φ^{- 1} (\cdot)$ (called the Normal Ogive or Probit link), where $Φ^{- 1} (\cdot)$ is the inverse function of the standard normal CDF, and this link can ease the computation for Bayesian analysis.

In the second level (Equation 2), $w_{i, j}$ represents the random residuals, $w_{i, j} \sim N (0, ϕ^{- 1} Δ_{i, j})$ , where ϕ is an unknown proportion of the precision of $w_{i, j}$ and $Δ_{i, j}$ is the time lapse between the $j th$ test day and the $(j - 1) th$ test day of the subject i, $j = 1, \dots, T_{i}$ . We assume the variance of $w_{i, j}$ is proportional to the time lapse because it implies the uncertainty about one’s ability would become larger when he or she does not take the tests for a while. $f_{i} (\cdot)$ is the $i th$ individual latent trajectory and assumed to be a continuous function, where $t_{i, j}$ is the time location, that is, the actual test day for an examinee to take a test in the study period. Often when we model $f_{i} (\cdot)$ , there is some prior information available. For example, psychologists and educators can assume that the mean trend for a student’s reading ability would be growing or at least not decreasing during the study period. We choose to impose such prior beliefs on the modeling of $f_{i} (\cdot)$ since it can be used for a large part of the potential applications, and we may be able to check from the data fit about the reasonableness of this assumption.

We will utilize the idea of Bornkamp and Ickstadt (2009) to model the unknown latent trajectory $f_{i} (\cdot)$ flexibly and conveniently with a monotone shape constraint. First, for each individual i, $i = 1, \dots, n$ , we rescale the original time units into [0,1], by subtracting the original time with the minimum time value of the $i th$ individual and then divided by the time range of individual i. After such rescaling, we continue to use $t_{i, j}$ for the notation of time for convenience and assume

f_{i} (t_{i, j}) = β_{i, 0} + β_{i, 1} f_{i}^{0} (t_{i, j}), i = 1, \dots, n, j = 1, \dots, T_{i},

where $f_{i}^{0} (\cdot)$ is the CDF of a bounded and continuous random variable on [0,1]. Thus, $f_{i} (\cdot)$ will be increasing if $β_{i,1} \geq 0$ and decreasing otherwise. Moreover, $f_{i} (\cdot)$ can accommodate flat regions if chosen $f_{i}^{0} (\cdot)$ properly. This is because a CDF of a random variable will reach its plateau when the variable approaches its bound of the domain. Under the current time rescaling mechanism and since $0 \leq f_{i}^{0} (t_{i j}) \leq 1$ for each individual i, the intercept $β_{i,0}$ can be interpreted as the initial level of the latent ability for the $i th$ individual, while $β_{i,0} + β_{i,1}$ can be treated as the maximum level that an individual can reach during the study period. Second, to make the modeling of $f_{i}^{0} (\cdot)$ much flexible, we introduce the nonparametric ideas,

f_{i}^{0} (t_{i, j}) = \int_{Ξ} F (t_{i, j}, ξ) G_{i} (d ξ) = \sum_{ℓ = 1}^{L_{i}} π_{i, ℓ} F (t_{i, j}, ξ_{i, ℓ}), ξ_{i, ℓ} \overset{i.i.d.}{\sim} P_{0},

where we model $f_{i}^{0} (\cdot)$ as a discrete mixing of parametric CDFs. In Equation 5, $F (\cdot, ξ)$ is a CDF with parameters $ξ$ belong to the parameter space $Ξ$ ; $G_{i}$ is a discrete probability measure on $Ξ$ with assigning a general discrete random measure prior introduced by Ongaro and Cattaneo (2004), that is, $G_{i} (d ξ) = \sum_{ℓ = 1}^{L_{i}} π_{i, ℓ} δ_{ξ_{i, ℓ}} (d ξ)$ , where δ is a Dirac delta function; $ξ_{i, ℓ}$ ’s are independent and identically distributed realizations from a continuous distribution P ₀ on $Ξ$ and are assumed to be independent with $π_{i, ℓ}$ ’s and L_i ’s; $π_{i, ℓ}$ satisfies $\sum_{ℓ = 1}^{L_{i}} π_{i, ℓ} = 1$ , and L_i requires its support on positive integers. Also, note the common choices of P ₀ in Equation 5 will help us to borrow strength among individual latent trajectories. The construction of Equation 5 actually contains many popular discrete random probability measures in the current literature as a special case, such as the Dirichlet process, general stick-breaking processes, and so on. In addition, this construction is very flexible in modeling monotone function since it has not imposed any restricted structure on parameters (i.e., for the weights $π_{i, ℓ}$ ’s and the parameters $ξ_{i, ℓ}$ ’s) in comparison to other methods mentioned in the Introduction section.

In Equation 5, proper choice of the base distribution function $F (\cdot, ξ)$ is the key for the success of modeling $f_{i} (\cdot)$ . A typical requirement is that we need a convex combination of functions $F (\cdot, ξ_{i,1}), F (\cdot, ξ_{i,2}), . . .$ , for any $i = 1, \dots, n$ , which can approximate any arbitrary continuous CDF on [0,1]. The beta distribution functions (i.e., the regularized incomplete beta functions) will satisfy this requirement, however, there is usually heavy computational burden associated with the beta distribution functions since they have no closed form. To balance the computation burden and the adequacy of approximation, we consider the CDF of two-sided power (TSP) distribution (cf. Van Dorp & Kotz, 2002) for $F (\cdot, ξ)$ :

F (t_{i, j}, ξ) = {\begin{array}{l} b {(\frac{t_{i, j}}{b})}^{γ}, & if 0 \leq t_{i, j} \leq b, \\ 1 - (1 - b) (\frac{1 - t_{i, j}}{1 - b})^{γ}, & if b \leq t_{i, j} \leq 1. \end{array}, (b, γ) \in [0, 1] \times R_{+},

which has two key parameters $(b, γ)$ and is a viable alternative of beta distribution functions. Here, define $ξ = (b, γ)$ . Bornkamp and Ickstadt (2009) proved the convex combination of the CDF of TSP functions can be capable of approximating any continuous CDF on [0,1].

We illustrate several examples of TSP functions in Figure 1. From Figure 1, we can see that γ controls the steepness of the curve. When γ is small, the pace of increasing is comparatively slow, and when γ becomes larger, the increasing trend becomes steeper. The parameter b is the unique mode of TSP function when $γ > 1$ . A TSP function with a small b and a large γ describes a curve increasing steeply in the beginning and stabilizing afterward, which could be viewed as a “fast learner” growth curve, while a TSP function with a large b and small γ can be viewed as a growth trend for an individual who improves steadily with a slower pace. Other scenarios could be interpreted accordingly, and they will be varying with different values of b and γ.

Figure 1.

Illustration of several two-sided power functions with different γ and b values.

3. Bayesian Computation Scheme

The hierarchical model of Equations 1 and 2 can accommodate the complex structure of computerized testing (e.g., EdSphere data sets), and it also allows the incorporation of prior information. However, because of the complexity of the model considered, we have to resort to Markov chain Monte Carlo (MCMC) computational techniques for the analysis. A by-product of Bayesian inference is that all uncertainties in all quantities are combined in the overall assessment of inferential uncertainty.

3.1. Prior Specification and Posterior Distribution

Before starting the Bayesian inference, first, we have to specify the prior distributions of unknowns in the model. For parameters ϕ, $τ_{i}$ ’s, $β_{i,0}$ ’s, and $β_{i,1}$ ’s in Equation 1, 2, and 3, there is a lack of scientific knowledge, so we use the following objective priors for them: $π (ϕ) \propto ϕ^{- \frac{3}{2}}$ , $π (τ_{i}) \propto τ_{i}^{- \frac{3}{2}}$ , $π (β_{i,0}) \propto 1$ , and $π (β_{i,1}) \propto 1$ , for $i = 1, \dots, n$ . The objective priors used for $π (ϕ)$ and $π (τ_{i})$ are recommended in (Wang, Berger, & Burdick, 2013).

According to Lemma 1 in Bornkamp and Ickstadt (2009), the prior distributions of $ξ_{i, ℓ}$ ’s determine the mean and prior correlation structure of $f_{i}^{0} (\cdot)$ . Without expert’s information, a uniform distribution on a finite subset of parameter space $Ξ$ for $ξ_{i, ℓ}$ ’s would be a reasonable choice to start with. Theoretically, we would like to elicit an unbounded prior for L_i . One viable choice is the zero-truncated Poisson distribution with the rate parameter $λ > 0$ , and thus, its prior mean is $\frac{λ e^{λ}}{e^{λ} - 1}$ . The larger λ value is, the more components are in TSP mixture. For the prior of $π_{i} = (π_{i 1}, \dots, π_{i L})^{'}$ , a natural choice is a symmetric Dirichlet distribution with common parameter $ρ > 0$ . Notice the prior variability of $f_{i}^{0} (\cdot)$ is increasing when L_i or ρ gets smaller (see Lemma 1 in Bornkamp & Ickstadt, 2009). Hence, in practice, the values of λ and ρ are chosen according to the desired prior variability for $f_{i}^{0} (\cdot)$ and the expected number of jumps in the model response.

Using the priors specified above, we can derive the posteriors of unknowns in the proposed model as shown in Equation 8 of online Appendix A. We can show this posterior is proper (see details in online Appendix A). Then, the statistical inferences based on the sampling from this posterior is legitimate. To facilitate the computation for the posterior of unknowns, we implement the idea of data augmentation (Albert & Chib, 1993) by introducing a latent variable $Z_{i, j, s, k}$ for each dichotomous response $X_{i, j, s, k}$ , that is, defining $Z_{i, j, s, k} \sim N (θ_{i, j} - a_{i, j, s} + η_{i, j, s},1 + σ^{2}) I (Z_{i, j, s, k} > 0)$ if $X_{i, j, s, k} = 1$ and $Z_{i, j, s, k} \sim N (θ_{i, j} - a_{i, j, s} + η_{i, j, s},1 + σ^{2}) I (Z_{i, j, s, k} \leq 0)$ if $X_{i, j, s, k} = 0$ . Therefore, the two-level hierarchical model (Equations 1 and 2) can be simplified as

Z_{i, j, s, k} = f_{i} (t_{i, j}) + ω_{i, j} - d_{i, j, s, k} + η_{i, j, s} + ∊_{i, j, s, k},

with $∊_{i, j, s, k} \overset{i.i.d}{\sim} N (0, 1)$ . Then, our computation schemes for drawing samples from the joint posterior distribution of unknowns will derive from this data augmentation model.

Let us denote $Λ_{i} = (L_{i}, π_{i,1}, \dots, π_{i, L_{i}}, ξ_{i,1}, \dots, ξ_{i, L_{i}})^{'}$ with each $ξ_{i, ℓ} = (b_{i, ℓ}, γ_{i, ℓ})$ being the corresponding parameters of the $ℓ th$ TSP mixture component for the $i th$ subject, $π_{i, ℓ}$ is the corresponding assigned weight of the $ℓ th$ mixture component, with $i = 1, \dots, n$ and $ℓ = 1, \dots, L_{i}$ . Thus, $Λ_{i}$ contains all information about the TSP mixture for the $i th$ individual, and we denote the notations $Λ = {Λ_{1}, \dots, Λ_{n}}$ and $ξ = {ξ_{1, 1}, \dots, ξ_{1, L_{1}}, \dots, ξ_{n,1}, \dots, ξ_{n, L_{n}}}$ to represent the sets of variables for all n subjects.

Similarly, we use bold notations $θ, β, Z, η, τ$ to define the sets of corresponding variables introduced in Section 2, over all indices, that is, $β = {β_{1, 0}, β_{1, 1}, \dots, β_{n,0}, β_{n,1}}$ , $η = {η_{i, j, s} : i = 1, \dots, n, j = 1, \dots, T_{i}, s = 1, \dots, S_{i, j}}$ , $τ = {τ_{1}, \dots, τ_{n}}$ , $Z = {Z_{i, j, s, k} : i = 1, \dots, n, j = 1, \dots, T_{i}, s = 1, \dots, S_{i, j}, k = 1, \dots, K_{i, j, s}}$ , $θ = {θ_{1, 1}, \dots, θ_{1, T_{1}}, \dots, θ_{n,1}, \dots, θ_{n, T_{n}}}$ . Similarly, define $X = {X_{i, j, s, k} : i = 1, \dots, n, j = 1, \dots, T_{i}, s = 1, \dots, S_{i, j}, k = 1, \dots, K_{i, j, s}}$ . Then, the joint posterior distribution of parameters $θ, β, Λ, Z, η, τ, ϕ$ given the data $X$ is derived as

\begin{array}{l} f (θ, β, Λ, Z, η, τ, ϕ | X) \\ \propto f (X | Z) f (Z | θ, η) f (θ | β, Λ, ϕ) f (η | τ) π (β) π (Λ) π (ϕ) π (τ) \\ \propto \prod_{i = 1}^{n} \prod_{j = 1}^{T_{i}} \prod_{s = 1}^{S_{i, j}} \prod_{k = 1}^{K_{i, j, s}} {(I (Z_{i, j, s, k} > 0) I (X_{i, j, s, k} = 1) + I (Z_{i, j, s, k} \leq 0) I (X_{i, j, s, k} = 0)) \\ \times \sqrt{\frac{ψ_{i, j, s, k}}{2 π}} exp {- \frac{ψ_{i, j, s, k} {(Z_{i, j, s, k} - θ_{i, j} + a_{i, j, s} - η_{i, j, s})}^{2}}{2}}} \\ \times \prod_{i = 1}^{n} \prod_{j = 1}^{T_{i}} \sqrt{\frac{ϕ}{2 π Δ_{i, j}}} exp {- \frac{ϕ {[θ_{i, j} - β_{i,0} - β_{i,1} f_{i}^{0} (t_{i, j})]}^{2}}{2 Δ_{i, j}}} \\ \times \prod_{i = 1}^{n} \prod_{j = 1}^{T_{i}} {(\frac{τ_{i}}{2 π})}^{\frac{S_{i, j}}{2}} exp {- \frac{τ_{i} \sum_{s = 1}^{S_{i, j}} η_{i, j, s}^{2}}{2}} [\prod_{i = 1}^{n} π (τ_{i}) π (β_{i}) π (Λ_{i})] π (ϕ), \end{array}

where $ψ_{i, j, s, k} = (1 + σ^{2})^{- 1}$ and ${π(τ}_{i})$ , ${π(β}_{i})$ , $π (Λ_{i})$ , and $π (ϕ)$ denote the priors specified in the beginning of this subsection.

3.2. The MCMC Sampling Schemes

The key part in developing MCMC algorithm to draw samples from the joint posterior distribution (Equation 7) is to estimate the latent trajectory $f_{i}^{0} (t_{i, j})$ s in Equation 4. Notice that once Λ is known, $f_{i}^{0} (t_{i, j})$ s are fully determined. Since the distribution of Λ can vary in dimension, we employ a reversible jump Markov chain Monte Carlo (RJ-MCMC) sampling scheme (Green & Hastie, 2009). Further, to reduce the correlation as well as to achieve faster convergence of the MCMC samples of parameters, we implement the idea of partially collapsed Gibbs sampling (Van Dyk & Park, 2008). Thus, at the $q th$ iteration, we perform the sampling procedure of unknowns in the order below:

Sample $Z^{(q)}$ from $f (Z^{(q)} | θ^{(q - 1)}, η^{(q - 1)})$ , which is a truncated normal distribution for the full conditional distribution of each individual $Z_{i, j, s, k}$ given the rest;

Sample $η^{(q)}$ from $f (η^{(q)} | θ^{(q - 1)}, τ^{(q - 1)}, Z^{(q)})$ , which is a normal distribution for the full conditional distribution of each $η_{i, j, s}$ given the rest;

Sample $τ^{(q)}$ from $f (τ^{(q)} | η^{(q)})$ , which is a gamma distribution for the full conditional distribution of each $τ_{i}$ given the rest;

Sample $Λ^{(q)}$ from $f (Λ^{(q)} | θ^{(q - 1)})$ , which has no closed form; moreover, the dimension of Λ is varying for each iteration, and thus, we employ the Metropolis–Hasting algorithm within the RJ-MCMC to sample the full conditional distribution of each $Λ_{i}$ given the rest;

Sample $ϕ^{(q)}$ from $f (ϕ^{(q)} | θ^{(q - 1)} Λ^{(q)})$ , which is a gamma distribution for the full conditional distribution of ϕ given the rest;

Sample $β^{(q)}$ from $f (β^{(q)} | θ^{(q - 1)}, Λ^{(q)}, ϕ^{(q)})$ , which is a multivariate normal distribution for the full conditional distribution of β given the rest;

Sample $θ^{(q)}$ from $f (θ^{(q)} | β^{(q)}, Λ^{(q)}, Z^{(q)}, η^{(q)}, ϕ^{(q)})$ , which is a normal distribution for the full conditional distribution of each individual $θ_{i, j}$ given the rest.

The details of each sampling step are described in online Appendix A. The MCMC sampling loops through Steps 1 through 7 and repeats until the MCMC is converged. The initial values (i.e., when $(q - 1) th$ iteration $= (0) th$ iteration) of parameters chosen in simulations and application are: $θ_{i, j}^{(0)}$ ’s drawing from $N (0, 1)$ , $β_{i,1}^{(0)}$ ’s and $β_{i,2}^{(0)}$ ’s being 0, $η_{i, j, s}^{(0)}$ ’s being 0, $ϕ^{(0)} = 200$ , and $τ_{i}^{(0)}$ ’s being 6 for $i = 1, \dots, n$ . While, we will specifically discuss how to choose the initial values of $Λ_{i}^{(0)} = (L_{i}^{(0)}, π_{i}^{(0)}, b_{i}^{(0)}, γ_{i}^{(0)})$ for $i = 1, \dots, n$ in different examples. The convergence is evaluated informally by looking at trace plots.

Then, statistical inferences are made straightforward from the MCMC samples. For example, an estimate and 95% credible interval (CI) for the latent trajectory of one’s ability $θ_{i} = (θ_{i,1}, \dots, θ_{i, T_{i}})^{'}$ can be plot from the median, 2.5%, and 97.5% empirical quantiles of the corresponding MCMC realizations of each $θ_{i, j}$ , for $j = 1, \dots, T_{i}$ . In examples, these will be graphed as a function of time $t_{i j}$ , so that the dynamic changes of an examinee are apparent.

4. Simulation Examples

To validate the inference procedure of MCMC schemes and show the success of using monotone shape constraints in the nonparametric modeling, two simulation studies are conducted. For the first simulated example, we know the true underlying curve $f_{i}^{0} (\cdot)$ ’s that generate the latent trajectory of one’s ability, while for the second simulated example, we have no information about the true underlying curve $f_{i}^{0} (\cdot)$ ’s.

4.1. An Example Using the Mixture of TSP Functions as the Latent Trajectory

In this section, we apply our proposed method to a simulated data set that uses the mixture of TSP functions as the true latent trajectory. We consider 10 test takers, that is, $n = 10$ ; each of them is examined at 60 different test days, that is, $T_{i} = 60, i = 1, \dots,10$ . During each distinctive test day, there are four examinations for each individual, thus $S_{i, j} = 4$ for $i = 1, \dots,10, j = 1, \dots, T_{i}$ , and there are 10 questions (or items) in each test, that is, $K_{i, j, s} = 10$ for $i = 1, \dots,10, j = 1, \dots, T_{i}$ and $s = 1, \dots, S_{i, j}$ . For each person i, we assume the time lapse between two consecutive tests is a function of j, which is set to be $Δ_{i, j} = j + 10$ for $j = 1, \dots, T_{i} / 2$ and $Δ_{i, j} = j - 10$ for $j = T_{i} / 2 + 1, \dots, T_{i}$ , with $i = 1, \dots,10$ .

We set the true values for model parameters as below: $ϕ^{- 1 / 2} = 0.05$ , and thus, the corresponding standard deviation of the random component $w_{i, j}$ in Equation 2 is $0.05 \sqrt{Δ_{i, j}} . τ^{- 1 / 2} = (0.361, 0.286, 0.322, 0.362, 0.359, 0.302, 0.347, 0.325, 0.360, {0.378)}^{'},$ where each element represents the standard deviation of random effects $η_{i, j, s}$ for the $i th$ person, respectively. $σ^{2} {= .7333}^{2}$ , which is chosen based on the test design of EdSphere data. The parameters specified for the mixture components of TSP functions are addressed in Table 1.

Table 1.

Simulated Parameters for Each Individual of Two-Sided Power Functions

	L	π	b	γ
$Λ_{1}$	2	[0.5, 0.5]	[0.5, 0.1]	[3, 5]
$Λ_{2}$	2	[0.2, 0.8]	[0.4, 0.7]	[2, 8]
$Λ_{3}$	1	1	0.15	10
$Λ_{4}$	2	[0.3, 0.7]	[0.3, 0.5]	[5, 13]
$Λ_{5}$	1	1	0.3	4
$Λ_{6}$	1	1	0.5	17
$Λ_{7}$	2	[0.1, 0.9]	[0.2, 0.3]	[3, 15]
$Λ_{8}$	1	1	0.3	5
$Λ_{9}$	2	[0.6, 0.4]	[0.1, 0.5]	[8, 5]
$Λ_{10}$	2	[0.7, 0.3]	[0.2, 0.45]	[4, 10]

We use the prior specified in Section 3.1 for the unknown parameters of the proposed model. Particularly, for this simulated example, we use a zero-truncated Poisson for each L_i with $λ = 2$ , which implies there are expecting no more than three jumps since each of the mixture of TSP functions in the simulation consists at most two components. Without available scientific information, we employ a symmetric Dirichlet distribution with the parameter $ρ = 1$ as the prior for $π_{i, ℓ}$ ’s, assign a Uniform(0,1) for the prior of each $b_{i, ℓ}$ ; and specify a Uniform(1,50) for the prior of each $γ_{i, ℓ}$ .

Model fitting is done with the MCMC algorithm described in the Section 3.2, based on 50,000 iterations in total. The first 10,000 samples are burned in, and only every 20th value is taken for our inference as to reduce dependence. After this burn-in process, the mixing of MCMC samples looks pretty good, and the trace plots of MCMC samples of parameters are convergent.

In Figure 2, we display results of 4 represented individuals among the 10 simulated individuals in the example. Those four individuals have noticeable differences in the shape of their corresponding trajectories. The growth curve for the first subject is steadily increasing without obvious transition. While for the second individual in Figure 2, he or she has obvious turning points. He or she grows slowly during the initial period, then rapidly grows in the middle and reaches the plateaus in the end. For third and sixth subjects in Figure 2, they have similar phenomena as the second individual but with different lengths of the three stages in the study.

Figure 2.

The estimation of latent trajectories for the first, second, third, and sixth subjects. (A) First individual. (B) Second individual. (C) Third individual. (D) Sixth individual.

In each subfigure of Figure 2, the dash line represents the true underlying function $f_{i} (\cdot)$ , the bullet dots are simulated values of $θ_{i, j}$ ’s for $j = 1, \dots, T_{i}$ , and the star dotted line is the estimate of posterior median of $f_{i} (\cdot)$ , along with the solid lines representing the corresponding 2.5% and 97.5% credible bands. We could see that the fitted trajectory captures the trend pretty well even though the number of different test date is comparatively small, that is, 60. Simulations for larger sample sizes have been tested, and the results were generally better. Those results are not addressed here as to save the space.

We also calculate the posterior estimates of all other unknowns. The posterior median of $ϕ^{- 1 / 2}$ is 0.0561, with a 95% credible interval (CI) being [0.0491, 0.0634], where the true value of $ϕ^{- 1 / 2}$ (i.e., 0.05) is included. The posterior median as well as 95% CIs of other parameters are displayed in Figures 3 and 4. We could see that their respective true values, which are marked by cross, are all included in the 95% CIs.

Figure 3.

Posterior median and 95% CIs of $τ_{i}$ ’s in the mixture of two-sided power function based simulation.

Figure 4.

The results of posterior median and 95% CIs of $β_{i, 0}$ ’s and $β_{i, 1}$ ’s in the mixture of two-sided power function based simulation.

To take account of randomness in the simulations, another 100 independent data sets are simulated to check frequentist coverage probabilities, with same parameter setup but different random seeds. For each data set, we also run the MCMC sampling for 50,000 iterations with the first 10,000 samples being burned in. In addition, the MCMC chains are thinned by only using every 20th sample. The results are shown in Table 2. We can see that the coverage probabilities of the 95% CIs of all model parameters including the truth are equal or very close to the nominal level 95%. Thus, while the inferential method is Bayesian, it seems to yield sets that have good frequentist coverage.

Table 2.

Coverage Probabilities of ϕ, $θ_{i}$ ’s, $τ_{i}$ ’s, $β_{i, 0}$ ’s, and $β_{i, 1}$ ’s by 100 Independent Simulations

Parameter	Coverage Probability	Parameter	Coverage Probability
$τ_{1}$	0.940	$τ_{2}$	0.940
$τ_{3}$	0.960	$τ_{4}$	0.920
$τ_{5}$	0.940	$τ_{6}$	0.900
$τ_{7}$	0.900	$τ_{8}$	0.920
$τ_{9}$	0.880	$τ_{10}$	0.940
$θ_{1}$	0.944	$θ_{2}$	0.932
$θ_{3}$	0.950	$θ_{4}$	0.975
$θ_{5}$	0.969	$θ_{6}$	0.965
$θ_{7}$	0.971	$θ_{8}$	0.931
$θ_{9}$	0.945	$θ_{10}$	0.963
$β_{1, 0}$	0.960	$β_{2, 0}$	0.920
$β_{3, 0}$	0.960	$β_{4, 0}$	1.000
$β_{5, 0}$	0.980	$β_{6, 0}$	0.980
$β_{7, 0}$	1.000	$β_{8, 0}$	0.940
$β_{9, 0}$	0.960	$β_{10, 0}$	0.960
$β_{1, 1}$	0.960	$β_{2, 1}$	0.960
$β_{3, 1}$	0.960	$β_{4, 1}$	1.000
$β_{5, 1}$	0.980	$β_{6, 1}$	0.960
$β_{7, 1}$	0.960	$β_{8, 1}$	1.000
$β_{9, 1}$	0.980	$β_{10, 1}$	1.000
$ϕ$	0.940

4.2. An Example of the Logistic Curve as the Latent Trajectory

In this section, we apply our proposed method to some non-TSP-based true latent trajectories. The setup is the same as the previous simulation except the true trajectories $f_{i} (\cdot)$ , $i = 1, \dots,10$ become the logistic curves as below:

\begin{array}{l} f_{1} (t) = \frac{- 0.5 + 2}{(exp (- 10 t + 5) + 1)}; f_{2} (t) = \frac{- 0.5 + 2}{(exp (- 5 t + 2) + 1)}; \\ f_{3} (t) = \frac{0.5 + 1}{(exp (- 10 t + 2.5) + 1)}; f_{4} (t) = \frac{0.5 + 2}{(exp (- 12 t + 3) + 1)}; \\ f_{5} (t) = \frac{2.5 + 1}{(exp (- 5 t + 3) + 1)}; f_{6} (t) = \frac{- 0.5 + 2}{(exp (- 10 t + 2) + 1)}; \\ f_{7} (t) = \frac{1}{(exp (- 10 t + 2) + 1)}; f_{8} (t) = \frac{0.5 + 2}{(exp (- 4 t + 1) + 1)}; \\ f_{9} (t) = \frac{- 1.5 + 2}{(exp (- 5 t + 1) + 1)}; f_{10} (t) = \frac{1}{(exp (- 2 t + 1) + 1)} . \end{array}

The motivation for using logistic curves as the true latent trajectories in the simulation is that the logistic curves have been widely applied in the growth curve analysis. Similarly, our model fitting is done based on 200,000 iterations in total. The first 40,000 samples are burned in, and every 20th value of MCMC samples is taken to reduce the dependence of samples. We use a zero-truncated Poisson for each L_i with $λ = 3$ , expecting a few more TSP components are needed than the previous example to fit the logistic curve. Similarly as before, we use a symmetric Dirichlet distribution with the parameter $ρ = 1$ for $π_{i, ℓ}$ ’s and assign uniform(0,1) for the prior of each $b_{i, ℓ}$ as well as specify uniform(1,50) for the prior of each $γ_{i, ℓ}$ . The priors for the rest unknowns are specified the same as Section 4.1.

Figure 5 displays the results of four selected individuals, where the dash lines represent the truth of the underlying growth curve, the bullet dots are simulated values of $θ_{i, j}$ ’s for $j = 1, \dots, T_{i}$ , the star dots correspond to the posterior median estimates of one’s ability, and the solid dash lines indicate the 95% credible band of the estimates. Figure 5(B) and 5(C) show that the true growth curves of fourth and sixth examiners both have a steadily increasing growth curve at the beginning and reach a plateau after half of the study period, whereas our estimated growth curves (star dotted line) capture the trend of the truth (dash lines) very well, and all true values (dash lines) are within the 95% CIs of our estimation (solid lines). For the second and eighth subjects, seen from Figure 5(A) and 5(D), their growth curves are strictly increasing over the study period, and similarly, our proposed method can capture the underlying trend well under these situations.

Figure 5.

The estimation of latent trajectories for the fourth, sixth, seventh, and eighth subjects. (A) Second individual. (B) Fourth individual. (C) Sixth individual. (D) Eighth individual.

In addition, we calculate the posterior estimates of all other unknowns. The posterior median of $ϕ^{- 1 / 2}$ is 0.0539, with the 95% CI being [0.0471, 0.0610], which includes its true value 0.05. The posterior median as well as 95% CIs of $τ_{i}$ ’s, $β_{i,0}$ ’s, and $β_{i,1}$ ’s are displayed in Figures 6 and 7, where we can see that the true-simulated values of $τ_{i}$ ’s are all inside their corresponding 95% CIs. Notice that in the logistic curve simulations, the true values of $β_{i,0}$ ’s and $β_{i,1}$ ’s are unknown, and thus, we are not able to compare the truth relative to the corresponding 95% CIs for $β_{i,0}$ ’s and $β_{i,1}$ ’s.

Figure 6.

Posterior median and 95% CIs of $τ_{i}$ ’s in logistic curve simulation.

Figure 7.

Posterior median and 95% CIs of $β_{i, 0}$ ’s and $β_{i, 1}$ ’s in logistic curve simulation.

5. Application to EdSphere Data

Since our approach has been successfully applied to estimate the trend of $f_{i} (\cdot)$ from simulated data and recovered the true values of parameters well, we will employ our two-level hierarchical models to the EdSphere data. Due to the limitation of our computer’s RAM (8 GB), a sample of 10 individuals from the EdSphere database was randomly chosen for illustration purpose. The characteristics of the individuals are described in online Appendix B. There are two goals for the analysis of EdSphere data sets. One is to assess the appropriateness of the local independence assumption for this type of data, and the other is to understand the growth in ability of students, by retrospectively producing the estimated growth trajectories of their latent abilities, incorporating the monotone assumption.

The prior specification is the same as the aforementioned simulation examples except we use a zero-truncated Poisson prior for each L_i with $λ = 1$ , which corresponds to a prior belief that using about two different TSP functions, we can explain the changes of the response trajectory in the data. But such prior assumption can be washed out by the data if our data have strong information to indicate that we need more mixture components of TSP functions to explain one’s ability growth. In addition, to examine the sensitivity of the prior specification for L_i , we have tried other λ values; the yielded estimation of latent trajectories are almost the same as those shown in Figure 8.

Figure 8.

The estimation of latent trajectories for the second, third, sixth, and seventh subjects. (A) Second individual. (B) Third individual. (C) Sixth individual. (D) Seventh individual.

We have run in total 500,000 iterations and burn in the first one fifth of the samples, and to reduce dependence of MCMC samples, we have taken only every 20th value of MCMC samples. We have checked the trace plots of model parameters to access the convergence for MCMC samples, and we found a good mixing is observed after our burn-in and thinning process. Figure 8 shows the estimated trajectory of four individuals, which represent four types of growth curves we typically observed from the data.

In Figure 8, from our proposed method, the star dotted lines denote the corresponding 95% credible band of one’s latent trajectory, the square dots represent the estimated trajectory used in current EdSphere learning platform (where they assume AR(1) model for $θ_{i, j}$ and consider an observation equation as Model 1 but without the random effect $η_{i, j, s}$ ), and the bullet dots correspond to the estimates of one’s ability obtained by solving the equation that the expectation of expected score for a person’s ability is equivalent to the observed score (which can roughly be thought of as the raw test scores put on the same scale as the $θ_{i, t}$ ’s). In Figure 8, we can see that our estimated latent trajectory of one’s ability growth (i.e., the red dots for the posterior median of $θ_{i, t}$ ’s) displays a much smooth monotone increasing trend of one’s ability in comparison to current EdSphere’s estimation, where current EdSphere’s approach shows a continuously up-and-down oscillation in the estimation of one’s ability.

Noticeably, in Figure 8, the latent growth curve of the second individual (i.e., see Figure 8[A]), a six-grader, increases sharply at about the 300th day and reaches the plateau before the 450th day, while the sixth individual (in Figure 8[C]) experiences a moderate growth during most of the time before stabilizing by the end of the study. For the third and the seventh students in Figure 8(B) and 8(D), respectively, they both are in Grade 2 and have a similar type of steadily increasing shape of the growth over the entire study period. However, seen from Figure 8, the growth rate (or learning speed) of the seventh individual is much faster than that of the third individual. Since the study period of seventh individual is much shorter than that of the third individual, clearly, we can also compare the estimation of posterior median as well as the corresponding 95% CIs of $β_{7, 1}$ and $β_{3, 1}$ (they represent the magnitude of one’s ability growth during the study period), respectively, in Figure 9 to validate the differences between the growth rate of these two individuals. The results of Figure 8 inform us that the timing for students reaching the “learning ceiling” (i.e., plateau) as well as their “learning speed” differs among different individuals. Further investigation or clustering of students based on the shape of the growth curves might help teachers to tailor education practice or assignments for each individual student.

Figure 9.

Posterior median and 95% CIs of $β_{i, 0}$ ’s and $β_{i, 1}$ ’s.

Moreover, we are able to summarize the results of other parameters in the model. The posterior median of $ϕ^{- 1 / 2}$ is 0.0566, and its 95% CI is [0.0301, 0.0797]. The posterior estimates and 95% CIs for τ and β are summarized in Figures 10 and 9, respectively. In Figure 10, all $τ_{i}$ values and their corresponding CIs are far away from 0 except $τ_{2}^{- 1 / 2}$ and $τ_{3}^{- 1 / 2}$ , which suggests the local dependence indeed exists in the EdSphere data sets. In Figure 10, all $β_{i,1}$ ’s values as well as their corresponding 95% CIs are above 0, which shows the data support the belief that the growth curve of one’s reading ability is always increasing. Notice our method does not require any additional restriction on the parameters in the model; thus, the values of parameters are fully determined by the data. Also, we could see the values of $β_{i,0}$ ’s and $β_{i,1}$ ’s are varying a lot according to different individuals.

Figure 10.

Posterior median and 95% CIs of $τ_{i}$ ’s.

6. Conclusion

In this article, we proposed a Bayesian nonparametric two-level hierarchical model for the analysis of one’s latent ability growth in educational testing for longitudinal scenarios. The advantage of our method is able to incorporate the monotonic shape constraints into the estimation of latent trajectories. Due to the flexibility of our nonparametric method, we are able to fit any monotonic continuous curve without further restrictions on the estimation of parameters. Therefore, the estimation of the slopes $β_{i,1}$ ’s without including 0 in their corresponding 95% CI indicates the monotonicity is supported by EdSphere data sets.

The latent trajectories of ability growth estimated from our approach can help educators or practitioners to better understand the behaviors of students in the study, such as the growth patterns (continuous increasing, sharp increasing, and etc.) and the timing that a student reaches the ceiling of increasing for his or her learning ability (i.e., the timing of reaching the plateau). Further studies on clustering those behavioral patterns of individuals will guide us to design education practice or teaching tailored to individuals. In addition, since the evidence of the local dependence assumption is generally strong from the analysis of EdSphere data sets, we can conclude that the use of random effects to model the local dependence seems to be necessary and successful.

As the MCMC computation is time-consuming and resource-demanding for our current approach, our next goal is to improve the efficiency of our computation by developing big data schemes, such as similar ideas of Wei, Wang, and Conlon (2017), to make the parallel computing possible so as to conveniently apply our approach to the entire data sets. With improvement of computation efficiency, we might be able to develop a computable evaluation criterion to assess the model fit of the proposed nonparametric model using the predictive score ideas (Gneiting & Raftery, 2007). Another potential extension is to explore covariates including the grade and others with their relationships to the growth of one’s ability trajectory. This direction will facilitate us to group individuals based on their similar characteristics and encourage the development of personalized education.

Supplemental Material

Supplemental Material, Appendix - Bayesian Nonparametric Monotone Regression of Dynamic Latent Traits in Item Response Theory Models

Supplemental Material, Appendix for Bayesian Nonparametric Monotone Regression of Dynamic Latent Traits in Item Response Theory Models by Yang Liu and Xiaojing Wang in Journal of Educational and Behavioral Statistics

Footnotes

Acknowledgments

The authors are grateful to Dr. Carl Swartz from Highroad Learning company for generously sharing their data with us and to the Editor, Dr. Li Cai, and referees for numerous suggestions that significantly improved this article.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research of Dr. Yang Liu was part of his dissertation at the Department of Statistics, University of Connecticut, and was supported by start-up fund from Dr. Xiaojing Wang. Also, Dr. Xiaojing Wang wants to acknowledge the receipt of the National Science Foundation award (#1848451) for support for her research, authorship, and/or publication of this article.

References

Albers

Does

Imbos

Janssen

(1989). A stochastic growth model applied to repeated tests of academic knowledge. Psychometrika, 54, 451–466.

Albert

J. H.

Chib

(1993). Bayesian analysis of binary and polychotomous response data. Journal of the American statistical Association, 88, 669–679.

Bartolucci

Pennoni

Vittadini

(2011). Assessment of school performance through a multilevel latent Markov Rasch model. Journal of Educational and Behavioral Statistics, 36, 491–522.

Bollen

K. A.

Curran

P. J.

(2004). Autoregressive latent trajectory (ALT) models a synthesis of two traditions. Sociological Methods and Research, 32, 336–383.

Bollen

K. A.

Curran

P. J.

(2006). Latent curve models: A structural equation perspective (Vol. 467). Hoboken, NJ: John Wiley.

Bornkamp

Ickstadt

(2009). Bayesian nonparametric estimation of continuous monotone functions with applications to dose–response analysis. Biometrics, 65, 198–205.

Brezger

Steiner

W. J.

(2012). Monotonic regression based on Bayesian P–splines. Journal of Business & Economic Statistics, 26, 90–104.

Cho

S.-J.

Athay

Preacher

K. J.

(2013). Measuring change for a multidimensional test using a generalized explanatory longitudinal item response model. British Journal of Mathematical and Statistical Psychology, 66, 353–381.

Choi

Kim

H.-J.

(2016). Bayesian variable selection approach to a Bernstein polynomial regression model with stochastic constraints. Journal of Applied Statistics, 23, 1–21.

10.

Embretson

(1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495–515.

11.

Geiser

Bishop

Lockhart

Shiffman

Grenard

J. L.

(2013). Analyzing latent state-trait and multiple-indicator latent growth curve models as multilevel structural equation models. Frontiers in Psychology, 4, 1–23.

12.

Gelfand

A. E.

Kuo

(1991). Nonparametric Bayesian bioassay including ordered polytomous response. Biometrika, 78, 657–666.

13.

Gneiting

Raftery

A. E.

(2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102, 359–378.

14.

Green

P. J.

Hastie

D. I.

(2009). Reversible jump MCMC. Genetics, 155, 1391–1403.

15.

Hsieh

C.-A.

von Eye

Maier

Hsieh

H.-J.

Chen

S.-H.

(2013). A unified latent growth curve model. Structural Equation Modeling: A Multidisciplinary Journal, 20, 592–615.

16.

Johnson

Raudenbush

S. W.

(2006). A repeated measures, multilevel Rasch model with application to self-reported criminal behavior. Methodological Issues in Aging Research, 5, 131–164.

17.

Kim

Camilli

(2014). An item response theory approach to longitudinal analysis with application to summer setback in preschool language/literacy. Large-Scale Assessments in Education, 2, 1.

18.

Lin

Dunson

D. B.

(2014). Bayesian monotone regression using Gaussian process projection. Biometrika, 101, 303–317.

19.

Martin

A. D.

Quinn

K. M.

(2002). Dynamic ideal point estimation via Markov chain Monte Carlo for the U.S. Supreme Court, 1953-1999. Political Analysis, 10, 134–153.

20.

McKay Curtis

Ghosh

S. K.

(2011). A variable selection approach to monotonic regression with Bernstein polynomials. Journal of Applied Statistics, 38, 961–976.

21.

Neelon

Dunson

D. B.

(2004). Bayesian isotonic regression and trend analysis. Biometrics, 60, 398–406.

22.

Ongaro

Cattaneo

(2004). Discrete random probability measures: A general framework for nonparametric Bayesian inference. Statistics & Probability Letters, 67, 33–45.

23.

Park

J. H.

(2011). Modeling preference changes via a hidden Markov item response theory model. In Jones

G. L.

Brooks

Gelman

Meng

X.-L.

(Eds.), Handbook of Markov chain Monte Carlo (pp. 479–491). Boca Raton, FL: CRC Press.

24.

Raudenbush

S. W.

Bryk

A. S.

(2002). Hierarchical linear models: Applications and data analysis methods (Vol. 1). Thousand Oaks, CA: Sage.

25.

Shively

T. S.

Sager

T. W.

Walker

S. G.

(2009). A Bayesian approach to non-parametric monotone function estimation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71, 159–175.

26.

Tan

E. S.

Ambergen

A. W.

Does

R. J. M. M.

Imbos

(1999). Approximations of normal IRT models for change. Journal of Educational and Behavioral Statistics, 24, 208–223.

27.

te Marvelde

J. M.

Glas

C. A. W.

Landeghem

G. V.

Damme

J. V.

(2006). Application of multidimensional item response theory models to longitudinal data. Educational and Psychological Measurement, 66, 5–34.

28.

Van Dorp

J. R.

Kotz

(2002). The standard two-sided power distribution and its properties: With applications in financial engineering. The American Statistician, 56, 90–99.

29.

Van Dyk

D. A.

Park

(2008). Partially collapsed Gibbs samplers: Theory and methods. Journal of the American Statistical Association, 103, 790–796.

30.

Verhagen

Fox

J.-P.

(2012). Longitudinal measurement in health-related surveys. A Bayesian joint growth model for multivariate ordinal responses. Statistics in Medicine, 32, 2988–3005.

31.

Wang

Berger

J. O.

(2016). Estimating shape constrained functions using Gaussian processes. SIAM/ASA Journal on Uncertainty Quantification, 4, 1–25.

32.

Wang

Berger

J. O.

Burdick

D. S.

(2013). Bayesian analysis of dynamic item response models in educational testing. The Annals of Applied Statistics, 7, 126–153.

33.

Wei

Wang

Conlon

E. M.

(2017). Parallel Markov chain Monte Carlo for Bayesian dynamic item response models in educational testing. Stat, 6, 420–433.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.37 MB