Abstract
Utilizing planned missing data (PMD) designs (ex. 3-form surveys) enables researchers to ask participants fewer questions during the data collection process. An important question, however, is just how few participants are needed to effectively employ planned missing data designs in research studies. This article explores this question by using simulated three-form planned missing data to assess analytic model convergence, parameter estimate bias, standard error bias, mean squared error (MSE), and relative efficiency (RE).Three models were examined: a one-time-point, cross-sectional model with 3 constructs; a two-time-point model with 3 constructs at each time point; and a three-time-point, mediation model with 3 constructs over three time points. Both full-information maximum likelihood (FIML) and multiple imputation (MI) were used to handle the missing data. Models were found to meet convergence rate and acceptable bias criteria with FIML at smaller sample sizes than with MI.
Keywords
High-quality approaches to infer population information in the presence of missing data, such as full-information maximum likelihood (FIML) and multiple imputation (MI), have been around for decades (Dempster, Laird, & Rubin, 1977; Rubin, 1976); however, implementing either of these procedures was very time-consuming and computationally intense. With the rapid advance of computer technology, modern approaches are now readily available in most current statistical software packages (Enders, 2010; Enders & Gottschall, 2011; Graham, Cumsille, & Elek-Fisk, 2003). Specifically, either FIML or MI can be used to recapture the missing information and represent the population’s characteristics very well (Dempster et al., 1977; Enders, 2010; Graham, Taylor, & Cumsille, 2001; Graham, Taylor, Olchowski, & Cumsille, 2006; Rubin, 1976). The benefits of using either FIML or MI for handling missingness is that they increase power compared to traditional methods of handling missing data (e.g. listwise deletion), while estimating unbiased parameter values.
Now that FIML and MI are widely available, an alternate view of missing data has become viable; namely, to plan for and easily address past limitations caused by missing data (Graham, 2009). This planned missing data (PMD) approach involves intentionally introducing missingness into the data collection in a way that results in missing data patterns that are “missing completely at random (MCAR).” Methodologists have started investigating the strengths and limitations of using a PMD approach, and enough empirical evidence now exists to justify integrating this approach as a routinely-used research methodology (Graham, 2009). For developmental researchers, an important strength of PMD approach is the ability to collect longitudinal data, while minimizing the burden placed upon their participants. The purpose of this article is to provide preliminary guidance for researchers using PMD survey designs (specifically the three-form design; see below) with small sample sizes (N = 60–300).
Background
A PMD is based upon a particular characteristic pattern of missingness among the data: completely random missingness. There are three categories or types of missingness in data: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR; Graham, 2009; Rubin, 1976; Schafer & Graham, 2002). When data is missing completely at random (MCAR) there is no systematic cause for the missingness. When the reason for the data’s missingness is captured by other variables in the data, then it is called a missing at random (MAR) process (Graham, 2009; Rubin, 1976; Schafer & Graham, 2002). Lastly, MNAR data is missing for a reason that was not captured in anyway by the other variables measured. Since the cause of missingness is not measured in MNAR situations, we lack relevant information to inform the FIML or MI process when missing data is MNAR. Both MCAR and MAR data can be recaptured by FIML or MI approaches by using the relationships (covariances) that exist within the observed data to inform the missing information. To know how much the covariances are able to inform the missing information, the proportion of information lost due to missing data can be measured. This value is the fraction of missing information (FMI). The lower the FMI, the less information lost, and the higher the quality of estimation (Enders, 2010; Savalei & Rhemtulla, 2012).
The ability to recapture data that is missing due to either a MCAR or MAR process is the foundation for PMD theory and practice. The data that is missing by design is, by definition, MCAR, and, therefore, is easily recaptured. When either MCAR or MAR data is present, then the existing data can potentially inform the modern missing data technique’s (see FIML and MI specifics described later) process to recapture the missing information (Dempster et al., 1977; Enders, 2010; Graham et al., 2001, 2006; Rubin, 1976).
For PMD designs to be effective and have the ability to result in improved data quality, FIML or MI needs to be used to recapture the data that was missing by design. There are, however, slight differences that may affect the success of their application. FIML estimation is done during model estimation (Graham, 2009). Therefore, only the variables in an analysis model, plus a select few auxiliary variables – variables not part of the analysis model that are included only to inform the estimation of any MAR mechanism – may inform model parameter estimates (Graham, 2009; Graham, et al., 2003). FIML estimation is now often the default in SEM software (e.g. Mplus and lavaan). MI, infers the descriptive statistics of the dataset’s variables using all possible information from all answered items. MI was originally intended for use with datasets having a large number of variables, particularly when those variables may not all be used in a single model (Rubin, 1987). The imputed values are based upon the relationships among the present data. Each dataset that is imputed will have slightly different values in the missing data positions. The greater the variability in values between imputations, the greater the uncertainty due to the missingness present. By using multiple imputed datasets, the parameter estimates and standard error values are less biased than if only a single imputed dataset were used (Graham, Olchowski, & Gilreath, 2007; Rubin, 1996; van Buuren, Boshuizen, & Knook, 1999). Although researchers agree that with enough imputations, MI results are equivalent to FIML results, there is not yet consensus on the number of imputations that should be used (Bodner, 2008; Savalei & Rhemtulla, 2012). Once multiple datasets have been imputed, then the analysis model is run upon each dataset and the results combined by Rubin’s Rules (see Rubin, 1987), which is now an automatic step in many software packages. One benefit of MI is that imputed items can be averaged (such as subscale scores) prior to model estimation. When using FIML, items averaged to form subscale values result from averaging across missing values, which may lead to more biased estimates (Schafer & Graham, 2002).
Purpose
The purpose of this article is to provide preliminary guidance on sample size minimums for researchers using PMD 3-form survey designs. After fitting models to the data, the convergence of the PMD models by FIML and MI were examined. These results provide guidance to researchers regarding the minimum sample size FIML and MI may need in order to use these designs. The PMD models’ relative bias of parameter estimates and standard errors, mean squared error (MSE), and relative efficiency (RE) were also assessed. The less the relative bias, the more the PMD models’ values are properly representing the true parameter values as specified in the simulation (Schafer & Graham, 2002). These results suggest the minimum sample size FIML and MI may need in order to obtain accurate and precise parameter estimates.
Methods
The simulation designs involved three latent variable models: a one-time-point, cross-sectional confirmatory factor analysis (CFA) model, a two-time-point CFA model, and a three-time-point mediation model. Simulations for sample sizes ranging from 60 to 300 were conducted to examine the minimum sample size necessary to successfully complete an SEM analysis of 3-form PMD design data collected cross-sectionally (Figure 1) and longitudinally (Figure 2 and Figure 3). For the guidance from these simulations to be as applicable as possible, an additional 5% MCAR missingness was included on top of the PMD design, so that the impact of a commonly present amount of missingness in real-world datasets would also be present and accounted for. The models’ parameters—loadings, within-time latent covariances, autoregressive coefficients and cross-lagged coefficients—were specified to be within the ranges commonly seen in psychological and developmental research. The specifics for each model are detailed later. To streamline the methods section, the longitudinal studies’ expanded methodology will build upon the cross-sectional model’s methodology provided initially. All study designs include performance comparisons between FIML and MI.

Model 1: Cross-sectional CFA model.

Model 2: Two-time-point saturated CFA model.

Model 3: Three-time-point mediation model.
3-form PMD survey design
The 3-form PMD survey design is being adopted by researchers in the social sciences, and so it is the PMD design focus of this article. Other, more complex, multi-form designs are also, of course, possible. The basic principle for the 3-form design is that scale items are divided into four blocks: X, A, B, and C (Graham, 2009; Graham et al., 2006; Moore, 2011; Rhemtulla, Little, Moore, Gibson, & Wei, 2012). The X-block is also called the common block and is the block included in all survey versions (e.g. Version 1, 2, and 3; Graham, 2009; Graham et al., 2006; Moore, 2011; Rhemtulla et al., 2012). The X-block is comprised of demographic information and often includes the strongest and most representative indicator of each construct. Each construct’s remaining indicators are then distributed across the A-, B-, and C-blocks as evenly as possible based upon their psychometric properties (see Moore, 2011, for a more detailed 3-form design example; and Little, Jorgensen, Lang, & Moore, 2013, for an example of analyses with data collected using a 3-form design). Finally, three survey versions are designed by combining two of the A-, B-, and C-blocks of items together with the X block. For example, Version 1 would have the X-block items; along with the items from blocks A and B (See Table 1). In this article’s example, the X set contained one item from each latent construct, and the other items were evenly spread across A, B, and C sets. This distribution approach resulted in 33% of the items being in the X-block, and 22% in each of the A-, B-, and C-blocks. Therefore, each survey version presents only 78% of all the study items to each participant (See Table 1).
Three-form PMD survey design.
Data generation
Model 1
The simplest latent structure model used for data generation was a cross-sectional CFA model (Model 1) with three latent variables, η1, η2, and η3. Each latent variable was indicated by three items (see Figure 1). Latent variances were fixed at 1 for model identification. A range of population values representing the variety of conditions seen in applied studies were simulated. Specifically, the factor loadings ranged from .7 to .85, and were simulated at .05 increments (i.e., .7, .75, .80, and .85). The factor loadings in a given model had the same value across each construct (i.e., tau-equivalent). Additionally, the within-time correlations between latent variables were simulated to range between .2 and .5 in increments of .1 (i.e., .2, .3, .4, and .5). Simulating each of these possible combinations resulted in 16 conditions (4 loadings × 4 covariances) to be assessed.
Model 2
The two-time-point CFA model (Model 2) had the same latent variables and indicators as Model 1, but measured at two time points (see Figure 2). The Time 1 latent variances were fixed at 1, Time 2 latent variances and all latent covariances were freely estimated. The population values of the factor loadings and within-time covariances were simulated with the same ranges as for the cross-sectional model (Model 1) mentioned earlier. The conditions used to derive the population values of between-time covariances for Model 2 were based upon a cross-lagged panel model, in which the autoregressive coefficient values were simulated for a range from .4 to .9 in .1 increments, and the cross-lagged path values were simulated for a range from .1 to .4 in increments of .1. These conditions were used to derive the population covariance matrices for the indicators using (simplified from equation 4.7 of Bollen, 1989):
where ∑ represents the latent covariance matrix of Model 2, Ψ is the latent residual covariance matrix of the cross-lagged panel model, and B is the coefficient matrix. Simulating each possible combination resulted in 340 conditions (4 loadings × 4 covariances × 6 autoregressive coefficients × 4 cross-lagged coefficients, minus 44 combinations that produced negative latent variances).
Model 3
The most complex, three-time-point mediation model (Model 3) had the same latent variables and indicators as Model 1 and Model 2, but measured at three time points (see Figure 3). The Time 1 latent variances were fixed at 1, while the other latent variances were freely estimated. The population values of factor loadings and within-time covariances were simulated with the same ranges as for Model 1. Model 3’s autoregressive coefficients and cross-lag coefficients were the same values as we used to deriving the Model 2 parameters. Therefore, Model 3 had the same number of conditions as Model 2.
Because modeling with sample sizes was the primary focus of these simulations, we assessed each model’s conditions with sample sizes from 60 to 300 at increments of 20 (e.g. 60, 80, 100, 120, 140, … 300). After examining these results, the threshold range for each model to reach convergence and accurate estimates was further investigated. This was done by conducting additional simulations in which the sample sizes assessed were in increments of 5.
Missing values were imposed based on a 3-form PMD design. In the longitudinal models (Model 2 and Model 3), we systematically switched the form each participant received. Specifically, participants who responded to Version 1 at time one, responded to Version 2 at time two, and Version 3 at time three. In the last step, 5% of unplanned MCAR missingness was imposed on top of the planned missingness. Therefore, the percentage of missing data in each survey version was 28%.
Two hundred samples (replications) were drawn from each condition, and the three models described above were estimated for each of the samples using FIML and MI. In simulation studies, there is always a trade-off between the number of replications and conditions examined. A large number of replication (e.g. 1000) is usually recommended to improve precision (Bentler, 1995). However, Skrondal (2000, p. 157) argues that this “practice unduly favors precision at [the] expense of external validity.” The empirical applications of statistical methods deserve more attention than “the exaggerated precision of convention” standard. Using a comparatively small, but not too small, number of replications (i.e., 200) allows us to examine more simulation conditions, and thereby achieve a higher level of validity without scarifying too much precision.
All samples were generated and analyzed in R using the simsem package (Version 0.4–6; Pornprasertmanit, Miller, & Schoemann, 2012). The simsem package was designed to automate Monte Carlo simulations that used either a CFA or SEM analytical framework. Using functions in simsem, data were first generated and then modified (i.e., simulated 3-form PMD design and unplanned MCAR). The models analyzed with the FIML approach were run through lavaan (Version 0.5–11; Rosseel, 2012). The models analyzed by the MI approach were run through Amelia (Version 1.6.4; Honaker, King, & Blackwell, 2011), and the resultant 20 imputed datasets were run through lavaan and the results combined by Rubin’s (1987) Rules. The decision to use 20 imputed datasets was made based on the findings from previous studies and the computational effort needed. Rubin (1987) suggested 2 to 10 imputations were sufficient in most realistic situations. Then, von Hippel (2005), in response to Hershberger and Fisher (2003), supported Rubin’s guidance, and argued that the marginal gains of using more than 10 imputations is outweighed by the computational costs. In 2007, Graham and colleagues, who also noted that computational cost is worth considering, suggested using at least 20 imputed datasets. Thus, given the huge number of simulation conditions already required a heavy computational effort, in this study, we considered 20 imputed datasets a reasonable number.
Analysis
The effectiveness of FIML and MI to properly estimate the true parameter values for each model across sample size conditions was assessed over the following characteristics: model convergence, relative biases of parameter estimates and standard errors, MSE, and RE.
Model convergence
To assess a model, it must converge. Therefore, our first step was to examine the effect of sample size and model complexity on the quantity and quality of the model’s convergence when applying FIML or MI. The convergence rate helped us answer our first question of how small a sample size one could have when collecting data with a 3-form PMD design. Our next convergence question was on convergence rate differences between FIML and MI. For the FIML conditions we considered a replication as converged if the number of iterations of the replication was smaller than 250 (the default in lavaan). Then the rate of convergence for FIML results can be simply obtained by counting the number of converged replications in each condition and divided it by the total number of replications (i.e., 200). However, assessing the convergence rate of MI results was not as straightforward, since each replication was comprised of 20 imputed datasets. As there is no guidance about how many imputed datasets need to converge when using MI, we chose the strictest cutoff (i.e., 20 of 20 datasets converge) to make the convergence rate more comparable to FIML. In other words, we considered a replication convergent only when the model converged on all 20 of its imputed datasets. Then, for a given sample size in each model, we computed the average convergence rate across all the parameter value conditions.
Relative bias of parameter estimates and standard errors (SE)
Once a model converges, the next important question is how accurate are the models’ values—parameter estimates. To determine the minimum sample size at which parameter estimates were unbiased, we focused on the model’s factor loadings, and latent covariances/coefficients. This assessment was based upon the relative parameter bias, which refers to the percentage of the raw bias—the difference between the average value across replications and the true parameter value—relative to the population value. In other words, the relative parameter bias is the ratio determined by dividing the raw bias (i.e., difference) by the population value. Therefore, the smaller the absolute relative parameter bias, the more properly the true parameter value is estimated by the model. In addition to assessing the parameter estimates’ bias, we also assessed the standard errors’ bias. Standard errors (SE) represent the variability of the parameter estimates of interest (e.g. factor loading, latent covariance, or latent pathway). A parameter’s SE is used in determining if the parameter estimate is significantly different from zero. Assessing the relative SE bias was similar to the relative parameter bias above. First, we evaluated how much each sample’s SE deviated from its population value (i.e., raw bias). This raw value was divided by the population value to get the relative SE bias ratio. Although we were not able to set the population’s SE value in our simulation study, the standard deviation of the empirical sampling distribution of parameter estimates served this purpose. After computing the absolute the relative parameter and SE biases, the values for the same parameter type (e.g. factor loadings, within-time-point covariances) were averaged.
Mean squared error (MSE)
MSE was also used to evaluate the performance of FIML and MI at the different sample sizes. The MSE is equal to the squared bias of the estimate plus the variance of the estimate. Of the two components, the first measures the estimator’s bias (e.g. accuracy), while the second measures the variability of the estimator (precision). Generally, a good estimator has small combined variance and bias. That is, when performing well, FIML and MI would produce accurate and precise parameter estimates. Thus, the smaller the parameter’s MSE, the better the model fits the data.
Relative efficiency (RE)
Relative efficiency (RE) measures the amount of information loss due to missing data. It is negatively related to the fraction missing of information (FMI). RE is computed as a ratio of the sampling variances (i.e., squared standard errors) of the complete data estimates to the missing data estimates (Rhemtulla, Jia, Wu, & Little, 2014). Ranging from 0 to 1, RE is computed for each parameter and could be interpreted as the loss of statistical power caused by missing data (Savalei & Rhemtulla, 2012). For example, data was collected with missingness from 100 participants, and a parameter is found to have an RE of .8, then a complete dataset could produce the same parameter’s information with only 80 (0.8 × 100) participants.
Criteria
Several criteria are used to determine how small a sample size is acceptable. The first criterion (C1) is that convergence rate is greater than .9 (Muthén & Muthén, 2002). The second criterion (C2) is that the parameter estimate’s relative bias does not exceed .05 and standard error bias does not exceed .1 (Hoogland & Boomsma, 1998). C1 and C2 are commonly used criteria and they two together help avoid nonconvergence, bias or inaccuracy in a statistic procedure. In addition, Hoogland and Boomsma (1998) proposed a more stringent criterion (C3), suggesting that the mean standard error bias across parameters should be smaller than .05. In this article, we consider C1 + C2 the cutoff for determining the minimum sample size required under each condition. Larger sample sizes might be needed when C3 is employed.
We ran analyses of variance (ANOVAs) to investigate the effects of the design factors on the magnitude of the relative bias of the parameter estimates and standard errors. Four univariate ANOVAs were performed on FIML point estimate bias, FIML standard error bias, MI point estimate bias, and MI standard error bias. The simulation design factors (e.g. factor loading value, sample size) were all treated as the between-subject factors. Due to the large number of total replications (with the sample size increments of 20, there were 208,000 replications for Model 1 and 4,992,000 for Model 2 and Model 3), statistical significance tests were not reported; instead, we examined the partial η2 for each effect and only reported and interpreted those that exceeded .01, which is considered a “small effect” according to Cohen (1973).
Results
Model 1
Figure 4a depicts convergence rates for Model 1. When N = 60, FIML tended to converge better than MI, though both had very low convergence rates (.71 and .55). As expected, when sample size increased, so did the convergence rates. A 90% convergence rate was reached by FIML when N = 90, and by MI when N = 110. Both FIML and MI achieved 100% convergence when N = 180.

Convergence rate.
Next, the relative bias of the parameters and standard errors for this model’s factor loadings (λ; See Figure 5a) and latent covariances (φ; See Figure 5b) were examined. Even when sample size was as small as 60, the biases in the point estimates of loadings were still trivial for both FIML and MI. The top-right panel shows that the SE bias for loadings estimated by FIML was less than MI. As N increased, the difference in SE bias estimated by FIML and MI decreased. With larger sample sizes (N > 150) FIML and MI performed equally well, with bias values at or below .05. The latent covariances’ relative bias was much higher than the corresponding factor loadings’ relative bias at all sample sizes (N = 60 to 300). Specifically, when applying FIML, once N ≥ 70, the latent covariances in Model 1 were accurately estimated, while MI estimates of Model 1’s latent covariances were accurate when N ≥ 115. Different than the loadings, the latent covariance SE bias showed no notable difference between FIML and MI at all sample sizes.

Parameter bias and SE bias for Model 1. λ and φ represent factor loadings and factor covariances, respectively.
None of the design factors had any notable effect on the point estimate bias, with either FIML or MI. The noticeable effects (η2 ≥ .01) were present only on SE bias. When estimating factor loadings with MI, the SE bias decreased with increases in the N (η2 = .014), factor loading value (η2 = .016) and latent covariance value (η2 = .042). In the estimation of latent covariances, N had a negative effect on SE bias with both FIML and MI (η2 = .08, and .05, respectively).
MSE and RE in Model 1 (see Appendix A) did not show notable patterns. MSEs were almost identical (MSE < .05) when applying either FIML or MI. REs remained stable across samples size. On average, FIML tended to be less efficient than MI when estimating factor loadings, but more efficient for covariances.
In addition to Figure 5, Table 2 provides a decision guide for choosing a minimum sample size with Model 1. For FIML, N = 90 was the minimum number to get the model converged. This sample size was also large enough to achieve accurate estimates of the major parameters in the model. MI results showed a different pattern. A sample size of 110 was the minimum number for a three-construct model to converge. However, to obtain accurate estimation of latent covariances, which are often a parameter of interest in a simple CFA model, the sample size should be no smaller than 115. To meet the most stringent criterion for SE—mean standard error bias across parameters should be smaller than .05—sample sizes of 120 and 155 appear to be needed for FIML and MI, respectively, when fitting cross-sectional models similar to Model 1.
Decision table of sample size requirements for Model 1.
Note. λ = Factor loadings; φ = Factor covariances. C1 = Convergence > 0.9; C2 = Parameter bias < 0.05 and SE bias < 0.1; C3 = mean SE bias < 0.05.
Model 2
Figure 4b depicts convergence rates for Model 2. When N = 60, convergence for both FIML (27%) and MI (56%) was much lower than for Model 1. However, FIML’s convergence rate achieved 93% by N = 80, and 99% with N = 110. When MI was applied, Model 2 converged better than Model 1 at every sample size examined. MI’s convergence rate achieved 90% with N = 90, and 99% with N = 135. Both MI and FIML attained acceptable convergence rates with smaller samples sizes for Model 2 compared to Model 1.
Next, the parameter bias and SE bias for this two-time point CFA model’s parameters were examined: factor loadings (λ; See Figure 6a), within-time-point latent covariances (φWT; Figure 6b), and between-time-point latent covariances (φBT; Figure 6c). Neither FIML, nor MI, produced notable parameter bias in estimating factor loadings; although FIML performed slightly better than MI. Even when sample size was as small as 60, the biases in point estimates of loadings were still below the .05 cutoff value for both FIML and MI. The right panel of Figure 6a shows that the SE biases of loadings for FIML and MI were overlapped and smaller than .1 at all sample sizes. The parameter bias of within-time-point latent covariances and between-time-point latent covariances told the same story: unbiased estimates were generated when N = 65 for FIML and N = 115 for MI. The SE bias of covariances for FIML and MI were very close and all smaller than .1, and both gradually decreased as N increased.

Parameter bias and SE bias for Model 2. λ, φWT and φBT represent factor loadings, within-time-point latent covariances, and between-time-point latent covariances, respectively.
None of the design factors had any notable effect on the point estimate bias, with either FIML or MI. For the FIML conditions, two effects were detected for the SE bias in factor loadings: N (η2 = .102), and factor loading value (η2 = .044). As N and the population value of factor loadings increased the bias in factor loadings SE became smaller. N was also found to have a negative effect on the SE bias of within-time-point latent covariances and between-time-point latent covariances (η2 = .015 and .017, respectively). For the MI conditions, the SE bias in factor loadings was influenced by two design factors, N (η2 = .012) and within-time-point covariance (η2 = .013). The increase in N and the decrease in the population value of within-time-point covariance resulted in the decrease of SE bias in factor loadings. In addition, N influenced the SE bias of within-time-point covariance (η2 = .017).
Similar to Model 1, MSEs for FIML and MI in Model 2 (see Appendix B) were almost identical and all smaller than .05. The REs for the MI conditions were found to be lower than those for the FIML conditions (see Appendix B).
In addition to Figure 6, Table 3 provides a decision guide for choosing a minimum sample size with Model 2. For FIML, N = 80 was the minimum number to get the model converged with unbiased parameter estimates. MI results showed a slightly different pattern. A sample size of 90 was the minimum number for our two-time point model to converge. Similar to the patterns seen for Model 1’s cross-sectional CFA, a greater sample size (N ≥ 115) was needed to obtain accurate latent covariance estimates, which usually are the parameters of interest in a CFA model. However, for more precise estimation (mean SE < .05), 100 and 200 are suggested as the smallest sample sizes for FIML and MI, respectively, when fitting two-time point models, such as Model 2.
Decision table of sample size requirements for Model 2.
Note. λ = Factor loadings; φWT = Within-time-point latent covariances; φBT = Between-time-point latent covariances. C1 = Convergence > 0.9; C2 = Parameter bias < 0.05 and SE bias < 0.1; C3 = mean SE bias < 0.05.
Model 3
Figure 4c depicts the convergence rates for the three-time point, mediation model (Model 3). When N = 80, convergence for both FIML (0%) and MI (30%) was poor for Model 3. FIML’s convergence rate achieved 90% by N = 130. MI’s convergence rate achieved 90% with N = 160. As expected, this more complex model needed a larger sample size for either FIML or MI to successfully attain a 90% convergence rate.
Next, the parameter bias and SE bias for this model’s parameters were examined: factor loadings (λ; See Figure 7a), autoregressive coefficients (βAR; Figure 7b), and mediation pathways (βM, i.e., β5,1 and β9,5; see Figures 3 and 7c). Since FIML did not converge until N = 90, and MI also performed poorly at N = 60 and 80, we only present the bias and MSE when N ≥ 90. As long as Model 3 converged, the factor loadings’ parameter biases were smaller than .05 (see Figure 7a). Comparatively, loadings’ SE bias was not surprisingly sensitive to convergence, especially for FIML. Bias was greater than .10 with FIML when N < 100. As sample size increased, SE bias for both FIML and MI diminished.

Parameter bias and SE bias for Model 3. λ, βAR and βM represent factor loadings, autoregressive effects and mediating effects, respectively.
The latent pathways’ relative parameter bias and SE bias decreased as sample size increased (Figure 7b). However, the revealed patterns for FIML and MI were both different from each other, and from the prior models. First, when FIML was applied to N = 90, there was a .1 bias; however, when N = 95, the FIML autoregressive pathways’ bias averaged .05, and continued to improve as sample size increased. The SE bias for the FIML estimated autoregressive paths showed a similar pattern to loadings, but with values slightly larger at all sample sizes: attaining .10 by N = 105, and decreasing as N increased. Second, when MI was used to handle missing data, the autoregressive pathway estimates’ bias averaged .07 when N = 90, and then gradually decreased until it reached .05 at N = 175; while the SE bias gradually decreased and fell below .10 at N = 105. Since this is a mediation model, we were keenly interested in the mediation paths: β5,1 and β9,5. As depicted in Figure 7c, FIML had extremely poor performance in estimating this mediation until N ≥ 125. The trend for the SE bias when FIML estimated this mediation was similar to the autoregressive effects, only with slightly higher bias values. When MI was applied, the mediation pathway estimates’ bias tended to bounce around the .07 value, until N ≥ 170, at which point acceptable (.05) estimate bias was reached. The SE bias for the MI estimated mediation pathways was .11 at N = 90, and dropped as N increased.
The design factors had no effect on the point estimate bias for either the FIML or MI conditions. Notable negative effects of N on the estimates’ bias of autoregressive and mediation paths (η2 = .04 and .013, respectively) were found for the FIML conditions. For the MI conditions, the SE bias in factor loadings was influenced by the population value of factor loadings (η2 = .027). As the value of factor loadings increased from .7 to .85, the marginal mean of the factor loading’s SE bias changed from negative to positive.
The MSE and RE values for Model 3 can be found in Appendix C. Similar to Model 2, the factor loadings’ MSE values showed no difference from applying FIML or MI, and were minimal (MSE < .03). For the autoregressive and the mediation effects, MSE values generated by FIML and MI were also very close. The factor loadings’ RE values for FIML conditions were generally smaller than those for MI conditions. In contrast, when estimating the autoregressive and the mediation effects FIML had higher RE than MI.
Table 4 serves as a decision guide when choosing sample size for a three-time-point mediation model estimated using data collected utilizing a PMD 3-form survey design. More than with the prior models, the sample size needed to have acceptable convergence and unbiased parameters estimates and SEs differed. Therefore, taking all of these aspects into account the minimum sample size needed with FIML would be 130 to attain an acceptable convergence rate and unbiased latent parameter estimates; while with MI, one might need to have a sample size no smaller than 175. To achieve more precise estimation (mean SE < .05), larger sample sizes might be needed for both FIML and MI (160 and greater than 300, respectively).
Decision table of sample size requirements for Model 3.
Note. λ = Factor loadings; βAR = Autoregressive effects; βM = Mediating effects. C1 = Convergence > 0.9; C2 = Parameter bias < 0.05, and SE bias < 0.1; C3 = mean SE bias < 0.05.
Discussion
We examined the ability to use 3-form PMD design with small sample sizes. The three simulated models were a one-time-point, cross-sectional model with three latent constructs; a two-time point model with three latent constructs at each time point; and a three-time point mediation model with three latent constructs at each time point. Three-form PMD missingness, and 5% MCAR missingness were imposed on the simulated data and then analyzed in a CFA or SEM framework. These analyses were conducted with either FIML estimation or 20 multiply imputed datasets.
There were three major model characteristics used to assess the ability of FIML and MI to handle missing data when a 3-form PMD approach to data collection was utilized. Each of these characteristics displayed expected trends. The first trend was that convergence rates increased as sample size increased. This was true for both FIML and MI across all three models, and all conditions. In addition, FIML consistently reached the 90% convergence rate with a small sample size than MI. The second trend identified was that the relative parameter bias diminished as sample size increased. Again, FIML estimation performed better than MI with respect to the sample size needed to produce estimates that met the acceptable relative bias cutoff. The third trend seen across all three models and conditions was that SE bias trended gradually down as the sample size increased. This trend was so consistent across models, conditions, and estimators, such that the SE bias for FIML and MI were often indistinguishable.
The differences in performance by FIML and MI are understandable. In theory, FIML and MI produce equivalent results when the input data and model are the same, and the number of imputations are infinite (Collins, Schafer, & Kam, 2001; Graham et al., 2007; Savalei & Rhemtulla, 2012). In our study, the first condition, that the input data and model used with FIML and MI are the same, was met. However, the second condition, infinite imputations for MI, was not met, as only 20 imputed datasets were used for each model replication. Given this limitation on imputed datasets, FIML outperforming MI is theoretically expected (Savalei & Rhemtulla, 2012), and exactly what our study’s findings illustrated. With very few exceptions, FIML produced more accurate parameter estimates than MI, and needed a smaller sample size to meet the 90% convergence rate. In fact, as long as a sample size was large enough for the FIML estimated model to converge, the parameter estimates were accurate. MI, on the other hand, needed a greater sample size to meet the 90% convergence rate, and estimate parameters with an acceptably small level of bias. As discussed in this article, the 20 imputations were chosen for both theoretical and practical reasons. “Equivalent results” between FIML and MI only exist when the number of imputations goes to infinity. Empirical studies showed that to obtain identical estimates as FIML, one may need as many as 50,000 imputations (Savalei & Rhemtulla, 2012). Applied researchers are recommended to use a relative large number of imputations to obtain better estimates with MI when computer power allows, although estimates identical to FIML’s are not usually necessary. To illustrate, we did a follow-up study over a small set of conditions for each model, with N = 200. We found that the increase in the number of imputations (e.g. number of imputations = 20, 100, 500) reduced parameter bias toward FIML values, while little effect was seen on the SE bias (see Appendix D).
A final trend illustrated by this study’s results is the relationship between sample size needs, model complexity, and data available to inform the estimation process. This relationship is a good example of the importance of FMI to model convergence (Longford, 2005). Among the three models, the two-time point CFA model (i.e., Model 2) attained the 90% convergence rate at the lowest sample sizes (N FIML = 80 and N MI = 90). FMI is an informative measure of the information lost due to missingness for each parameter. Therefore, unlike percent missing data (i.e., global measure of dataset missingness), FMI is computed for each parameter. FMI is equal to 1 minus the ratio of sampling variances of the complete data estimate to the missing data estimate. Therefore, FMI ranges from 0 to 1, and a higher value of FMI suggests a stronger impact of missing data (Savalei & Rhemtulla, 2012). Savalei & Rhemtulla (2012) proposed a 3-step procedure to compute FMI via FIML. Following these steps we found that with N = 200 the average FMI across all parameters for Model 1, Model 2, and Model 3 were 0.31, 0.26, and 0.36, respectively. These values suggest that the reason Model 2 converged more successfully at the smaller sample sizes is because there was more information available to inform the model’s complete covariance matrix compared to Models 1 and 3.
Application
An important application of this finding then is that when preparing to collect longitudinal data with a 3-form PMD design, it is important to be aware of the FMI differences among the different data collection waves to ensure that the covariance matrix is covered well enough to estimate a model that will converge on a model with acceptably unbiased parameter estimates. For example, say a researcher conducted a power analysis, got an idea about the FMI for a two-time-point CFA model, and determined that the minimum sample size necessary for that model was 80 participants. When this researcher went to analyze the first-wave only data, s/he may not have enough information for the estimator to converge on a model. If the researcher had run the power analysis, and learned the FMI on a cross-sectional CFA, then s/he would have known a higher minimum sample size was actually necessary for cross-sectional model to converge. This is an important finding for applied researchers, as relationships are often published from both the single wave and multi-wave data that are collected during longitudinal studies.
There are ways to increase a model’s convergence rate that depend on FIML or MI being used. The inclusion of auxiliary variables can improve the FMI for a model estimated with FIML. In this study, all available variables were included in the estimated model. However, often in applied research there are variables not included in the model that may still be able to inform the estimation of the model’s covariance matrix. The traditional benefit of the MI approach over FIML is that all the variables collected could be included in the imputation process. Therefore, any “auxiliary” information is included. MI estimated models can benefit, however, from a greater number of imputed datasets. As, mentioned earlier, when all else is the same, then the greater the number of imputed datasets used, the more equivalent to FIML the MI results will be.
In conclusion, when applied researchers utilize PMD designs, for analytic models to successfully converge with acceptably unbiased parameter estimates it is not simply that a larger sample is better. How much more of a sample size is better is dependent upon the complexity of the analytic model, the missing data technique used, criteria for estimation assessment, and the FMI. This study’s preliminary results suggest that when researchers use a 3-form PMD design with factor loadings greater than .7 and covariances of .2 to .5, to obtain reasonable convergence (convergence rate > .9) and parameter and SE estimates (parameter bias < .05, and SE bias < .1), for one-time-point, cross-sectional models a minimum of N = 90 may be sufficient when FIML is utilized; while N = 115 may be sufficient when MI is utilized. As the model complexity increases with multiple time points, it will be important for researchers to take into account the estimator and the parameters’ FMI to determine an appropriate increase in sample size. This study’s initial findings suggest that when population values of autoregressive coefficients range from .4 to .9, and cross-lagged paths range from .1 to .4, a 3-form PMD design needs at a minimum N = 80 for FIML and N = 115 for MI approaches to fit two-time-point CFAs; and N = 130 for FIML and N = 175 for MI approaches with three-construct mediation models. When estimating the sample size necessary for a study, it will be important for researchers to include a power analysis that reflects the amount and pattern of missingness present due to the PMD approach utilized.
Footnotes
*This article accepted during Marcel van Aken’s term as Editor-in-Chief.
Funding
This study was supported by grant NSF 1053160 (Wei Wu & Todd D. Little, co-PIs) and by the Center for Research Methods and Data Analysis at the University of Kansas (when Todd D. Little was director; 2009–2013). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies. Todd D. Little is now director of the Institute for Measurement, Methodology, Analysis, and Policy at Texas Tech University.
Appendix A
Parameter bias, SE bias, MSE and RE of factor loadings for Model 1. Parameter bias, SE bias, MSE and RE of latent covariances for Model 1.
N
Parameter bias
SE bias
MSE
RE
FIML
MI
FIML
MI
FIML
MI
FIML
MI
60
0.015
0.025
0.046
0.091
0.049
0.045
0.825
0.960
65
0.013
0.024
0.054
0.119
0.044
0.040
0.842
0.955
70
0.013
0.023
0.052
0.095
0.042
0.038
0.841
0.945
75
0.011
0.021
0.049
0.099
0.039
0.036
0.837
0.954
80
0.012
0.023
0.047
0.081
0.037
0.035
0.863
0.935
85
0.011
0.021
0.046
0.077
0.035
0.032
0.844
0.946
90
0.010
0.023
0.046
0.074
0.033
0.031
0.844
0.899
95
0.009
0.021
0.050
0.073
0.031
0.030
0.874
0.932
100
0.010
0.020
0.046
0.076
0.030
0.028
0.854
0.924
105
0.009
0.020
0.048
0.057
0.028
0.027
0.876
0.926
110
0.008
0.019
0.046
0.059
0.027
0.026
0.856
0.891
115
0.007
0.019
0.046
0.054
0.026
0.025
0.883
0.945
120
0.008
0.019
0.047
0.054
0.025
0.024
0.857
0.905
125
0.008
0.019
0.046
0.063
0.024
0.022
0.885
0.911
130
0.007
0.019
0.053
0.060
0.023
0.022
0.874
0.890
135
0.008
0.018
0.048
0.054
0.022
0.021
0.880
0.927
140
0.008
0.019
0.043
0.050
0.021
0.021
0.870
0.910
145
0.007
0.019
0.037
0.048
0.020
0.020
0.873
0.886
150
0.007
0.018
0.046
0.050
0.020
0.019
0.867
0.893
155
0.006
0.018
0.039
0.047
0.019
0.018
0.849
0.897
160
0.006
0.019
0.043
0.048
0.019
0.018
0.883
0.911
180
0.007
0.019
0.042
0.044
0.017
0.016
0.879
0.899
200
0.006
0.018
0.043
0.046
0.015
0.014
0.890
0.905
220
0.005
0.018
0.043
0.045
0.014
0.013
0.862
0.878
240
0.005
0.018
0.040
0.043
0.012
0.012
0.874
0.881
260
0.005
0.018
0.041
0.039
0.011
0.011
0.846
0.870
280
0.005
0.018
0.042
0.041
0.011
0.010
0.845
0.856
300
0.004
0.018
0.046
0.044
0.010
0.010
0.838
0.848
N
Parameter bias
SE bias
MSE
RE
FIML
MI
FIML
MI
FIML
MI
FIML
MI
60
0.061
0.141
0.084
0.100
0.059
0.069
0.859
0.916
65
0.053
0.124
0.080
0.087
0.054
0.060
0.843
0.829
70
0.048
0.095
0.100
0.105
0.053
0.058
0.821
0.777
75
0.045
0.083
0.084
0.078
0.047
0.050
0.862
0.848
80
0.048
0.091
0.084
0.089
0.043
0.048
0.864
0.846
85
0.045
0.068
0.080
0.076
0.041
0.044
0.865
0.849
90
0.051
0.077
0.076
0.080
0.038
0.042
0.855
0.803
95
0.048
0.070
0.068
0.078
0.036
0.040
0.859
0.786
100
0.046
0.074
0.071
0.072
0.034
0.037
0.845
0.818
105
0.046
0.065
0.066
0.074
0.032
0.035
0.856
0.797
110
0.036
0.054
0.064
0.075
0.030
0.034
0.838
0.778
115
0.037
0.051
0.068
0.076
0.029
0.032
0.856
0.788
120
0.038
0.054
0.050
0.060
0.027
0.030
0.867
0.797
125
0.033
0.049
0.047
0.060
0.025
0.029
0.871
0.781
130
0.031
0.048
0.051
0.058
0.025
0.028
0.843
0.767
135
0.027
0.042
0.043
0.055
0.023
0.026
0.871
0.791
140
0.030
0.042
0.047
0.057
0.023
0.025
0.855
0.786
145
0.029
0.040
0.053
0.059
0.022
0.025
0.865
0.801
150
0.030
0.044
0.048
0.053
0.021
0.023
0.865
0.807
155
0.031
0.048
0.041
0.047
0.020
0.022
0.887
0.831
160
0.031
0.045
0.040
0.048
0.019
0.022
0.886
0.815
180
0.030
0.041
0.041
0.050
0.017
0.019
0.867
0.790
200
0.029
0.043
0.036
0.040
0.015
0.017
0.882
0.806
220
0.027
0.044
0.043
0.051
0.014
0.016
0.879
0.813
240
0.023
0.036
0.049
0.049
0.012
0.014
0.886
0.813
260
0.023
0.040
0.042
0.048
0.012
0.013
0.851
0.790
280
0.022
0.038
0.047
0.047
0.011
0.012
0.869
0.790
300
0.019
0.035
0.037
0.044
0.010
0.011
0.864
0.800
Appendix B
Parameter bias, SE bias, MSE and RE of factor loadings for Model 2. Parameter bias, SE bias, MSE and RE of within-time-point latent covariances for Model 2. Parameter bias, SE bias, MSE and RE of between-time-point latent covariances for Model 2.
N
Parameter bias
SE bias
MSE
RE
FIML
MI
FIML
MI
FIML
MI
FIML
MI
60
0.025
0.040
0.086
0.099
0.036
0.037
0.859
0.890
65
0.019
0.037
0.073
0.082
0.034
0.034
0.867
0.872
70
0.015
0.036
0.072
0.074
0.032
0.032
0.846
0.876
75
0.013
0.035
0.069
0.065
0.029
0.030
0.851
0.865
80
0.012
0.033
0.060
0.059
0.027
0.028
0.858
0.861
85
0.012
0.034
0.060
0.057
0.025
0.026
0.846
0.850
90
0.011
0.031
0.062
0.054
0.024
0.025
0.845
0.842
95
0.011
0.030
0.055
0.050
0.022
0.023
0.842
0.831
100
0.009
0.028
0.052
0.048
0.021
0.022
0.858
0.843
105
0.009
0.028
0.054
0.046
0.020
0.021
0.839
0.820
110
0.009
0.027
0.055
0.048
0.019
0.020
0.847
0.836
115
0.009
0.026
0.052
0.046
0.018
0.019
0.846
0.839
120
0.009
0.024
0.052
0.048
0.017
0.018
0.858
0.840
125
0.008
0.026
0.051
0.045
0.017
0.017
0.871
0.859
130
0.008
0.025
0.054
0.049
0.016
0.017
0.861
0.845
135
0.007
0.024
0.049
0.045
0.015
0.016
0.858
0.847
140
0.007
0.024
0.050
0.048
0.015
0.015
0.849
0.820
145
0.006
0.022
0.050
0.047
0.014
0.015
0.864
0.851
150
0.006
0.022
0.050
0.047
0.014
0.014
0.858
0.849
155
0.006
0.021
0.047
0.044
0.013
0.013
0.861
0.846
160
0.006
0.016
0.048
0.048
0.013
0.014
0.863
0.832
180
0.006
0.014
0.045
0.045
0.011
0.012
0.848
0.824
200
0.005
0.013
0.041
0.042
0.010
0.011
0.857
0.818
220
0.006
0.012
0.040
0.042
0.009
0.010
0.856
0.818
240
0.005
0.011
0.039
0.041
0.008
0.009
0.858
0.828
260
0.006
0.011
0.040
0.042
0.007
0.008
0.858
0.814
280
0.006
0.010
0.041
0.043
0.007
0.007
0.851
0.804
300
0.005
0.009
0.039
0.040
0.006
0.007
0.852
0.825
N
Parameter bias
SE bias
MSE
RE
FIML
MI
FIML
MI
FIML
MI
FIML
MI
60
0.061
0.109
0.095
0.079
0.072
0.084
0.861
0.743
65
0.042
0.097
0.074
0.071
0.064
0.076
0.866
0.751
70
0.038
0.092
0.068
0.070
0.059
0.069
0.843
0.747
75
0.036
0.087
0.064
0.067
0.054
0.065
0.841
0.729
80
0.033
0.078
0.062
0.065
0.050
0.060
0.841
0.732
85
0.032
0.071
0.058
0.063
0.047
0.056
0.840
0.724
90
0.029
0.063
0.057
0.065
0.044
0.052
0.840
0.726
95
0.027
0.058
0.046
0.053
0.040
0.048
0.848
0.732
100
0.026
0.055
0.043
0.053
0.037
0.044
0.840
0.732
105
0.025
0.056
0.044
0.052
0.036
0.042
0.844
0.737
110
0.023
0.051
0.042
0.052
0.034
0.040
0.842
0.740
115
0.022
0.047
0.040
0.051
0.032
0.038
0.845
0.749
120
0.020
0.039
0.042
0.054
0.031
0.037
0.843
0.744
125
0.020
0.040
0.043
0.055
0.030
0.035
0.847
0.745
130
0.020
0.042
0.044
0.059
0.029
0.034
0.849
0.743
135
0.020
0.037
0.041
0.056
0.027
0.032
0.857
0.754
140
0.018
0.038
0.042
0.058
0.027
0.031
0.857
0.759
145
0.018
0.033
0.043
0.060
0.026
0.030
0.862
0.770
150
0.017
0.035
0.043
0.058
0.025
0.029
0.861
0.769
155
0.018
0.035
0.043
0.059
0.024
0.028
0.860
0.771
160
0.016
0.022
0.046
0.061
0.023
0.027
0.862
0.772
180
0.016
0.023
0.043
0.060
0.021
0.024
0.866
0.779
200
0.014
0.020
0.035
0.048
0.018
0.021
0.865
0.783
220
0.015
0.020
0.034
0.043
0.016
0.018
0.865
0.791
240
0.016
0.020
0.033
0.042
0.015
0.017
0.863
0.793
260
0.013
0.016
0.035
0.043
0.014
0.015
0.864
0.795
280
0.014
0.017
0.034
0.039
0.013
0.014
0.858
0.788
300
0.012
0.014
0.032
0.040
0.012
0.013
0.869
0.806
N
Parameter bias
SE bias
MSE
RE
FIML
MI
FIML
MI
FIML
MI
FIML
MI
60
0.059
0.106
0.099
0.088
0.069
0.080
0.822
0.725
65
0.042
0.095
0.082
0.076
0.061
0.072
0.815
0.726
70
0.037
0.093
0.077
0.069
0.056
0.066
0.807
0.739
75
0.035
0.087
0.072
0.067
0.051
0.061
0.804
0.706
80
0.033
0.079
0.065
0.065
0.047
0.056
0.807
0.707
85
0.032
0.076
0.065
0.061
0.044
0.053
0.798
0.701
90
0.031
0.070
0.066
0.064
0.041
0.050
0.792
0.691
95
0.029
0.066
0.057
0.055
0.038
0.045
0.804
0.708
100
0.026
0.059
0.053
0.058
0.035
0.042
0.799
0.701
105
0.025
0.060
0.055
0.054
0.034
0.040
0.807
0.718
110
0.023
0.055
0.050
0.054
0.032
0.037
0.809
0.723
115
0.022
0.051
0.048
0.052
0.030
0.035
0.810
0.730
120
0.021
0.046
0.049
0.056
0.029
0.034
0.808
0.722
125
0.021
0.048
0.047
0.053
0.027
0.032
0.818
0.726
130
0.021
0.048
0.047
0.056
0.026
0.031
0.811
0.726
135
0.020
0.045
0.046
0.053
0.025
0.029
0.818
0.733
140
0.021
0.046
0.047
0.055
0.024
0.028
0.823
0.742
145
0.020
0.043
0.047
0.057
0.023
0.027
0.819
0.738
150
0.019
0.043
0.045
0.054
0.023
0.026
0.825
0.746
155
0.018
0.042
0.048
0.057
0.022
0.025
0.829
0.753
160
0.017
0.030
0.048
0.060
0.021
0.025
0.829
0.743
180
0.016
0.027
0.044
0.056
0.019
0.022
0.827
0.740
200
0.014
0.026
0.042
0.053
0.017
0.019
0.839
0.758
220
0.013
0.023
0.037
0.045
0.015
0.017
0.827
0.759
240
0.013
0.021
0.037
0.045
0.014
0.015
0.831
0.764
260
0.012
0.019
0.037
0.044
0.012
0.014
0.836
0.769
280
0.012
0.018
0.035
0.039
0.011
0.013
0.833
0.768
300
0.011
0.017
0.038
0.043
0.011
0.012
0.836
0.766
Appendix C
Parameter bias, SE bias, MSE and RE of factor loadings for Model 3. Parameter bias, SE bias, MSE and RE of autoregressive coefficients for Model 3. Parameter bias, SE bias, MSE and RE of mediation pathways for Model 3.
N
Parameter bias
SE bias
MSE
RE
FIML
MI
FIML
MI
FIML
MI
FIML
MI
90
0.045
0.037
0.404
0.065
0.019
0.018
0.633
0.836
95
0.023
0.034
0.147
0.078
0.018
0.018
0.752
0.806
100
0.018
0.034
0.09
0.078
0.018
0.016
0.806
0.843
105
0.012
0.032
0.076
0.061
0.016
0.016
0.842
0.836
110
0.01
0.031
0.071
0.068
0.015
0.015
0.844
0.826
115
0.009
0.029
0.06
0.057
0.015
0.015
0.843
0.84
120
0.008
0.029
0.055
0.059
0.014
0.014
0.792
0.835
140
0.009
0.027
0.056
0.06
0.012
0.012
0.784
0.822
160
0.006
0.025
0.048
0.062
0.011
0.011
0.792
0.82
180
0.006
0.023
0.052
0.059
0.01
0.009
0.762
0.832
200
0.005
0.023
0.046
0.059
0.008
0.008
0.792
0.834
220
0.005
0.021
0.045
0.054
0.008
0.008
0.82
0.828
240
0.005
0.021
0.047
0.055
0.007
0.007
0.834
0.83
260
0.005
0.021
0.046
0.051
0.006
0.007
0.802
0.83
280
0.005
0.02
0.046
0.053
0.006
0.006
0.866
0.833
300
0.005
0.019
0.045
0.055
0.006
0.006
0.844
0.838
N
Parameter bias
SE bias
MSE
RE
FIML
MI
FIML
MI
FIML
MI
FIML
MI
90
0.099
0.068
0.379
0.1
0.057
0.046
0.582
0.605
95
0.047
0.067
0.166
0.111
0.046
0.044
0.684
0.574
100
0.03
0.067
0.107
0.108
0.042
0.043
0.724
0.62
105
0.028
0.066
0.09
0.094
0.04
0.041
0.726
0.633
110
0.025
0.065
0.084
0.098
0.037
0.041
0.741
0.641
115
0.021
0.063
0.079
0.094
0.034
0.04
0.727
0.628
120
0.017
0.062
0.072
0.095
0.032
0.038
0.722
0.643
140
0.019
0.057
0.067
0.081
0.027
0.033
0.728
0.649
160
0.015
0.052
0.061
0.078
0.022
0.027
0.72
0.668
180
0.012
0.049
0.061
0.068
0.02
0.024
0.72
0.666
200
0.012
0.047
0.058
0.068
0.017
0.021
0.738
0.685
220
0.011
0.043
0.054
0.062
0.015
0.018
0.739
0.676
240
0.011
0.041
0.049
0.069
0.014
0.016
0.789
0.697
260
0.01
0.04
0.05
0.058
0.013
0.015
0.785
0.705
280
0.01
0.038
0.048
0.054
0.012
0.014
0.776
0.705
300
0.008
0.036
0.049
0.053
0.011
0.013
0.779
0.696
N
Parameter bias
SE bias
MSE
RE
FIML
MI
FIML
MI
FIML
MI
FIML
MI
90
0.344
0.068
0.343
0.107
0.081
0.06
0.597
0.568
95
0.149
0.069
0.169
0.103
0.064
0.058
0.668
0.519
100
0.089
0.069
0.104
0.12
0.055
0.056
0.709
0.567
105
0.078
0.067
0.116
0.103
0.053
0.052
0.721
0.579
110
0.06
0.075
0.089
0.124
0.047
0.054
0.715
0.571
115
0.063
0.066
0.085
0.097
0.043
0.053
0.721
0.581
120
0.055
0.063
0.075
0.092
0.04
0.05
0.713
0.572
140
0.05
0.059
0.065
0.079
0.033
0.041
0.699
0.577
160
0.038
0.055
0.056
0.072
0.027
0.033
0.708
0.604
180
0.032
0.044
0.052
0.069
0.023
0.028
0.702
0.609
200
0.03
0.046
0.051
0.06
0.02
0.024
0.702
0.619
220
0.024
0.042
0.05
0.051
0.018
0.021
0.717
0.636
240
0.033
0.037
0.052
0.058
0.016
0.019
0.739
0.644
260
0.025
0.035
0.048
0.052
0.015
0.018
0.737
0.642
280
0.03
0.031
0.051
0.055
0.014
0.016
0.714
0.657
300
0.025
0.03
0.045
0.053
0.013
0.015
0.73
0.659
