Abstract
The topic of missing data has been receiving increasing attention, with calls to apply advanced methods of handling missingness to counseling psychology research. The present study sought to assess whether advanced methods of handling item-level missing data performed equivalently to simpler methods in designs similar to those counseling psychologists typically engage in. Results of an initial preliminary analysis, an analysis using real-world data, and a series of simulation studies were used in the present investigation. Results indicated that available case analysis, mean substitution, and multiple imputation had similar results across low levels of missing data, though in data with higher levels of missing data and other problems (e.g., small sample size or scales with weak internal reliability) mean substitution produced inflation of correlation coefficients among items. The present results support the use of available case analysis when dealing with low-level item-level missingness.
Keywords
The topic of missing data has garnered increasing interest in the past several years, with a small but influential literature base devoted to the topic. Within counseling psychology research, there have been calls for the use of sophisticated and complex methods to handle missing data (Schlomer, Bauman, & Card, 2010). Such calls are perhaps spurred on by advances in computational power and software accessibility that enable analyses that may have taken hours of dedicated mainframe time in the past to now be completed in seconds on a personal computer. Calls for increasing attention to advanced methodology within counseling psychology are encouraging insomuch as they speak to the importance of, and the eagerness of researchers to use, the best available nascent methodologies. However, there is a potential limitation if these complex methods are not necessary in many research contexts. Specifically, advanced methods of handling missing data were originally developed for, and are important to, longitudinal methodology with time-wave-level missing data but are often extrapolated to use with low-level, item-level missingness on multi-item scales (i.e., the type of missingness that most often confronts researchers in counseling psychology).
It is not clear whether these advanced techniques are necessary to deal with missingness that appears primarily in the context of individual items missing from multi-item scales. For example, in Schafer and Graham’s (2002) excellent review of best practices in handling missing data, item-level, scale-level, and time-wave-level missingness are discussed, but throughout much of the article there is little distinction between these categories when describing implications of imputation methods. Furthermore, examples given by Schafer and Graham are almost entirely of scale-level missingness, blurring the differences between levels of missingness. The same blurring is present within the paper by Schlomer et al. (2010), who discuss both item- and scale-level missingness but do not make distinctions between them; indeed, results of their analysis, which focuses on very high levels of scale-level missingness in a three-variable model, are extrapolated to item-level missingness. Thus, readers may be unclear about the impact of different methods of handling missing data at different levels of missingness. Although use of advanced methods with item-level missingness is certainly not incorrect, it may simply be unnecessary to conduct the complex procedures if simple methods perform just as well. Validation of relatively simpler procedures would enable researchers to spend less time mastering a complex technique that may not be necessary to their work (as well as prevent errors if that complex technique is ultimately performed incorrectly). Indeed, Wilkinson and the American Psychological Association Task Force on Statistical Inference (1999) averred that in most circumstances, the “minimally sufficient analysis” (p. 598) should be used in research. That is, although researchers should be aware of the most up-to-date methodologies, analysis methods should be chosen to handle the intended research question appropriately, and not to “impress [one’s] readers or to deflect criticism” (p. 598). Thus, there is a need to assess the impact of different levels of handling missing data at the item level.
In this article I address whether a computationally simple method of handling item-level missing data, available item analysis (AIA 1 ), functions equivalently to a traditional method (participant mean substitution 2 ) and an advanced method (multiple imputation) for item-level missingness. Specifically, I review theoretical bases for using AIA and examine the three methods in (a) an initial preliminary study of missing data replacement, (b) an example using real-world data, and (c) an example using simulation data.
Definitional Issues
Before delving into the topic of handling item-level missing data, clarification of terminology may be useful. First, I explain what is meant by “missingness.” Then, I review levels of missingness and major methods of handling missing data. This information is provided to orient the reader to these issues, and the interested reader may consult some other excellent and detailed summaries for further information, such as Schafer and Graham (2002).
Data can be missing in three ways: missing not at random (MNAR), missing at random (MAR), and missing completely at random (MCAR). 3 The clearest of these definitions is MNAR: In the case of MNAR, some extraneous variable, which itself is not measured, is related to the variable of interest and influences its missingness. Poorly worded or confusing items that are intentionally omitted by participants are an example of MNAR at the item level. For example, if a researcher was conducting a study on college women’s relationship satisfaction, and a scale contained an item such as “I get along well with my boyfriend or husband,” women who are in same-gender relationships or who are not in relationships may omit this item systematically, potentially biasing the responses to the item. If data on the gender and existence of a relationship partner were not assessed in that project (say, on the demographics page), the data would be MNAR because there is no way to connect the reason for the missingness (gender and existence of partner) to the item responses. MNAR at an item level may often be best thought of as an issue in the design of research rather than a statistical issue to be dealt with post hoc, and researchers should carefully plan projects, evaluate scales at the item level, and pilot surveys with diverse testers to avoid it before a study begins.
Data are MAR if missingness is due to some other, observed variable. For example, in the preceding case, data would be MAR if partner gender and relationship status were assessed. It is possible to use the available data on these other variables to estimate the missing data through techniques such as expectation maximization or multiple imputation.
Finally, data are MCAR (a special case of MAR) if missingness is due to a factor completely unrelated to the missing data. An example might be if a participant simply misses an item while completing a survey. 4 Under MAR and MCAR, researchers may assume that missing data are distributed such that a standard missing data technique might be applied (provided other reasonable conditions are met, such as some limit on tolerance for missing data; Schafer & Graham, 2002). Unfortunately, it is never possible to determine with complete certainty whether data are MAR/MCAR (though Little, 1988, developed a test that may help to support or not support MCAR). Researchers typically make an assumption that data might as well be treated as MCAR so long as there is not a clear bias in the missingness (such bias might be evident, for example, if a single item had an abnormally large number of missing values).
Levels of Missingness
The level at which missingness occurs is important to understanding the concept of missingness and considering strategies for handling it. The broadest level of missingness is exclusion from the study. For example, online studies exclude persons without Internet access, studies in the United States exclude persons in other countries, and studies conducted in English exclude people who do not speak, write, or read English. Like MNAR, missingness at this level is best addressed through design and interpretation of results, and is not a truly statistical issue.
Participants may also be missing data at the scale level (in the present article, scale is used to mean the level at which most analyses of questionnaire data takes place—i.e., mean scores 5 across multiple items all intended to measure some common construct. In many cases this would be “subscale”-level missingness, as in the case of a single scale with multiple subscales that are scored separately, such as domains of a Five Factor inventory). Participants may have begun participation but dropped out of the study before finishing and left one or more scales completely missing. This level of missingness may be handled with advanced methods such as multiple imputation that can take into account the available data and correlations among observed variables for all cases.
Finally, there is the level of missingness that most frequently confronts counseling psychology researchers: item-level missingness. At this level, participants have omitted one or more items from one or more multi-item scales in the study without completely missing any scales. It is this level of missingness with which many researchers are often forced to deal, and for which advanced methodologies are now being recommended.
Methods of Handling Missing Data
There are a number of methods of dealing with missing data. The oldest method is likely listwise deletion, or the deletion of participants with any missing data. Although this circumvents problems of imputation of missing data points and may be useful when there is an exceptionally tiny amount of missing data, it may often have a powerfully negative effect on sample size and be potentially biasing in samples with more than trivial missingness (Schafer & Graham, 2002). AIA, the method advocated in this article, is also known as pairwise deletion or pairwise inclusion. AIA is also an older method of handling missing data that involves using the available data for analysis and excluding missing data points only for analyses in which the missing data points would be directly involved. In most research using multiple-item scales, analysis takes places at the scale level, and thus AIA can be used to generate mean scores for scales using the available data without substituting or imputing values. In SPSS, through the “compute” command, the MEAN function is used to generate an AIA scale mean. 6 Participant mean substitution involves substituting the mean of a single participant’s nonmissing items on a scale for missing item values on that particular scale or subscale. 7 Practically, this is typically accomplished by generating the AIA scale mean then substituting this mean line-by-line for missing data on that scale. Mean substitution would produce output very similar to AIA for means (though rounding the substituted mean to fewer decimal places or to the integer would introduce some deviation), and variability (i.e., standard deviations) would be reduced. Multiple imputation involves the generation of multiple data sets with missing value imputations that factor in correlations among variables and random error, and pools these results. This is typically accomplished in a program such as NORM (Schafer, 2000), the multiple imputation add-on module for SPSS, or the MI and MIANALYZE procedure in SAS. Many other methods of imputation exist but are used relatively less often and in any case are generally supplanted by the other methods in terms of either simplicity or lack of bias. Another method worth noting, full information maximum likelihood estimation, is a special method of handling missing data in structural equation modeling applications that, like AIA, does not involve imputation of missing data at all but rather uses the available data to carry out analyses.
AIA offers a number of benefits over mean substitution and multiple imputation as applied to item-level missingness. The primary advantage is that imputation may not be at all necessary for most analyses when dealing with item-level missing data on multi-item scales. Variables are typically entered into analyses at the scale level after all item responses have been averaged. Thus, a complete (at the item level) data set is unnecessary. Even analyses that are sometimes thought of as requiring complete item-level data, such as calculation of Cronbach’s alpha or structural equation modeling, do not necessarily require complete data (AIA calculation of Cronbach’s alpha will be discussed shortly, and analyses in the structural equation modeling family, including path analysis, latent variable modeling, and confirmatory factor analysis, may use full information maximum likelihood estimation and bypass the need for any imputation).
Another advantage is that AIA is a simple procedure that can be implemented easily and is not limited by technological savvy of researchers, availability of software, or suitability of the collected data to the analysis technique. Multiple imputation, though useful in many contexts, is complex to learn and more complex to fully understand, requires learning how to navigate a program such as NORM (Schafer, 2000), and even after mastery data may still produce issues such as nonpositive definite matrices that complicate or preclude completion of the process.
Also notable is that several scales familiar to many counseling psychology research contexts exist for which the preferred scoring method is AIA. The Objectified Body Consciousness Scale (McKinley & Hyde, 1996) contains “not applicable” responses that are not included in mean score calculations, and missed items on the Lesbian, Gay, and Bisexual Knowledge and History Scale (Worthington, Dillon, & Becker-Schutte, 2005) are not counted in mean scale scores. Thus, some scale authors explicitly allow for the existence of nonresponse in their scales and recommend that AIA be used in creating a mean scale score.
The Present Studies
The present studies undertake an investigation of the differences between AIA, mean substitution, and multiple imputation on artificially constructed data sets simulating single measures (Studies 1 and 3) and on a real data set with multiple measures (Study 2). Participant mean substitution is included in the present analysis because it has been used frequently in the past, and evaluating it may be helpful in understanding the impact of this method in past research. Overall, I expected no meaningful difference to emerge among the methods. Throughout the results, I describe data to the third decimal place rather than the second to avoid confusing substantive differences with rounding.
Study 1
Study 1 used an artificial data set created specifically for this study as a preliminary investigation. The differences in values among the three methods of handling missing data were compared for mean differences, scale variability differences, differences in correlations among the items, and reliability coefficients. I hypothesized that no statistically or practically significant differences would emerge from comparisons and that the effect size of differences would be small.
Method
I constructed the artificial data set using a single run of a Monte Carlo simulation in Mplus (Muthén & Muthén, 2010). Responses to five variables were generated on a multivariate normal distribution then constrained to a 7-point scale (0-6). I constrained items to load onto a latent variable in the simulation to produce correlated item data, though the association was intentionally not strong to create an artificial data set that resembled a scale of, at best, average quality. Two hundred cases were generated. Table 1 provides summary data on the initial data set (before adding missing data) for reference.
Correlations Among the Original Data Set Items and Total Score, and Missing Value Items and Total Score.
Correlations above the diagonal represent correlations for the original data set (before generating missingness); correlations below the diagonal are the missing-value data. For the missing-value data, the number below the correlation is the n for that bivariate relation.
p < .01.
I generated missingness manually by creating a random number matrix of the same dimensions as the data matrix (5 × 200) with cell values between 1 and 100. If a cell value in the random matrix was equal to or less than 5, I replaced the corresponding cell in the original data set with a missing value code. In no instance was a participant allowed to have more than one missing value (thus, each participant was missing either one or no data points on the five-item scale). Ultimately, I replaced 51 values with missing values for as many participants out of a total of 1,000 data points, or 5.1% missingness. Though no systematic analysis of prevalence of missing data has been conducted, this level of missingness approximates a substantial level of item-level missingness; in prior work conducted by me, total item-level missingness has been under 0.2% (e.g., Parent & Moradi, 2010, 2011a, 2011b) in studies with designs similar to those employed by many counseling psychologists.
The missing-value data set was identical to the AIA data set. Means for the AIA data set (generated as the mean for the number of nonmissing items: If a case had five data points, the mean is the sum of responses divided by five; if four, then divided by four) and standard deviations are presented in Table 1. I generated mean substitution values by calculating the AIA means for each participant, rounded these values to the nearest integer, and substituted them the missing data points. Means and standard deviations for mean substitution are presented in Table 2. I conducted multiple imputation in SPSS using 1,000 iterations to generate five imputed data sets. In the multiple imputation data set, values were rounded to the nearest integer and constrained to be within the original 0-6 range; means and standard deviations are presented in Table 2. Reported results are for the pooled multiple imputation data set, with the exception of Cronbach’s alpha.
Correlations Among Mean Substitution and Multiple Imputation Item and Total Scores.
Data for mean substitution are above the diagonal and pooled data for multiple imputation are below the diagonal.
p < .01.
In SPSS, Cronbach’s alpha is calculated with listwise deletion to handle missing data. However, it is possible to generate an AIA Cronbach’s alpha using syntax (see the appendix 8 ). The resultant alpha is reported for the AIA data. Similarly, calculation of Cronbach’s alpha in the multiple imputation data set also presented a problem; Cronbach’s alpha does not have a standard error in SPSS multiple imputation and so a pooled estimate could not be generated. Thus, Cronbach’s alpha values for the five imputed data sets are reported. The 95% confidence intervals are also reported for all alphas (using syntax from Iacobucci & Duhachek, 2003).
Results
First, I compared differences in mean values for items and total scores across the three data sets. Absolute values of the differences among the scores are reported in Table 3. The largest difference, between the AIA and mean substitution methods on Item 3, was 0.034.
Mean Score Comparisons Across Methods
AIA = available item analysis; Mean = mean substitution; MI = multiple imputation.
Next, differences among variabilities were assessed. Emergent differences were very small, with an average mean of absolute differences in standard deviations of .035 (SD = .122). The largest observed absolute difference in standard deviations was .085, for the difference between mean substitution and multiple imputation on Item 1.
Next, differences in the correlation matrices were assessed. Absolute values were calculated for the differences between each bivariate pair of data sets (presented in Tables 1 and 2). The largest difference among correlations was .044 on Items 2 and 3 when comparing mean substitution and multiple imputation.
Finally, Cronbach’s alphas were calculated for the three data sets. Alpha for items on the AIA data set (using the aforementioned modification) was .643, 95% CI [.558, .716]; alpha for items on the mean substitution data set was .675, 95% CI [.598, .741]; and alphas for the five multiple imputation data sets were .630, 95% CI [.542, .705]; .634, 95% CI [.547, .708; .649, 95% CI [.566, .720]; .653, 95% CI [.571, .724]; and .627, 95% CI [.539, .703].
Discussion
In sum, across the artificial data sets, there was no major difference between the methods of missing data replacement in terms of mean values, variability, correlations, or Cronbach’s alpha internal reliability coefficients. These results support the equivalence of missing data methods across these artificial data sets in this preliminary investigation; to build on this finding, another analysis using real data was undertaken.
Study 2
The purpose of Study 2 was to further investigate the effects of the three methods of handling missing data using an existing data collection set. This investigation provides an assessment of the methods using data more akin to what researchers are actually accustomed to using. Once again, I expected that no meaningful differences would emerge.
Method
Data were taken from a larger data set previously collected by the present author. The data are explained in greater detail in Parent and Moradi (2011a). The included data consisted of six scales: the Sociocultural Attitudes Toward Appearance Questionnaire (SATAQ; Heinberg, Thompson, & Stormer, 1995), the Objectified Body Conciousness Scale Surveillance and Shame subscales (OBCS-Surv and OBCS-Shame; McKinley & Hyde, 1996), the Drive for Muscularity Scale (DMS; McCreary & Sasse, 2000), and the Outcome Expectations for Anabolic Androgenic Steroid Use and Intentions for Anabolic Androgenic Steroid Use scale (OE-AAS and I-AAS; Parent & Moradi, 2011a, and 276 cases. There were 33 missing data points distributed over 23 cases. (There are slightly more missing data in the this data set than the published article because missing data on the OBCS-Surv and OBCS-Shame were treated using AIA in the original study, as per their instructions. In the present study they were treated as missing to allow more missing data points with which to assess the impact of those methods. In addition, there were 6 extra participants who were identified as multivariate outliers in the original study whose data were retained for the present analysis, resulting in fit statistics and scale scores differing slightly from the original model.) All of the scales had at least 1 missing data point. AIA, mean substitution, and multiple imputation were all conducted in the same manner as in Study 1.
Analyses are similar to the previous study. However, a replication of the path analysis from the original research is presented to examine the impact of the methods of handling missing data on analyses that might be undertaken by researchers. I present the same statistics as in Study 1 (means, standard deviations, mean comparisons, variabilities, correlation matrices and correlation matrix comparisons, and Cronbach’s alpha reliability coefficients). The path analysis undertaken in Parent and Moradi (2011a) is replicated using each of the three methods of handling missing data, and I present path coefficients and fit indices. Again, for all comparisons I expected no meaningful differences to emerge.
Results and Discussion
First, differences in mean values for items and total scores were compared across the three data sets. Mean scores and absolute values of the differences among the scores are reported in Table 4. The largest difference, between AIA and mean substitution on DMS, was 0.007.
Mean Score Comparisons Across Methods.
AIA = available item analysis; Mean = mean substitution; MI = multiple imputation.
Next, differences among variabilities were assessed. Again, emergent differences were very small, with an average mean of absolute differences in standard deviations of .003 (SD = .007). The largest observed absolute difference in standard deviations was .020 for the difference between mean substitution and multiple imputation on OBCS-Shame.
Next, differences in the correlation matrices were assessed. Absolute values were calculated for the differences between each bivariate pair of data sets, and the matrices are presented in Table 5. The largest difference among correlations was .002, which occurred for several of the comparisons.
Cronbach’s Alpha for Scales by Missing Data Method.
AIA = available item analysis; Mean = mean substitution; MI = multiple imputation.
Next, Cronbach’s alphas were calculated for the three data sets. Alpha for the items are presented in Table 5 (confidence intervals omitted as the homogeneity of the results minimizes their utility).
Finally, the original path analyses were replicated. Figure 1 presents the path diagram and Table 6 presents path coefficients and fit indices for each of the three methods of data collection.

Final model in Parent & Moradi (2011a).
Comparison of Standardized Path Coefficients (With Standard Errors) and Fit Indices Across Methods.
Note. Values in the Paths columns are the standardized path coefficients and standard errors (in parentheses).
AIA = available item analysis; Mean = mean substitution; MI = multiple imputation; CFI = comparative fit index; RMSEA = residual mean square error of approximation (with 95% confidence interval); SRMR = standardized root mean residual.
Study 3
The purpose of Study 3 was to investigate the methods of handling missing data in a series of simulated data sets that varied by specific parameters. This investigation was intended to provide additional detail on the impact of the three methods of handling missing data as a function of number of items, sample size, association among items, and level of missingness.
Method
Monte Carlo simulations were conducted in R (R Development Core Team, 2008). Data for 1,000 Monte Carlo runs for each condition were generated on multivariate normal distributions with a mean of 4.00 and a standard deviation of 1.25. The data were recategorized to create a 7-point integer response scale (i.e., 0 = values < 0-1, 1 = 1-2, etc.). In each run, data sets varied by number of items, sample size, association among items, and level of missingness. Number of items was set to 3, 5, or 10, representing three common scale lengths (3 items being suggested as the minimum to conduct structural equation modeling using latent variable modeling, though such short scales are probably generally unreliable and suboptimal for use as indicators in structural equation modeling; Tabachnick & Fidell, 2007; Worthington & Whittaker, 2006). Sample size was set to either 50 (representing a small-scale study) or 200 (representing the lower conventional bound for structural equation modeling studies; Tabachnick & Fidell, 2007). Associations among items were set to approximately .30 (representing a relatively weak scale) and .50 (representing a standard scale). Level of missingness was set to 1%, 5%, or 10%. The levels chosen for investigation were intended to represent data that might commonly occur in counseling psychology research; a literally infinite number of other levels and combinations of each of the variables is possible, and future work may continue to explore such variables.
Missingness was imposed quasi-randomly within the R runs, with the limitation that no row could be completely missing (this would be akin to a participant missing the scale entirely). Multiple imputation was conducted within R using the AMELIA program (Honaker, King, & Blackwell, 2011) and the multiple imputation runs were transferred to SPSS version 18 (SPSS Inc., 2010) for pooling as in Studies 1 and 2.
Results are presented in Tables 7, 8, and 9. I present the means, standard errors, and Cronbach’s alphas with 95% confidence intervals for the data sets.
Monte Carlo Results for Three-Item Simulations.
AIA = available item analysis; Mean = mean substitution; MI = multiple imputation. Correlations for AIA data use pairwise deletion. Alphas for MI are summarized across the five imputation runs. Values in M and SE rows represent item values.
Monte Carlo Results for Five-Item Simulations.
AIA = available item analysis; mean = mean substitution; MI = multiple imputation. Correlations for AIA data use pairwise deletion. Alphas for MI are summarized across the five imputation runs. Values in M and SE rows represent item values
Monte Carlo Results for Ten-Item Simulations
AIA = available item analysis; Mean = mean substitution; MI = multiple imputation. Correlations for AIA data use pairwise deletion. Alphas for MI are summarized across the five imputation runs. Values in M and SE rows represent item values.
Results and Discussion
Results of the simulation studies in general support the equivalence of AIA and multiple imputation to handle missing data, even in suboptimal conditions (i.e., low sample size, low associations among items, and small number of items) and in the “worst” condition (i.e., n = 50, associations = weak, number of items = 3). Differences that did arise between AIA and multiple imputation were generally negligible and did not appear to take on a consistent pattern. Mean substitution appeared to fair adequately at low levels of missingness but produced increasingly inflated correlations among items and alphas as missingness increased to 10% and when there were other negative circumstances (a lower n, lower interitem correlations, or fewer items). Ultimately, though, within the “best” data condition (i.e., n = 200, associations = moderate, and number of items = 10), even 10% missing data did not produce notable difference among methods (means, standard errors, and alphas were similar across methods).
General Discussion
The present article was an investigation of the suitability of handling missing data by using AIA rather than conducting imputation methods. Given that research conducted by counseling psychologists typically involves scale- or subscale-level, not item-level, analysis, the present findings indicate that imputation of item-level missing data for most situations encountered by counseling psychologists may not be necessary. Indeed, in investigating the differences between methods of handling missing data, few notable differences were observed. In Study 1, using simulated data, there were no meaningful differences in mean values of items or total scores between methods, no meaningful differences in variances, and negligible differences in the correlation matrices. There were slight differences in Cronbach’s alphas, though none of these differences were significant as evident from examination of the confidence intervals. These findings were replicated in Study 2 using real data; again, no significant differences emerged across any of the analyses. In Study 3, it became apparent that mean substitution produces inflated correlations when there is a substantive amount of missing data along with an additional challenge in the data (e.g., few items, weak interitem correlations, or low sample size). In sum, AIA and multiple imputation demonstrated approximately equivalent performance across the studies and across data conditions. Mean substitution does appear to degrade as a method of handling missing data when missingness is high and there are other nonfavorable conditions in the data.
There have been many recent calls for use of advanced methodology to handle missing data. Much of this literature is based on longitudinal research paradigms and time-wave-level missingness at the scale, subscale, and construct level, but is extrapolated to item-level missingness. Such methodology may not be necessary to handle the type and level of missingness that counseling psychologists most frequently encounter, and the results of the present article suggest that calls for use of advanced methodology in handling low-level, item-level missingness may be overgeneralized. At the very least, researchers may feel more confident in the results of older research that used participant mean substitution, despite some suggestions that such procedures are unacceptable (Schlomer et al., 2010).
Researchers may wish to consider whether any imputation is necessary for their analyses. As mentioned, because most research does not involve item-level analyses, using available data (i.e., AIA) to generate mean scores may be a fine solution in most circumstances and may circumvent problems (bias, complexity) associated with both simple methods of imputation (e.g., mean substitution) and complex methods (e.g., multiple imputation). Analyses that do involve item-level data may also use data sets with missingness; exploratory factor analysis can be run with missing data and structural equation modeling may be run using full information maximum likelihood estimation (all provided the level of missingness is low and not MNAR), provided that the missingness prevalence is low and the pattern does not disrupt the covariance matrix. The present article also provides syntax for calculating Cronbach’s alpha in SPSS using AIA.
The results of the present article are specific to, and do not extend beyond, item-level missingness at the prevalence common in counseling psychology research. The next level of missingness, scale-level missingness, may require advanced procedures such as multiple imputation. However, although no detailed meta-analysis has been undertaken to examine the extent of scale-level missingness in research, in my experience the extent of missingness at the scale level has been so low that the benefit of running a scale-level imputation procedure would be minimal (perhaps a gain of one or two dozen participants to a sample of several hundred; Parent & Moradi, 2010, 2011a, 2011b). Finally, although the present analyses do not extend to longitudinal analysis, it is important to note that innovations in analysis allow for many longitudinal analyses to be conducted with time-wave level missing data (Duncan, Duncan, & Strycker, 2006) using available data, again circumventing any need to conduct any imputation.
Further research may address some of the implications of this article and advance our understanding of handling missing data. For example, additional Monte Carlo analyses could be conducted to assess the extent of differences between methods of handling missing data as data sets vary in sample size, number of items, correlations among items, and level of missingness (the possible combinations of these variables is literally infinite, and the present analyses undertook only one particular subset of the infinite possible analyses). Monte Carlo simulations could also be set up with artificial independent and dependent variables, which themselves have differing levels of items, intercorrelations, and missinginess, analyzed using a common method such as regression, to further explore the methods. Archival data sets could also be examined to assess the impact of multiple ways of handling item-level missing data in real-world scenarios.
Recommendations
Overall, the present study suggests a number of considerations for counseling psychologists. First, a number of issues related to missing data can be handled before the study even begins. Before collecting data it would behoove researchers to carefully assess scales at the item level and pilot these items with diverse assistants to check for items that may be unanswerable to particular persons (e.g., relationship-related items that contain heterosexist assumptions) to avoid issues of MNAR before they arise. Given the increasing diversity of institutions of higher education, it should not be difficult to recruit a small number of piloters who vary in gender, gender identity/expression, race/ethnicity, sexual orientation, ability status, age, relationship status, or other potentially confounding individual difference characteristics that may affect interpretation of or responses to items. The feedback of these testers can be used to inform scale selection or careful adjustment of items to be inclusive and understandable. As well, researchers may time these piloters to ensure that their survey protocol is not unreasonably long.
Second, after data are collected, researchers may examine their missing data to choose an appropriate level of overall tolerance for missing data (say, 20% missingness on items on any given subscale per participant) and then evaluate whether it is really going to be necessary to analysis to impute missing values or if AIA may be used instead. 9 Importantly, this decision is not to be made unmindfully. For instance, imagine a researcher conducting a study using three scales; one has 4 items and the other two have 10 items each. Were the researcher to set an arbitrary tolerance of 20%, this would mean that any participant missing any responses on the first scale would be excluded. This lopsided deletion could be potentially biasing to the results.
Third, researchers should clearly and concisely detail the level of missing data in their studies. These data must be specific; Schlomer et al. (2010) recommended that a “statement giving the range of missing data [be provided]. For example, one could say, ‘Missing data ranged from a low of 4% for Attachment Anxiety to a high of 12% for Depression’” (p. 9). But such a statement is unclear—what level of missingness is being written about? Does the writer mean that participants missed 4%-12% of data (as in, e.g., Attachment Anxiety was a 25-item scale and participants missed at most one item)? Or, does it mean that of all data points in the n × k matrix for Attachment Anxiety, 4% were missing (which would mean that the percent is interpretable only by referring to sample size and number of items for each scale, and also omits the maximum amount of missing data by participant)? Instead, I recommend that authors (a) state their tolerance level for missing data by scale or subscale (e.g., “We calculated means for all subscales on which participants gave at least 75% complete data”) and then (b) report the individual missingness rates by scale per data point (i.e., the number of missing values out of all data points on that scale for all participants) and the maximum by participant (e.g., “For Attachment Anxiety, a total of 4 missing data points out of 100 were observed, with no participant missing more than a single data point”). Researchers must also verify that before conducting AIA, they manually inspected missing data for obvious patterns, such as abnormally high missing rates for only one or two items. This can be accomplished easily by simply requesting frequencies output for the items and checking the observed (i.e., not missing) data points for each scale, and ensuring that there are no abnormal spikes in missingness. Although Schlomer et al. (2010) recommended missing data patterns be presented in a table, we typically take researchers at their word about other substantial issues (e.g., meeting normality or homogeneity of variance or linearity assumptions) and so the above seems sufficient.
Fourth, when researchers encounter item-level missing data they may choose to consider using AIA rather than using participant mean substitution or multiple imputation, presuming that missing data levels are assumable to be not MNAR, are at normal levels (e.g., below 10% of all data on each scale, and ideally much lower), and there are no other major complicating concerns (e.g., low sample size, poor internal reliability of scales, scales with fewer than five items). In most circumstances, research does not use item-level responses as variables but rather scale mean scales. For example, most regressions or group difference testing analyses are conducted with scale scores. Thus, AIA means can be generated (e.g., by using the MEAN function in the “compute” command of SPSS) and those are the variables that would be entered into an analysis. Many other analyses take place at the item level but do not require complete data; factor analysis can be run with missing data, and structural equal modeling using item indicators does not need complete data if full information maximum likelihood estimation is used. Cronbach’s alpha is calculated by default using listwise deletion in SPSS; however, it is possible to compute an AIA Cronbach’s alpha using a covariance matrix obtained using pairwise deletion (see the appendix). Thus, researchers should think carefully about whether they actually need a complete data set at the item level to carry out their analyses; if they do not, they may simply use AIA.
Fifth, researchers may feel more confident in interpreting studies that have used older methods of handling missing data, such as participant mean substitution, at low-level, item-level missingness, at least in situations where researchers have a sizable sample and reliable scales.
Footnotes
Appendix
* This file contains the syntax capable of generating a Cronbach’s alpha internal reliability coefficient for data files with missing data. Presently the syntax is written such that alpha is calculated one “measure” (i.e, the intended unit of analysis, whether it be scale or subscale).
* Differences may arise between version of SPSS; the present syntax were verified to work only in SPSS v.18.
* #####BEGIN#####
* This syntax calls up a correlation matrix with Covariates also displayed (i.e., the CORRELATIONS command), handling data with pairwise deletion. The raw generated file contains extra information (e.g., Correlations, Means, SDs, Ns). These have to be deleted to create a matrix composed of only covariances. The OMS commands accomplish this. The syntax produces the alpha value and c-bar and v-bar for verificaton purposes.
* THINGS TO CHANGE:
* -Replace “var1…var10” with a list of the variables for which you wish to calculate an alpha
OMS
/select tables
/if commands = [‘Correlations’] SUBTYPES = [‘Correlations’]
/destination format = sav OUTFILE = ‘alpha1.sav’ .
CORRELATIONS
/VARIABLES= var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
/STATISTICS XPROD
/MISSING=PAIRWISE .
OMSEND .
GET FILE “alpha1.sav”.
SELECT IF (Var2=’Covariance’).
SAVE OUTFILE ‘alpha2.sav’
/DROP = Command_ Subtype_ Label_ Var1 Var2 .
GET FILE ‘alpha2.sav’.
MATRIX .
GET X /FILE=* /VARIABLES ALL.
COMPUTE k = nrow(x) .
COMPUTE SUMVAR=TRACE(X).
COMPUTE SUMCOV=(MSUM(X)-SUMVAR)/2.
COMPUTE VBAR = SUMVAR/k .
COMPUTE CBAR = SUMCOV/((k*k-k)/2) .
PRINT VBAR.
PRINT CBAR.
COMPUTE AIAALPH = (k*CBAR)/(VBAR+(k-1)*CBAR) .
PRINT AIAalph .
END MATRIX .
Note: Please note that if copying and pasting the syntax directly from this manuscript as a pdf, there may be formatting issues that need to be resolved (such as accidental copying of header or footer text, or retention of formatting from the article that is not readable in SPSS syntax [i.e., double and single quotation marks may be pasted as curly-quotes, which will need to be deleted and retyped].
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
