Impact of Outliers Arising From Unintended and Unknowingly Included Subpopulations on the Decisions About the Number of Factors in Exploratory Factor Analysis

Abstract

There is a lack of research on the effects of outliers on the decisions about the number of factors to retain in an exploratory factor analysis, especially for outliers arising from unintended and unknowingly included subpopulations. The purpose of the present research was to investigate how outliers from an unintended and unknowingly included subpopulation affected the decisions about the number of factors to retain using four commonly used methods separately. The results showed that all the decision methods could provide biased results and the number of factors could be inflated, deflated, or remain the same depending on the decision methods used and outlier conditions. The findings also revealed that symmetric outliers did not affect the three principal component analysis–based methods but affected chi-square (ML) sequential tests. Finally, sample size did not play a role in the effect of outliers.

Keywords

exploratory factor analysis number of factors outliers

Exploratory factor analysis (EFA) is a widely used statistical technique in the psychosocial, behavioral, and health sciences. However, the matter of the impact of outliers on the decisions about the number of factors to retain is largely undocumented in the psychometric and methodological literature. Recently, Liu, Zumbo, and Wu (in press) demonstrated the impact of outliers on the decision about the number of factors. Their study focused on outliers that are errors in the data—that is, Liu and Zumbo’s (2007) first category of outlier sources. Liu and colleagues found that this type of outliers inflated, deflated, or had no effect on the number of factors retained, depending on the extent of outlier contamination and which decision method (e.g., parallel analysis, Kaiser–Guttman’s eigenvalues-greater-than-one, minimum average partial, or sequential chi-square tests) was used.

The purpose of the present article is to continue this line of research and investigate Liu and Zumbo’s (2007) second and third categories of outliers using a probabilistic mixture of distributions. With this purpose in mind, we first discuss the connection between the various sources of outliers and the models used for simulating them in psychometric studies. Next, two studies are reported. The first study demonstrates the impact of these types of outliers on the decision about the number of factors to retain. The second study is a follow-up to the first study, including a focused simulation study of a small correlation matrix and a report on the skewness and kurtosis of variables that were simulated using the same simulation design as in Study 1. The second study provides insight into how the outliers altered the properties of the correlation matrix. Readers interested in a review of the literature as well as a discussion of the need to study outliers in terms of deciding on the number of factors to retain should see Liu et al. (in press).

Various Sources of Outliers and the Models Used in Simulation Studies

As Zumbo and Zimmerman (1993) state, computer simulation (including Monte Carlo simulation) is an empirical method of experimental mathematics that is loosely defined as the mimicking of the rules of a model (in our case, a psychological or psychometric phenomenon) via random processes. The key concept in this definition is the correspondence of the psychometric or psychological process to how it is being mimicked in the simulation. For example, in the case of the study of outliers, it is important that the simulation method matches the source of outliers being considered. Below, we briefly review a taxonomy of sources of outliers and then address the simulation methods that can be used to mimic these outliers.

Sources of Outlier Contamination

In terms of psychometric analysis, Liu and Zumbo (2007) described three categories of possible sources of outliers in item responses—that is, a univariate distribution of item responses. As noted above, the first category usually refers to “errors” in the data, instantiations of which include errors that occur during data collection, data recording, or data entry. The outliers generated from such sources are obviously illegitimate observations and should, when found, be corrected. This first category of outliers arise from mistakes and are hence specific to a particular data set, so that they are a property of a sample but not of a population. For example, typically it does not make sense to talk about the number of typographical data entry errors in a population; however, it does make sense to talk about the number of such typos in a sample. Because of this characteristic, this type of outliers distinguishes itself from the other two categories of outliers in terms of the outlier generation models used in methodological and simulation studies—that is, deterministic or slippage simulation models.

The second category of outliers refers to the unpredictable measurement-related errors from participants, including guessing and inattentiveness during item responding, which may be caused by fatigue or participants’ lack of interest in participation. Another example of this category includes item misresponding, which happens when, for example, participants misunderstand the instructions or the descriptors on the response scale (e.g., Barnette, 1999). Unlike the first category, which are clearly errors and sample specific, depending on the particular psychological processes in item responding, the second category of outliers may be considered either (sample specific) errors or characteristics and propensities of respondents and hence a population characteristic. For example, misunderstanding the item response instructions may be due to something that reflects momentary inattention or an inherent inattentiveness by the respondent. The former is a sample specific error (and hence akin to an error of the kind in the first category), whereas the latter is, by definition, a characteristic of respondents and hence may reflect a subgroup of inattentive respondents in the population of possible item respondents.

Liu and Zumbo’s (2007) third category of outliers occurs when researchers unknowingly recruit some individuals who are not members of the target population, resulting in a subpopulation for whom the measure operates differently than for the target population. Liu and Zumbo described an example of this in the context of self-concept research conducted with a student population wherein some study participants are from Asian countries for which self-concept may be a different construct. There are many examples of this sort of problem as evidenced by the growing number of articles on construct comparability and test adaptation (e.g., Hambleton, Merenda, & Spielberger, 2005).

Outliers from the third category as well as the second category (when they are a characteristic of respondents) reflect an unintended and unknowingly included (henceforth referred to as “unintended”) subgroup in one’s target population, which are usually simulated via probability models. Although they represent different psychological phenomena, these outliers behave the same mathematically and hence can be simulated by the same outlier generation models.

Models Used in Simulation Studies

In the statistical literature, one sees reference to three common models for simulating outliers: deterministic, slippage, and mixture models. Deterministic and slippage models are typically used for the first category and for sample-specific errors in the second category, whereas mixture models are typically used for the second category of outliers that are a characteristic of respondents (and, hence, a population characteristic) and for the third category of outliers. Whether it is the second and third categories, the mixture model is used to mimic unintended subpopulations. It should be noted, however, that the slippage model can, in particular instances, also be used to model unintended subpopulations except that the number of outliers, in this case, would be fixed from replication to replication.

Deterministic model

The first category of outliers, errors in the data, has been simulated using a deterministic model (Barnett & Lewis, 1994). Because this type of outliers is sample specific, the number of outliers is fixed for a sample and rejection of the null hypothesis of no outliers is deterministically correct, as these outliers are obviously different from the majority of observations (Barnett & Lewis, 1994). One way to simulate outliers using a deterministic model is simply to alter the original data, by either multiplying or adding a constant to raw scores. Examples of this type abound in the literature and include EFA studies by Yuan, Marshall, and Bentler (2002) and Study 1 of Liu et al. (in press). In both examples, outliers were created by multiplying raw scores of one or more variables by a constant (2, 3, 4, or 5) for a certain proportion of subjects in a sample.

Slippage model

Another common strategy of simulating outliers as errors in the data (i.e., the first category of outliers) is the slippage model. Like the deterministic model, the number of outliers in a sample is fixed from replication to replication in a simulation study; however, with the slippage model, these outliers arise from some probability distribution.

The slippage model has been widely discussed and used in the literature (e.g., Anscombe, 1960; Barnett & Lewis, 1978; Dixon, 1950; Liu et al., in press). In its general form, the null model (without outliers) is

H : x_{j} \in F (j = 1, 2, \dots, n) .

The alternative model is

\bar{H} : x_{i} \in F (i = 1, 2, \dots, I), x_{p} \in G (p = 1, 2, \dots, P),

where I + P = n; F denotes an target distribution (sometimes called parent distribution); G denotes a contamination distribution with a different mean and/or variance; n denotes the total number of observations in a sample; I is the number of observations from a target distribution; and P is the number of observations from a contamination distribution. In the null model, all observations are assumed to come from the same population distribution. In the alternative model, a small number of observations are assumed to come from a contamination distribution, and the total number of observations in a sample is the sum of observations from a target distribution and from a contamination distribution (Balakrishnan & Childs, 2001).

Mixture of distributions

One can think of this model intuitively as mixing two different population distributions together. A psychological example may help make this concrete: people from Denmark typically rate their life satisfaction as much higher than people from Hungary (Organization for Economic Co-operation and Development, 2005). If we targeted people in Denmark for an investigation, but unknowingly also recruited a small group of people who just immigrated to Denmark from Hungary, the observations recruited are from a mixture of two populations and responses from Hungarian people might appear as outliers. To mimic this kind of outliers, one would use a mixture of distributions,¹ which has been widely used in the research literature and also is used in the present research.

A mixture of two distributions is a general model, comprising two weighted probability distributions with positive weights that sum up to one (Blischke, 1978). As the weights represent a probability distribution, the mixture is also a probability distribution. The two distributions thus mixed, depending on the parameter values for the mixing, represent different populations. These components of a mixture of distributions can be normal distributions or nonnormal distributions (e.g., Poisson, negative binomial distributions). In the statistical and psychometric research literatures, a mixture of two normal distributions has been frequently used for simulating outliers. One of the most well known and widely used mathematical models is the mixture contamination model—also referred to as the mixed normal distribution, which was introduced by Tukey (1962) and later extended by Huber (1964), Mosteller and Tukey (1968), and Barnett and Lewis (1994). This mixture contamination model is the one used herein. It is generated by including two normal distributions, a target distribution with mean µ and standard deviation σ, N(µ, σ), denoted by F, and a contamination distribution with some values of mean and/or standard deviation different from F, denoted by G.

Given a sample of n independent observations, X_i (i = 1, 2, . . . , n), the majority of the data points follow the target distribution F and the proportion of the sample is denoted by 1 − p, whereas a small fraction, p, follows the contamination distribution G. The mixed contamination model is a mixture of F and G. The null model is

H : x_{j} \in F (x) .

The alternative model is

\bar{H} : x_{j} \in (1 - p) * F (x) + p * G (x), 0 < p < 1 / 2,

where the amount of contamination/outliers p must be less than one half, and often substantially less, which indicates the probability that an observation arises from a contamination distribution G. If the amount of outliers is as large as near half, any outlier treatment methods, such as robust methods, are not legitimate to apply to these outliers in practice and the outliers should be modeled as another population. It is important to note that, in a simulation study with, for example, 100 replications, the proportion of the sample from the G distribution is itself a random variable whose average over the 100 replications (i.e., the expected value) is the proportion p—that is, the proportion of outliers varies from sample to sample, however, on average it will be p. In the results section, the varying proportion of outliers from sample to sample was shown in the description of our simulation method (Table 1).

Table 1.

Documenting Simulations Using Mixture of Distributions: Proportion of Outliers in Each Sample Across 100 Replications

Pc	n	Mean	SD	Minimum	1st Quartile	Median	3rd Quartile	Maximum
0.010	1,000	0.010	0.003	0.003	0.008	0.010	0.012	0.018
	500	0.010	0.004	0.000	0.008	0.010	0.012	0.022
	250	0.010	0.007	0.000	0.004	0.008	0.012	0.028
0.080	1,000	0.080	0.010	0.060	0.073	0.080	0.088	0.106
	500	0.080	0.011	0.054	0.072	0.082	0.088	0.106
	250	0.080	0.019	0.040	0.068	0.076	0.088	0.128
0.150	1,000	0.150	0.010	0.121	0.143	0.150	0.156	0.178
	500	0.150	0.015	0.108	0.140	0.149	0.162	0.184
	250	0.150	0.023	0.104	0.132	0.150	0.168	0.200

Note. Pc = proportion of contamination in the population; n = sample size; SD = standard deviation.

Slippage models and the mixture contamination model share some similarities, but have some fundamental differences. Barnett and Lewis (1994) pointed out that the number of outliers is fixed in a slippage model and outliers are regarded as fixed contamination, whereas the number of outliers is a random variable in a mixture contamination model and hence outliers are regarded as random contamination. It should be noted that it is not appropriate to use a mixture of distributions model to simulate outliers from Liu and Zumbo’s (2007) first category, but is more appropriate to simulate outliers from their second and third categories. As the first category of outliers is obvious (typographical) errors and sample specific, the randomness of mixture of distributions models does not fit into the fixed property of outliers from the first category. However, slippage models can be used for this kind of outliers because the number of outliers is fixed in each sample.

There are two contamination conditions: symmetric and asymmetric contamination. The contamination is symmetric if the population is a mixture of N(µ, σ) and N(µ, bσ), where b is a positive constant greater than one and hence can generate a contamination distribution with a larger standard deviation (SD) than the parent distribution, which is called SD shift in this article. It is worth noting that if b is less than one, the condition of inliers should be considered instead of outliers, which is not of interest of the present study. The contamination is asymmetric when the population is a mixture of N(µ, σ) and N(µ + aσ) or N(µ, σ) and N(µ + a, bσ), where a is a constant and a ≠ 0. The mean and SD of F are usually defined as 0 and 1, respectively, that is, N(0, 1), so adding or subtracting any value to zero will result in the mean shift of a contamination distribution from the center of the population distribution and hence lead to the asymmetric contamination. Therefore, a variety of contamination conditions can be generated by increasing the three outlier factors, that is, the proportion of contamination, mean shift, and SD shift of the contamination distribution.

An example of outliers in a mixed contamination model is given in Figure 1. Figure 1A is a normal distribution, N(µ = 0, σ = 1). Figure 1B presents a case of symmetric outliers with 15% of outliers, consisting of a parent distribution N(µ = 0, σ = 1) and a contamination distribution N(µ =0, σ = 3). Outliers are shown as long and heavy tails at each side of the distribution and result in a highly leptokurtic (peaked) distribution. Figure 1C demonstrates a case of asymmetric outliers with 15% of outliers, consisting of a parent distribution N(µ = 0, σ = 1) and a contamination distribution N(µ = 3, σ = 1). Outliers are shown as a heavy tail on one side of the distribution. Figure 1D shows another case of asymmetric outliers with a parent distribution N(µ = 0, σ = 1) and a contamination distribution N(µ = 3, σ = 3). Outliers make the distribution have a heavy tail on one side as well as a high peak.

Figure 1.

An example of symmetric and asymmetric outliers (proportion of contamination = 0.15)

Building on the findings of Liu et al. (in press), the purpose of the present research was to investigate how outliers, arising from an unintended and unknowingly recruited subpopulation (Liu and Zumbo’s second and third categories of outliers), affected the decisions about the number of factors to retain using four commonly used methods, and the most commonly used variants thereof, that is, parallel analysis (PA, using the PCA model and the 95th percentile of the 100 random data sets), Kaiser–Guttman’s (K-G) eigenvalues-greater-than-one, minimum average partial (MAP, see Velicer, 1976, for a detailed description of the procedure used herein), or sequential chi-square tests based on maximum likelihood estimation ( $χ_{ML}^{2}$ ). The results of a Monte Carlo simulation study were reported first, in which the outlier conditions were manipulated using five factors (i.e., mean shift, SD shift, proportions of contamination of subjects, number of variables with outliers, and sample size). A follow-up study was also presented to provide insight into potential causes of our findings.

Study 1: Investigating the Effects of Outliers Generated Using the Mixture Contamination Model

Method

Study design

A Monte Carlo simulation study was used to investigate the effects of outliers on decisions about the number of factors by the four decision methods. This study systematically varied five factors with 100 replications for each outlier condition (i.e., simulation condition). These five factors are as follows:

Mean shift of a contamination distribution (0, 1.5, 3)

SD shift of a contamination distribution (1, 1.5, 3)

Proportion of contamination (i.e., proportion of the subjects from the contamination distribution; .01, .08, .15)

Sample size (250, 500, 1,000)

Number of variables with outliers (1, 6, 12, 24)

The study design is therefore a 3 × 3 × 3 × 3 × 4 completely crossed factorial design with 324 conditions, which also includes the no-outlier conditions (i.e., the comparison condition) that has mean shift of zero and SD shift of one.

To ensure a systematic investigation of outlier effects, the selection of the magnitude of three factors (mean shift, SD shift, and proportion of contamination), which are the parameters of a typical mixture contamination model, were guided by previous studies, Blair and Higgins (1980), Liu and Zumbo (2007), Mosteller and Tukey (1968), and Zumbo and Jennings (2002). Following these studies, the present study adopted similar values of model parameters with some modifications to fit the purpose of the present study. The number of variables with outliers was also included in the present study as it was demonstrated to be an influential factor in determining the number of factors in Liu et al.’s (in press) study. In addition, sample size was found in the literature to affect the performance of the K-G rule as well as chi-square tests (e.g., Gorsuch, 1983; Hubbard & Allen, 1987; Zwick & Velicer, 1986). Hence, we included samples size as a factor in the present study.

Data generation

In line with the earlier work by Liu and Zumbo (2007), Liu et al. (in press), and the psychometric context of our study, the outliers are induced in the item responses, that is, the marginal distributions. Twenty-four continuous variables were simulated and therefore our findings apply equally to analyses of subscale scores or visual analogue item response data (Liu & Zumbo, 2007).

For the mixture contamination model in Equation (1), both the target and contamination data were generated based on the population correlation matrix from Holzinger and Swineford’s (1939) classic data set. The original data set consists of 24 psychological ability test scores from 301 junior high school students with a four-factor solution recommended by many researchers (e.g., Gorsuch, 1983; Harman, 1976; Liu et al., in press). As in Liu et al.’s studies, a four-factor solution based on maximum likelihood EFA was obtained using Holzinger and Swineford’s data. To give the reader a sense of the factorial solution that generated the implied correlation matrix, using maximum likelihood EFA along with PROMAX rotation, the average interfactor correlation was .46 and ranged from .40 to .55. In addition, the structure coefficients (i.e., the factor loadings) demonstrate some complexity (i.e., not precisely simple structure); however, every factor had at least eight loadings greater than .40, and the interpretation of the factors is in line with Gorsuch (1983). The resulting reproduced correlation matrix (i.e., the implied correlation matrix with “1s” on the diagonal rather than the reproduced communalities) was used as the population correlation matrix in the simulation to generate multivariate normal data sets with specified marginal means and SDs, depending on the experimental condition, that correspond to the target or contamination distribution in Equation (1). Multivariate normal data were generated in software R 2.12.1, using a method akin to the Kaiser and Dickman (1962) method wherein we used Cholesky decomposition rather than principal components analysis in the computation. Generating data from a model with a known (prespecified) number of factors allowed us to compare the number of factors obtained from different outlier conditions to a common criterion in the population: four factors.

Outcome variable

In each of the 324 experimental conditions, and for each of the 100 replications, the number of factors to retain for the EFA was determined, separately, by the K-G rule, PA, MAP, and sequential $χ_{ML}^{2}$ tests. The number of factors retained is the dependent variable for each of these four methods, respectively, in this simulation study. It should be noted that, as in Liu et al.’s (in press) Study 2, for each outlier design condition, an average of the number of factors over 100 replications was obtained and hence the number of factors reported might not be a whole number.

Analysis of the simulation results

Following the data analysis strategy used in Liu and Zumbo (2007) and Liu et al. (in press), five-way ANOVAs (3 × 3 × 3 × 3 × 4) were conducted with the number of factors retained as the dependent variable separately for each of the four decision methods, that is, the K-G rule, PA, MAP, and $χ_{ML}^{2}$ sequential test. Given the large sample size (32,400—i.e., 324 cells in the design with 100 replications per cell), we used eta-square (η²) to orthogonally partition the explained variance obtained from the fixed-effect ANOVA models instead of looking at the statistical significance. An experiment with a sample size of 100 per cell (with an overall sample size of 32,400) results in a statistical power for main effects and interactions approaching one; hence, statistical significance was not useful in interpreting the results because even trivial effects would be statistically significant with that much power. Instead, the proportion of explained variance was used to aid our interpretation of the simulation results, which is like R² in regression analysis. In line with Liu et al. (in press), we used Ferguson’s (2009) minimum effect size of η² of .04 as the criterion to judge the importance of the main effects and interactions. In addition, if interactions appeared in the model, only higher order interactions were interpreted because main effects and lower order interactions are not interpretable in the presence of higher order interactions.

The sequential $χ_{ML}^{2}$ test can result in no decision about the number of factors to retain because of nonconvergence. Therefore, the nonconvergence problem can result in unbalanced data for the ANOVA, which can, in turn, distort the orthogonal partition of variance in the outcome variable. Please see our discussion of our findings in Table 6, below, for the details about the small amount of unbalance in the sample sizes, and where it was found to be present in the design. We therefore adopted the Type III sum-of-squares method in SPSS, an often-used method for handling unbalanced data with no missing cells (SPSS Inc., 2009), for the data analysis of the simulation results.

Results

Proportion of outliers in a given sample

As noted earlier, when using the mixture contamination model to simulate outliers, the proportion of outliers in a sample can vary across replications—that is, from sample to sample. To our knowledge, the central tendency and variability in sample-to-sample proportions of outliers has not been documented in simulation studies. To better understand these statistics, we recorded the proportion of outliers across 100 replications for a single variable.

Table 1 lists the central tendency (mean, median) and variability (SD, quartiles, minimum and maximum values) for the proportion of outliers for the various conditions in the current simulation study across the 100 replications. Starting from the far left in Table 1, one can find the population value of the proportion of contamination, the sample size, and then the seven descriptive statistics computed across the 100 replications. One can see that, as expected, the mean is equal to the population value of contamination in every case. However, also as expected, there is variability in the proportion of contamination across the samples, which depends on the sample size and the population proportion of contamination.

Results of the simulation study

Tables 2 to 5 present the results for the four decision methods (K-G, MAP, PA, and sequential $χ_{ML}^{2}$ tests), respectively. Figures 2 to 5 show the corresponding highest order interactions for these four methods identified as important factors using η². For the K-G, MAP, and PA methods, the highest order interaction meeting our criterion was the same: mean shift by the number of variables having outliers by the proportion of contamination. Hence, we only interpreted this three-way interaction for the K-G, MAP, and PA methods and not any of the lower order interactions and main effects.

Figure 2.

Graphs for three-way interactions of variables with outliers versus proportion of contamination by three levels of mean shift on the number of factors extracted by the K-G rule

Figure 3.

Graphs for three-way interactions of variables with outliers versus proportion of contamination by three levels of mean shift on the number of factors extracted by MAP approach

Figure 4.

Three-way interactions of variables with outliers versus proportion of contamination by three levels of mean shift on the number of factors extracted by PA approach

Figure 5.

Three-way interactions of variables with outliers versus proportion of contamination by three levels of standard deviation shift on the number of factors decided by the sequential $χ_{ML}^{2}$ test

Table 2 presents the results of the variance decomposition for the K-G rule, and Figure 2 shows the corresponding plot of the three-way interaction. With a mean shift of zero (i.e., symmetric outliers), the number of factors was not affected by outliers, which was also the case for the MAP and PA methods. With mean shift of 1.5 and 3 (i.e., asymmetric outliers), the change in the number of factors depended on the number of variables having outliers and the proportion of contamination. When mean shift was 1.5, the number of factors was not affected when one variable and all variables (24) had outliers, but was inflated (from 4 up to 5 factors) when 6 and 12 variables had outliers. With a mean shift of 3, the number of factors was not affected when only one variable had outliers, was inflated (from 4 up to 5 factors) when 6 and 12 variables had outliers, but deflated when all 24 variables had outliers (from 4 to an average of 2.7 factors). There was more deflation with an increase in the proportion of contamination.

Table 2.

Variable Ordering for a Five-Way ANOVA on the Number of Factors Extracted by the K-G Rule

Model	Sum Squares	Eta-Square	Percentage of R²
vars * mean	1929.155	0.197	25.953
Vars	1742.855	0.178	23.447
pc * vars * mean	930.368	0.095	12.516
pc * vars	887.323	0.090	11.937
pc	242.723	0.025	3.265
pc * mean	238.286	0.024	3.206
vars * SD	233.350	0.024	3.139
N	232.309	0.024	3.125
SD	169.119	0.017	2.275
Mean	165.230	0.017	2.223
pc * vars * mean * SD	91.428	0.009	1.230
pc * vars * SD	81.851	0.008	1.101
vars * mean * SD	65.865	0.007	0.886
n * SD	56.401	0.006	0.759
n * vars * SD	44.155	0.005	0.594
pc * SD	37.587	0.004	0.506
pc * mean * SD	37.478	0.004	0.504
n * vars * mean	37.212	0.004	0.501
n * vars * mean * SD	30.391	0.003	0.409
n * mean	26.433	0.003	0.356
n * mean * SD	25.753	0.003	0.346
n * pc * vars * mean	24.026	0.002	0.323
n * pc * mean	23.618	0.002	0.318
mean * SD	19.424	0.002	0.261
n * pc * vars	15.457	0.002	0.208
n * vars	14.537	0.001	0.196
n * pc * vars * mean * SD	11.692	0.001	0.157
n * pc * mean * SD	10.674	0.001	0.144
n * pc * vars * SD	3.998	0.000	0.054
n * pc	2.270	0.000	0.031
n * pc * SD	2.220	0.000	0.030
Error	2373.010
Total	581401.000
Corrected total	9806.198

Note. R² = 0.758. ANOVA = analysis of variance; K-G = Kaiser–Guttman rule; mean = mean shift of the contamination distribution; SD = standard deviation shift of the contamination distribution; pc = proportion of contamination in the population; n = sample size; vars = number of variables having outliers. The important main effects and/or interactions, as described in the Methods section, are listed in boldface.

Table 3 presents the results of the variance decomposition for the MAP method, with the corresponding plots in Figure 3. Figure 3 showed that the number of factors was not affected when the mean shift was 0 and 1.5, but inflated from 4 to 5 when the mean shift increased to 3 for the cases of 6 and 12 variables having outliers. The magnitude of the inflation increased with the increase of the proportion of contamination. It is worth noting that the number of factors retained was not affected when all variables had outliers in the MAP method, which was different from the K-G and PA methods.

Table 3.

Variable Ordering for a Five-Way ANOVA on the Number of Factors Extracted by the MAP Approach

Model	Sum Squares	Eta-Square	Percentage of R²
vars * mean	564.060	0.140	21.940
mean	543.740	0.135	21.149
vars	414.457	0.103	16.121
pc * vars * mean	292.322	0.072	11.370
pc * mean	273.508	0.068	10.638
pc * vars	204.590	0.051	7.958
pc	166.922	0.041	6.493
n	28.671	0.007	1.115
SD	14.232	0.004	0.554
pc * vars * mean * SD	6.649	0.002	0.259
vars * SD	5.894	0.001	0.229
pc * SD	5.838	0.001	0.227
vars * mean * SD	4.605	0.001	0.179
n * pc * vars * mean	4.435	0.001	0.173
n * mean	4.411	0.001	0.172
n * vars * mean	4.298	0.001	0.167
pc * mean * SD	4.121	0.001	0.160
n * pc * mean	4.038	0.001	0.157
n * vars * SD	3.618	0.001	0.141
pc * vars * SD	3.613	0.001	0.141
n * pc * vars * SD	2.343	0.001	0.091
n * SD	2.169	0.001	0.084
n * pc * vars * mean * SD	2.156	0.001	0.084
mean * SD	2.028	0.001	0.079
n * vars	1.838	0.000	0.071
n * pc	1.513	0.000	0.059
n * vars * mean * SD	1.390	0.000	0.054
n * pc * vars	1.061	0.000	0.041
n * pc * SD	1.025	0.000	0.040
n * pc * mean * SD	1.020	0.000	0.040
n * mean * SD	0.234	0.000	0.009
Error	1471.610
Total	542513.000
Corrected total	4042.407

Note. R² = 0.636. ANOVA = analysis of variance; MAP = minimum average partial; mean = mean shift of the contamination distribution; SD = standard deviation shift of contamination distribution; pc = proportion of contamination in the population; n = sample size; vars = number of variables having outliers. The important main effects and/or interactions, as described in the Methods section, are listed in boldface.

Table 4 presents the results of the variance decomposition for the PA method, and Figure 4 is the corresponding interaction plot. Similar to the performance of the K-G and MAP methods, the PA method was robust to symmetric outliers. In general, the PA method was accurate in retaining the number of factors in the presence of asymmetric outliers; however, it became dysfunctional when all variables had outliers: (a) the number of factors was deflated slightly when the mean shift was 1.5 and deflated dramatically when the mean shift increased to 3 and (b) the magnitude of deflation increased when the proportion of contamination increased.

Table 4.

Variable Ordering for a Five-Way ANOVA on the Number of Factors Extracted by the PA Approach

Model	Sum Squares	Eta-Square	Percentage of R²
vars * mean	2280.285	0.196	23.802
Vars	1985.020	0.171	20.720
pc * vars * mean	1069.485	0.092	11.163
Mean	1006.276	0.087	10.504
pc * vars	940.891	0.081	9.821
pc * mean	498.573	0.043	5.204
pc	410.573	0.035	4.286
N	234.027	0.020	2.443
n * vars * mean	120.726	0.010	1.260
n * vars	107.233	0.009	1.119
vars * mean * SD	104.751	0.009	1.093
vars * SD	97.287	0.008	1.015
mean * SD	94.225	0.008	0.984
n * pc * vars * mean	84.549	0.007	0.883
SD	80.244	0.007	0.838
n * mean	75.325	0.006	0.786
pc * vars * mean * SD	64.773	0.006	0.676
pc * vars * SD	60.835	0.005	0.635
pc * mean * SD	50.661	0.004	0.529
pc * SD	48.762	0.004	0.509
n * pc * vars	42.428	0.004	0.443
n * pc	41.678	0.004	0.435
n * pc * mean	27.678	0.002	0.289
n * pc * vars * mean * SD	15.749	0.001	0.164
n * pc * mean * SD	10.429	0.001	0.109
n * vars * mean * SD	9.840	0.001	0.103
n * mean * SD	8.414	0.001	0.088
n * pc * vars * SD	7.395	0.001	0.077
n * vars * SD	1.617	0.000	0.017
n * pc * SD	0.393	0.000	0.004
n * SD	0.291	0.000	0.003
Error	2039.620
Total	485484.000
Corrected total	11620.035

Note. R² =0.824. ANOVA = analysis of variance; PA = parallel analysis; mean = mean shift of the contamination distribution; SD = standard deviation shift of the contamination distribution; pc = proportion of contamination in the population; n = sample size; vars = number of variables having outliers. The important main effects and/or interactions, as described in the Methods section, are listed in boldface.

Unlike the three PCA-based methods, for the sequential $χ_{ML}^{2}$ test, the highest order interaction identified as important using η² was found to be: SD shift by the number of variables having outliers by the proportion of contamination. Table 5 presents the results of the variance decomposition for the sequential $χ_{ML}^{2}$ test, and Figure 5 shows the corresponding interaction plot. The SD shift played an important role in the interaction, which indicated that the symmetric outliers affected the performance of the sequential $χ_{ML}^{2}$ test. When the SD was one (no shift) or 1.5 (mild increase on variations for the contamination distribution), the number of factors retained was either not affected or inflated by a small magnitude. However, the number of factors retained was inflated dramatically when the SD shift increased to 3, and especially when all variables had outliers, the number of factors retained increased from 4 (baseline) to almost 12.

Table 5.

Variable Ordering for a Five-Way ANOVA on the Number of Factors Decided by the Chi-Square (ML) Test

Model	Sum Squares	Eta-Square	Percentage of R²
vars * SD	27625.757	0.306	32.393
SD	19436.831	0.216	22.791
Vars	13616.464	0.151	15.966
pc * vars * SD	7747.121	0.086	9.084
pc	5190.159	0.058	6.086
pc * SD	4856.412	0.054	5.694
pc * vars	4018.700	0.045	4.712
Mean	765.987	0.008	0.898
vars * mean	642.238	0.007	0.753
N	214.615	0.002	0.252
n * vars * SD	134.941	0.001	0.158
n * SD	131.215	0.001	0.154
pc * vars * mean	119.643	0.001	0.140
pc * mean	117.121	0.001	0.137
mean * SD	100.804	0.001	0.118
n * vars	99.760	0.001	0.117
n * pc * vars * SD	88.972	0.001	0.104
vars * mean * SD	74.092	0.001	0.087
pc * mean * SD	62.435	0.001	0.073
pc * vars * mean * SD	48.162	0.001	0.056
n * pc * vars * mean	31.810	0.000	0.037
n * pc * mean	29.125	0.000	0.034
n * pc * SD	26.532	0.000	0.031
n * pc	24.936	0.000	0.029
n * pc * vars	22.889	0.000	0.027
n * pc * vars * mean * SD	18.122	0.000	0.021
n * mean	10.558	0.000	0.012
n * vars * mean * SD	8.307	0.000	0.010
n * mean * SD	8.049	0.000	0.009
n * vars * mean	7.478	0.000	0.009
n * pc * mean * SD	5.091	0.000	0.006
Error	8922.025
Total	855330.000
Corrected total	90140.908

Note. R² = .946. ANOVA = analysis of variance; mean = mean shift of the contamination distribution; SD = standard deviation shift of the contamination distribution; pc = proportion of contamination in the population; n = sample size; vars = number of variables having outliers. The important main effects and/or interactions, as described in the Methods section, are listed in boldface.

It should be noted that nonconvergence was found for the sequential $χ_{ML}^{2}$ tests, ranging from 1% to 19% of the replications in a cell of the experimental design (in most cases it was zero), and less than one half of a percentage point overall. As shown in Table 6, the nonconvergence occurred for experimental conditions wherein the SD shift was 3, which we saw in Figure 1 involved high kurtosis for the variable with outliers. Within the conditions involving an SD shift of 3, nonconvergence happened when the proportion of contamination was either .08 or .15 and either 12 or 24 variables had outliers and was more likely to happen when all 24 variables had outliers. Sample size also seemed to interact in this finding, wherein the nonconvergence occurred predominantly with sample sizes of 1,000. Like the findings of Liu et al. (in press), in inspecting the statistical output from the simulation, the nonconvergence problem resulted from a combination of a Heywood case and failure to find local minimum of the empirical likelihood solution.

Table 6.

Percentage of Nonconvergent Replications With the Sequential Chi-Square (ML) Tests

		SD Shift of 3
		Mean Shift
Pc	Vars	.0	1.5	3.0
.08	12			1 (n = 250)
	24	3 (n = 500)	4 (n = 500)	3 (n = 500)
		8 (n = 1,000)	15 (n = 1,000)	18 (n = 1,000)
.15	24		1 (n = 250)	1 (n = 250)
		3 (n = 500)	4 (n = 500)	10 (n = 500)
		19 (n = 1,000)	17 (n = 1,000)	16 (n = 1,000)

Note. SD = standard deviation; pc = proportion of contamination in the population; vars = number of variables having outliers.

Study 2: Demonstrations of Effects of Outliers on Correlation Matrix and Kurtosis and Skewness of Item Responses

The purpose of Study 2 was to facilitate our understanding about why these decision methods performed differently in the presence of outliers. As Liu et al. (in press) pointed out, correlation matrices are the engine for the PCA-based methods and hence are the input data for them. Furthermore, skewness and kurtosis are related to the performance of the $χ_{ML}^{2}$ test in factor analysis (Boomsma, 1983; Browne, 1984). We, therefore, include two demonstrations in Study 2: a small, focused simulation to show how outliers distort the correlation matrix and eigenvalues as well as a demonstration of the change in kurtosis and skewness in the presence of outliers.

Demonstration 1

Researchers usually ignored the effects of outliers on factor analysis partly because they believed that a few outliers should not substantially change the correlation matrix and, as such, a factor analysis should not be affected by outliers. The present small-scale simulation aimed to demonstrate how outliers may distort properties of a correlation matrix. Following Liu et al.’s (in press) study, we also used the original correlation matrix of the first four variables from Holzinger and Swineford’s (1939) classic data as the population correlation matrix for simulating multivariate normal data sets. For demonstration purposes, we only included extreme outlier conditions (mean shift = 3 and/or SD shift = 3) with either two or all four variables having outliers as well as a no-outlier condition. Across all outlier conditions, the proportion of contamination was 0.15. In Study 1, we did not find sample size effects; therefore, in this demonstration we ruled out this factor and used data sets with 100,000 observations so as to have population analogues.

To examine the change in the correlation matrix under outlier conditions, we used the matrix’s condition number to document if it is ill-conditioned and the magnitude of ill-condition—with larger condition number indicating more ill-conditioned. The advantage of using the condition number is that, when the correlation matrix is close to being singular, we can still obtain a solution, which disguises the problem of being ill-conditioned, but the condition number can reflect if the matrix is ill-conditioned and if the properties of the matrix are distorted. The condition number is a product term, $c o n d (A) = ‖ A ‖ \times ‖ A^{- 1} ‖$ , where A denotes a correlation matrix, $‖ A ‖$ denotes matrix norm, A⁻¹ denotes the inverse of a matrix, and $‖ A^{- 1} ‖$ denotes the matrix norm for the inverse of a matrix (Watkins, 2010). Given that a correlation matrix is a special case of a square matrix that is symmetric about the major diagonal that contains ones, $‖ A ‖$ is the largest eigenvalue of a correlation matrix, and $‖ A^{- 1} ‖$ is the largest eigenvalue of the inverse of a correlation matrix (Golub & Van Loan, 1993).

Table 7 presents the results of the simulation with four rows and seven columns. The first column indicates the outlier conditions (i.e., mean shift and SD shift), the second column shows the resulting correlation matrix with only two variables having outliers, the third and fourth columns are the corresponding condition number and eigenvalues, the fifth column shows the resulting correlation matrix with all four variables having outliers, and the sixth and seventh columns are the corresponding condition number and eigenvalues.

Table 7.

Demonstration of Changes in Correlation Coefficients, Condition Number, and Eigenvalues Using a Four-Variable Data Set in the Presence of Outliers (15%) Compared to the No-Outlier Condition

Note. M = mean; SD = standard deviation. Dashed lines are used to indicate which variables had outliers.

The top row presents the results for the correlation matrix in the no-outlier condition. The second row shows the results for symmetric outlier condition, with no mean shift and a SD shift of 3. The effects of outliers were not found for either the case of two variables having outliers or that of all variables having outliers. Some of the correlation coefficients were deflated to a small degree when two variables had outliers, and the condition number as well as the magnitude of eigenvalues was not affected by symmetric outliers.

However, there were dramatic changes for the asymmetric outlier condition with mean shift only (mean shift = 3, SD shift = 1). Echoed in the findings of Liu et al. (in press), we also found that when two variables had outliers, the correlation coefficient for those two variables was inflated, whereas the remaining correlations in the matrix were either deflated when involving combinations of variables with and without outliers or were unchanged when only involving variables without outliers. The complex pattern created an extra factor and resulted in an increase of the condition number from 3.415 (baseline) to 6.481. When all the variables had outliers, the correlation coefficients were all inflated resulting in the creation of a more dominant (or salient) factor and a large increase in the condition number from 3.415 to 11.363.

For the asymmetric outlier condition with both mean shift and SD shift (mean shift = 3, SD shift = 3), the effects of outliers were reduced to some degree. Compared with the mean shift only condition (mean shift = 3, SD shift = 1), the magnitude of inflation in correlation coefficients became smaller; the condition number dropped to some extent, from 6.481 and 11.363 to 4.465 and 7.093; the second eigenvalue for two variables having outliers was not greater than one anymore (i.e., dropped from 1.009 to 0.955); and the magnitude of the largest eigenvalue for all variables having outliers decreased from 3.047 to 2.666.

The interesting findings here were that symmetric outliers did not affect the correlation matrix whereas the asymmetric outliers, especially in the mean shift only condition, distorted the correlation matrix, which either created an extra factor or led to the appearance of a dominant factor that could reduce the number of factors if there were more than one factor. This helps us understand why mean shift and the number of variables having outliers played important roles for PCA-based methods whereas SD shift did not. Although they are all PCA-based, K-G, MAP, and PA methods adopt different procedures and hence one should not be surprised to find some variation among these methods when determining the number of factors, which was shown in our Study 1.

Demonstration 2

Our findings from Study 1 revealed that, for the sequential $χ_{ML}^{2}$ test, the number of factors was inflated with an increase in the magnitude of SD shift, the proportion of contamination, and the number of variables having outliers. Unlike the PCA-based methods, SD shift played an important role, but mean shift did not. In this demonstration, we followed the simulation design in Study 1, but did not manipulate the number of variables having outliers. The purpose was to demonstrate the univariate skewness and kurtosis for a single variable in each outlier condition. Like Demonstration 1, we did not manipulate sample size and hence reported the population analogues—that is, we reported the values of kurtosis and skewness for a data set with 100,000 observations.

Table 8 comprises two parts: the upper part reports kurtosis and the lower part reports skewness.² The kurtosis was inflated when the SD shifted from 1 to 1.5, and greatly inflated when SD shift became 3. Mean shift affected kurtosis to some degree for the proportion of contamination of .01 and .08, but not much for a proportion of contamination of .15. It should be noted that the kurtosis was inflated to 8.62 (mean shift = 3, SD shift = 3) for the proportion of contamination of .08, but was 5.96 for the proportion of contamination of .15. This suggests that a higher level of proportion of contamination (.15) led to less inflation in kurtosis than a lower level (.08). As shown mathematically by Pena and Prieto (2001), symmetric outliers increase the kurtosis and a small proportion of asymmetric outliers also increase kurtosis, but a large proportion of asymmetric outliers can make kurtosis smaller.

Table 8.

Demonstration of Effects of Outliers on Kurtosis and Skewness

		Kurtosis
		Mean
		0.0			1.5			3.0
		SD			SD			SD
		1.0	1.5	3.0	1.0	1.5	3.0	1.0	1.5	3.0
Pc	0.01	−0.02	0.01	1.31	0.01	0.18	2.06	0.61	1.13	4.51
	0.08	0.01	0.37	5.89	0.17	1.05	6.86	1.22	2.57	8.62
	0.15	0.00	0.48	5.27	0.11	1.09	5.68	0.56	1.72	5.96
		Skewness
		Mean
		0.0			1.5			3.0
		SD			SD			SD
		1.0	1.5	3.0	1.0	1.5	3.0	1.0	1.5	3.0
Pc	0.01	0.02	0.02	0.01	0.05	0.10	0.33	0.25	0.34	0.77
	0.08	0.00	−0.01	−0.08	0.17	0.44	1.13	0.78	1.07	1.98
	0.15	0.00	−0.01	−0.03	0.22	0.58	1.24	0.77	1.09	1.90

Note. Mean = mean shift; SD = standard deviation shift; Pc = proportion of contamination.

The lower part of Table 8 shows that, as expected, skewness was inflated when the mean shifted to 1.5 and 3 and SD shifted to 1.5 and 3. The largest increase of skewness is 1.98 for the mean shift of 3 and SD shift of 3 with .08 proportion of contamination. However, the inflation of skewness was not as large as the inflation of kurtosis. Hence, the inflation in kurtosis likely drove the inflation of Type I error rate of chi-square (ML) sequential tests in our simulation, in which SD shift was an influential factor. This might reflect why SD shift played an important role in explaining the inflation of number of factors in sequential $χ_{ML}^{2}$ tests.

General Discussion

The common practices to deal with outliers in data analysis are either to (a) remove or correct them if they are errors or (b) use a robust estimator if one is uncertain about the source of the outliers or if one is uncertain that outliers are present (e.g., outliers in high-dimensional data are very difficult to detect). However, it should be noted that not all outliers are typographical or data entry (or recording) errors. If outliers arise from unintentionally and unknowingly included subpopulations other than the target population, the outliers are not errors of the first kind described in Liu and Zumbo (2007) but rather, in that sense, legitimate observations that arise from a subpopulation different than the target population in a study. The example we provided earlier of the study of life satisfaction in Denmark demonstrates the subtle issues of unintentionally and unknowingly invoking an assumption of measurement universality with heterogeneous populations involved in a multicultural and globalized assessment environment (Hambleton et al., 2005).

The purpose of the present research was to investigate the effects of outliers, arising from an unintentionally and unknowingly recruited subpopulation, on decisions about the number of factors to retain in an EFA using four decision methods separately. Four important findings are summarized as follows. First, the effects of outliers did not depend on the sample size. This is an important finding because many practitioners believe that having a larger sample size makes them immune to the effects of outliers, which has been shown herein (and elsewhere) to not be the case. Second, the performance of the three PCA-based methods (K-G, MAP, and PA) was not affected by symmetric contamination, but that of sequential $χ_{ML}^{2}$ tests was affected with inflation of the number of factors retained. Third, for the asymmetric contamination, the number of factors retained was inflated for the MAP method, deflated for the PA method, and either inflated or deflated for the K-G rule, depending on the number of variables having outliers, the proportion of contamination, and the level of mean shift, whereas mean shift did not affect sequential $χ_{ML}^{2}$ tests. Finally, the MAP and PA methods are, in general, more resistant to outliers than the K-G rule and sequential $χ_{ML}^{2}$ tests. However, it should be noted that both MAP and PA are still affected by outliers under certain conditions, so they are not fully resistant to outliers.

The present study, along with the earlier study by Liu et al. (in press), provide a broad picture of the effects of outliers on the decisions about the number of factors to retain in an EFA study. When reading extant literature, or conducting an EFA study, readers can be assured that outliers in the item response distributions are likely to have a significant impact on the conclusions, either inflating or deflating the number of factors retained depending on the decision methods used, outlier sources (Liu & Zumbo, 2007), and manifestation of outliers (e.g., asymmetric or symmetric outliers) in the sample. The take-home message in this line of research, however, is still the same: researchers are strongly encouraged to check for outliers and use robust methods in their day-to-day research practice (Huber, 1981; Wilcox, 2010, in press) and that not doing so may lead to misleading empirical conclusions.

The present research has high fidelity with real data situations, which provides useful information for applied researchers. However, using real data for simulation also brings some limitations, such as we only mimic one real data situation, so we did not vary the number of variables and number of factors and manipulate different levels of factor loadings and factor correlations. We would encourage future research to investigate these variables when examining the effects of outliers on the decision about the number of factors.

Footnotes

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

The author(s) disclosed receipt of the following support for the research, authorship, and/or publication of this article: Bruno Zumbo wishes to acknowledge support from the Social Sciences and Humanities Research Council of Canada (SSHRC) and the Canadian Institutes of Health Research (CIHR) during the preparation of this work.

Notes

References

Anscombe

F. J.

(1960). Rejection of outliers. Technometrics, 2, 123-147.

Balakrishnan

Childs

(2001). Outlier. In Hazewinkel

(Ed.), Encyclopaedia of mathematics (online version). New York, NY: Springer-Verlag. Retrieved from http://eom.springer.de/

Barnett

(1999). Nonattending respondent effects on internal consistency of self-administered surveys: A Monte Carlo simulation study. Educational and Psychological Measurement, 59, 38-46.

Barnett

Lewis

(1978). Outliers in statistical data (1st ed.). New York, NY: Wiley.

Barnett

Lewis

(1994). Outliers in statistical data (3rd ed.). New York, NY: Wiley.

Blair

R. C.

Higgins

J. J.

(1980). The power of t and Wilcoxon statistics: A comparison. Evaluation Review, 4, 645-656.

Blischke

W. S.

(1978). Mixtures of distributions. In Kruskal

W. H.

Tanur

J. M.

(Eds.), International encyclopedia of statistics (Vol. 1, pp. 174-180). New York, NY: Free Press.

Boomsma

(1983). On the robustness of LISREL (maximum likelihood estimation) against small sample size and nonnormality. Amsterdam, Netherlands: Sociometric Research Foundation. (Unpublished doctoral dissertation, University of Groningen, Netherlands)

Browne

M. W.

(1984). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62-83.

10.

Dixon

W. J.

(1950). Analysis of extreme values. Annals of Mathematical Statistics, 21, 488-506.

11.

Ferguson

C. J.

(2009). An effect size primer: A guide for clinicians and researchers. Professional Psychology: Research and Practice, 40, 532-538.

12.

Golub

G. H.

Van Loan

C. F.

(1993). Matrix computation. Baltimore, MD: Johns Hopkins University Press.

13.

Gorsuch

R. L.

(1983). Factor analysis (2nd ed.). Hillsdale, NJ: Erlbaum.

14.

Hambleton

R. K.

Merenda

P. F.

Spielberger

C. D.

(2005). Adapting educational and psychological tests for cross-cultural assessment. Mahwah, NJ: Erlbaum.

15.

Harman

H. H.

(1976). Modern factor analysis. Chicago, IL: University of Chicago Press.

16.

Holzinger

K. J.

Swineford

(1939). A study in factor analysis: The stability of a bi-factor solution (Supplementary Educational Monographs, No. 48). Chicago, IL: University of Chicago.

17.

Hubbard

Allen

S. J.

(1987). An empirical comparison of alternative methods for principal components extraction. Journal of Business Research, 15, 173-190.

18.

Huber

P. J.

(1964). Robust estimation of a location parameter. Annals of Mathematical Statistics, 35, 73-101.

19.

Huber

P. J.

(1981). Robust statistics. New York, NY: Wiley.

20.

Kaiser

H. F.

Dickman

(1962). Sample and population score matrices and sample correlation matrices from an arbitrary population correlation matrix. Psychometrika, 27, 179-182.

21.

Liu

Zumbo

B. D.

(2007). The impact of outliers on Cronbach’s coefficient alpha estimate of reliability: Visual analogue scales. Educational and Psychological Measurement, 67, 620-634.

22.

Liu

Zumbo

B. D.

A. D.

(in press). A demonstration of the impact of outliers on the decisions about the number of factors in exploratory factor analysis. Educational and Psychological Measurement.

23.

Mosteller

Tukey

J. W.

(1968). Data analysis, including statistics. In Lindzey

Aronson

(Eds.), Handbook of social psychology (2nd ed., Vol. 2, pp. 80-203). Reading, MA: Addison-Wesley.

24.

Organization for Economic Co-operation and Development. (2005). Society at a glance: OECD social indicators—2005 Edition. Retrieved from http://www.oecd.org/dataoecd/34/13/34542721.xls

25.

Pena

Prieto

F. J.

(2001). Multivariate outlier detection and robust covariance matrix estimation. Technometrics, 43, 286-310.

26.

SPSS Inc. (2009). PASW STATISTICS 17.0 command syntax reference. Chicago, IL: Author.

27.

Tukey

J. W.

(1962). The future of data analysis. Annals of Mathematical Statistics, 3, 1-67.

28.

Velicer

(1976). Determining the number of components from the matrix of partial correlations. Psychometrika, 41, 321-327.

29.

Watkins

D. S.

(2010). Fundamentals of matrix computations (3rd ed.). Pullman, MA: Wiley.

30.

Wilcox

R. R.

(2010). Fundamentals of modern statistical methods: Substantially improving power and accuracy (2nd ed.). New York, NY: Springer.

31.

Wilcox

R. R.

(in press). Modern statistics for the social and behavioral sciences: A practical introduction. New York, NY: Chapman & Hall/CRC Press.

32.

Yuan

K. H.

Marshall

L. L.

Bentler

P. M.

(2002). A unified approach to exploratory factor analysis with missing data, nonnormal data, and in the presence of outliers. Psychometrika, 67, 95-122.

33.

Zumbo

B. D.

Jennings

(2002). The robustness of validity and efficiency of the related samples t-test in the presence of outliers. Psicologica, 23, 415-450.

34.

Zumbo

B. D.

Zimmerman

D. W.

(1993). Is the selection of statistical methods governed by level of measurement? Canadian Psychology, 34, 390-400.

35.

Zwick

W. R.

Velicer

W. F.

(1986). Comparison of 5 rules for determining the number of components to retain. Psychological Bulletin, 99, 432-442.