Abstract
Latent profile analysis (LPA) is a person-centered method commonly used in organizational research to identify homogeneous subpopulations of employees within a heterogeneous population. However, in the case of nested data structures, such as employees nested in work departments, multilevel techniques are needed. Multilevel LPA (MLPA) enables adequate modeling of subpopulations in hierarchical data sets. MLPA enables investigation of variability in the proportions of Level 1 profiles across Level 2 units, and of Level 2 latent classes based on the proportions of Level 1 latent profiles and Level 1 ratings, and the extent to which covariates drawn from the different hierarchical levels of the data affect the probability of a membership of a particular profile. We demonstrate the use of MLPA by investigating job characteristics profiles based on the job-demand-control-support (JDCS) model using data from 1,958 university employees clustered in 78 work departments. The implications of the results for organizational research are discussed, together with several issues related to the potential of MLPA for wider application.
Keywords
The aim of this article is to demonstrate the use of multilevel latent profile analysis (MLPA) in the organizational sciences, an area in which it has been seldom utilized. A comprehensive description of this method is given by Henry and Muthén (2010) and Vermunt (2003, 2008, 2011; see also Finch & French, 2014), and thus, the statistical content covered is already familiar to methodological experts. For less statistically knowledgeable organizational researchers interested in applying person-centered methods in a multilevel context, this article may serve as a convenient introduction to how this can be done. To this end, we present and illustrate step by step how several types of MLPA can be applied to answer questions in organizational research. Each step is presented from a user perspective. Moreover, in our illustrative example, continuous variables rather than categorical variables are used, as they are more typically studied in organizational research.
We illustrate MLPA in the context of the well-known and widely used job-demand-control-support (JDCS) model (Karasek & Theorell, 1990). This job stress model has been widely utilized in occupational stress research (for reviews, see Häusser, Mojzisch, Niesel, & Schulz-Hardt, 2010; van der Doef & Maes, 1999), including recent longitudinal studies applying a person-centered perspective (Igic et al., 2017; Mauno, Mäkikangas, & Kinnunen, 2016). Despite decades of investigation, very little is known about the extent of the role played by individual and environmental factors in shaping JDCS profiles. Instead of estimating the quantitative amount of variance in job characteristics between work departments, as is typically done in multilevel modelling, MLPA enables qualitatively different working environments to be identified and further investigated. Consequently, from the stress theory perspective, MLPA has the potential to provide answers to various novel research questions, such as whether intergroup (e.g., team, work department, organization) variability exists in within-person job characteristics profiles, whether qualitatively different work environments can be identified, or to what extent experiences of job characteristics are explained by individual (e.g., working hours) and environmental factors (e.g., leadership, climate, resource inadequacy in terms of unit size). Such knowledge could have important practical implications, regarding, for example, whether psychosocial work conditions can best be improved by redesigning jobs, or by treating individuals, or both.
First, we introduce the use of a person-centered approach in the multilevel context in general. We then describe three types of MLPA, namely, Level 2 variations in the size of the Level 1 profiles (also known as the parametric model; Henry & Muthén, 2010); Level 2 profiles (henceforth “classes”) based on the relative frequency of the Level 1 profiles (also known as the nonparametric model; Henry & Muthén, 2010); and Level 2 classes based on Level 1 individual ratings. We then illustrate these MPLA models by investigating JDCS profiles in hierarchical data on university employees (N = 1,958) working in 78 departments.
Person-Centered Approach in the Single Versus Multilevel Context
A person-centered approach focuses on identifying subpopulations of individuals who show different patterns of values for certain criteria variables (Muthén & Muthén, 2000). In addition, the person-centered approach can also be used to identify profiles that differ from one another at the level of the relations between variables (i.e., mixture regression, factor mixture). Compared with a variable-centered approach, which applies a single model to the whole sample to estimate a single set of parameters, a person-centered approach is able to relax the assumption of homogeneity in the population through the estimation of a distinct set of parameters applicable to the various subpopulations present in the sample.
The concept of finite mixture models refers to statistical methods that probabilistically assign individuals into subpopulations (McLachlan & Peel, 2000). Latent class analysis (LCA) and latent profile analysis (LPA) are special cases of finite mixture models that aim to identify naturally occurring latent subpopulations that are homogeneous in relation to selected criteria variables (McLachlan & Peel, 2000). Although LCA is sometimes presented as a general statistical framework for mixture models, it typically refers to identifying subpopulations based on categorical variables, whereas LPA performs the same function using continuous variables (Oberski, 2016). In the present demonstration, we focus on LPA, which aims to identify latent subpopulations presenting different configurations on a set of indicators (McLachlan & Peel, 2000; Sterba, 2013). Consequently, the covariance structure of the observed variables is explained via differences in their mean values between the latent subpopulations rather than via the associations between continuous variables, as is done in variable-centered modeling.
In organizational research, LPA has previously been applied to identify, for example, different employee profiles in relation to the three symptoms of burnout, that is, exhaustion, cynicism, and reduced professional efficacy (for a review, see Mäkikangas & Kinnunen, 2016), or different mind-sets (i.e., affective, normative, and continuance) of workplace commitment (for a review, see Meyer & Morin, 2016). Because LPA assumes observations to be independent of each other, it is not a suitable method for use with hierarchical data sets where individuals are nested within groups (e.g., within units/departments/teams in an organization; Asparouhov & Muthén, 2008; Henry & Muthén, 2010). In addition to violation of the independency assumption of observations, ignoring the presence of nested data structures may lead to inaccuracy in the classification of profiles (e.g., to inaccuracy in the number of latent classes and to biased standard errors; Asparouhov & Muthén, 2008; Chen, Kwok, Luo, & Willson, 2010; Luo & Kwok, 2009; Meyers & Beretvas, 2006; Moerbeek, 2004; Park & Yu, 2016; Vermunt, 2003, 2008). Recent evidence indicates that multilevel modeling is advised in the case of nested data, even if no group-level effects are expected or evident (Bliese, Maltarich, & Hendricks, 2017).
To overcome the obstacles caused by hierarchical data structures, multilevel LPA (MLPA) is needed. The benefit of MLPA is that it enables the different levels of the data to be taken into account in modeling the phenomena of interest, and thus answer research questions which are beyond the scope of single-level LPA. In the organizational sciences, for instance, MLPA enables investigation of the role played by qualitatively different work environments. Examples of such novel questions would be, how do burnout profiles vary across work departments or organizations? Or how can work departments be clustered according to the different proportions of individual burnout profiles? In multilevel modeling, the associations between variables are modeled at two or more levels (e.g., at the level of the individual and at the level of the work department). In variable-centered multilevel modeling the associations of variables are modeled at different levels (e.g., between job characteristics and turnover intentions at different hierarchical levels of the data; Mauno, De Cuyper, Tolvanen, Kinnunen, & Mäkikangas, 2014). In contrast, MLPA explains the observed covariance structure through latent profiles and relates the information to latent profiles at different levels of the data (e.g., to predict the proportions of profiles at different levels).
MLPA can be specified in many ways. Here, we focus on three specific types of MLPA. In the first model, Level 2 variations in the size of Level 1 profiles, the variation in the relative size of single-level profiles across the higher-level units (i.e., Level 2) is modeled. In the second model, Level 2 classes based on the relative frequency of the Level 1 profiles, latent classes of higher-level units (i.e., Level 2) based on the proportions of the single-level latent profiles are investigated. In the third model, Level 2 classes based on Level 1 ratings, latent classes of higher-level units are based on the variables observed at Level 1. In the following, we use the term profile to refer to latent subpopulations at the level of individuals (Level 1) and the term class to refer to subpopulations at the level of work departments (Level 2). Next, we explain in detail the steps to follow in constructing these models.
MLPA: Steps and Equations
Modeling MLPA follows six distinguishable steps: (a) definition of the problem and selection of the variables, (b) preliminary analysis of the data, (c) model specification, (d) model estimation for 1, 2,…, K latent profile (Level 1)/class (Level 2) solutions, (e) choosing the best-fitting model, and (f) investigating the role of potential covariates.
In the first step, the definition of the problem to be solved and selection of the variables are guided by the research target and questions. MLPA, like other person-centered methods, is typically conducted in an exploratory manner, that is, a priori hypotheses for the number or nature of the latent upper-level classes present in the data may not be given (Finch & French, 2014; see also Meyer & Morin, 2016; Morin & Wang, 2016). However, predictions of the characteristics of possible subpopulations can be theoretically anchored, keeping in mind that person-centered analysis is more model- than theory-driven (for exceptions, see Finch & Bronk, 2011; Wang, 2007). Thus, instead of a theory-driven definition of the problem, the theoretical rationale for person-centered studies is typically related to the following questions: Why is there a need for the identification of possible latent subpopulations in relation to the variables of interest? Why can we expect that the identification of these profiles will improve our understanding of the phenomenon under study? Do we expect these profiles to have a practical impact on different covariates?
After defining the problem and selecting the relevant variables, the second step, preliminary examination of the data, follows. This includes, for example, checking for possible outliers (e.g., by using Mahalanobis distance), 1 evaluating the reliability of the variables, and conducting missing data analysis. In implementing MLPA, it is also necessary to calculate intraclass correlation coefficients (ICCs) to see whether and how strongly the responses given by the members of the same subpopulation are related (Bliese et al., 2017). A significant ICC(1), indicating the degree to which individuals within a designated unit (i.e., work department) are nonindependent (Bliese, 2000), is often regarded as an essential precursor for multilevel modeling. However, recent evidence indicates that multilevel models are appropriate even in the case of a low or nonsignificant ICC(1) (Bliese et al., 2017). In our demonstration, ICCs are calculated by dividing the variance between work departments with the total variance (variance within + variance between work departments), which corresponds to ICC(1).
The third step, specification of the model, needs to be well-aligned with the research question and with current knowledge on best practices. In this stage, it is decided which of the model parameters (e.g., means, variances) are to be freely estimated and which are to be fixed/constrained at different levels of the data (Asparouhov & Muthén, 2008; Vermunt, 2011). Simulation studies conducted for different mixture models (i.e., LPA, growth mixture modeling, mixture regression modeling) suggest that to achieve a less biased and proper representation of the data, the variances across profiles should be freely estimated (Diallo, Morin, & Lu, 2016; Enders & Tofighi, 2008; Kim, Lamont, et al., 2016; Peugh & Fan, 2013). However, despite this generic recommendation, it is not always possible to rely on the results of models in which both means and variances are freely estimated. The reason for this is that these models tend more frequently to converge on improper solutions, which is generally taken to indicate that the model might have been overparameterized and that a more parsimonious model should be preferred (Bauer & Curran, 2003; Chen, Bollen, Paxton, Curran, & Kirby, 2001; Diallo et al., 2016). It should be also noted that LPA relies on a conditional independence assumption (i.e., variables within latent profile are uncorrelated). Consequently, covariances within profiles should be estimated only with great caution and on the basis of a strong theoretical justification (see Meyer & Morin, 2016). In MLPA, raw data or grand-mean centering is recommended, as they both provide the same results. To aid interpretation of the results, either grand-mean centering could be applied or the variables could be standardized for the analysis. Group-mean centering, however, should not be used in MPLA because it can substantially change model interpretation (see also Bliese et al., 2017).
To conclude, model specification and tests of alternative models should be conducted after careful reasoning and described in detail. As stated earlier, we present three possible ways to specify MLPA, each of which, depending on the research question, can be implemented separately. However, two of these (Level 2 variations in the size of the Level 1 profiles and Level 2 classes based on the relative frequency of the Level 1 profiles) are based on single-level profiles, and therefore it is recommended to conduct single-level LPA before MLPA.
The model for Level 2 variations in the size of the Level 1 profiles is performed to investigate whether the proportional distributions of the single-level profiles vary across the upper-level units. The model equation with two levels in which latent profiles j = 1, 2,…, J (across individuals i in k = 1,., K variables) appear at Level 1 and the proportions of the latent profiles vary randomly across Level 2 units u is
Probability of latent class j
Mean μj and variance
The model for Level 2 classes based on the relative frequency of the Level 1 profiles investigates whether Level 2 latent classes can be identified based on the proportions of the Level 1 latent profiles. Consequently, the equation for a two-level model in which latent profiles appear in Level 1 and the proportions of the Level 1 latent profiles are regressed on the Level 2 latent classes is
Probability of latent class j
The value of
The equation for the Level 2 classes based on Level 1 ratings with two levels in which latent classes j = 1,2,…, J (across individuals i in k = 1,., K variables) appear at Level 2 is
Probability pj of latent class j is received via latent variable CBj = μj
To sum up, both models, Level 2 classes based on the relative frequency of the Level 1 profiles and Level 2 classes based on Level 1 ratings, can be used where the interest is in classifying upper-level units (in this case, work departments). The main difference between these models is that Level 2 classes based on the relative frequency of the Level 1 profiles investigates latent classes based on within-person profiles, whereas Level 2 classes based on Level 1 ratings examines upper-level classes on the basis of variables observed on the single level. Consequently, model selection depends on whether the research target is to investigate latent subpopulations simultaneously at both levels of the data, or focus solely on upper-level latent classes. In turn, in the model Level 2 variations in the size of the Level 1 profiles, upper-level latent subpopulations are not investigated. Instead, the main research question is to investigate intercontextual variability in the Level 1 profiles, that is, whether the rate of individual (i.e., single-level) experiences vary by higher-level units.
In the fourth step, the estimation of latent profiles (Level 1)/classes (Level 2) for alternative models is conducted, starting from one profile/class solution and continuing to K/B latent profile/class solutions. Because mixed normal distributions of observations are assumed, and typically data appear to be missing, the full information maximum likelihood estimator with robust standard error estimation is needed (Asparouhov & Muthén, 2008). While trying to find the maximum likelihood value, the algorithm may prematurely stop and return a suboptimal set of parameter values, that is, lead to local maxima. To overcome the local maxima problem and find the global maximum (i.e., optimal solution), it is necessary to use several sets of starting values for parameters (Asparouhov & Muthén, 2008; McLachlan & Peel, 2000; see also Li, Harring, & Macready, 2014). It should also be noted that different MLPA models may require different starting value sets.
In the fifth step, the final model is selected. Selection of the optimal number of profiles should be based on the inspection of several statistical information criteria (e.g., Akaike information criterion [AIC], consistent AIC [CAIC], Bayesian information criterion [BIC], and the sample-size-adjusted BIC [aBIC]) and tests (e.g., Lo-Mendell-Rubin likelihood ratio test [LMR] and the bootstrap likelihood-ratio test [BLRT]), the statistical adequacy of the solution (e.g., convergence and absence of negative variance estimates), and inspection of the substantive meaning and theoretical conformity of the solution (McLachlan & Peel, 2000; see also Morin, Morizot, Boudrias, & Madore, 2011). Simulation studies indicate that the BIC, adjusted BIC, consistent AIC, and BLRT tests appear to be particularly effective in choosing the model that best covers the sample’s true parameters on the single level (Henson, Reise, & Kim, 2007; Nylund, Asparouhov, & Muthén, 2007; Tein, Coxe, & Cham, 2013; Tofighi & Enders, 2008; Tolvanen, 2007; Yu & Park, 2014). For deciding the number of upper-level latent classes, the BIC has proved to be the most efficient criterion (Finch & French, 2014; Yu & Park, 2014; Zhang, Zhang, Zhang, & Jiou, 2014). Graphical presentations of alternative solutions also aid the selection of the final model. Moreover, as recommended by Meyer and Morin (2016), the final model results, based on both standardized and raw scores, are also presented graphically.
In the sixth and final step, it is possible to extend MLPA to include single- and upper-level covariates and to investigate their cross-level interaction. According to the latest knowledge, the covariates should not be incorporated into the models until after the final unconditional solution has been identified (Diallo & Lu, 2017; Diallo, Morin, & Lu, 2017; Kim, Vermunt, Bakk, Jaki, & Van Horn, 2016; Nylund-Gibson & Masyn, 2016). Furthermore, to avoid the problem of profile switching, the starts values from the final unconditional MLPA model should be used.
Having provided an overview of the steps of MLPA, we now demonstrate the method using a study example in which job characteristics profiles based on the JDCS model are analyzed with hierarchical data on university employees.
Data Used in the Demonstration
Participants and Procedure
Participants were employees from two multidisciplinary equal-sized universities located in Central Finland. The data were collected in autumn 2008 with an electronic questionnaire that was sent to the email addresses of 4,508 university employees working at least 20 hours per week. In total, 2,137 responses were received after two reminders, yielding a response rate of 48%. Of the participants, 67.1% were women, and the mean sample age was 42.9 (SD = 11.1). As expected, the sample was highly educated: 76% had a master’s degree or higher. The majority (93.8%) worked full-time and, owing to the project nature of research work, 57% had a temporary employment contract. The sample was representative of the population of university workers in these two universities in occupational grouping (i.e., teachers, researchers, administrative and technical staff). However, women and temporary workers were overrepresented among the participants compared to the population from which the sample was drawn (for sample representativeness, see Kinnunen, Mäkikangas, Mauno, Siponen, & Nätti, 2011).
Altogether, 78 work departments, covering 15 faculties and two administrative units, were represented in our data. These 78 departments served as clustering units in the multilevel analyses. Mean departmental size was 24.8 (SD = 20.1, range = 3-115) employees. Information on their unit was missing for 179 participants, and thus the number of participants (Level 1) whose data were used in the analysis was 1,958.
Measures
To assess the job characteristics of JDCS model, we used the QPS Nordic Questionnaire, which has been developed and validated by Lindström et al. (2000). All the items were rated on a 5-point scale (1 = rarely or never, 5 = very often or always). Job demands were operationalized by workload and measured with three items (e.g., “Do you have too much to do?”; Cronbach’s α = .83). Job control was also measured by three items (e.g., “Can you influence decisions that are important for your work?”; Cronbach’s α = .73). Social support from supervisors (e.g., “If needed, is your nearest superior willing to listen to your work-related problems?”; Pearson correlation = .84) and colleagues (e.g., “If needed, can you get support and help with your work from your co-workers?”; Pearson correlation = .73) were each evaluated by two items.
Steps and Results of the MPLA Demonstration
Step 1: Definition of The Problem and Selection of the Variables
The JDCS model (Karasek, 1979; Karasek & Theorell, 1990) suggests that eight types of job can be identified based on three job characteristics (job demands, job control, and social support). The combination of job demands and control is assumed to form four job types, namely, active (high job demands, high job control), passive (low job demands, low job control), high-strain (high job demands and low job control), and low-strain (low job demands and high job control) (Karasek, 1979). The level of social support further specifies each of these four job types as either isolated or collective (Karasek & Theorell, 1990).
Most of the previous studies have focused on the combinations of job demands and job control (i.e., JDC model) by using mean or median splits (for a meta-analysis, see Kivimäki et al., 2012). Therefore, knowledge of the combinations of all three job characteristics without using any cutoff values is lacking. Two studies investigating the JDC model have applied a person-centered approach (Igic et al., 2017; Mauno et al., 2016). However, both studies failed to find all the constellations posited in the JDC model. In one of these, a Finnish 2-year follow-up study, low- (80%) and high-strain (10%) profiles were identified (together with two change profiles in which the level of job control either increased or decreased over time). In the other, a Swiss 10-year longitudinal study, 57% of the employees studied belonged to the low-strain profile and 17% to the active job profile (Igic et al., 2017). Approximately a third of the employees belonged to the profile in which changes were evident in workload and job control. Based on the findings of these two studies, it seems likely that in the present cross-sectional sample of university employees two or three profiles (i.e., low-strain, high-strain, active job) will be identified. However, in the absence of previous person-centered studies utilizing the JDCS model, it is unknown how the facets of social support are likely to combine together with workload and job control. Via LPA it is possible to identify these naturally occurring latent subpopulations.
Previous studies have investigated the JDC model only at the individual level, thus neglecting the possibility that these job characteristics may also be socially shared, signaling that employees who share the same working environment may also share a similar interpretation of these job characteristics (see Mauno et al., 2014; van Yperen & Snijders, 2000). Sharing (also termed emotional contagion; Barsade, 2002) occurs because employees who share the same work context tend to interact with each other as well as share the same organizational policies and practices, implying that their thinking and behavior in relation to work and the organization tend to converge over time (see Peiro, 2001).
Instead of simply estimating the amount of variance in job characteristics between work departments, as in traditional multilevel modeling, we first investigated whether individual JDCS profiles vary across work departments. This intergroup (e.g., team, work department, organization) variability in the within-person job characteristics profiles reveals whether certain, such as high or low strain, constellations are more typical in certain work departments than others. Moreover, it also offers a possibility to identify higher-level classes based on individual JDCS experiences (based either on profiles or observed ratings). Therefore, our second MLPA aims to contribute to the stress literature by identifying and investigating the proportions of healthy versus unhealthy work environments in light of the JDCS model. Investigating upper-level latent classes should also make it possible to answer the generalization problem encountered in person-centered research, that is, the extent to which the profiles found are context-specific or more universal (see Meyer & Morin, 2016; Mäkikangas & Kinnunen, 2016).
It is well known that objective working hours are associated with increased levels of job stress (including stressors such as excessive job demands, insufficient job control, and inadequate social support; Lee, Suh, Kim, & Park, 2017). Also, long working hours at the work department level might serve as a risk factor associated with high-strain jobs and high-strain work departments. Therefore, working hours at the individual and at the department levels were investigated as covariates. Moreover, the average size of the work department was investigated as a covariate, since it is well established that interaction and emotional contagion are more prevalent in smaller than larger work units (Kelly & Barsade, 2001). Emotional contagion may have both positive and negative outcomes, in that average work department size may predict either high- or low-strain constellations.
Step 2: Preliminary Analysis of The Data
Using Mahalanobis distance and corresponding p values, the preliminary analyses revealed the presence of deviant cases. At the p < .05 probability level, 5% of the cases were expected to be statistically significant at random. In our case, the percentage of statistically significant cases was 5.9%. However, in the cases with the smallest p values, further inspection revealed that the observed values of job characteristics were highly similar, suggesting the existence of a subpopulation. Therefore, all these deviant cases were included in the analysis (see also Note 1). The preliminary analyses showed further that the reliabilities of the observed variables were acceptably high (i.e., > .73; see the Measures section). ICCs (together with descriptive information on the variables) are presented in Table 1. As shown, the ICCs were significant for each of the four job characteristics variables, thus demonstrating the existence of meaningful variation between the Level 2 units (i.e., work departments). Significant ICCs mean that as well as varying from one individual to another, perceived job characteristics also varied from one work department to another.
Means (M), Standard Deviations (SD), Intraclass Correlations (ICC), and Correlations Between Study Variables at Level 1 (below the diagonal, N = 1,958 employees) and at Level 2 (above the diagonal, N = 78 departments).
aLevel 1 variable.
*p < .05, **p < .01.
Step 3: Model Specification
We expected to find job characteristics profiles at Level 1 that would further either vary randomly across work departments or form latent classes at the work department level based on either the Level 1 profiles or observed ratings. Consequently, the model specification would correspond to the above-described MLPA models. For this demonstration, we first estimated a single-level LPA, which was further extended to form alternative MLPA models. This enables us to demonstrate the difference between single and multilevel LPAs. We estimated the single-level LPA in two alternative ways: with unequal means and variances, and with unequal means only, across profiles. According to the common practice, covariances between the observed variables were not freely estimated, but instead were fixed to zero according to the assumptions of LPA. In the model of Level 2 variations in the size of the Level 1 profiles, in addition to mean profile size, the variances associated with the variation in profile size were freely estimated at Level 2. In the model of Level 2 classes based on the relative frequency of the Level 1 profiles, the latent class variable at Level 2 was freely estimated. In addition, in this model, the Level 1 profile was regressed on the Level 2 classes, thus estimating coefficients for multinomial regression, that is, statistically significant regression coefficients associated with the Level 1 latent profile proportions for the Level 2 latent classes. In the model of Level 2 classes based on Level 1 ratings, the variances in the Level 1 and Level 2 were freely estimated and only the Level 2 mean values were allowed to differ between the latent classes.
Step 4: Model Estimation
All the analyses were performed using Mplus (Version 7.3). The model parameters were estimated using full information maximum likelihood (FIML) estimation with standard errors that are robust to nonnormality (MLR estimator in Mplus) (Muthén & Muthén, 1998–2014). The MLR estimator is recommended for use in multilevel mixture models, as it is also robust to the nesting of observations into higher units. This method allowed use of all the observations in the data to estimate the model parameters. Starting value sets of 1,000 were used. Based on 50 iterations, the 50 best-fitting models were further iterated to find the likelihood maxima. If the log-likelihood is not replicated in several starting values or if convergence problems appear, the starting value sets should be increased.
Step 5: Choosing and Interpreting the Best-Fitting Model
Single-Level LPA
The LPAs were first estimated with unequal means and variances across the profiles as recommended in the literature (Diallo et al., 2016; Enders & Tofighi, 2008; Kim, Lamont, et al., 2016; Peugh & Fan, 2013; see the appendix for Mplus script; Model 1). This analysis produced a two-profile solution. The solutions with three or more profiles did not converge. Therefore, we next estimated the more parsimonious model, that is, a model where the latent profiles differed in the mean values of job characteristics (see the appendix for Mplus script; Model 1). Figure 1 presents the elbow plot of the information criteria (BIC, aBIC, AIC, and CAIC) for the different profile solutions of the single-level LPA based solely on the mean differences between the profiles. As illustrated, between the one- and two-profile solutions, all the information criteria decreased strongly, but remained at the same level up to the six-profile solution. After a six-profile solution, the information criteria again decreased compared with the previous solutions. When the k and k-1 profile solutions were compared, the p value of the BLRT test was < .001; consequently, in all the comparisons with the k-1 profile solutions, the null hypothesis was rejected.

Elbow plot of information criteria for different Level 1 profile solutions.
Typically, the change (i.e., decrease) in ICs depicted via an elbow plot should be U-shaped, while the point after which the slope flattens out indicates the optimal number of profiles in the data (see, e.g., Morin, Meyer, McInerney, Marsh, & Ganotice, 2015; Wang, Morin, Ryan, & Liu, 2016). The shape of the elbow plot obtained here may reflect the fact that the observed variables were not strictly continuous. Inspection of the profile solutions indicated that the six-profile solution and solutions with more than six profiles contained profiles with very small memberships (i.e., 1-2% of cases). We further investigated the two- to five-solution profiles graphically. It became evident that in the two-profile solution, the profiles differed from each other qualitatively (i.e., the two profiles had distinct shapes). The three- to five-profile solutions contained similarly shaped profiles with small level differences in workload, job control and facets of social support. Thus, these profile-solutions did not reveal new job types, as depicted in the JDCS model. Also, the classification quality of the two-profile solution was high: the average latent profile probabilities for the most likely latent profile membership (AvePP) were .90 and .94. In the three- to five-profile solutions, the AvePPs were much lower, the lowest varying between .74 and .83.
The two-profile solution is shown in Figures 2 (standardized scores) and 3 (raw scores). The first profile contained 64% of the participants, and was characterized by high social support, high levels of job control and low levels of workload. Consequently, this profile was labeled “collective low-strain job.” The second profile comprised 36% of the participants, and was characterized by high levels of workload, low levels of job control and low levels of social support from both supervisor and colleagues. Thus, using the terminology of the JDCS model, this profile was labeled “isolated high-strain job” (Karasek & Theorell, 1990). Previous person-centered research has also identified these low- and high-strain profiles (Mauno et al., 2016), and facets of social support further specified them as collective and isolated, respectively.

Level 1 two-profile solution with equal variances across profiles based on standardized scores.

Level 1 two-profile solution with equal variances across profiles based on raw scores.
Simulation studies suggest that LPAs with unequal variances lead to a less biased and a more accurate representation than models with only freely estimated means (Diallo et al., 2016; Enders & Tofighi, 2008; Kim, Lamont, et al., 2016; Peugh & Fan, 2013). As the estimation with equal variances resulted in a two-profile solution, as shown above, we also interpreted a two-profile solution based on unequal variances across the profiles. According to the BIC, the model with unequal variances (19151.664) was statistically better than the model with equal variances (21210.085). The two-profile solution based on unequal variances is shown in Figure 4. The form of the profiles is the same as in the two-profile solution with equal variances across profiles, but the size of the profiles is noticeably different (64% vs. 41% and 36% vs. 59%). Below, the MLPAs are illustrated based on the two-profile solution with unequal variances across the profiles. 2

Level 1 two-profile solution with unequal variances across profiles based on raw scores.
MLPA
Of the alternative MLPA models, Level 2 variations in the size of the Level 1 profiles was first tested (see the appendix for Mplus script; Model 2). This was done by setting the starting values according to the single-level LPA and the random starting values as zero, thereby ensuring that the Level 1 profiles would remain unchanged (see Mplus script for Model 2). The results of the random effect MLPA showed that the size of the latent profile varied significantly between work departments (variance estimate = 0.322, SE = 0.162, p = .047), suggesting that the typicality of the “collective low-strain job” and “isolated high-strain job” profiles varied from one department to another. Overall, these results indicate that, depending on the university department, the proportional distributions of the two individual job characteristics profiles are different.
The model of Level 2 classes based on the relative frequency of the Level 1 profiles was then estimated (see the appendix for Mplus script; Model 3). We compared the different Level 2 latent class solutions using the BIC criterion (Finch & French, 2014; Yu & Park, 2014; Zhang et al., 2014). The corresponding values for the one- to three-class solutions were 19139.26, 19137.84, and 19473.53. Consequently, the two-class solution was deemed sufficient. These classes are presented graphically in Figure 5. The results showed that the first Level 2 latent class included departments in which the great majority of employees were characterized by the profile “isolated high-strain job” (8% of all work departments); these departments were thus labeled “isolated high-strain departments.” The second Level 2 latent class contained work departments which had relatively equal proportions of “isolated high-strain job” and “collective low-strain job” profiles (92% of all work departments); the Latent class 2 was thus labeled as “mixed strain departments.” In this class solution (two Level 1 profiles, two Level 2 latent classes), the multinomial regression coefficient obtained from regressing the Level 1 latent profiles on the Level 2 latent classes was 2.314 (SE = 0.815), p = .002. This regression coefficient indicates that in Level 2 latent class 1, the probability for the “isolated high strain job” profile is .936 and for the “collective low strain job” profile .064. In Level 2 latent class 2, the corresponding probabilities are .590 and .410.

Level 2 classes based on the relative frequency of the Level 1 profiles.
Finally, a model of Level 2 classes based on Level 1 ratings was estimated (see the appendix for Mplus script; Model 4). In the one-class solution, the BIC was value 22028.68 and in the two-class solution 22037.56. Thus, it seems that no Level 2 latent classes were present in the data when the MLPA was based on Level 1 observations.
Step 6: Investigating the Role of Potential Covariates
In this step, the effects of the covariates and their cross-level interactions were tested in the two of the MLPA models, namely Level 2 variations in the size of the Level 1 profiles and Level 2 classes based on the relative frequency of the Level 1 profiles. As no latent classes were evident in the model of Level 2 classes based on Level 1 ratings, the role of covariates in this model were not investigated. In the tested models, working hours (Level 1), the size of the department, and average departmental working hours (Level 2) were used as covariates (see the appendix for Mplus script; Model 5 and Model 6). To avoid the problem of profile switching, the starting values of the model without covariates were used to ensure that the nature of the Level 1 profiles remained unchanged. The random starts function is only needed for the identification of the profiles themselves, and no longer needed thereafter.
First, covariates were added to the model of Level 2 variations in the size of the Level 1 profiles. The results showed that working hours at Level 1 (standardized estimate = 0.042, SE = 0.007, p < .001) predicted the likelihood of membership of the profile “isolated high-strain job.” Consequently, the longer the hours worked weekly by the individual, the more likely he/she was to be characterized by the “isolated high-strain job” profile. At Level 2, neither the size of the department (standardized estimate = –0.002, SE = 0.003, p = .403) nor average working hours (standardized estimate = –0.049, SE = 0.047, p = .303) predicted the random mean variation. No cross-level interactions were evident for working hours.
In the model of Level 2 classes based on the relative frequency of the Level 1 profiles, the results showed that working hours at Level 1 (standardized estimate = 0.040, SE = 0.007, p < .001) predicted the likelihood of the Level 1 profile “isolated high-strain job.” Neither of the Level 2 covariates predicted membership of the Level 2 latent classes (standardized estimate = 0.045, SE = 0.037, p = .225 for department size; standardized estimate = 0.075, SE = 0.173, p = .666 for average working hours). No evidence was found for cross-level interactions.
Discussion
This study presented and illustrated the use of MLPA, a method that offers an interesting extension of single-level LPA as it allows estimation of the distribution of Level 1 profiles across Level 2 units and the estimation of Level 2 classes based on either single-level profiles or variables. These methods have been successfully applied in various disciplines. For instance, they have been used to analyze categorical variables in the school context in the educational sciences (e.g., Allison, Adlaf, Irving, Schoueri-Mychasiw, & Rehm, 2016), and regional health behaviors in the health sciences (e.g., Erickson, Lenk, Toomey, Nelson, & Jones-Webb, 2016), and even for clustering scientific disciplines according to research output profiles (Mutz, Bornmann, & Daniel, 2013). However, in the organizational sciences the use of multilevel mixture modeling continues to be uncommon, despite data sets that are often hierarchical; for example, employees are nested in work units and teams are nested in organizations. Moreover, in the organizational sciences the variables analyzed are typically continuous rather than categorical, and thus MLPA is a highly advantageous modeling technique in this field of inquiry.
The consequences of not taking the nested structure of the data into account, such as inaccuracy in parameter estimation and biased standard errors due to using the Level 1 model alone, have been widely discussed and demonstrated (Asparouhov & Muthén, 2008; Chen et al., 2010; Luo & Kwok, 2009; Meyers & Beretvas, 2006; Moerbeek, 2004; Park & Yu, 2016; Vermunt, 2003, 2008). Thus, it is essential to recognize whether the data have been collected in a nested context, and to use multilevel modeling even in the case of low ICCs (Bliese et al., 2017). Conversely, aside from avoiding inaccuracy in profile estimation, the existence of a higher-level nested structure offers interesting novel analytical possibilities, as this study has demonstrated.
First, we showed in our demonstration that the distribution of profiles of individual latent job characteristics varied between university departments. That is, in some work departments the “collective low-strain job” profile was more typical, whereas in others the “isolated-high strain job” profile was more typical. Second, we were able to demonstrate in some detail how work departments were clustered according to the proportions of single-level profiles. In our data demonstration, two Level 2 latent classes were identified: “collective low-strain” work departments, composing most (92%) of the departments studied and “isolated high-strain” departments, accounting for the remaining 8%. To conclude, although job characteristics describe individual experiences of job control, demands, and support, a high-strain job profile was more typical in some work departments than others, and the identification of these departments was achieved through the application of MLPA.
Overall, our results indicate that the sample of a university employees studied typically belonged to a high-strain job profile. A novel finding is that the facets of social support did not reveal the additional job types described in the JDCS model (Karasek & Theorell, 1990). Thus, it seems that the original JDC model could be sufficient to characterize and differentiate types of jobs (Karasek, 1979), at least among highly educated university employees. Moreover, and perhaps more interesting, through the application of a multilevel mixture approach, we were able to identify latent classes of “healthy” and “unhealthy” work departments. The unhealthy “isolated high-strain” departments contained nearly all the employees with a high-strain job. This suggests that psychosocial job characteristics are not only individually experienced but may also be linked to collective stress and thus shared with colleagues.
Implications and Recommendations
MLPA importantly extends latent profile identification beyond the level of the individual employee, thereby enlarging the applicability of a person-centered methodology. Moreover, MLPA also enriches the outcomes of multilevel modeling. For example, in our case example, rather than simply estimating the amount of variance in job characteristics between work departments, it was possible to identify qualitatively different work departments. No hypotheses were set for the Level 2 latent classes, and thus they were identified in a purely data-driven way. In the context of multilevel person-centered research it would be advisable to use and combine both individual and organizational theories. Use of multiple theoretical frameworks also enables the possibility to conduct person-centered studies in a theory-driven way, as demonstrated by Wang (2007). Multilevel mixture modeling also has the potential to answer the generalization problem of person-centered research (see Meyer & Morin, 2016; Mäkikangas & Kinnunen, 2016), that is, whether the profiles found are unit-specific or more universal. Via the identification of Level 2 latent classes, it is possible to discover which profile type is predominant (versus underrepresented) across the higher-level units.
Besides offering theoretically innovative and interesting novel research targets, multilevel mixture modeling also provides plenty of modeling extension possibilities. In our MLPA demonstration, it was assumed that the Level 1 profiles are the same across the work departments, and vary only in their relative proportions. However, it is recommended that the Level 1 profile solution be verified when estimating the alternative MLPA models. In the longitudinal context, multilevel mixture models can be used to investigate Level 1 and Level 2 growth trajectories simultaneously (Chen et al., 2010). It is also worth noticing that all the multilevel mixture models can be based on latent factors instead of the sum scores used in this demonstration. However, too much complexity in the model easily induces convergence problems, and hence it might be reasonable to start the modeling procedure with a simpler model and increase the level of complexity within the boundaries of the tested model and sample size.
Overall, multilevel mixture models are highly recommended for analyzing nested data sets where the research interest is in studying latent classes in within-person profiles or their intercontextual variability (see also Henry & Muthén, 2010; Vermunt, 2003, 2008, 2011). One challenge in LPA—as in other types of mixture modeling—is to find the best fit solution, that is, the one showing the highest likelihood. It is typical of this kind of analysis that the iteration process stops at the local maximum likelihood value. Therefore, given the aim of finding the correct number of latent profiles, the use of several random sets of starting values is recommended (see also Meyer & Morin, 2016).
Although MLPA models offer indisputable benefits in modeling latent subpopulations in nested data sets, they also have some limitations. First, as models of these kinds are relatively complex and computationally heavy, they require a large sample size at both levels: a large number of observations at Level 1 and a large number of Level 2 clusters. Large sample sizes are especially required when the latent subpopulations are multifaceted and less distinct (Park & Yu, 2017). For example, the present data included 78 departments, of which only 8% were in the “isolated high-strain” class. An insufficient sample size in mixture modeling can lead to convergence issues, improper solutions, and the inability to identify small but meaningful subpopulations (Berlin, Williams, & Parra, 2014). However, a sufficient sample size is difficult to determine, as it depends, for example, on the size of the model, the distribution of the variables, the amount of missing data, the reliability of the variables, and the strength of the relations among the variables (Muthén & Muthén, 2002). Consequently, further statistical research is needed on the calculation of statistical power and adequate sample size for different MLPA models.
Second, a simplification of the model may be needed, such as in the case of a large number of Level 1 latent profiles whose random means correlate highly at Level 2. In this case, using a common factor to model the different random means at Level 2 may be a solution, as recommended by Henry and Muthén (2010). Third, modeling the Level 2 latent subpopulations and investigating the effects of the Level 2 covariates requires careful planning and theoretical reasoning prior to the data collection. For example, in the present demonstration, the Level 2 covariates were nonsignificant and their number was rather limited.
In sum, multilevel mixture modeling offers a fruitful approach to the analysis of hierarchical data in person-centered research, which is nowadays more commonly undertaken in the organizational sciences. Given the importance attributed to understanding human behavior in organizational settings, this method allows latent classes on different hierarchical levels (e.g., individual, team, and organization) in organizations to be identified at the same time, thereby also disentangling the diverse effects of covariates expressed on the levels of interest. Via studying the effects of various contextual factors on lower- and upper-level classes, this method has potential for broader application in organizational research. Owing to their suitability for studying, in particular, the effects of contextual-level predictors on individual typologies of behavior, we believe that these advanced statistical methods will encourage the formulation of new research questions, lead to new theorizing, and eventually also lead to new answers to issues pertaining to the employee-organization relationship.
Footnotes
Appendix
Acknowledgments
We wish to thank Alexandre J. S. Morin for sharing his expertise and consulting with us on the statistical analyses used in this study.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by grants from the Academy of Finland to Anne Mäkikangas (Grant 258882), Ulla Kinnunen (Grant 124268), and Saija Mauno (Grant 124360).
