Abstract
The Holistic Wellness Assessment (HWA) is a relatively new assessment instrument based on an emergent transdisciplinary model of wellness. This study validated the factor structure identified via exploratory factor analysis (EFA), assessed test–retest reliability, and investigated concurrent validity of the HWA in three separate samples. The hypothesized eight-factor structure was validated via confirmatory factor analysis (CFA), individually for each factor and overall in a multifactor analysis. Test–retest reliability estimates over a 1- to 3-week interval were appropriate for this assessment type. Concurrent validity estimates indicated that the HWA measures were similar, but not redundant, to wellness constructs found in other wellness instruments, specifically the TestWell® and Wellness Evaluation of Lifestyle, Version S (WEL-S). As young adults are exposed to a broader base of wellness in educational and related contexts, the use of the wellness assessments such as the HWA can identify areas of personal need for balance and healthy choice making.
Wellness has emerged from counseling and health sciences as a multidimensional concept that has promise for individual and collective life enhancement. The developmental understanding of the components of wellness has been influenced by medicine, education, sociology, ecology, biology, economics, counseling/psychology, and spirituality (Ardell, 1977; Clark, 1996; Dunn, 1961; Edlin, 1988; Hettler, 1984; Lafferty, 1979; Larson, 1999; Myers & Sweeney, 2008; Renger et al., 2000; Scott, Mayol, & Schreiber, 2014). Roscoe (2009) provided a summary of the construct space of wellness and distilled the various models of wellness into seven common categories: social, emotional, physical, intellectual, spiritual, occupational, and environmental.
The Holistic Wellness Assessment (HWA; Brown & Applegate, 2012, 2013; Brown, Applegate, & Yildiz, 2014) is based on an emergent integrative system and a transdisciplinary model of wellness. Previous research (Brown & Applegate, 2012) utilized exploratory factor analysis (EFA) of a large, young adult sample to explore the dimensionality of the HWA. Results supported eight wellness dimensions: Self-Regard with 36 items measures the relationship with self; Self-Awareness and Responsibility, with 36 items, measures how you relate to the world around you; Sustainability, with 30 items, measures the mutual relationship between you and the earth; Relational, with 15 items, measures the perceived effect others have on you; Risk Prevention, also with 15 items, measures how you keep yourself safe; Spirituality, with 7 items, measures one’s spiritual and religious beliefs and practices; Physical has 15 items and measures one’s exercise and nutritional choices; and, finally, Health Maintenance has just five items and measures one’s relationship with a health care provider, the frequency of maintaining regular medical checkups, plus knowledge of one’s family medical history. Previous oblique rotation from the EFA revealed meaningful but moderate interfactor correlations among the eight factors, ranging from −.09 between Sustainability and Relational, to .55 between the Self-Regard and Self-Awareness factor and Responsibility (Brown & Applegate, 2012). Five of these factors align with dimensions identified by Roscoe (2009), while three—Self-Regard, Self-Awareness and Responsibility, and Sustainability—represents integration of several wellness constructs with the inclusion of financial items.
Historically and currently, there are several wellness instruments targeting young adult populations (Rachele, Washington, Cuddihy, Barwais, & McPhail, 2013) from which researchers and practitioners can choose. Although, these researchers also point out a consistent limitation in this literature; there is noted lack of validation evidence. Factor analysis is a common analytical strategy for both establishing and testing the validity of psychological assessments. However, it is well known that the results of a factor analysis are a function of the set of indicators included in the analysis, such that the only way to ensure a complete understanding of a multidimensional construct space is to examine a comprehensive set of indicators that covers the widest item sampling of the probable underlying domains. This explains how different wellness instruments often factor differently.
The objectives of this study were threefold and were examined in three separate samples. First, to investigate the factor structure of the HWA presented by Brown and Applegate (2012), derived from an EFA with confirmatory factor analysis (CFA) in a model evaluation/building framework, not a strictly confirmatory (SC) framework (Joreskög & Sörbom, 1993). Moreover, due to the length of the HWA (191 items), whenever possible, item pruning was considered when item overlap was detected within the CFA residual analyses. The second sample examined the test–retest reliability of the HWA. The third sample compared the concurrent validity of the HWA with two prominent wellness assessments, TestWell® (National Wellness Institute, 1992) and WEL-S (Hattie, Myers, & Sweeney, 2004).
Method
Participants and Procedure
Following the approval of the research protocol from the Human Subjects Institutional Review Board, three different independent samples were collected, more or less concurrently, from a large Midwestern regional university. The first sample (N1 = 1,092) was used to examine and refine the factorial structure of HWA; the second sample (N2 = 60) was collected to estimate the test–retest reliability of the HWA used; and the third sample (N3 = 66) was used to examine the convergent construct validity of HWA.
Sample 1
The first sample recruited participants from multiple sections of courses in Holistic Health, Alcohol and Drug Abuse, and Health and Wellness. Participants were directed to an online portal, Survey Suite©, where they anonymously completed a 191-item HWA, plus nine demographic questions. The HWA was available to the students for up to 4 weeks, after which the data was downloaded for analysis. A total of 1,160 students volunteered to participate in the research by initiating the HWA assessment, and 1,092 respondents of the original 1,160 completed the full online assessment.
Sample 2
The second study was undertaken to estimate test–retest reliability of the 191-item HWA. Sixty students were recruited from four different Holistic Health courses at the same university as Sample 1, but from a different semester. Due to the necessity to link participant responses across two assessments, participants completed a paper-and-pencil version of the HWA, where their responses were recorded on a ScanTron© response sheet that was electronically scored during class. The first author attended each classroom session to solicit participation and hand out materials for the study, which included ScanTron© sheets and booklets with the assessment questions. After the students responded to the assessment questions on their ScanTron© sheets, the researcher returned to the classroom and collected the completed ScanTron© sheets. The same administration protocol was used for the second (retest) assessment. The test–retest timeline ranged between 1 and 3 weeks after the initial test was administered. Retest intervals differed due to the instructional needs of each course instructor. All ScanTron© sheets were collected and processed at the university scanning center.
Sample 3
College students were recruited from a single section of a Holistic Health course from the same university as Samples 1 and 2, but from different semesters. The first author attended class, explained the study, and solicited participation for the research project. For students who consented to participate, the study involved the concurrent completion of the HWA, TestWell®, and WEL-S assessments in an online environment, following the same general protocol described in Sample 1. Due to programming limitations, participants were required to complete the three assessments (HWA, TestWell®, and WEL-S) through two separate Internet links. One link included both the HWA and WEL-S assessments, which were counterbalanced for administration order across sequential portal accesses and a second link to complete the TestWell®. All counterbalancing was done to minimize assessment order effects.
Table 1 presents general demographics for each sample. As illustrated, there are differences among the three samples. Specifically among gender, Samples 1 and 3 are statistically similar to each other, χ2(1, N = 1154), p = .4427, but differ from Sample 2, χ2(2, N = 1212), p < .0001. Regarding ethnicity, all three samples differ from each other, χ2(6, N = 1213), p < .0001, and for age grouping, Sample 2 differs from Sample 1, χ2(2, N = 1153), p = .0001, and Sample 3, χ2(2, N = 127), p = .0001, but Samples 1 and 3 do not differ from each other, χ2(2, N = 1158), p = .5339. While these three samples do differ in demographic profile, all were drawn from undergraduate courses in a holistic health curriculum.
Participant Demographics (%) in the Three Samples.
Instrumentation
The HWA is a 191-item instrument that assesses eight factors of wellness as described in Brown and Applegate (2012). Instrument examination can be requested via email to the Rinehart Institute (www.rinehartinstitute.com). Participants responded to HWA items on a 6-point frequency (magnitude) scale ranging from “no/never” to “yes/always.” Time to complete the HWA typically ranged between 20 and 30 min.
The TestWell® (National Wellness Institute, 1992) is a 100-item instrument that measures 10 constructs on a 5-point response scale (never or almost never, occasionally, often, very often, always or almost always). The TestWell® for college students measures 10 dimensions of wellness: Physical Fitness and Nutrition, Medical Self-Care, Emotional Management, Intellectual Wellness, Occupational Wellness, Spirituality and Values, Safety, Environmental Wellness, Social Awareness, and Sexuality and Emotional Awareness. Owen (1999) investigated the TestWell® reliability and validity in a graduate-student population. Based on 185 graduate-student responses, reliability for the TestWell® was computed by using coefficient alpha. The full TestWell® scale was .92 while the spilt-half reliability was .87. The individual dimensions ranged from .58 for Safety to .85 for Occupational wellness. Correlation between the total score of the TestWell® and the subscale scores ranged from .44 to .72.
The WEL-S is a 134-item inventory (5-point scale: strongly disagree to strongly agree) that measures five life tasks associated with the Adlerian counseling conceptualization of wellness: Spirituality, Self Regulation, Work/leisure, Friendship, and Love described in the Wheel of Wellness (Myers, Sweeney, & Witmer, 2000; Witmer & Sweeney, 1992). Only 123 items are scored; there are 11 distracter items that are not included in any scale or subscale scoring. The reliability of the WEL-S scales range between .61 for Leisure and .89 for Love, and for Total Wellness the alpha coefficient was .84. Validity of the WEL-S (Hattie et al., 2004) was investigated by correlating it with the TestWell®. Correlations ranged between .31 and .74—the total wellness scale score of the WEL-S correlated .77 with the composite score of the TestWell®.
Data Preparation
For the CFA, Sample 1, item-level data from 1,092 participant’s assessments were downloaded from the survey hosting website for analysis. Data screening indicated only a minimal amount of missing data at the item level. Complete item data were obtained from 544 respondents (50%) with 548 HWA protocols containing one or more missing item responses. Of these 548 protocols, 546 contained less than 10% missing item data, with the two remaining protocols missing 18% and 20% of the item data, respectively. Data were analyzed to determine the nature of missingness (McKnight, McKnight, Sidani, & Figueredo, 2007; Rubin, 1976), identified as missing completely at random (MCAR), missing at random (MAR), or not missing at random (NMAR). Evidence from these analyses supported the conclusion that the data minimally met the definition for MAR, and thus missing item data were imputed (SAS, PROC MI). Three imputed data sets were generated from a single Markov Chain Monte Carlo (MCMC) chain with initial starting values estimated from expectation-maximization (EM) algorithm, from which item-level data were extracted for CFA estimation. Overall imputation efficiency was high (99.8%) suggesting that adding covariates in the imputation step would be of minimal value, and thus all CFA analyses (Mplus V6.11) are based only on the first imputed data set.
In Sample 2, 60 of 66 pairs (9% attrition) were collected from participants’ ScanTron© sheets and transferred to a spreadsheet by the university scanning center. Individual’s responses to the initial test items were compared with the retest items administered 1 to 3 weeks later.
In Sample 3, of the 100 initial participants, complete data were secured on 66 respondents and item-level data were downloaded from the respective websites for the test validity comparisons. All data files were merged into a subject by item-level data set for validity analysis. Assessment scoring was conducted in accordance with established scoring rules for the TestWell® and WEL-S, and as previously described for the HWA.
Results and Discussion
Factor Structure
Joreskög and Sörbom (1993) argued that testing of structural equation models follows one of three approaches: SC, alternative models (AM), and model generating (MG). This study used a MG approach instead of a SC approach. The eight subscales of HWA were separately analyzed via CFA due to the capacity of currently available computers and computation time limitations. Modification indices and content-expert review were used to examine and modify the initial unidimensional model to a model with correlated errors, which was called the modified model. Then, a decision had to be made, regarding the items that had correlated errors, whether to keep or delete one of them.
Three models were investigated in the CFA analyses: initial (one-factor model for each subscale), modified (individual one-factor models with correlated errors), and a final (one-factor model with no correlated errors, after item trimming). The initial model and modified model contained the same set of items, whereas the final model had a smaller number of items due to item trimming. The modified model (containing correlated measurement errors) and the initial model were nested; thus, a Satorra–Bentler chi-square difference test statistic was calculated to test if the modified model resulted in a statistically significant improvement in model-data fit. Item trimming was based on the nature of the correlated errors and the theoretical alignment to the underlying factor. Following item deletion, models were no longer nested, so the final and modified models were re-estimated with maximum likeilhood with robust standard errors (MLR) estimation. The MLR estimator allowed comparison on nonnested models via Akaike information criteria (AIC) and Bayesian information criteria (BIC). Table 2 presents the initial one-factor model, modified, and final (trimmed) models for all eight HWA scales individually. Results of the individual CFAs resulted in elimination of 32 (of the original 191) items among the eight HWA scales.
Improvement of Fit Between Initial and Final models for Each HWA Subscale Individually (N = 1,092).
Note. WLSMV = weighted least squares means and variance adjusted estimation; MLR = maximum likeilhood with robust standard errors estimation; RMSEA, CFI, and TLI were estimated by WLSMV; AIC, BIC, and BIC adjusted were estimated by MLR. HWA = Holistic Wellness Assessment; RMSEA = root mean square error of approximation; CI = confidence interval; CFI = comparative fit index; TLI = Tucker–Lewis index; AIC = Akaike information criteria; BIC = Bayesian information criteria; ΔSBχ2 = Satorra–Bentler chi-square difference test statistic.
p < .05.
Following item trimming, complete CFA (weighted least squares means and variance adjusted [WLSMV] estimation) of the original and trimmed HWA instrument were estimated. CFA results, based on the initial pool of 191 items, revealed a plausible fit: χ2(17,926, N = 1092) = 42,731.54; root mean square error of approximation (RMSEA) = 0.036; 90% confidence interval (CI) = [0.035, 0.036]; comparative fit index (CFI) = 0.868; and Tucker–Lewis index (TLI) = 0.867. In the trimmed HWA instrument, after deleting 32 items, the 159-item HWA model-data fit improved: χ2(12,374, N = 1092) = 30,824.05, RMSEA = 0.037, 90% CI = [0.036, 0.037], CFI = 0.885, and TLI = 0.883. In both analyses, no restrictions were placed on interfactor correlations, for example, they were freely estimated, and all unique variances were specified as uncorrelated. The overall fit of the full HWA model was very good based on the RMSEA criteria established by Hu and Bentler (1999). However, CFI and TLI showed lower values than the acceptable model-data fit levels (<0.95) as described in Hu and Bentler. The reduced 159-item HWA provides support for the eight-factor structure of wellness developed in the previous EFA. Table 3 depicts the interfactor correlations in the final (trimmed) HWA instrument. As can be seen, most of the interfactor correlations are moderate with two exceptions: Self-Regard with Self-Awareness and Responsibility and Relational. This should not be too surprising in that Self-Regard involves items measuring one’s self whereas Self-Awareness and Responsibility involves items about how one fits into the world and Relational items about self and others. Thus, all three factors measure aspects of the self with the environment or others.
HWA Subscale Interfactor Correlations (N = 1,092).
Note. The number of items used for this table is based on CFA with 159 items. HWA = Holistic Wellness Assessment; RSELF = Self-Regard; SELFAWA = Self-Awareness; SUSTA = Sustainability; RELAT = Relational; RSKPV = Risk Prevention; SPRT = Spirituality; PHYS = Physical Health; HLTH = Health Maintenance; CFA = confirmatory factor analysis.
Hattie et al. (2004) conducted validation study on the WEL-S and found a similar level of model-data fit (RMSEA = 0.04). Unfortunately, we have not been successful in locating a structural (confirmatory) study of the TestWell®. Our CFA results indicate that the original eight factors derived from an EFA have replicated in a new sample. Moreover, they can be measured with a smaller set of item indicators. The eight-factor structure of the HWA is similar to existing instruments, yet provides a different dimensional formulation of the underlying elements of wellness. We investigated the relationships between the HWA and two popular alternative wellness instruments in Sample 3.
Reliability
Internal consistency of 159-item HWA was estimated by employing alpha (Cronbach, 1951) and omega (McDonald, 1999) coefficients. Health Care subscale had .71 of omega and alpha; the remaining reliability coefficients were .83, or more, for both omega and alpha. This indicates that the HWA shows internal consistency.
Test–retest reliability was estimated by computing the correlations among the HWA scales over a short 3-week interval. Given the immediate focus of the HWA items, measuring a respondent’s current state, the test–retest reliability estimates are reasonable in magnitude, and longer retest intervals may be affected by changes within the respondent. Table 4 presents the test–retest reliability findings. As can be seen from this table, the short-term stability of the HWA scales is quite good, and ranging from .74 to .89 is well within acceptable ranges for assessments of this type. In addition, intraclass correlations were calculated for each HWA item, but are not presented in this article; they are available from the authors upon request.
Internal Consistency, Test–Retest, and Coefficient Omega Reliability Estimates for the HWA Subtests (N = 60).
Note. This table is based on refined HWA that has 159 items. HWA = Holistic Wellness Assessment.
Stewart, Rowe, and LaLance (2000) reported that test–retest reliability coefficients of TestWell® ranged from .70 to .81 over a 12-week period, when it was administered to a high school sample. The shorter time interval employed in our study may partially explain the higher coefficients for the HWA. In an earlier study utilizing a slightly different version of the HWA (items and response scale changes) by Brown and Applegate (2012), the reported internal consistency reliability of HWA ranged from .75 to well more than .90. Stewart et al. also reported that internal consistency reliability for TestWell® ranged from .67 to .89. Compared with the TestWell®, the HWA evidenced higher internal consistency and test–retest reliability coefficients.
Convergent Validity
Concurrent validity coefficients were estimated by computing correlations among the subtests of the 159-item HWA, TestWell®, and WEL-S instruments administered concurrently in Sample 3. Table 5 presents concurrent validity coefficients for the HWA and TestWell®, as does Table 6 for the HWA and WEL-S instruments. Highlighting relevant concurrent validity coefficients indicates that the TestWell® scale of Physical Fitness shows convergent validity with the HWA scale, Physical, r = .74, whereas the TestWell® scale, Nutrition, was related to both the HWA scales, Sustainability, r = .76, and Physical, r = .75. The TestWell® scale, Self-Care, correlated with the HWA Risk Prevention scale, r = .47; the TestWell® scale, Environmental Awareness, with the HWA. Sustainability, r = .58, and the TestWell® Emotional Management scale correlated with the HWA scale, Self-Regard, r = .50. Finally, the two Spirituality scales were moderately correlated, r = .50. There is relatively strong concurrent validity between the TestWell® and HWA Spirituality scales.
Concurrent and Internal Validity Estimates for the HWA and TestWell® (N = 66).
Note. Clear cells depict interscale correlations among the TestWell®, light gray shading cells depict concurrent validity correlations among the HWA and TestWell® scales, and dark gray shading cells depict interscale correlations within the HWA. HWA = Holistic Wellness Assessment.
Concurrent and Internal Validity Estimates for the HWA and WEL-S (N = 66).
Note. WEL-S = Welness Evaluation of Lifestyle, Version S. Empty cells depict internal correlations among the WEL-S subscales. Gray cells depict concurrent validity correlations among the WEL-S and HWA scales. HWA = Holistic Wellness Assessment.
As can be seen from Table 6, there are numerous validity coefficients between HWA and WEL-S that are theoretically meaningful but do not achieve a level that would indicate duplication between the constructs. For example, the validity coefficient between WEL-S Spirituality and HWA Spirituality is one of the highest in the table, r = .84, suggesting these two instruments are consistent in their measurement of this construct. Other examples of convergent validity include the WEL-S Sense of Worth and the HWA Self-Regard, r = .57, and between the WEL-S Sense of Control and the HWA Self-Regard, r = .51, suggesting the theoretical construct of Self-Regard is broad in its conceptualization. Emotional Response in the WEL-S showed convergent validity with the HWA scales: Self-Regard, r = .60; and Relational, r = .59. Intellectual Wellness in the WEL-S was related to the HWA Self-Awareness and Responsibility scale, r = .40. The WEL-S Exercise scaled with HWA Physical scales were moderately correlated, r = .56. Risk Prevention on the HWA and Self-Care of the WEL-S were also moderately correlated, r = .50. Stress Management in the WEL-S was correlated with the HWA in areas of Self-Regard, r = .58, and Self-Awareness and Responsibility, r = .53. Finally, the strongest correlations of the WEL-S Perceived Wellness scale were with the HWA scale, Self-Regard, r = .72. Taken together, the HWA shows both practical and theoretically meaningful concurrent validity with the TestWell® and WEL-S subscales.
The concurrent validity evidence of HWA, relative to WEL-S and TestWell®, indicated that there was convergent validity among all three assessments. As reported in the previous section, the relation of HWA to TestWell® ranged from .36 to .76 for relevant constructs, whereas HWA’s relation to WEL-S was evident in most of the constructs, with convergent validity estimates ranging from .43 to .84. These findings support that HWA effectively bridges the scales/constructs embodied in the WEL-S and TestWell®.
Conclusion
The HWA shows promise for potential use in clinical and educational settings, with ongoing validation needed that expands both the setting and population. The HWA is psychometrically sound; however it is not without limitations. The samples for the study were drawn from a college-age population where the study of wellness is part of their educational experience. The three samples were drawn from different courses in different semesters. Although there was no sample aggregation, there could be systematic differences affecting the observed results. An additional limitation of this study is that we did not model alternative factor structures for the HWA, thus there may be other solutions that equivalently fit the data.
As a standalone assessment, the HWA can be a starting point for client/provider conversation related to awareness, and to the role wellness has in the client’s lifestyle and personal growth. The results of the assessment can also inform and educate the patient on ways to improve their wellness by identifying past behavior patterns related to healthy choices. By using the HWA, patients and providers can work in partnership to address and meet health care objectives, while potentially improving the relationship between the provider and the patient.
As young adults are exposed to a broader base of wellness in educational contexts, the use of the HWA can identify areas of personal need for balance and healthy choice making. The HWA can be used to identify areas of behavior where change may be needed, and can affect lifelong patterns of behavior. Educators can use the HWA as a pre-/posttest to track changes in student attitudes, beliefs, and behaviors that inform both the student and the instructor, as different wellness topics are presented. Although HWA has desirable psychometric properties and defendable validity coefficients, the process of validation is ongoing. Moreover, further refinement of the HWA and similar instruments present as structurally complex assessments and thus there may be theoretical profit in considering higher order structural solutions (Hattie et al., 2004).
Footnotes
Authors’ Note
Portions of this article have been presented at the American Educational Research Association annual conference in San Francisco, April 2013, and Philadelphia, April 2014.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
