Abstract
The consequences of traumatic brain injury (TBI) for health-related quality of life (HRQoL) are poorly investigated, and a TBI-specific instrument has not previously been available. The cross-cultural development of a new measure to assess HRQoL after TBI is described here. An international TBI Task Force derived a conceptual model from previous work, constructed an initial item bank of 148 items, and then reduced the item set through two successive multicenter validation studies. The first study, with eight language versions of the QOLIBRI, recruited 1528 participants with TBI, and the second with six language versions, recruited 921 participants. The data from 795 participants from the second study who had complete Glasgow Coma Scale (GCS) and Glasgow Outcome Scale (GOS) data were used to finalize the instrument. The final version of the QOLIBRI consists of 37 items in six scales (see Appendix). Satisfaction is assessed in the areas of “Cognition,” “Self,” “Daily Life and Autonomy,” and “Social Relationships,” and feeling bothered by “Emotions,” and “Physical Problems.” The QOLIBRI scales meet standard psychometric criteria (internal consistency, α = 0.75–0.89, test-retest reliability, rtt = 0.78–0.85). Test-retest reliability (rtt = 0.68–0.87) as well as internal consistency (α = 0.81–0.91) were also good in a subgroup of participants with lower cognitive performance. Although there is one strong HRQoL factor, a six-scale structure explaining additional variance was validated by exploratory and confirmatory factor analyses, and with Rasch modeling. The QOLIBRI is a new cross-culturally developed instrument for assessing HRQoL after TBI that fulfills standard psychometric criteria. It is potentially useful for clinicians and researchers conducting clinical trials, for assessing the impact of rehabilitation or other interventions, and for carrying out epidemiological surveys.
Introduction
The concept of HRQoL refers to the specific effects of health on well-being and functioning (Guyatt et al., 1989; von Steinbüchel, 1995). HRQoL represents a person's perspective on his or her subjective health condition, functioning, and well-being in the domains of physical, psychological (emotional and cognitive), social, and daily life (von Steinbüchel et al., 2005a). The person is viewed as the best expert on his or her QoL, and the measurement of this multidimensional concept is usually accomplished via self-rating; only in cases of severe cognitive impairment is observer rating preferred (Bullinger and von Steinbüchel, 2001).
Since English (1904), outcomes of TBI, however, have been traditionally assessed by functional indicators, such as disability recovery (Arango-Lasprilla et al., 2007; Jennett and Bond, 1975; Wilson et al., 1998), health status (Corrigan and Bogner, 2004; Findler et al., 2001; Klonoff et al., 1986; Lippert-Grüner et al., 2007; McCarthy et al., 2006), return to work or productivity (Klonoff et al., 2006; McCrimmon and Oddy, 2006; Shames et al., 2007), psychosocial and social functioning (Arango-Lasprilla et al., 2007; Schönberger et al., 2006; Tomberg et al., 2005), and community participation (Cattelani et al., 2008; Corrigan and Bogner, 2004; Mascialino et al., 2009; Powell et al., 1998). While these outcomes are certainly related to HRQoL in TBI, they do not incorporate the perspective of subjective well-being of HRQoL (von Steinbüchel et al., 2005a). The National Institutes of Health (NIH) consensus development panel on rehabilitation of persons with TBI (1999) therefore recommended the use of generic HRQoL measures.
The most frequently administered generic health status measures in TBI have been the SF-36 Health Survey (Andelic et al., 2009; Callahan et al., 2005; Findler et al., 2001; Hawthorne et al., 2009; MacKenzie et al., 2002; Ware et al., 1995, 1993), and the Sickness Impact Profile (Corrigan et al., 1998; Klonoff et al., 1986; Pagulayan et al., 2008, 2006). Both of these instruments show lower subjectively rated health in persons with TBI than in healthy persons. The measurement of life satisfaction in TBI—a component of HRQoL—has been most commonly performed with the Satisfaction with Life Scale (Diener et al., 1985), which shows improvements in satisfaction with life after treatment (Dahlberg et al., 2007), and an association of life satisfaction with healthy and productive lifestyles (Pierce and Hanks, 2006). Importantly, these generic HRQoL instruments allow for general comparative studies between different populations.
In comparison, disease- or condition-specific HRQoL instruments are assumed to be more sensitive to specific health conditions, and therefore allow the collection of more focused and precise information. It is only recently that HRQoL has been recognized as a potentially important outcome variable in TBI (Neugebauer et al., 2002). In TBI, dimensions such as cognition, self-perception, and self-esteem are likely to be particularly important (von Steinbüchel et al., 2005a). Although the negative impact of TBI on HRQoL has been reported as described above, positive factors, such as adaptation to a new life after TBI, have not been evaluated (Dijkers, 2004). This emphasizes the need for complementary assessment of positive aspects of HRQoL in persons after TBI (von Steinbüchel et al., 2005b).
Riemsma and associates (2001) and von Steinbüchel and colleagues (2005b) pointed out potentially important validity problems with generic health status and HRQoL instruments when administered to patients with cognitive impairment. That is, in subgroups of persons after TBI this assessment may be invalid, or at least it has unknown validity and reliability, as the patients may differ in the domains (such as cognition, self perception, and daily life activities) important to them. Critically, in severely impaired persons, awareness of cognitive and other deficits may be reduced. Measuring a construct such as HRQOL in persons with cognitive deficits via the cognitive domain (self-rated questionnaires) represents a major methodological challenge. One solution is to develop an appropriate patient-based disease-specific questionnaire, and to check whether it has satisfactory psychometric properties in more severely impaired individuals.
As a result of the recommendations of the Trauma Consensus Group (TCG; Neugebauer et al., 2002), a new TBI-specific HRQoL measure, the Quality of Life after Brain Injury instrument in TBI (QOLIBRI) was developed (Fig. 1). This article reports the construction and validation of the questionnaire. Additional background on the development of the QOLIBRI can be found in articles by von Steinbüchel and associates (2005a, 2005b) and Truelle and colleagues (2008). Aspects of the validity of the QOLIBRI are described in the other article in this issue by von Steinbüchel and colleagues (2010). The QOLIBRI shows systematic relationships with the GOSE, the SF-36, and the HADS, and with other clinical variables that indicate good validity.

Stages in the development of the QOLIBRI (HRQoL, health-related quality of life; TBI, traumatic brain injury; QOLIBRI, Quality of Life After Brain Injury scale). 1Bullinger et al., 2002. 2Von Steinbüchel et al., 2005. 3Truelle et al., 2008.
Methods
Initial development of the QOLIBRI
The conceptual model for the QOLIBRI was developed on the basis of a TBI literature review and consensus meetings of an international consortium (later referred to as “the QOLIBRI Task Force”; hereafter the Group). The Group reviewed measures that could be used to assess subjective experience after TBI for appropriateness, relevance, applicability, and psychometric quality. The following instruments were selected: the QOLBI (Quality of Life of the Traumatic Brain Injured; Tazopoulou et al., 2005), the Profile de la Qualité de la Vie Subjective (Gerin et al., 1989), the BICRO-39 (Brain Injury Community Rehabilitation Outcome Scale; Powell et al., 1998), and the EBIQ (European Brain Injury Questionnaire; Teasdale et al., 1997). The items were pooled into an item bank of 148 items and translated into English if necessary. The initial review revealed substantial communality in the areas and items covered by the different instruments. Fifteen QOLIBRI Task Force members, including neurosurgeons, neurologists, neuropsychologists, psychologists, and other health care professionals working in neuro-rehabilitation, were involved in the process of selecting 56 of these items (for details see von Steinbüchel et al. 2005b), and allocated them to seven HRQoL domains: physical condition, thinking activities, feelings and emotions, functioning in daily life, relationships and social/leisure activities, current situation, and future prospects. The items were either formulated as satisfaction items (‘‘How satisfied are you with …’’) or as bothered items (‘‘How bothered are you with..’’). The items were self-rated on a five-point scale (“Not at all/Slightly/Moderately/Quite/Very”). Additional items and open-ended questions were included to assess the relevance of the items to the participants.
The item bank was translated into Danish, Dutch, Finnish, French, German, Italian, and Spanish, using linguistic validation guidelines (Acquadro et al., 1996). Guidelines for cognitive debriefing and language harmonization were followed (von Steinbüchel et al., 2002). The draft instrument was subsequently administered to 1528 persons after TBI across the centers collaborating in the QOLIBRI Task Force. The participants were recruited predominantly as convenience samples in a cross-sectional study.
The aim was to enroll 250 patients per language. Inclusion and exclusion criteria were the same as in the final validation study (see below). Four languages reached this goal (Finnish, French, German, and Italian). Psychometric testing of the items was performed on these cohorts with classical and modern test theory methods. The other languages underwent confirmatory analyses. Following the analyses of 1050 cases with the Glasgow Outcome Scale–Extended (GOSE; Wilson et al., 1998) and the Glasgow Coma Scale (GCS; Teasdale and Jennett 1974), and subsequent item reduction and refinement of scores, an interim version of the QOLIBRI instrument with 49 items was constructed. We considered it essential to retest, revalidate, and finalize this preliminary instrument in clinical settings.
Development and validation of the final QOLIBRI
Psychometric properties of the preliminary 49-item QOLIBRI version were investigated in a cross-sectional sample designed to cover a heterogeneous range of head injury outcomes. Nine countries with six languages participated in the second validation study: Australia (n = 64), Belgium (n = 33), Finland (n = 171), France (n = 135), Germany (n = 172), Italy (n = 150), the Netherlands (n = 118), the U.K. (n = 41), and the USA (n = 25). Most countries/centers recruited convenience samples from rehabilitation facilities. Australia and Germany randomly sampled outpatients of hospital registries as participants. The inclusion criteria were: a minimum age of 15 years at time of injury, time since trauma between 3 months and 15 years, diagnosis of TBI according to ICD-10, and informed consent. Exclusion criteria were: a recorded GOSE score < 3, a spinal cord injury, the presence of a significant current or pre-injury psychiatric condition or ongoing severe addiction, a diagnosed terminal illness, and inability to understand, cooperate, and answer questions in the respective language. Ethics clearance was obtained by each of the participating centers.
In this second study we aimed to recruit 120–150 participants per language. Patients were stratified by clinical severity of TBI as assessed by the GCS. Test-retest reliability was investigated in a subset of at least 30 participants per language by second administration of the questionnaire after a 2-week interval.
The questionnaires were administered in one of four modes: by self-report (mail), self-report (participant present at the clinic), face-to-face interview, or administration over the telephone.
Measures
The preliminary QOLIBRI version for this second validation study consisted of 49 items, 43 of which were arranged in seven scales, with the remaining six items composed as an overall scale for use as an independent screening instrument (von Steinbüchel et al., in prep.). Four scales assessed satisfaction with “Cognition” (7 items), “Emotion and self-perception” (8), “Activities of daily living (ADLs) and Autonomy” (8), and “Social relationships” (6). Three scales assessed feeling bothered by “Negative feelings” (5), “Restrictions and problems” (4), and “Physical condition” (4). Responses were recorded on a 5-point scale: “Not at all/ Slightly/ Moderately/ Quite/ Very.” The “bothered” items had an additional (filter) category: “Does not apply.” During scoring, the endorsements of this option were coded as “Not at all.” Scale scores were presented on a percentage scale (0–100).
Demographic characteristics of participants were collected, including country, language, gender, age, relationship status, educational attainment, labor force participation, and social participation.
Subjective health status was assessed through administration of the SF-36 Health Survey (version 1; Ware et al., 1993; one country used version 2 with subsequently transformed data; Ware and Kosinski, 1996; Ware, 2000). Two summary scales are reported, the Physical Component Summary (PCS) and the Mental Component Summary (MCS).
Current health conditions were assessed through a list of 28 common health conditions (adapted from the WHOQoL project; von Steinbüchel et al., 2006). Clinical data extracted from participants' medical records included the date of injury, cause of injury, site of major head injury, and the worst GCS score in the first 24 h, classified into minor (13–15), moderate (9–12) and severe (3–8; Rimel et al., 1982). The presence of comorbid health conditions was also recorded (epilepsy, hemiparesis, vision and hearing problems, extracerebral injuries, communication, attention and memory dysfunction, executive function, and affective and behavioral disorders), as were details of participation in rehabilitation and the use of anticonvulsive, psychotropic, and recreational drugs.
Depression and anxiety were assessed with the Hospital Anxiety and Depression (HADS) scale (Zigmond and Snaith, 1983), which was available in all languages. A score of > 8 was taken to indicate probable morbidity (Olsson et al., 2005).
Cognitive status was measured in a subsample of participants using either the Mini Mental State Examination (MMSE; Folstein et al., 1975), or the Telephone Interview for Cognitive Status (TICS; Brandt et al., 1988), which was translated into the particular languages. Cut-offs on the TICS and MMSE of 32/33 and 27/28, respectively, were used to define groups with low cognitive performance and normal performance. These cut-offs have been found to be equivalent in an elderly sample, and scores in the lower range are taken as indicating borderline or impaired performance, while the upper range is regarded as normal (Brandt and Folstein, 2003). In a younger group, borderline scores are more likely to be indicative of impairment (Crum et al., 1993).
Disability recovery was assessed by the GOSE (Wilson et al., 1998). Users were trained to administer the GOSE by following the manual published by Wilson and colleagues (1998). The GOSE classification was Severe Disability (3–4), Moderate Disability (5–6), and Good Recovery (7–8).
Statistical approaches
To finalize the QOLIBRI scales, metric properties were examined (a) on the item level, and (b) on the scale level, with respect to internal consistency, test-retest reliability, and factor structure.
(a) Item level
We first checked whether responses to items were distributed over the whole range. For item frequency analysis we used the endorsement index devised by the WHOQOL group (1998). Distributions were checked for frequency problems to determine whether any two adjacent response categories had a sum of less than 10% of the total number of responses, and did so for at least half of the language versions. Floor and/or ceiling effects (>60% of cases at the maximum or the minimum of the scale) were also checked.
Skewness is common in the responses to clinical scales. For example, many people will tend to report satisfactory to good HRQoL on items in a scale, while a very few others may report extremely poor quality of life. Extreme skewness can, however, create problems for analysis using correlation, reducing both the probability that a scale will show strong relationships with other measures, and its reliability (or the precision of measurement) of a scale. Skewness (conventionally, items with skewness > 1 are considered for removal) was checked; however, some moderately skewed items (1.0–1.3) were included, to capture a range of impairment (Schmidt et al., 2006).
(b) Scale level
The internal consistency of the scales, which reflects their reliability, was assessed using Cronbach's α, and the fit of individual items to each scale was examined by correlating the item with the total for the other items in the scale. Cronbach's α and corrected item-total correlations (CITCs) were calculated. It is conventionally accepted that CITCs should be over 0.4 (WHOQOL group, 1998). An α of 0.70 is often regarded as the lower boundary of acceptability for measures used in group comparisons (Moosbrugger and Kelava, 2007), and over 0.90 for clinical application to individuals (Bland and Altman, 1997).
Test-retest reliability is one of the most important measures of reliability for questionnaires. The test-retest reliability of the QOLIBRI scales was assessed using the intra-class correlation coefficient (ICC), calculated between the scale means on two occasions (retested on average 14 days after initial testing). The conventional interpretation of the ICC is that values of 0.40–0.75 are fair to good, and values over 0.75 are excellent (Fleiss, 1986).
In addition to correlating adequately with its own scale, it is also important that an item does not correlate similarly with other scales. Statistics comparable to the Multitrait Analysis Program (MAP; Hays et al., 1988) were calculated. The MAP criterion for a definite scaling problem is a corrected item-home scale correlation that is > 2 SEs below the correlation of the item with another scale. A probable scaling problem is indicated by a corrected item-home scale correlation within 2 SEs of the correlation of the item with another scale (Hays et al., 1988).
Item response theory approaches are now widely advocated as a test of fundamental measurement for assessing the fit of items to scales. Rasch analysis was carried out using Winsteps 3.66. The data were examined to ensure that items were suitable for Rasch analysis. All categories had 10 or more responses, and no items in the analysis were very skewed (Bond and Fox, 2007). Furthermore the average category responses for all items were in the expected order. Winsteps produces two fit statistics: “infit” weights responses of people whose performance is close to the item value, while “outfit” is sensitive to outlying scores (Bond and Fox, 2007). Deviation of infit from expectation is generally regarded as more important for measurement than outfit deviation. For large sample sizes the mean square is preferred to the Z statistic as a measure of fit, and satisfactory fit is indicated by values between 0.7 and 1.3 (Smith et al., 2008). Values > 1.3 indicate lack of item fit with the unidimensional model, while values < 0.7 suggest items are over-fitting or redundant.
To study whether the structure of the questionnaire (specifically the division into separate scales) was justified we used factor analysis. The dimensionality of the final version was examined using principal component analysis (PCA). PCA using both a forced one-factor solution and a six-factor solution that resulted when Kaiser's criterion (eigenvalues > 1) was applied, and oblique rotation (promax method with the assumption of correlated scales) was performed.
Finally we studied the structure of the questionnaire using confirmatory factor analysis (CFA; using structural equation modeling [SEM] in AMOS 8.0), an approach that allows various statistics for overall fit to be calculated. Good fit indicates that the assumed grouping of items into scales adequately reflects the empirical patterns of relationships between items. Within CFA the observed variables corresponded to the individual items of the QOLIBRI, and the latent variables to the factors that represent the six QOLIBRI subscales. The existence of substantial intercorrelations between the factors, in turn, suggested the existence of a second-order latent variable representing general HRQoL. This type of confirmatory factor analysis is common within HRQoL research, with HRQoL as a higher-order construct, conceptually reflecting the multidimensionality of the construct, and the various underlying factors (with substantial covariance) representing specific dimensions. The analysis was restricted to complete cases after imputing missing values as described below (n = 787). Maximum likelihood was applied for parameter estimation in CFA analysis. Selected indices (CFI, RMSEA, and chi-square statistics) were observed to evaluate the overall fit of the data to the final model of the QOLIBRI. Interpretation of fit statistics was carried out along the cut-off criteria presented by Hu and Bentler (1999).
Results
Descriptives
A total of 921 participants were enrolled. There were 126 cases with missing GCS, which were excluded from subsequent analysis. In the remaining 795 cases, some data were missing to varying degrees. For demographics concerning gender and age 100% of the data were present, and for living arrangements, employment status, and relationship status 93% were present. In the clinical data 100% were present for GCS and GOSE, for major lesion location and years since injury 99%, 98% for numbers of comorbid health conditions, and 92% for self-reported health situation.
There were less than 5% missing responses for single QOLIBRI items (e.g., “participation in work,” 4.4%). HADS anxiety and depression scores were present for 99% and SF-36 scores for 96% of the sample. For the QOLIBRI, means were calculated for each scale, and prorated if up to 33% of responses were missing. Missing responses were imputed per participant by substituting for the missing value the scale mean rounded to an integer. Means for the SF-36, the HADS, and the comorbid health conditions list were calculated, using prorating if up to 33% of responses were missing.
Demographic and clinical characteristics of the final validation study (n = 795) are presented in Table 1. Percentages are given with respect to the number of participants with complete data. As is typical for a TBI sample, there are a greater number of men than women. Within the range covered (from 17–68 years), three age groups were formed (17–30 years, 31–40 years, and 45–68 years) of almost equal size. More than half of the sample was severely injured by GCS criteria, and for half the injury occurred 4 or more years previously. Less than a quarter of the sample was in full-time employment, and only half was currently in a relationship. Over half of the sample was living independently, that is, did not “need help for daily life tasks.” Over half reported four or more comorbid health conditions; in contrast only 28% described themselves as being “unhealthy” at the moment. According to the GOSE, the majority (72%) were disabled by the consequences of their TBI.
Finalizing the QOLIBRI scales and item set
The final resulting QOLIBRI instrument consists of 37 items in four satisfaction scales, “Cognition” (7 items), “Self” (7 items), “Daily Life and Autonomy” (7 items), and “Social Relationships” (6 items), and two bothered scales, “Emotions” (5 items), and “Physical Problems” (5 items).
Item characteristics
The properties of the 43 preliminary QOLIBRI items (resulting from the initial validation and excluding the overall scales items) are shown in Table 2. All items except three met the endorsement criteria (Power et al., 2005), and did so for at least 50% of the language versions. The exceptions were the two bothered items “epileptic seizures” and “problems with smelling/tasting,” and an item concerning satisfaction with the “ability to look after basic personal needs.” These three items were removed.
Item removed on distributional grounds.
Item removed on internal consistency grounds.
Item removed on MAP grounds.
These two scales were combined into one final scale “Physical problems.”
CITCs and alphas computed with reference to final scale structure (i.e., one “Physical problems” scale) and item number.
Values in parentheses are for items that entered the analyses of internal consistency but failed criterion CITC (0.40), and are based on interim scales different from the final scales.
CITC, corrected item-total correlation; SD, standard deviation; TBI, traumatic brain injury; MAP, Multitrait Analysis Program.
At this point, seven items were left to form a second bothered scale “Physical problems.” One item concerning “ongoing legal actions” had a CITC (<0.40) with the interim scale and was excluded. This resulted in CITC dropping below 0.40 for another item pertaining to “restrictions with driving,” which was also removed. The resulting five-item “Physical problems” scale had satisfactory internal consistency (Cronbach's α = 0.75). Finally, a satisfaction item from the “Self” scale, satisfied with “ability to control emotions,” was excluded. Of the eight initial items in this scale, it had the lowest CITC (0.58), and MAP analysis revealed several substantial correlations > 0.40 with other scales. For the remaining 37 items reliability analysis indicated that all items had CITCs of greater than 0.40, and the majority had CITCs greater than 0.60. On the MAP analysis all items correlated more strongly with their home scale than any other scale, and scaling success for each of these scales was 100%. Two of the retained 37 items had skewness indices slightly above 1 (bothered by “feelings of loneliness,” and “problems with seeing/hearing”), but were kept because of their clinical importance, reasonable scale fits, and ability to differentiate between (strongly impaired) persons in the low HRQoL range.
Internal consistency
Internal consistency was assessed for each scale, and for each language version of the QOLIBRI (Table 3). Cronbach's α ranges from 0.75 (“Physical problems”) to 0.89 (“Cognition” and “Self”). The individual scales thus fulfill criteria for use in research studies, and the total QOLIBRI score provides a reliable assessment at the level of the individual with Cronbach's α of 0.95, ranging from 0.92 (French; n = 147) to 0.97 (English; n = 96).
MMSE, Mini Mental State Examination; TICS, Telephone Interview for Cognitive Status; QOLIBRI, Quality of Life After Brain Injury.
Individual scale scores exceed α = 0.70 for all language versions except for the “Emotions” and “Physical problems” scales of the Dutch version (α = 0.64 and α = 0.69, respectively), and the “Physical problems” scale of the French version (α = 0.64). The results indicate that the QOLIBRI scales generally have good internal consistency. Also, in a subgroup of persons with low cognitive performance (MMSE < 28 or TICS < 33; n = 84) internal consistency was comparable to persons with normal cognitive status (MMSE > 27 or TICS > 32; n = 121); in the former group, the lowest α was 0.81 for the “Physical problems” scale, for which it was 0.76 in the latter group.
Test-retest reliability
Table 4 shows ICCs in the sample of the 381 participants retested after 2 weeks (27 Dutch, 119 German, 56 English, 49 Finnish, and 126 French cases). The mean age in the group retested was 36 years (SD = 12.5), compared to 40 years (SD = 14.0) in those not retested; 52% of the retested group had severe injuries, 11% had moderate injuries, and 37% had minor injuries. For the retested sample as a whole ICCs ranged from 0.78 (“Emotions”) to 0.85 (“Physical Problems”), indicating that all scales show good test-retest reliability. The reliability of the total score was rtt = 0.91. Scores on the MMSE or TICS were available for 181 participants who were retested. Participants were divided into those with low performance (borderline or impaired; n = 84) versus those with scores in the normal range (n = 121). ICCs for these two subgroups are shown in Table 4. In general, most scales showed good reliability in both subgroups. The lowest ICCs were for the “Social relationships” and “Emotions” scales, and the values here are still consistent with good reliability.
MMSE, Mini Mental State Examination; TICS, Telephone Interview for Cognitive Status; QOLIBRI, Quality of Life After Brain Injury; CI, confidence interval; ICC, intra-class correlation.
Test-retest reliability was good for the four language versions for which sufficient numbers of retested participants were available (n ≥ 48). For the QOLIBRI total score, ICCs range from 0.87 (Finland) to 0.91 (France; Table 5). Single-scale test-retest reliability is mostly in the range of 0.75 to 0.80.
QOLIBRI, Quality of Life After Brain Injury; ICC, intra-class correlation.
QOLIBRI scale means on the first and second test are shown in Table 6; differences between the first and second assessment reached significance at the .05 level for the “Cognition” scale and the “Physical Problems” scale. When the changes were calculated as effect sizes using Cohen's d (Cohen, 1988), none exceeded 0.10, indicating very small differences.
QOLIBRI, Quality of Life After Brain Injury; SD, standard deviation.
Another approach considering the test-retest reliability of differences between QOLIBRI scale scores (being rather unrelated to test-retest reliability per se), that is already providing support for the suggested six-factor structure of the QOLIBRI, is reported below. For this purpose, for each pair of the six subscales, differences between subscale scores were computed for test and retest data. Test-retest correlations between pairwise compared scale differences ranged from 0.55 (“Self” × “Emotions”) to 0.75 (“Emotions” × “Physical”). The averaged test-retest correlations of QOLIBRI scale differences, as computed based on Fisher-Z-transformed coefficients, amounted to 0.62, which is an extremely high value given that differences between measures cannot be as reliable as the measures themselves. Thus, in a multi-scale approach, the different scales of the QOLIBRI data reliably assess information related to the different HRQoL domains; clearly, there is not only error variance beyond the g-factor. However, further research is needed to determine whether the QOLIBRI can be used to reliably assess HRQoL profiles at an individual level.
Rasch analyses
Each scale was analyzed in turn using Rasch analysis. Items within each scale are shown in Table 7 ordered by item “difficulty.” Thus, for example, in the “Cognition” scale, participants were most likely to express satisfaction with “finding way about,” and least likely to be satisfied with “remember.” The range covered by items in each scale was fairly narrow, from 0.59 logits for the “Emotion” scale to 1.51 logits for “Social relationships,” indicating that in general items were quite similar in their likelihood of being endorsed positively. It should be borne in mind that item difficulty is calculated at the middle of the rating categories. Rasch analysis of individual QOLIBRI scales (Table 7) showed that infit was in the required range for all items in each of the scales. Rasch analysis thus confirms that items have a satisfactory fit with their home scales. Weaker items are “self-perception,” with an infit value of 0.7 suggesting a certain amount of redundancy, and “run personal finances,” with an outfit value of 1.33, which indicates misfitting outliers in the data.
Items within each scale are ordered by item “difficulty” (i.e., from most likely to least likely to be endorsed positively).
MNSQ, mean square; QOLIBRI, Quality of Life After Brain Injury; TBI, traumatic brain injury.
Rasch analysis was also performed with all items combined to examine whether QOLIBRI items fit a single unidimensional scale. Item difficulty measures ranged from −0.47 to 0.61 logits. PCA of the residuals showed that the Rasch model explained 38.2% of the variance, indicating that a unidimensional model explains only a moderate amount of the variance. The infit values indicated that the majority of QOLIBRI items fitted an overall Rasch dimension. However, there were five items with infit values of 1.3 or more: “partner” (infit = 1.41), “sex life” (infit = 1.30), “other injuries” (infit = 1.30), “pain” (infit = 1.31), and “seeing/hearing” (infit = 1.36). The results of this analysis give moderate support to a unidimensional model, but also indicate that some of the items in the “Social relationships” and “Physical problems” scales have a poor fit with a unidimensional model.
Exploratory factor analyses
The results of two principal components analyses (PCA) are shown in Table 8. Loadings on the first component of a single-factor solution indicate that items in the first three scales generally have a good fit (loadings > 0.6) with a unidimensional HRQoL model descriptive system. Items in the last three scales have a weaker fit with this single-factor descriptive system, and two items (“partner” and “see/hear”) have a poor fit (loading < .45). The single-factor PCA is thus consistent with the Rasch analysis conducted on all items combined, and indicates that there is a unidimensional component to the QOLIBRI, primarily based on the items in the first three scales, which are concerned with cognitive function, self-perception, and independent living. The items from the last three scales, with the two exceptions described above, have moderate fit with this descriptive system model.
Shown are 1 forced single-factor solution (column 3), and 1 six-factor solution (eigenvalues > 1; columns 4–10). Factor loadings > .25 are shown.
QOLIBRI, Quality of Life After Brain Injury; TBI, traumatic brain injury.
The results from the second PCA, for which six factors with eigenvalues > 1 were extracted, on the other hand, nicely confirm the overall structure of the QOLIBRI: all items have the highest loadings on their home scales, and there is relatively little cross-loading > 0.25.
The PCA single-factor solution accounted for only 37% of the variance, while the six factors accounted for 59%. These scales are moderately correlated, as shown in Table 9.
QOLIBRI, Quality of Life After Brain Injury.
Confirmatory factor analysis (CFA) using structural equation modeling (SEM)
Structural equation modeling (SEM) was used to confirm the structure of the QOLIBRI. The SEM model with six factors showed substantial intercorrelation of latent factors (range of r = .469–.796). Thus, a second-order factor underlying the six original factors on the first level was specified to account for these intercorrelations of latent variables and in keeping with conceptual considerations (general HRQoL as a higher-order construct). The final model (Fig. 2) consisted of the six latent variables (factors) on one level, and a “second-order” latent variable (a general HRQoL factor). Fit statistics indicated a reasonable fit with this model (CFI = 0.896, RMSEA = 0.055, χ2 = 2115.96, df = 623, p[χ2] < 0.001). In fact, the model meets the RMSEA criterion, rather than the CFI criterion, for satisfactory fit (Hu and Bentler, 1999). Combinations of cut-off rules are more appropriate for lower sample sizes (<250), and with larger sample sizes that tend to over-reject models (see Hu and Bentler, 1999). A similar problem arises if chi-square statistics are applied to larger sample sizes. We believe RMSEA is more appropriate than CFI because it compensates for model complexity.

Model of the structure of QOLIBRI items and scales, and path coefficients as revealed by SEM analysis (Quality of Life After Brain Injury scale; SEM. structural equation modeling).
Discussion
The parallel, consensual, and cross-cultural development of this new specific instrument for the assessment of HRQoL of persons after TBI represents a distinctive and unusual approach.
The different language samples were generally collected as convenience samples (with two exceptions: Germany and Australia), which is a common strategy in this type of instrument development study (Hawthorne et al., 2006). The wide variation represents an advantage for this type of study because it ensures future applicability to a wide range of patients. The relatively small number per language sample, however, presents a limitation for analyses on the language level.
For use in international multi-centre studies, the content as well as metric equivalence of an instrument needs to be demonstrated (Bullinger et al., 1993; Stucki et al., 1997). Thus the translation of HRQoL instruments requires a standardized process, including translation, back translation, review, cognitive debriefing, and harmonizations of the different language translations. We ensured comparability of the different translations by following accepted guidelines concerning cognitive debriefing in each set of language and harmonization procedures (von Steinbüchel et al., 2002; Acquadro et al., 1996). Cross-cultural development also includes the demonstration of comparable metric properties for the whole sample, as well as for the different languages, in particular with respect to internal consistency, reliability, factor structure, and validity (von Steinbüchel et al., 2010).
The results of the present study indicate favorable psychometric properties of the QOLIBRI. In spite of the variation in demographic and clinical characteristics, internal consistency and test-retest reliability are acceptable to good, both in the total sample and in different language groups. The test-retest reliability of the QOLIBRI is similar to that of the EBIQ (Sopena et al., 2007), a patient-reported outcome measure developed for TBI.
Reliabilities for the individual QOLIBRI scales were also good in the subgroup with poorer cognitive performance. The lowest reliability was recorded for the “Emotions” scale, an area in which some fluctuation might be expected.
A concern with self-report instruments in TBI is the potential lack of insight that may be experienced by those with cognitive impairment. Experience in diseases such as dementia suggests that subjective HRQoL judgments can be obtained reliably, even in people with substantial cognitive impairment (Brod et al., 1999; Novella et al., 2001; Wlodarczyk et al., 2004). The relationship between reliability of reports and cognitive status, however, is an issue that has not been properly addressed in head injury (Riemsma et al., 2001). This study showed that internal consistency (i.e., Cronbach's α) for participants with poor cognitive performance was satisfactory, and test-retest reliability was good for the QOLIBRI. This study, however, did not include detailed neuropsychological testing of cognitive functioning, which presents a topic for future research. On the other hand, the majority of participants were severely injured. We conclude that the reliability of the QOLIBRI is satisfactory to good, even when some cognitive impairment is present, and may be appropriate for the great majority of persons after TBI.
The conceptual model of HRQoL, on which the QOLIBRI descriptive system was based, initially suggested a seven-, then a six-dimensional model with four “satisfaction” scales and three “bothered” scales, which were then collated into two scales following psychometric examination. The PCA and SEM analyses support this six-factor structure of the QOLIBRI in a second-order latent variable model. Also, RASCH analysis based on the presented results does not support unidimensionality for all the items, and instead suggests a multi-factorial structure. However, there is potential to develop, in future research, a QOLIBRI short form with a reduced number of items and dimensions.
The final descriptive system of the QOLIBRI provides both for an HRQoL profile across six domains of life, and also a total index of HRQoL. The data from the various analyses suggest that where a rich profile description of HRQoL is required, the individual domain scores will provide this. When used in evaluation studies, shifts in individual domains will reflect areas of life where gains consequent to treatment are made. In contrast, where an index of HRQoL is required, this index can be used to assess the impact of treatments on overall HRQoL as measured by QOLIBRI total score.
A limitation of the study was that the number of participants was too small for most language samples to confirm the QOLIBRI factor structure within each language. Examination of the factor structure in some of the larger samples of the first validation study, however, confirms the structure. Rasch analysis suggests a certain amount of redundancy at the item level. A short six-item screening measure is already available with the “Overall” items (psychometric properties will be reported in von Steinbüchel et al., in prep.); the use of the different versions will depend on the research questions to be answered.
The development of the QOLIBRI descriptive system in two large multi-national samples of persons after TBI has resulted in an instrument with good psychometric properties. Specific information on the validity of the scale can be found in the other article on QOLIBRI in this issue (von Steinbüchel et al., 2010). The results of this investigation of the validity of the QOLIBRI show the expected confirmatory patterns of correlations with instruments assessing disability, emotion, and subjective health status, including the GOSE, HADS, and SF-36. A multiple regression analysis showed that the main correlates of the total QOLIBRI score were emotional state (HADS depression and anxiety), amount of help needed, outcome on the GOSE, and number of comorbid health conditions. Together these five variables accounted for 58% of the variance.
The QOLIBRI measures well-being and health-related quality of life from the patient's perspective. The items predominantly concentrate on emotional, cognitive, and psychosocial aspects, and to a lesser extent on physical changes. The questionnaire thus measures satisfaction and distress in areas of life typically affected by traumatic brain injury. The QOLIBRI provides an assessment of quality of life that potentially complements functional outcome measures such as the GOS or GOSE.
Further work will explore the sensitivity and responsiveness of the measure, and further cross-cultural comparisons of scale structures and scores. The psychometric properties of the QOLIBRI suggest that it is a practical and reliable instrument that can be considered for use in studies examining HRQoL after TBI in clinical as well as in research settings (Zitnay et al., 2008).
Footnotes
Acknowledgments
The QOLIBRI Task Force consists of collaborating investigators in the following countries (*Steering Committee, †Methodological Group): Argentina (Armando Basso), Australia (Graeme Hawthorne†), Austria (Stefan Höfer†), Belgium (Christine Croisaux, Andrew Maas*), Brazil (Lucia Braga), China (Wai Poon, Zhang Tong), Denmark (Anne-Lise Christensen), Finland (Jaana Sarajuuri, Sanna Koskinen), France (Philippe Azouvi, Michelle Montreuil, Pierre North, Jean-Luc Truelle*), Germany (Monika Bullinger,*† Henning Gibbons,† Tanja Lischetzke,† Edmund Neugebauer,* Nadine Sasse, Silke Schmidt,† Nicole von Steinbüchel,*† Klaus von Wild,* Wolfgang Woerner†), Greece (Eva Tazopoulou), E. Wahjoepramono (Indonesia), Italy (Rita Formisano), Japan (Yoichi Katayama), Netherlands (Wilbert Bakx, Pieter E. Vos), Poland (Maria Pachalska), Portugal (Sandra Guerreiro), Russia (Boleslav Lichterman), Taiwan (Wen-Ta Chiu), United Kingdom (Richard Greenwood, Jane Powell,* Lindsay Wilson†), United States of America (John DaVanzo, George Zitnay*).
Author Disclosure Statement
No competing financial interests exist.
Appendix: QOLIBRI—Quality of Life After Brain Injury
In the first part of this questionnaire we would like to know
In the second part we would like to know
© The authors of this article, all rights reserved,
For details contact
