Validation of Neuro-QoL and PROMIS Mental Health Patient Reported Outcome Measures in Persons with Huntington Disease

Abstract

Background:

Patient-reported outcomes (PROs) for mental health are important for persons with Huntington disease (HD) who commonly experience symptoms of depression, anxiety, irritability, anger, aggression, and apathy. Given this, there is a need for reliable and valid patient-reported outcomes measures of mental health for use as patient-centered outcomes in clinical trials.

Objective:

Thus, the purpose of this study was to establish the psychometric properties (i.e., reliability and validity) of six Neuro-QoL and PROMIS mental health measures to support their clinical utility in persons with HD.

Methods:

294 individuals with premanifest (n = 102) or manifest HD (n = 131 early HD; n = 61 late HD) completed Neuro-QoL/PROMIS measures of Emotional and Behavioral Dyscontrol, Positive Affect and Well-Being, Stigma, Anger, Anxiety, and Depression, legacy measures of self-reported mental health, and clinician-rated assessments of functioning.

Results:

Convergent validity and discriminant validity for the Neuro-QoL and PROMIS measures of Emotional and Behavioral Dyscontrol, Positive Affect and Well-Being, Stigma, Anger, Anxiety, and Depression, were supported in persons with HD. Neuro-QoL measures of Anxiety and Depression also demonstrated moderate sensitivity and specificity (i.e., they were able to distinguish between individuals with and without clinically significant anxiety and depression).

Conclusions:

Findings provide psychometric support for the clinical utility of the Neuro-QoL/PROMIS measures of mental health measures in persons with HD. As such, these measures should be considered for the standardized assessment of health-related quality of life in persons with HD.

Keywords

Neuro-QoL PROMIS emotion mental health validity reliability Huntington disease

Huntington disease (HD) is an autosomal dominant neurodegenerative disease that principally affects the basal ganglia [1]. Age of disease onset is related to the number of repeats of the CAG triplicate repeats in the HD gene. With typical expanded repeat numbers, people gradually begin to develop signs and symptoms of HD over one to two decades, which is then diagnosed once motor signs of HD are identified unequivocally. HD symptoms include mental health symptoms (e.g., irritability and impulsivity), motor signs (chorea and dystonia), as well as cognitive decline [2]. Patients may experience problems in all of these domains, or may present with clinically significant symptoms in only one area of function. Between 73–98% of individuals with premanifest or manifest HD experience neuropsychiatric symptoms [3 –8] including elevated levels of depression, anxiety, irritability, anger, aggression, and apathy [3 , 9–28]. These neuropsychiatric symptoms are associated with negative outcomes, including strained inter-personal relationships and poor health-related quality of life. In addition, persons with HD are at increased risk for psychiatric conditions [29 –33], as well as increased suicide risk [34 –43].

Much of the work examining mental health symptomatology has focused on the examination of clinician-rated symptoms, either through examination of comorbid neuropsychiatric diagnoses [12 , 22] or a standard clinician-rated behavioral examination (either the short form of the Problem Behavior Assessment [PBA-s] or the Behavioral Exam from the Unified Huntington Disease rating Scale [UHDRS] [6 , 28]). Only a handful of studies have examined how these symptoms impact health-related quality of life (HRQOL) [5 , 14]. This is unfortunate given the increasing emphasis for patient-centered outcomes and the importance of patient-reported outcomes (PROs) as relevant and meaningful endpoints for clinical trials [44]. Therefore, the purpose of this study was to examine the clinical utility of using PROs to assess the different mental health symptoms that can impact HRQOL.

As such, we were interested in providing psychometric support for two PRO measurement systems designed to measure key symptoms and health concepts. The Quality of Life in Neurological Disorders (Neuro-QoL) [45, 46], and the Patient Reported Outcomes Measurement Information System (PROMIS) [47, 48] are HRQOL measurement systems that include different item banks that capture self-reported mental health (including the following neuropsychiatric symptoms: Depression, Anxiety, Anger, Stigma, Emotional and Behavioral Dyscontrol, and Positive Affect and Well-Being). Neuro-QoL focuses on the evaluation of mental health constructs in individuals with neurological conditions (i.e., stroke, Parkinson’s disease, multiple sclerosis, child and adult epilepsy, amyotrophic lateral sclerosis, and muscular dystrophy [45], whereas PROMIS focuses on assessing similar concepts across chronic medical conditions. Both systems offer the advantage of computerized adaptive test (CAT) administration technology, a test method where each individually administered item is uniquely tailored based on the previous test response. CATs allow clinicians and researchers to ascertain a person’s level of functioning with minimal items and maximal precision. In addition, these systems allow for cross-disease comparison (generic item banks) and disease-specific sensitivity (disease-specific item banks). Many identical items (i.e., common data elements; CDEs) are used on both Neuro-QoL and PROMIS to allow “linking” between measures, such that a score on one measure (e.g., Neuro-QoL) can be used to estimate a score on the other (e.g., PROMIS).

The purpose of this paper was to provide data to support the psychometric properties of the Neuro-QoL and PROMIS mental health PROs so that they can be used with confidence to measure mental health in persons with HD. Specifically, we examined the reliability (internal consistency) and validity (convergent, discriminant, and known groups validity) of several Neuro-QoL and PROMIS mental health PROs in persons with or at-risk for HD. Specifically, we hypothesized that the Neuro-QoL/PROMIS mental health item banks would demonstrate acceptable reliability [i.e., Cronbach’s Alpha >0.70; 49] in this population and that floor and ceiling effects for these measures would be better than other commonly used self-report measures. In addition, convergent validity would be supported in our sample by moderate to high correlations (i.e., ≥0.4) among self-report measures of mental health, and discriminant validity would be supported by negligible to small correlations (i.e., 0.0–0.2) between the mental health PROs and other measures of clinician-rated functioning [50]. We also hypothesized that premanifest HD participants would report better mental health functioning than manifest HD groups, and that early-stage HD participants would report better functioning that late-stage HD participants (i.e., known groups validity); effect sizes for self-reported neuropsychiatric symptoms (i.e., Cohen’s d) were expected to be larger for individuals with more clinician-rated mental health problems. Finally, we hypothesized that the PRO measures of depression and anxiety would demonstrate adequate sensitivity and specificity relative to established cutoff scores on established measures for clinically significant anxiety or depression as evidenced by AUC values ≥0.70 in other clinical populations [51]. Results will provide evidence for the clinical utility of these PRO measures of mental health in individuals with premanifest and manifest HD.

MATERIALS AND METHODS

Participants

Study participants were individuals with either premanifest (gene-positive status for the HD CAG expansion and no clinical diagnosis) or manifest HD. Inclusion criteria were as follows: participants had to be at least 18 years of age, able to read and understand English, and able to provide informed consent. A convenience sample of persons with HD were recruited from HD treatment centers at the University of Michigan, University of Iowa, University of California-Los Angeles, Indiana University, Johns Hopkins University, Rutgers University, Struthers Parkinson’s Center, and Washington University. Recruitment was also supported by the National Huntington Disease Roster and existing online medical record data capture systems [52]. In addition, several study participants (n = 96) were recruited in conjunction with the Predict-HD research study [53], a nationwide cohort which aimed to examine symptoms in participants with premanifest and early manifest HD. All study procedures were completed in compliance with institutional review board requirements, and informed consent was obtained from each participant prior to the administration of study measures.

Measures

Self-reported HRQOL

We administered six different measures from the Neuro-QoL/PROMIS systems. Specifically, we administered Neuro-QoL measures of Emotional and Behavioral Dyscontrol, Neuro-QoL Positive Affect and Well-Being, Neuro-QoL Stigma, Neuro-QoL Anxiety, Neuro-QoL Depression, and the PROMIS measure of Anger (given that Neuro-QoL does not include a measure of this construct). Details for these measures are provided elsewhere [45 , 54]. These measures were designed to evaluate important aspects of HRQOL (including aspects of symptoms and functioning) according to the Internal Classification of Functioning, Disability, and Health (ICF) framework [55, 56]. Briefly, each measure was administered as CAT, followed by fixed 8-item short-forms (SFs; participants completed items in the CAT, followed by any items on the SF that were not already administered as a part of the CAT). This was repeated for each Neuro-QoL/PROMIS measures to allow for the comparison of different administration methods. For CAT administrations, participants completed a minimum of 4 items, and test administration stopped after either a standard error (SE) ≤0.3 was achieved or the participant answered 12 items. Scores for both CATs and SFs are on a T metric (M = 50, SD = 10); higher scores indicate more of the trait that is being assessed. That is, higher scores for Positive Affect and Well Being indicate better mental health, whereas higher scores on all other PROs indicate poorer mental health. While these measures are commonly recommended as common data elements for several neurological populations (http://www.commondataelements.ninds.nih.gov/ and [57]), this data is not yet available for persons with HD.

Generic self-report legacy measures of HRQOL

We administered three generic measures of HRQOL: the WHODAS 2.0 [55], RAND-12 Health Status Inventory [HSI; 58] and the EQ5D [59]. The WHODAS 2.0 [55] is a 12-item self-report measure that evaluates HRQOL as it relates to disability across six subdomains: understanding and communication, self-care, mobility, interpersonal relations, work and household roles, and community and civic roles. Total scores ranging from 0 (highest level of health) to 48 (low health). The RAND-12 his [58] is a 12-item self-report measure that produces a physical health composite (PHC) and a mental health (MHC). Scores on PHC and MHC are on T metric (M = 50, SD = 10); higher scores indicate better health. The EQ5D [59] is a self-report measure of HRQOL that measures mobility, ability to do self-care, ability to do usual activities, pain, and anxiety/depression. We examined scores on both the Health Scale (which range from 0 [poor health] to 100 [perfect health]) and the Index Value (scores range from 0 [worst health] to 1 [best health]). There is data to support the clinical utility of all three these measures in persons with clinical conditions, and for the WHODAS 2.0 and the Rand-12 in persons with HD, specifically [60 –101].

Legacy self-report measures of mental health

We administered two legacy measures of mental health symptoms, the Patient Health Questionnaire 9 (PHQ-9) [102] and the Generalized Anxiety Disorder-7 (GAD7) [103]. The PHQ-9 is part of the Patient Health Questionnaire and is a 9-item self-administered questionnaire used for assessing frequency of symptoms for clinical depression. Response categories were Not at all (0), Several days (1), More than half the days (2), and Nearly every day (3). Total scores range from 0 to 27, with cut point scores of 5, 10, 15, and 20 representing mild, moderate, moderately severe, and severe depression [102]. The GAD-7 is a 7-item self-administered patient questionnaire used to measure the severity of symptoms for Generalized Anxiety Disorder, with responses of Not at all (0), Several days (1), More than half the days (2), and Nearly every day (3). Total Scores range from 0–21, with cut point scores of 5, 10, and 15 representing mild, moderate, and severe anxiety [103]. Given that there are no consensus PRO measures for mental health in persons with HD [https://www.commondataelements.ninds.nih.gov/HD.aspx#tab=Data_Standards and 104], the PHQ-9 and GAD-7 were selected given their broad clinical applicability across different clinical populations [102 , 105–128], the presence of established diagnostic cutoff scores, and the relative brevity of administration items.

Clinician rated assessments

For all participants, clinicians completed the Independence Scale, the Total Motor Scale, and Total Functional Capacity measure from the Unified Huntington’s Disease Rating Scale (UHDRS) [129], as well as the Problem Behaviors Assessment (PBA-s) [130]. The Independence Scale requires the clinician to rate the participants overall level of independence on a scale of 0 to 100; higher scores reflect better functioning/greater independence. The Total Motor Scale provides a clinician-rated measure of motor functioning. We used the final question on this scale to determine if a participant had premanifest or manifest HD; in cases where the clinician indicates with >99% certainty that motor symptoms indicate unequivocal signs of manifest HD participants were classified as having manifest HD; otherwise the participant was classified as having premanifest HD. The TFC is a 5-item clinician-rated measure of day-to-day functioning across the domains of occupation, finances, domestic chores, activities of daily living, and care level. Scores range from 0 (low functioning) to 13 (highest level of functioning) with higher scores indicating better functioning. TFC scores were used to classify participants with an HD diagnosis as either early stage (Stages I –II; sum scores of 7–13) or later-stage (Stages III – V; sum scores of 0–6) [131]; staging data was combined due to the relatively small number of participants in the later stages of the disease (i.e., n = 7 Stage IV and n = 1 in Stage V), as well as to maximize the power for group comparison analyses. The PBA-s [130] provides a clinician-rated assessment of 11 different behaviors (depression, suicidality, anxiety, irritability, anger/aggression, apathy, perseverative thinking, obsessive compulsive behaviors, delusions, hallucinations, and disorientation). Each item is rated for severity (rated on a scale of 0 [symptom absent] to 4 [severe] in regards to the last 4 weeks) and frequency (rated on a scale of 0 [never/almost never] to 4 [daily/almost daily for most or all of the day]). We examined scores on individual items, which is the product of the frequency and severity score for that item (which can range from 0 to 16); higher scores indicating more problems.

Analyses

Reliability

Cronbach’s alphas were calculated to evaluate internal consistency reliability for the SF version of each mental health measure in our sample. These values were compared to the Cronbach’s alphas for the CATs, which were drawn from the Neuro-QoL and PROMIS calibration samples [48, 54]. A critical cut-off of ≥0.70 was considered minimal acceptable reliability [49].

Floor and ceiling effects

The proportion of participants with the lowest or the highest possible scores on each of the mental health measures were calculated to establish the floor and ceiling effects of these measures. A priori acceptable rates for floor and ceiling effects were ≤20% [132, 133].

Timing data

Due to concerns about test burden in HD participants (especially for participants in the later stages of the disease), median and SD timing data were examined for both CAT and Short Form versions of the mental health measures.

Convergent and discriminant validity

A correlation matrix was used to examine the associations between the NeuroQoL/PROMIS PROs, other self-report measures of mental health symptoms (PHQ9 and GAD7), generic measures of HRQOL (WHODAS, EQ5D, and RAND), clinician-rated mental health (PBA-s), and clinician-rated assessments of functioning (UHDRS Motor, Functional Assessment and TFC) in our sample. Convergent validity was supported by moderate to high correlations (i.e., ≥0.4) between the mental health PRO measures and other self-report measures of mental health [50]. Discriminant validity was supported by small correlations (magnitude ≥0.1) between Neuro-QoL/PROMIS PROs and corresponding clinician-rated measures of emotion (i.e., PBAs items) and negligible to small correlations (i.e., 0.0 – 0.2) between the mental health PROs and other measures of clinician-rated functioning [50].

Known-groups validity

One-way analysis of variance was used to examine group differences (premanifest, early- and late-HD) for each of the different mental health PROs; Tukey’s post hoc analyses were used to identify significant group differences. Specifically, we hypothesized that premanifest participants would report better mental health than manifest HD groups, and that early-stage HD participants were report better mental health than late-stage HD participants.

Effect sizes

In order to evaluate the relative influence that neuropsychiatric symptom severity (as determined by a median split using the matched “established” measure) has on Neuro-QoL/PROMIS scores, effect sizes (i.e., Cohen’s d) were calculated. Specifically, a median split on the PBAs Irritability item was used for Neuro-QoL Emotional and Behavioral Dyscontrol, a median split on the WHODAS was used for Neuro-QoL Positive Affect and Well Being, a median split on the PBAs Anger/Aggression item was used for PROMIS Anger, a median split on the PBAs Anxiety item was used for Neuro-QoL Anxiety, and a median split on the PBAs Depression item was used for Neuro-QoL Depression. Effect size were calculated for each of the Neuro-QoL/PROMIS PROs (based on comparison of each group relative to the means and standard deviations from the PROMIS/NQ normative sample; Neuro-QoL normative sample [N = 549, M = 50, SD = 10; 46, 54] and PROMIS normative sample [N = 858; M = 50, SD = 10; 48]. Effect sizes were expected to be larger for the groups with more clinician-rated severity.

Impairment rates

Clinical impairment rates (participants whose scores were >1 SD worse than the Neuro-QoL normative sample mean[N = 549, M = 50, SD = 10; 46, 54] or participants whose scores were >1 SD worse than the PROMIS normative sample mean [N = 858; M = 50, SD = 10; 48] were used to determine if individuals with HD were at greater risk than the general population for mental health problems. In a normal distribution, 16% of the scores would fall 1 SD below the mean (i.e., impaired); therefore, we treated impairment rates that exceeded 16% to indicate greater impairment than would be expected compared to demographically-comparable neurologically healthy peers [134].

Classification accuracy (sensitivity/specificity)

Finally, we conducted logistic regression models to determine the accuracy with which Neuro-QoL Anxiety and Depression scores could discriminate between those individuals with and without anxiety or depression (as determined by the published cutoff scores of ≥5 on the GAD-7 [103] or the PHQ9 [102], for anxiety and depression, respectively). Criterion for likelihood ratios (i.e., sensitivity/[1-sensitivity]) for clinical decision making should be ≥2 [135]. In addition, receiver operating characteristic (ROC) analysis was used to compare diagnostic performance of the mental health PROs. The resulting area under the curve (AUC) values criterion was specified as ≥0.70 [51].

RESULTS

Two-hundred-ninety-four individuals with premanifest (n = 102) or manifest HD (n = 131 early-stage HD and n = 61 late-stage HD; Table 1) completed the mental health PROs as part of a larger study [136]. There were group differences for gender, χ{² (2, N 294) = 7.97, p = 0.02; there were more females than males for the premanifest group relative to the early-stage manifest group. There were also group differences for education, F (2, 290) = 10.97, p < 0.0001; the early-stage HD and late-stage HD groups had slightly less education than the premanifest group. As disease progresses with age, it was not surprising that there were significant groups differences on age, F (2, 291) = 26.91, p < 0.0001; the premanifest group was significantly younger than both manifest groups. Groups did not differ on race, χ² (3, N 294) = 7.35, p = 0.12, or ethnicity χ² (3, N 294) = 0.82, p = 0.94.

Table 1

Demographic Information for Huntington disease participants

Variable	Premanifest (N = 102)	Early (N = 131)	Late (N = 61)	All (N = 294)
Age (years)^*
M (SD)	42.67 (13.26)	53.50 (10.85)	52.84 (11.34)	49.61 (12.84)
Gender (%)^*
Female	68.6	50.4	55.7	57.8
Male	31.4	49.6	44.3	42.2
Race (%)
Caucasian	98.0	98.5	93.4	97.3
Other	2.0	0.8	6.6	2.4
Unknown	0.0	0.8	0.0	0.3
Ethnicity (%)
Not Hispanic or Latino	95.1	93.9	95.1	94.6
Hispanic or Latino	1.0	0.8	0.0	0.7
Not Provided	3.9	5.3	4.9	4.8
Education (# of years)^*
M (SD)	16.15 (2.62)	14.87 (2.69)	14.34 (2.51)	15.20 (2.72)
Marital Status (%)
Married	71.6	58.8	67.2	65.0

Note.^* = significant group differences.

Internal consistency

All Cronbach’s alphas in the sample were ≥0.90 for both the CAT and SF versions of the Neuro-QoL/PROMIS measures; values comfortably exceeded the a priori minimum standard for internal consistency reliability (≥0.70; see Table 2).

Table 2

Descriptive information and reliability data for neuro-QoL/PROMIS and comparator measures

	n	Cronbach’s α	% of the sample with floor effects (high functioning)	% of the sample with ceiling effect (low functioning)	Mean	SD	Administration Times (seconds) Median (SD)
Neuro-QoL/PROMIS Measures
Emotional and Behavioral Dyscontrol CAT	281	0.95^×	12.8	0.0	46.2	10.5	30.0 (38.7)
Emotional and Behavioral Dyscontrol SF	283	0.93	14.6	0.0	46.5	10.0	40.0 (34.1)
Positive Affect and Well-Being CAT^*	281	0.98^×	9.3	0.4	55.3	8.4	28.0 (38.5)
Positive Affect and Well-Being SF^*	281	0.95	12.8	0.7	55.0	7.8	48.5 (40.6)
Stigma CAT	281	0.93^×	16.7	0.0	48.9	8.5	41.0 (38.6)
Stigma SF	283	0.92	35.7	0.0	48.1	8.2	46.0 (36.1)
Anger CAT	160	0.96^×	13.1	0.0	48.6	12.5	39.0 (39.1)
Anger SF	159	0.90	17.0	0.0	48.4	10.7	23.0 (19.9)
Anxiety CAT	160	0.97^×	10.0	0.0	51.2	10.3	30.0 (28.5)
Anxiety SF	159	0.95	15.1	0.0	51.4	9.3	45.0 (35.6)
Depression CAT	160	0.98^×	13.8	0.0	47.8	9.0	26.0 (28.5)
Depression SF	159	0.96	25.8	0.0	47.8	8.5	40.0 (32.7)
COMPARATOR MEASURES
Generic HRQOL Measures
EQ5D Index Score^*∘	293	–	21.2	0.3	0.8	0.2	–
EQ5D Health Scale^*∘	293	–	5.8	0.0	78.3	15.4	–
Rand 12 Physical Health Scale^*	281	0.84	0.0	0.0	46.5	10.4	120.0 (68.3)
Rand 12 Mental Health Scale^*	281	0.84	0.0	0.0	46.5	10.8	128.0 (74.0)
WHODAS	278	0.95	18.7	0.0	22.3	10.6	85.0 (56.5)
Legacy Self-report Measures of Symptoms
Patient Health Questionnaire 9 (PHQ9)	160	0.90	19.4	0.6	6.3	6.4	78.0 (61.7)
General Anxiety Disorder 7-item (GAD7)	160	0.92	31.9	1.3	4.9	5.3	44.0 (39.7)
Clinician-Rated Functioning
UHDRS Motor	261	0.97	6.4	0.0	61.7	276.4	–
UHDRS Functional Assessment	293	0.94	38.6	0.7	20.8	5.5	–
UHDRS Total Functional Capacity	293	0.76	32.4	1.0	9.6	3.5	–
Clinician-Rated Behavioral Status
PBAs Aggressive Behavior	292	–	65.1	1.0	4.3	1.1	–
PBAs Anxiety	292	–	39.7	2.7	3.7	1.2	–
PBAs Depressed Mood	292	–	47.3	1.4	3.9	1.2	–
PBAs Irritability	292	–	50.0	2.7	4.0	1.2	–
PBAs Obsessive Compulsive Behavior	292	–	78.8	2.7	4.6	0.8	–
PBAs Suicide	292	–	86.0	2.7	4.7	0.7	–

Note. CAT = computer adaptive test; SF = short form; HRQOL = health-related quality of life; UHDRS = Unified Huntington’s Disease Rating Scales; PBAs = Problem Behaviors Assessment; ^* = higher scores = better functioning; °the paper form of the EQ5D was used for administration and timing data was not recorded;^× = CAT scores for the HD sample were simulated, and thus reliability coefficients could not be calculated, the provided reliability coefficients are based on the Neuro-QoL [54] or the PROMIS [48] calibration samples; All SF are 8 items except for Positive Affect and Well-Being which was 9 items.

Floor and ceiling effects

The Neuro-QoL/PROMIS PROs were free of floor and ceiling effects (meeting our a priori criterion of ≥20%), with the exception of the Stigma and Depression SFs (which both had floor effects >20%; see Table 2). In all cases, the CATs performed better than the SFs.

Timing data

Median administration times for the PROMIS/Neuro-QoL mental health PROs were brief (all <48.5 seconds; Table 2).

Convergent and discriminant validity

In general, convergent and discriminant correlations were consistent with the proposed hypotheses, with moderate to large correlations among self-report measures, less robust correlations between the Neuro-QoL/PROMIS PROs and clinician-rated emotion items, and negligible to small correlations with clinician-rated measures of functioning (Table 3). Results were virtually identical for the SFs, thus we only present data for the CATs.

Table 3

Pearson correlations to support convergent and discriminant validity of the new Neuro-QoL/PROMIS mental health PROs

	Emotional and Behavioral BDyscontrol CAT	Positive Affect and Well-Being CAT	Stigma CAT	Anger CAT	Anxiety CAT	Depression CAT
Generic HRQOL Measures
–
EQ5D Index Score^*	–0.338^**	0.389^**	–0.499^**	–0.334^**	–0.371^**	–0.416^**
EQ5D Health Scale^*	–0.301^**	0.408^**	–0.453^**	–0.313^**	–0.312^**	–0.404^**
Rand 12 Physical Health Scale^*	–0.305^**	0.356^**	–0.543^**	–0.299^**	–0.286^**	–0.338^**
Rand 12 Mental Health Scale^*	–0.602^**	0.574^**	–0.499^**	–0.668^**	–0.709^**	–0.709^**
WHODAS	0.441^**	–0.428^**	0.658^**	0.436^**	0.414^**	0.500^**
Patient Health Questionnaire 9 (PHQ9)	0.590^**	–0.552^**	0.685^**	0.640^**	0.641^**	0.704^**
General Anxiety Disorder 7-item (GAD7)	0.659^**	–0.516^**	0.608^**	0.674^**	0.703^**	0.639^**
Clinician-Rated Functioning
–
UHDRS Motor	0.034	–0.079	0.070	0.027	0.024	0.075
UHDRS Functional Assessment	–0.113	0.158^**	–0.375^**	–0.060	–0.096	–0.160^*
UHDRS Total Functional Capacity	–0.143^*	0.183^**	–0.417^**	–0.088	–0.073	–0.157^*
Clinician-Rated Behavioral Status
–
PBAs Aggressive Behavior	–0.377^**	0.240^**	–0.286^**	–0.395^**	–0.287^**	–0.287^**
PBAs Anxiety	–0.447^**	0.368^**	–0.414^**	–0.482^**	–0.554^**	–0.450^**
PBAs Apathy	–0.395^**	0.326^**	–0.389^**	–0.360^**	–0.360^**	–0.413^**
PBAs Depressed Mood	–0.479^**	0.435^**	–0.457^**	–0.551^**	–0.516^**	–0.634^**
PBAs Disoriented Behavior	–0.219^**	0.232^**	–0.293^**	–0.196^*	–0.160^*	–0.250^**
PBAs Irritability	–0.503^**	0.289^**	–0.415^**	–0.553^**	–0.428^**	–0.407^**
PBAs Obsessive Compulsive Behavior	–0.169^**	0.127^*	–0.210^**	–0.256^**	–0.214^**	–0.182^*
PBAs Perseverative Thinking	–0.285^**	0.156^*	–0.340^**	–0.336^**	–0.194^*	–0.229^**
PBAs Suicide	–0.354^**	0.345^**	–0.301^**	–0.436^**	–0.401^**	–0.522^**

Note. PROs = patient-reported outcomes; HRQOL = health-related quality of life; CAT = computer adaptive test; SF = short form; UHDRS = Unified Huntington’s Disease Rating Scales; PBAs = Problem Behaviors Assessment. ^**Correlation is significant at the 0.01 level (2-tailed). ^*Correlation is significant at the 0.05 level (2-tailed).

Known-groups analyses

While there were significant group differences for Positive Affect and Well-Being and Stigma (as hypothesized), we did not see group differences on Anger, Anxiety of Depression (which was contrary to our hypotheses); see Table 4.

Table 4

Known groups validity for Neuro-QoL/PROMIS mental health computer adaptive tests

CAT	Premanifest				Early				Late				Full Sample	F	p	eta ²
	N	Mean	SD	Impaired %	N	Mean	SD	Impaired %	N	Mean	SD	Impaired %	Impaired %
Emotional &Behavioral Dyscontrol	101	45.96	8.52	5.9	125	45.18	10.71	8.8	55	48.79	12.62	18.2	9.6	2.34	0.10	0.02
Positive Affect &Well-Being^b	101	56.49	6.99	2.0	125	55.51	8.59	2.4	55	52.77	9.78	7.3	3.2	3.61	0.03	0.03
Stigma^a,b,c	101	45.19	6.87	2.0	125	49.88	8.46	8.0	55	53.63	8.54	20.0	8.2	21.71	<.0001	0.14
Anger	49	48.53	9.54	9.6	73	47.75	12.74	11.5	38	50.12	15.34	29.4	13.2	0.44	0.64	0.01
Anxiety	49	51.30	8.83	7.7	73	50.80	10.30	22.5	38	51.90	11.98	39.4	12.4	0.15	0.87	0.00
Depression	49	46.99	6.90	3.8	73	47.56	9.80	27.3	38	49.32	9.86	23.5	12.4	0.76	0.47	0.01

^*a = significant group differences between premanifest and early, b = significant group differences between premanifest and late, c = significant group differences between early and late.

Effect sizes

Effect sizes are included in Table 5. With the exception of anxiety, those with higher clinician rated mental health problems had larger effect sizes.

Table 5

Cohen’s d effect sizes for Neuro-QoL/PROMIS mental health patient-reported outcome measures

Neuro-QoL/PROMIS Scores	Low Clinician-Rated Severity CAT (SF)	High Clinician-Rated Severity CAT (SF)
Emotional and Behavioral Dyscontrol	0.13 (0.16)	–0.97 (–0.98)
Positive Affect and Wellbeing	0.23 (0.19)	0.98 (1.01)
Anger	0.51 (0.39)	–0.53 (–0.52)
Anxiety	0.65 (0.66)	–0.47 (–0.46)
Depression	0.37 (0.32)	–0.83 (–0.86)

Note. CATs = computer adaptive tests; SF = short form.

Impairment rates

In general, impairment rates for our HD participants were comparable to the general population (general population ≤16%); the only elevated impairment rates were for late-stage HD participants for stigma, anger, anxiety and depression (Table 3).

Classification accuracy (sensitivity/specificity)

ROC results showed that when Neuro-QoL Depression equaled 50.3, the measure demonstrated moderate sensitivity (71.6%) and moderate specificity (85.9%), with an AUC of 0.85 and likelihood ratio of 2.52 for distinguishing between those individuals with and without clinically significant “mild” depression, meeting a priori criterion for clinical decision making. In addition, ROC results showed that when Neuro-QoL Anxiety equaled 55.1, the measure demonstrated moderate sensitivity (82.6%) and moderate specificity (86.6%), with an AUC of 0.91 and likelihood ratio of 4.75 for distinguishing between those individuals with and without clinically significant “mild” anxiety symptoms, also meeting a priori criterion for clinical decision making.

DISCUSSION

Study findings support the reliability and validity of several Neuro-QoL and PROMIS mental health PROs to assess mental health in individuals with premanifest and manifest HD. With regard to reliability, results provided strong support for the internal consistency reliability of all of the new CAT and SF administrations to assess PRO mental health in persons with HD (i.e., all Cronbach’s α >0.92). In addition, these reliability coefficients were either equivalent to, or higher than other generic measures of HRQOL (i.e., higher than internal consistencies for the RAND 12, and equivalent to internal consistencies for the WHODAS). All of the examined Neuro-QoL and PROMIS PRO measures required relatively brief administration times (i.e., <1 minute for both CATs and SFs) and generally, all the Neuro-QoL/PROMIS mental health PROs exceeded acceptable standards for floor and ceiling effects. The exceptions to this were Stigma and Depression SF administrations, which both had a small floor effects. This indicates that these measures lack sensitivity for persons that are experiencing little to no depression or stigma.

Convergent and discriminant validity of the Neuro-QoL/PROMIS mental health PROs were also supported in our sample. Moderate to strong associations between Neuro-Qol/PROMIS mental health PROs and measures of self-reported mental health suggest convergent validity in persons with HD. Discriminant validity was supported by less robust correlations between the Neuro-QoL/PROMIS PROs and clinician-rated assessments of mental health, as well as by small to negligible correlations between the Neuro-QoL/PROMIS PROs and clinician-rated measures of functioning.

Findings for known groups validity were mixed. While Neuro-QoL Stigma differentiated between all three HD groups (with increasing reports of stigma as HD stage progressed), and Neuro-QoL Positive Affect and Well-Being differentiated those with premanifest and late-stage HD (with premanifest reporting better functioning that late-stage participants), none of the other PROs (i.e., Depression, Anxiety, or Anger) were able to differentiate between the different HD staging groups. While these findings for anxiety, depression and anger were not consistent with our proposed hypotheses, the literature suggests that HD stage may not be related to anxiety [31, 137] or depression [3, 32]. It is also possible that anosognosia and/or frontal dysfunction, which can occur in HD (especially in the later stages, as well as for symptoms such as depression, apathy, and anger) [138 –140] may have precluded our ability to detect group differences in this study. These findings, in conjunction with the fact that overall impairment rates for the PROs were consistent with impairment rates in the general population, would suggest that the absence of group differences in our data does not mitigate support for the validity of these measures to assess mental health in persons with HD. Since sensitivity and specificity of the Anxiety and Depression measures met standards for clinical decision making, our findings provide some evidence of these measures’ construct validity to assess mental health in persons with HD. And finally, as expected, effect sizes for the new PROs were higher for participants with worse clinician-rated mental health functioning than those with better functioning (with the exception of anxiety where both groups showed moderate effects, but the effects for lower rated severity were higher than those with higher rated severity). There were generally moderate to large effects for individuals with higher clinician-rated severity and small to moderate effects for those with lower clinician-rated severity. Together, these findings support construct validity of these new measures of mental health in HD.

Our findings have implications for the clinical care of mental health problems in persons with HD. Specifically, all of these measures are scored on a T-metric with a mean of 50 and a standard deviation of 20. This means that persons with scores ≥60 are likely experiencing clinically significant symptoms (i.e., their scores are worse than 84% of the general population) and persons with scores ≥70 are almost definitely experiencing clinically significant symptoms (their scores are worse than 95.5% of the general population). Thus, scores on these measures can be used to guide clinical referrals for mental health treatment. In addition, for Neuro-QoL Depression and Anxiety, where our findings provide support for the sensitivity and specificity of these measures due to an established marker of clinically significant depression and anxiety, respectively. As such, persons with scores ≥50.3 on Neuro-QoL Depression and persons with scores ≥55.1 on Neuro-QoL Anxiety would be appropriate for a mental health referral. This is especially important given the elevated rates of psychiatric diagnoses in persons with HD [29 –33].

While this study provides important psychometric support for Neuro-QoL/PROMIS measures of mental health in persons with HD, there are some study-specific limitations. First, this convenience sample may not represent the HD population at large. The participants in this study were recruited from well-established HD clinics, and thus may have better access to mental health services than the general HD population. Our inclusion criteria also required participants to be cognitively capable of providing informed consent, and thus excludes persons with HD whom are in the latest stages of the disease. In addition, our sample included more females and the education levels for premanifest participants was higher than the manifest participants, limiting generalizability. In addition, while clinician-rated assessments of mental health were administered, these ratings do not correspond to mental health diagnoses. Therefore, although we used established cutoff rates for determining those with and without anxiety and depression, these rates were based on self-reported ratings and do not provide information about how well these measures may or may not be at differentiating between individuals with and without clinical diagnoses of anxiety and/or depression. Also, given the fact that there are no consensus measures for assessing self-reported mental health symptoms in HD [104], we elected to use measures that were in common use in other clinical measures, but did not always have existing psychometric data to support their use in HD. As such, it is possible that our findings may slightly over or underestimate clinical impairment rates in HD. More work is needed to directly examine the relationship between these self-report measures and the actual rates of psychiatric diagnoses in persons with HD. Finally, patient-reported outcomes depend largely on participants’ ability to accurately assess their own symptoms. Anosognosia, or symptom unawareness, has been found to be present even in the prodromal stages of the disease [141], and previous research has shown that cognitive impairment can impact the reliability of PROs especially in individuals with late-stage HD and associated cognitive declines [142]. Specifically, data has shown that there is increased variability in PRO responses and associated decreased in psychometric reliability as HD progresses; however, psychometric reliability for the PROs is typically still within acceptable limits (i.e., >0.70 [143]). For those individuals working with later-stage HD, there are established clinical cutoffs for cognitive scores that can be used to maximize PRO reliability among those with cognitive decline. Regardless, more work to be done examining anosognosia and its relationship to PRO reporting in persons with HD.

In summary, the Neuro-QoL/PROMIS mental health PROs are brief, reliable, and valid assessments of emotional functioning and health-related quality of life in HD. Findings support further utilization of these measures by clinicians and researchers. Future work focused on examining change over time will be useful to inform their utility in repeated trials or longitudinal study. Ultimately, these measures can help fulfill a significant gap in PRO measurement in individuals with HD.

CONFLICTS OF INTEREST

The authors declare that there is no conflict of interest.

Work on this manuscript was supported by the National Institutes of Health (NIH), National Institute of Neurological Disorders and Stroke (R01NS077946) and the National Center for Advancing Translational Sciences (UL1TR000433). In addition, a portion of this study sample was collected in conjunction with the Predict-HD study. The Predict-HD study was supported by the NIH, National Institute of Neurological Disorders and Stroke (R01NS040068), the NIH, Center for Inherited Disease Research (provided supported for sample phenotyping), and the CHDI Foundation (award to the University of Iowa). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Footnotes

ACKNOWLEDGMENTS

We would like to acknowledge the HDQLIFE Site Investigators and Coordinators: Noelle Carlozzi, Praveen Dayalu, Stephen Schilling, Amy Austin, Matthew Canter, Siera Goodnight, Jennifer Miner, Nicholas Migliore (University of Michigan, Ann Arbor, MI); Jane Paulsen, Nancy Downing, Isabella DeSoriano, Courtney Shadrick, Amanda Miller (University of Iowa, Iowa City, IA); Kimberly Quaid, Melissa Wesson (Indiana University, Indianapolis, IN); Christopher Ross, Gregory Churchill, Mary Jane Ong (Johns Hopkins University, Baltimore, MD); Susan Perlman, Brian Clemente, Aaron Fisher, Gloria Obialisi, Michael Rosco (University of California Los Angeles, Los Angeles, CA); Michael McCormack, Humberto Marin, Allison Dicke (Rutgers University, Piscataway, NJ); Joel Perlmutter, Stacey Barton, Shineeka Smith (Washington University, St. Louis, MO); Martha Nance, Pat Ede (Struthers Parkinson’s Center); Stephen Rao, Anwar Ahmed, Michael Lengen, Lyla Mourany, Christine Reece, (Cleveland Clinic Foundation, Cleveland, OH); Michael Geschwind, Joseph Winer (University of California – San Francisco, San Francisco, CA), David Cella, Richard Gershon, Elizabeth Hahn, Jin-Shei Lai (Northwestern University, Chicago, IL).

We would also like to thank the University of Iowa, the Investigators and Coordinators of this study, the study participants, the National Research Roster for Huntington Disease Patients and Families, the Huntington Study Group, and the Huntington’s Disease Society of America. We acknowledge the assistance of Jeffrey D. Long, Hans J. Johnson, Jeremy H. Bockholt, and Roland Zschiegner. We also acknowledge Roger Albin, Kelvin Chou, and Henry Paulsen for the assistance with participant recruitment.

References

Ross

, Aylward

, Wild

, Langbehn

, Long

, Warner

, et al. Huntington disease: Natural history, biomarkers and prospects for therapeutics. Nat Rev Neurol. 2014;10(4):204–16.

Paulsen

. Early detection of Huntington disease. Future Neurol. 2010;5(1):10.2217/fnl.09.78.

Martinez-Horta

, Perez-Perez

, van Duijn

, Fernandez-Bobadilla

, Carceller

, Pagonabarraga

, et al. Neuropsychiatric symptoms are very common in premanifest and early stage Huntington’s disease. Parkinsonism Relat Disord. 2016;25:58–64.

Orth

, Handley

, Schwenke

, Dunnett

, Craufurd

, Ho

, et al. Observing Huntington’s disease: The European Huntington’s Disease Network’s REGISTRY. PLoS Curr. 2010;2:RRN1184.

Paulsen

, Ready

, Hamilton

, Mega

, Cummings

. Neuropsychiatric aspects of Huntington’s disease. J Neurol Neurosurg Psychiatry. 2001;71(3):310–4.

van Duijn

, Craufurd

, Hubers

, Giltay

, Bonelli

, Rickards

, et al. Neuropsychiatric symptoms in a European Huntington’s disease cohort (REGISTRY). J Neurol Neurosurg Psychiatry. 2014;85(12):1411–8.

, Gilbert

, Mason

, Goodman

, Barker

. Health-related quality of life in Huntington’s disease: Which factors matter most?Mov Disord. 2009;24(4):574–8.

Carlozzi

, Tulsky

. Identification of health-related quality of life (HRQOL) issues relevant to individuals with Huntington disease. J Health Psychol. 2013;18(2):212–25.

Berrios

, Wagle

, Markova

, Wagle

, Rosser

, Hodges

. Psychiatric symptoms in neurologically asymptomatic Huntington’s disease gene carriers: A comparison with gene negative at risk subjects. Acta Psychiatr Scand. 2002;105(3):224–30.

10.

Caine

, Hunt

, Weingartner

, Ebert

. Huntington’s dementia. Clinical and neuropsychological features. Arch Gen Psychiatry. 1978;35(3):377–84.

11.

Carlozzi

, Ready

. Health-related quality of life in Huntington’s disease. In: Jenkinson

, Peters

, Bromberg

, editors. Quality of life measurement in neurodegenerative and related conditions. Cambridge: Cambridge University Press; 2011.

12.

Codori

, Slavney

, Rosenblatt

, Brandt

. Prevalence of major depression one year after predictive testing for Huntington’s disease. Genet Test. 2004;8(2):114–9.

13.

De Souza

, Jones

, Rickards

. Validation of self-report depression rating scales in Huntington’s disease. Mov Disord. 2010;25(1):91–6.

14.

Duff

, Paulsen

, Beglinger

, Langbehn

, Stout

, Predict HDIotHSG. Psychiatric symptoms in Huntington’s disease before diagnosis: The predict-HD study. Biol Psychiatry. 2007;62(12):1341–6.

15.

Fisher

, Sewell

, Brown

, Churchyard

. Aggression in Huntington’s disease: A systematic review of rates of aggression and treatment methods. J Huntingtons Dis. 2014;3(4):319–32.

16.

Folstein

, Abbott

, Chase

, Jensen

, Folstein

. The association of affective disorder with Huntington’s disease in a case series and in families. Psychol Med. 1983;13(3):537–42.

17.

Julien

, Thompson

, Wild

, Yardumian

, Snowden

, Turner

, et al. Psychiatric disorders in preclinical Huntington’s disease. J Neurol Neurosurg Psychiatry. 2007;78(9):939–43.

18.

Naarding

, Kremer

, Zitman

. Huntington’s disease: A review of the literature on prevalence and treatment of neuropsychiatric phenomena. Eur Psychiatry. 2001;16(8):439–45.

19.

Naarding

, Janzing

, Eling

, van der Werf

, Kremer

. Apathy is not depression in Huntington’s disease. J Neuropsychiatry Clin Neurosci. 2009;21(3):266–70.

20.

Pflanz

, Besson

, Ebmeier

, Simpson

. The clinical manifestation of mental disorder in Huntington’s disease: A retrospective case record study of disease progression. Acta Psychiatr Scand. 1991;83(1):53–60.

21.

Reedeker

, Bouwens

, Giltay

, Le Mair

, Roos

, van der Mast

, et al. Irritability in Huntington’s disease. Psychiatry Res. 2012;200(2-3):813–8.

22.

Reedeker

, van der Mast

, Giltay

, Kooistra

, Roos

, van Duijn

. Psychiatric disorders in Huntington’s disease: A 2-year follow-up study. Psychosomatics. 2012;53(3):220–9.

23.

Rosenblatt

, Leroi

. Neuropsychiatry of Huntington’s disease and other basal ganglia disorders. Psychosomatics. 2000;41(1):24–30.

24.

Shiwach

. Psychopathology in Huntington’s disease patients. Acta Psychiatr Scand. 1994;90(4):241–6.

25.

Snowden

, Craufurd

, Griffiths

, Thompson

, Neary

. Longitudinal evaluation of cognitive disorder in Huntington’s disease. J Int Neuropsychol Soc. 2001;7(1):33–44.

26.

Thompson

, Harris

, Sollom

, Stopford

, Howard

, Snowden

, et al. Longitudinal evaluation of neuropsychiatric symptoms in Huntington’s disease. J Neuropsychiatry Clin Neurosci. 2012;24(1):53–60.

27.

van Duijn

, Kingma

, van der Mast

. Psychopathology in verified Huntington’s disease gene carriers. J Neuropsychiatry Clin Neurosci. 2007;19(4):441–8.

28.

van Duijn

, Reedeker

, Giltay

, Eindhoven

, Roos

, van der Mast

. Course of irritability, depression and apathy in Huntington’s disease in relation to motor symptoms during a two-year follow-up period. Neurodegener Dis. 2014;13(1):9–16.

29.

Paoli

, Botturi

, Ciammola

, Silani

, Prunas

, Lucchiari

, et al. Neuropsychiatric burden in Huntington’s disease. Brain Sci. 2017;7(6):E67.

30.

Epping

, Kim

, Craufurd

, Brashers-Krug

, Anderson

, McCusker

, et al. Longitudinal psychiatric symptoms in prodromal Huntington’s disease: A decade of data. Am J Psychiatry. 2016;173(2):184–92.

31.

Dale

, van Duijn

. Anxiety in Huntington’s disease. J Neuropsychiatry Clin Neurosci. 2015;27(4):262–71.

32.

Epping

, Mills

, Beglinger

, Fiedorowicz

, Craufurd

, Smith

, et al. Characterization of depression in prodromal Huntington disease in the neurobiological predictors of HD (PREDICT-HD) study. J Psychiatr Res. 2013;47(10):1423–31.

33.

Epping

, Paulsen

. Depression in the early stages of Huntington disease. Neurodegener Dis Manag. 2011;1(5):407–14.

34.

Wesson

, Boileau

, Perlmutter

, Paulsen

, Barton

, McCormack

, et al. Suicidal ideation assessment in individuals with premanifest and manifest Huntington disease. J Huntingtons Dis. 2018;7(3):239–49.

35.

Fiedorowicz

, Mills

, Ruggle

, Langbehn

, Paulsen

, PREDICT-HD Investigators of the Huntington Study Group. Suicidal behavior in prodromal Huntington disease. Neurodegener Dis. 2011;8(6):483–90.

36.

Wetzel

, Gehl

, Dellefave-Castillo

, Schiffman

, Shannon

, Paulsen

, et al. Suicidal ideation in Huntington disease: The role of comorbidity. Psychiatry Res. 2011;188(3):372–6.

37.

Kachian

, Cohen-Zimerman

, Bega

, Gordon

, Grafman

. Suicidal ideation and behavior in Huntington’s disease: Systematic review and recommendations. J Affect Disord. 2019;250:319–29.

38.

Honrath

, Dogan

, Wudarczyk

, Gorlich

, Votinov

, Werner

, et al. Risk factors of suicidal ideation in Huntington’s disease: Literature review and data from Enroll-HD. J Neurol. 2018;265(11):2548–61.

39.

Anderson

, Eberly

, Groves

, Kayson

, Marder

, Young

, et al. Risk factors for suicidal ideation in people at risk for Huntington’s disease. J Huntingtons Dis. 2016;5(4):389–94.

40.

Hubers

, van Duijn

, Roos

, Craufurd

, Rickards

, Bernhard Landwehrmeyer

, et al. Suicidal ideation in a European Huntington’s disease population. J Affect Disord. 2013;151(1):248–58.

41.

Robins Wahlin

. To know or not to know: A review of behaviour and suicidal ideation in preclinical Huntington’s disease. Patient Educ Couns. 2007;65(3):279–87.

42.

Larsson

, Luszcz

, Bui

, Wahlin

. Depression and suicidal ideation after predictive testing for Huntington’s disease: A two-year follow-up study. J Genet Counsel. 2006;15(5):361–74.

43.

Robins Wahlin

, Backman

, Lundin

, Haegermark

, Winblad

, Anvret

. High suicidal ideation in persons testing for Huntington’s disease. Acta Neurol Scand. 2000;102(3):150–61.

44.

U.S. Food and Drug Administration. Clinical Outcome Assessment (COA) Qualification Program. Available from: http://www.fda.gov/Drugs/DevelopmentApprovalProcess/DrugDevelopmentToolsQualificationProgram/ucm284077.htm.

45.

Cella

, Nowinski

, Peterman

, Victorson

, Miller

, Lai

J-S

, et al. The Neurology Quality of Life Measurement (Neuro-QOL) Initiative. Arch Phys Med Rehabil. 2011;92(Suppl 1):S28–S36.

46.

Cella

, Lai

, Nowinski

, Victorson

, Peterman

, Miller

, et al. Neuro-QOL: Brief measures of health-related quality of life for clinical research in neurology. Neurology. 2012;78:1860–7.

47.

Cella

, Yount

, Rothrock

, Gershon

, Cook

, Reeve

, et al. The Patient-Reported Outcomes Measurement Information System (PROMIS): Progress of an NIH Roadmap cooperative group during its first two years. Med Care. 2007;45(5 Suppl 1):S3–11.

48.

Cella

, Riley

, Stone

, Rothrock

, Reeve

, Yount

, et al. The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested in its first wave of adult self-reported health outcome item banks: 2005-2008. J Clin Epidemiol. 2010;63:1179–94.

49.

Nunnally

, Bernstein

. Psychometric theory. New York, NY: McGraw-Hill; 1994.

50.

Campbell

, Fiske

. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychol Bull. 1959;56(2):81–105.

51.

Youngstrom

. A primer on receiver operating characteristic analysis and diagnostic efficiency statistics for pediatric psychology: We are ready to ROC. J Pediatr Psychol. 2014;39(2):204–21.

52.

Hanauer

, Mei

, Law

, Khanna

, Zheng

. Supporting information retrieval from electronic health records: A report of University of Michigan’s nine-year experience in developing and using the Electronic Medical Record Search Engine (EMERSE). J Biomed Inform. 2015;55:290–300.

53.

Paulsen

, Hayden

, Stout

, Langbehn

, Aylward

, Ross

, et al. Preparing for preventive clinical trials - The Predict-HD study. Arch Neurol. 2006;63(6):883–90.

54.

Gershon

, Lai

J-S

, Bode

, Choi

, Moy

, Bleck

, et al. Neuro-QOL: Quality of life item banks for adults with neurological disorders: Item development and calibrations based upon clinical and general population testing. Qual Life Res. 2012;21(3):475–86.

55.

Ustun

, Chatterji

, Kostanjsek

, Rehm

, Kennedy

, Epping-Jordan

, et al. Developing the world health organization disability assessment schedule 2.0. Bull World Health Organ. 2010;88(11):815–23.

56.

Rehm

, Ustun

, Saxena

, Nelson

, Chatterji

, Ivis

, et al. On the development and psychometric testing of the WHO screening instrument to assess disablement in the general population. Int J Methods Psychiatr Res. 2006;8(2):110–22.

57.

Grinnon

, Miller

, Marler

, Lu

, Stout

, Odenkirchen

, et al. National Institute of Neurological Disorders and Stroke Common Data Element Project - approach and methods. Clin Trials. 2012;9(3):322–9.

58.

Hays

, Sherbourn

, Mazel

. User’s manual for the Medical Outcomes Study (MOS) core measures of health-related quality of life. Santa Monica, CA: RAND corporation; 1995.

59.

Rabin

, de Charro

. EQ-5D: A measure of health status from the EuroQol Group. Ann Med. 2001;33(5):337–43.

60.

Carlozzi

, Kratz

, Downing

, Goodnight

, Miner

, Migliore

, et al. Validity of the 12-item World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) in individuals with Huntington disease (HD). Qual Life Res. 2015;24(8):1963–71.

61.

Downing

, Kim

, Williams

, Long

, Mills

, Paulsen

. WHODAS 2.0 in prodromal Huntington disease: Measures of functioning in neuropsychiatric disease. Eur J Hum Genet. 2014;22(8):958–63.

62.

Kim

, Long

, Mills

, Downing

, Williams

, Paulsen

, et al. Performance of the 12-item WHODAS 2.0 in prodromal Huntington disease. Eur J Hum Genet. 2015;23(11):1584–7.

63.

Mayrink

, Souza

, Silveira

, Guida

, Costa

, Parpinelli

, et al. Reference ranges of the WHO Disability Assessment Schedule (WHODAS 2.0) score and diagnostic validity of its 12-item version in identifying altered functioning in healthy postpartum women. Int J Gynaecol Obstet. 2018;141(Suppl 1):48–54.

64.

Younus

, Wang

, Yu

, Fang

, Guo

. Reliability and validity of the 12-item WHODAS 2.0 in patients with Kashin-Beck disease. Rheumatol Int. 2017;37(9):1567–73.

65.

Yen

, Hwang

, Liou

, Chiu

, Hsu

, Chi

, et al. Validity and reliability of the Functioning Disability Evaluation Scale-Adult Version based on the WHODAS 2.0–36 items. J Formos Med Assoc. 2014;113(11):839–49.

66.

Kulnik

, Nikoletou

. WHODAS 2.0 in community rehabilitation: A qualitative investigation into the validity of a generic patient-reported measure of disability. Disabil Rehabil. 2014;36(2):146–54.

67.

Wolf

, Tate

, Lannin

, Middleton

, Lane-Brown

, Cameron

. The World Health Organization Disability Assessment Scale, WHODAS II: Reliability and validity in the measurement of activity and participation in a spinal cord injury population. J Rehabil Med. 2012;44(9):747–55.

68.

Kucukdeveci

, Kutlay

, Yildizlar

, Oztuna

, Elhan

, Tennant

. The reliability and validity of the World Health Organization Disability Assessment Schedule (WHODAS-II) in stroke. Disabil Rehabil. 2013;35(3):214–20.

69.

Hernandez

, Garin

, Dima

, Pont

, Marti Pastor

, Alonso

, et al. EuroQol (EQ-5D-5L) validity in assessing the quality of life in adults with asthma: Cross-sectional study. J Med Internet Res. 2019;21(1):e10178.

70.

Bekairy

, Bustami

, Almotairi

, Jarab

, Katheri

, Aldebasi

, et al. Validity and reliability of the Arabic version of the the EuroQOL (EQ-5D). A study from Saudi Arabia. Int J Health Sci. 2018;12(2):16–20.

71.

Bejjani

, Fiore

Jr , Lee

, Kaneva

, Mata

, Ncuti

, et al. Validity of the EuroQol-5 dimensions as a measure of recovery after pulmonary resection. J Surg Res. 2015;194(1):281–8.

72.

Obradovic

, Lal

, Liedgens

. Validity and responsiveness of EuroQol-5 dimension (EQ-5D) versus Short Form-6 dimension (SF-6D) questionnaire in chronic pain. Health Qual Life Outcomes. 2013;11:110.

73.

Slobogean

, Noonan

, O’Brien

. The reliability and validity of the Disabilities of Arm, Shoulder, and Hand, EuroQol-5D, Health Utilities Index, and Short Form-6D outcome instruments in patients with proximal humeral fractures. J Shoulder Elbow Surg. 2010;19(3):342–8.

74.

Adobor

, Rimeslatten

, Keller

, Brox

. Repeatability, reliability, and concurrent validity of the scoliosis research society-22 questionnaire and EuroQol in patients with adolescent idiopathic scoliosis. Spine (Phila Pa 1976). 2010;35(2):206–9.

75.

Haywood

, Garratt

, Lall

, Smith

, Lamb

. EuroQol EQ-5D and condition-specific measures of health outcome in women with urinary incontinence: Reliability, validity and responsiveness. Qual Life Res. 2008;17(3):475–83.

76.

Prieto

, Novick

, Sacristan

, Edgell

, Alonso

, SOHO Study Group. A Rasch model analysis to test the cross-cultural validity of the EuroQoL-5D in the Schizophrenia Outpatient Health Outcomes Study. Acta Psychiatr Scand Suppl. 2003(416):24–9.

77.

, Jacobson

, Frick

, Clark

, Revicki

, Freedberg

, et al. Validity and responsiveness of the euroqol as a measure of health-related quality of life in people enrolled in an AIDS clinical trial. Qual Life Res. 2002;11(3):273–82.

78.

Fransen

, Edmonds

. Reliability and validity of the EuroQol in patients with osteoarthritis of the knee. Rheumatology (Oxford). 1999;38(9):807–13.

79.

Hurst

, Kind

, Ruta

, Hunter

, Stubbings

. Measuring health-related quality of life in rheumatoid arthritis: Validity, responsiveness and reliability of EuroQol (EQ-5D). Br J Rheumatol. 1997;36(5):551–9.

80.

Hurst

, Jobanputra

, Hunter

, Lambert

, Lochhead

, Brown

. Validity of Euroqol–a generic health status instrument–in patients with rheumatoid arthritis. Economic and Health Outcomes Research Group. Br J Rheumatol. 1994;33(7):655–62.

81.

Brazier

, Jones

, Kind

. Testing the validity of the Euroqol and comparing it with the SF-36 health survey questionnaire. Qual Life Res. 1993;2(3):169–80.

82.

Huo

, Guo

, Shenkman

, Muller

. Assessing the reliability of the short form 12 (SF-12) health survey in adults with mental health conditions: A report from the wellness incentive and navigation (WIN) study. Health Qual Life Outcomes. 2018;16(1):34.

83.

Shou

, Ren

, Wang

, Yan

, Cao

, Wang

, et al. Reliability and validity of 12-item Short-Form health survey (SF-12) for the health status of Chinese community elderly population in Xujiahui district of Shanghai. Aging Clin Exp Res. 2016;28(2):339–46.

84.

Bohannon

, Maljanian

, Landes

. Test-retest reliability of short form (SF)-12 component scores of patients with stroke. Int J Rehabil Res. 2004;27(2):149–50.

85.

Luo

, George

, Kakouras

, Edwards

, Pietrobon

, Richardson

, et al. Reliability, validity, and responsiveness of the short form 12-item survey (SF-12) in patients with back pain. Spine (Phila Pa 1976). 2003;28(15):1739–45.

86.

Salyers

, Bosworth

, Swanson

, Lamb-Pagone

, Osher

. Reliability and validity of the SF-12 health survey among people with severe mental illness. Med Care. 2000;38(11):1141–50.

87.

Tawiah

, Al Sayah

, Ohinmaa

, Johnson

. Discriminative validity of the EQ-5D-5L and SF-12 in older adults with arthritis. Health Qual Life Outcomes. 2019;17(1):68.

88.

Conner-Spady

, Marshall

, Bohm

, Dunbar

, Noseworthy

. Comparing the validity and responsiveness of the EQ-5D-5L to the Oxford hip and knee scores and SF-12 in osteoarthritis patients 1 year following total joint replacement. Qual Life Res. 2018;27(5):1311–22.

89.

Patel

, Lester

, Marra

, van der Kop

, Ritvo

, Engel

, et al. The validity of the SF-12 and SF-6D instruments in people living with HIV/AIDS in Kenya. Health Qual Life Outcomes. 2017;15(1):143.

90.

Edwards

, McFadden

, Lanier

, Murtaugh

, Ferucci

, Redwood

, et al. Construct validity of the SF-12 among American Indian and Alaska Native people using two known scoring methods. J Health Care Poor Underserved. 2012;23(3):1123–36.

91.

Chariyalertsak

, Wansom

, Kawichai

, Ruangyuttikarna

, Kemerer

, Wu

. Reliability and validity of Thai versions of the MOS-HIV and SF-12 quality of life questionnaires in people living with HIV/AIDS. Health Qual Life Outcomes. 2011;9:15.

92.

Jakobsson

, Westergren

, Lindskov

, Hagell

. Construct validity of the SF-12 in three different samples. J Eval Clin Pract. 2012;18(3):560–6.

93.

Okonkwo

, Roth

, Pulley

, Howard

. Confirmatory factor analysis of the validity of the SF-12 for persons with and without a history of stroke. Qual Life Res. 2010;19(9):1323–31.

94.

Failde

, Medina

, Ramirez

, Arana

. Construct and criterion validity of the SF-12 health questionnaire in patients with acute myocardial infarction and unstable angina. J Eval Clin Pract. 2010;16(3):569–73.

95.

Larson

, Schlundt

, Patel

, Beard

, Hargreaves

. Validity of the SF-12 for use in a low-income African American community-based research initiative (REACH 2010). Prev Chronic Dis. 2008;5(2):A44.

96.

Globe

, Levin

, Chang

, Mackenzie

, Azen

. Validity of the SF-12 quality of life instrument in patients with retinal diseases. Ophthalmology. 2002;109(10):1793–8.

97.

Jenkinson

, Chandola

, Coulter

, Bruster

. An assessment of the construct validity of the SF-12 summary scores across ethnic groups. J Public Health Med. 2001;23(3):187–94.

98.

Schofield

, Mishra

. Validity of the SF-12 compared with the SF-36 Health Survey in Pilot Studies of the Australian Longitudinal Study on Women’s Health. J Health Psychol. 1998;3(2):259–71.

99.

, Robbins

, Walters

, Kaptoge

, Sahakian

, Barker

. Health-related quality of life in Huntington’s disease: A comparison of two generic instruments, SF-36 and SIP. Mov Disord. 2004;19(11):1341–8.

100.

Feeny

, Farris

, Cote

, Johnson

, Tsuyuki

, Eng

. A cohort study found the RAND-12 and Health Utilities Index Mark 3 demonstrated construct validity in high-risk primary care patients. J Clin Epidemiol. 2005;58(2):138–41.

101.

Maddigan

, Feeny

, Johnson

, Investigators

. Construct validity of the RAND-12 and Health Utilities Index Mark 2 and 3 in type 2 diabetes. Qual Life Res. 2004;13(2):435–48.

102.

Kroenke

, Spitzer

, Williams

. The PHQ- Validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606–13.

103.

Spitzer

, Kroenke

, Williams

, Lowe

. A brief measure for assessing generalized anxiety disorder: The GAD-7. Arch Intern Med. 2006;166(10):1092–7.

104.

Mestre

, van Duijn

, Davis

, Bachoud-Levi

, Busse

, Anderson

, et al. Rating scales for behavioral symptoms in Huntington’s disease: Critique and recommendations. Mov Disord. 2016;31(10):1466–78.

105.

Amtmann

, Kim

, Chung

, Bamer

, Askew

, Wu

, et al. Comparing CESD-10, PHQ-9, and PROMIS depression instruments in individuals with multiple sclerosis. Rehabil Psychol. 2014;59(2):220–9.

106.

Bombardier

, Richards

, Krause

, Tulsky

, Tate

. Symptoms of major depression in people with spinal cord injury: Implications for screening. Arch Phys Med Rehabil. 2004;85(11):1749–56.

107.

Hammash

, Hall

, Lennie

, Heo

, Chung

, Lee

, et al. Psychometrics of the PHQ-9 as a measure of depressive symptoms in patients with heart failure. Eur J Cardiovasc Nurs. 2013;12(5):446–53.

108.

Pilkonis

, Yu

, Dodds

, Johnston

, Maihoefer

, Lawrence

. Validation of the depression item bank from the Patient-Reported Outcomes Measurement Information System (PROMIS) in a three-month observational study. J Psychiatr Res. 2014;56:112–9.

109.

Doi

, Ito

, Takebayashi

, Muramatsu

, Horikoshi

. Factorial validity and invariance of the 7-Item Generalized Anxiety Disorder Scale (GAD-7) among populations with and without self-reported psychiatric diagnostic status. Front Psychol. 2018;9:1741.

110.

Zhong

, Gelaye

, Zaslavsky

, Fann

, Rondon

, Sanchez

, et al. Diagnostic validity of the Generalized Anxiety Disorder - 7 (GAD-7) among pregnant women. PLoS One. 2015;10(4):e0125096.

111.

Ruiz

, Zamorano

, Garcia-Campayo

, Pardo

, Freire

, Rejas

. Validity of the GAD-7 scale as an outcome measure of disability in patients with generalized anxiety disorders in primary care. J Affect Disord. 2011;128(3):277–86.

112.

Indu

, Anilkumar

, Vijayakumar

, Kumar

, Sarma

, Remadevi

, et al. Reliability and validity of PHQ-9 when administered by health workers for depression screening among women in primary care. Asian J Psychiatr. 2018;37:10–4.

113.

Erbe

, Eichert

, Rietz

, Ebert

. Interformat reliability of the patient health questionnaire: Validation of the computerized version of the PHQ-9. Internet Interv. 2016;5:1–4.

114.

. Rapid screening of psychological well-being of patients with chronic illness: Reliability and validity test on WHO-5 and PHQ-9 Scales. Depress Res Treat. 2014;2014:239490.

115.

Milette

, Hudson

, Baron

, Thombs

, Canadian Scleroderma Research Group. Comparison of the PHQ-9 and CES-D depression scales in systemic sclerosis: Internal consistency reliability, convergent validity and clinical correlates. Rheumatology (Oxford). 2010;49(4):789–96.

116.

Chen

, Chiu

, Xu

, Ma

, Jin

, Wu

, et al. Reliability and validity of the PHQ-9 for screening late-life depression in Chinese primary care. Int J Geriatr Psychiatry. 2010;25(11):1127–33.

117.

Poongothai

, Pradeepa

, Ganesan

, Mohan

. Reliability and validity of a modified PHQ-9 item inventory (PHQ-12) as a screening instrument for assessing depression in Asian Indians (CURES-65). J Assoc Physicians India. 2009;57:147–52.

118.

Monahan

, Shacham

, Reece

, Kroenke

, Ong’or

, Omollo

, et al. Validity/reliability of PHQ-9 and PHQ-2 depression scales among adults living with HIV/AIDS in western Kenya. J Gen Intern Med. 2009;24(2):189–97.

119.

McCord

, Provost

. Construct validity of the PHQ-9 depression screen: Correlations with substantive scales of the MMPI-2-RF. J Clin Psychol Med Settings. 2019. doi: 10.1007/s10880-019-09629-z

120.

Rancans

, Trapencieris

, Ivanovs

, Vrublevska

. Validity of the PHQ-9 and PHQ-2 to screen for depression in nationwide primary care population in Latvia. Ann Gen Psychiatry. 2018;17:33.

121.

Doi

, Ito

, Takebayashi

, Muramatsu

, Horikoshi

. Factorial validity and invariance of the Patient Health Questionnaire (PHQ)-9 among clinical and non-clinical populations. PLoS One. 2018;13(7):e0199235.

122.

Dadfar

, Kalibatseva

, Lester

. Reliability and validity of the Farsi version of the Patient Health Questionnaire-9 (PHQ-9) with Iranian psychiatric outpatients. Trends Psychiatry Psychother. 2018;40(2):144–51.

123.

Christensen

, Oernboel

, Zatzick

, Russo

. Screening for depression: Rasch analysis of the structural validity of the PHQ-9 in acutely injured trauma survivors. J Psychosom Res. 2017;97:18–22.

124.

Arrieta

, Aguerrebere

, Raviola

, Flores

, Elliott

, Espinosa

, et al. Validity and utility of the Patient Health Questionnaire (PHQ)-2 and PHQ-9 for screening and diagnosis of depression in rural Chiapas, Mexico: A cross-sectional study. J Clin Psychol. 2017;73(9):1076–90.

125.

Inagaki

, Ohtsuki

, Yonemoto

, Kawashima

, Saitoh

, Oikawa

, et al. Validity of the Patient Health Questionnaire (PHQ)-9 and PHQ-2 in general internal medicine primary care at a Japanese rural hospital: A cross-sectional study. Gen Hosp Psychiatry. 2013;35(6):592–7.

126.

Turvey

, Sheeran

, Dindo

, Wakefield

, Klein

. Validity of the Patient Health Questionnaire, PHQ-9, administered through interactive-voice-response technology. J Telemed Telecare. 2012;18(6):348–51.

127.

de Lima Osorio

, Vilela Mendes

, Crippa

, Loureiro

. Study of the discriminative validity of the PHQ-9 and PHQ-2 in a sample of Brazilian women in the context of primary health care. Perspect Psychiatr Care. 2009;45(3):216–27.

128.

Gjerdingen

, Crow

, McGovern

, Miner

, Center

. Postpartum depression screening at well-child visits: Validity of a 2-question screen and the PHQ-9. Ann Family Med. 2009;7(1):63–70.

129.

Shoulson

, Fahn

. Huntington disease - Clinical care and evaluation. Neurology. 1979;29(1):1–3.

130.

Huntington Study Group. Unified Huntington’s Disease Rating Scale: Reliability and consistency. Mov Disord. 1996;11(2):136–42.

131.

Marder

, Zhao

, Myers

, Cudkowicz

, Kayson

, Kieburtz

, et al. Rate of functional decline in Huntington’s disease. Huntington Study Group. Neurology. 2000;54(2):452–8.

132.

Andresen

. Criteria for assessing the tools of disability outcomes research. Arch Phys Med Rehabil. 2000;81(12 Suppl 2):S15–20.

133.

Cramer

, Howitt

. The Sage dictionary of statistics. Thousand Oaks, CA: Sage; 2004.

134.

Heaton

, Miller

, Taylor

, Grant

. Revised comprehensive norms for an expanded Halstead-Reitan Battery: Demographically adjusted neuropsychological norms for African American and Caucasian adults. Lutz, FL: Psychological Assessment Resources, Inc.; 2004.

135.

Grimes

, Schulz

. Refining clinical diagnoses with likelihood ratios. Lancet. 2005;365:1500–5.

136.

Carlozzi

, Schilling

, Lai

, Paulsen

, Hahn

, Perlmutter

, et al. HDQLIFE: Development and assessment of health-related quality of life in Huntington disease (HD). Qual Life Res. 2016;25(10):2441–55.

137.

Dale

, Maltby

, Shimozaki

, Cramp

, Rickards

, REGISTRY Investigators of the European Huntington’s Disease Network. Disease stage, but not sex, predicts depression and psychological distress in Huntington’s disease: A European population study. J Psychosom Res. 2016;80:17–22.

138.

Hoth

, Paulsen

, Moser

, Tranel

, Clark

, Bechara

. Patients with Huntington’s disease have impaired awareness of cognitive, emotional, and functional abilities. J Clin Exp Neuropsychol. 2007;29(4):365–76.

139.

Chatterjee

, Anderson

, Moskowitz

, Hauser

, Marder

. A comparison of self-report and caregiver assessment of depression, apathy, and irritability in Huntington’s disease. J Neuropsychiatry Clin Neurosci. 2005;17(3):378–83.

140.

Duff

, Paulsen

, Beglinger

, Langbehn

, Wang

, Stout

, et al. “Frontal” behaviors before the diagnosis of Huntington’s disease and their relationship to markers of disease progression: Evidence of early lack of awareness. J Neuropsychiatry Clin Neurosci. 2010;22(2):196–207.

141.

McCusker

, Gunn

, Epping

, Loy

, Radford

, Griffith

, et al. Unawareness of motor phenoconversion in Huntington disease. Neurology. 2013;81(13):1141–7.

142.

Carlozzi

, Schilling

, Kratz

, Paulsen

, Frank

, Stout

. Understanding patient-reported outcome measures in Huntington disease: At what point is cognitive impairment related to poor measurement reliability?Qual Life Res. 2018;27(10):2541–55.

143.

Cohen

. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20(1):10.