Abstract
Background:
The Patient Scar Assessment Questionnaire (PSAQ) was constructed to evaluate the effect of any surgical therapy with a linear scar. This study aims to demonstrate reliability and validity of Appearance and Consciousness subscales of PSAQ in patients who underwent thyroidectomy or parathyroidectomy.
Methods:
Patients who underwent a thyroidectomy or parathyroidectomy between 2000 and 2010 were administered the aforementioned subscales of the PSAQ. Each subscale was separately evaluated for its psychometric performance according to established criteria. Acceptability, reliability, and internal validity analysis were conducted.
Results:
There were 696 patients (mean age=51.6 years) who participated in this study. Cronbach's alpha acceptable levels were demonstrated for the Appearance (α=0.79) and Consciousness (α=0.85) subscales. Reliability was also supported for the Appearance (Intraclass Correlation Coefficient [ICC]=0.79) and Consciousness subscales (ICC=0.81) by performing test-retest reliability analysis. Individual subscale items' correlations with all subscale scores were acceptable for the Appearance (0.31 to 0.78) and Consciousness (0.23 to 0.81) subscales. Internal validity was supported by evaluating correlations between the global assessment item of each subscale and both summary subscale scores (Appearance: 0.42 to 0.72, Consciousness: 0.66 to 0.67).
Conclusions:
The Appearance and Consciousness subscales of the PSAQ are both reliable and valid for the assessment of a linear scar following thyroid or parathyroid surgery, independent of the minimally invasive approach being used.
Introduction
There is only one study in the literature evaluating patient's satisfaction based on scar location and size following thyroid and parathyroid surgery (15). A paired cohort study comparing a conventional thyroidectomy incision (average length 7.58 cm) to smaller incision (average length 3.36 cm) used for minimal-access parathyroid surgery concluded that there was no difference in the subjective scar perception comparing patients with longer scars with those with smaller scars (15). However one criticism of this study was the small sample size (n=22) and advanced age of participants (mean age=70 years), which is not representative of most thyroidectomy patients (16). In general, scar assessment has been mainly focused on burn scars, but interest in postsurgical scars is increasing (17).
Recently, a validated patient-reported measure for linear postoperative scars, named Patient Scar Assessment Questionnaire (PSAQ), was introduced (18). The psychometric properties of the PSAQ were demonstrated by Durani et al. in healthy human volunteers, in a scar revision surgical group, after head and neck nevi excision surgery, after varicose vein surgery, and after cardiothoracic surgery (18). Subsequently, a systematic review concluded that such an instrument would be of valuable importance (19). This study aims to examine validity and reliability of PSAQ in patients following thyroid and parathyroid surgery.
Materials and Methods
Patient sample and study design
Randomly selected patients who underwent a thyroidectomy or parathyroidectomy between 2000 and 2010 were administered the Appearance, Symptoms and Consciousness subscales of the PSAQ. The patients were asked to respond to the questionnaire items via a telephone interview conducted by two physicians. All patients were Greek Caucasians. Only one center (Department of Surgery, “Hygeia” Hospital, Athens, Greece) was involved in the study and all patients were treated by the same surgical team. The study was approved by the hospital's ethics committee and written informed consent was obtained from all patients. Patients treated for benign or malignant disease were included in the study.
The PSAQ is divided into five distinct domains. Only three of them (Appearance, Symptoms and Consciousness) were used in our interview so as to reduce its duration and increase the response rate. The subscales of PSAQ can be used in isolation without affecting reliability or validity (18). In this study psychometric assessment of the Symptoms subscale was abandoned because 4 of the 6 items in this subscale have double-stemmed response options. This subscale needs further modifications before being applied in surgical groups with linear scars where scar symptoms are rare, as it has been already evidenced from the creators of the PSAQ (18).
The Appearance subscale contains nine questions, while the Consciousness subscale contains six questions. Each question has four possible responses, (scored from 1 to 4) and summed to give the total score for the individual subscales, Appearance (range 9–36) and Consciousness (range 6–24). Each of these two subscales has a global question scored from 5 points for the Appearance and from 4 points for the Consciousness subscale. Higher scores indicate a worse cosmetic outcome.
Psychometric evaluation of the PSAQ
Each subscale in the PSAQ was separately evaluated for its psychometric performance according to established criteria (Table 1) (20 –23). Acceptability analysis, reliability analysis (internal consistency and test-retest reliability), and internal validity analysis (construct validity using within scale analysis) were conducted. External validity, known group differences, and sensitivity analyses were not performed.
Acceptability of PSAQ was assessed by the completeness of data and by the score distributions (23). Missing data of <5% and skewness values between +1 and −1 are indicators of high acceptability.
Internal consistency was measured with Cronbach's alpha statistic. Values of 0.70 and higher were considered acceptable (21,22). The item-total correlation test was performed to check whether any question (item) is not consistent with the rest of the subscale. Spearman correlation coefficient was assessed between the scores of an individual item and the sum of the scores of the remaining items that form the scale. Items correlate with the total when Spearman correlation coefficient was higher than 0.20 (21,22).
The stability of PSAQ was assessed by interviewing the 10% of the respondents (70 patients) on two different occasions and examining the correlation between test and retest scores. The subgroup of patients used for test-retest reliability analysis was randomly selected. The repeated PSAQ was completed within 2 months of the initial completed questionnaire. A 2 months interval was selected to ensure that scars do not significantly change over this time period and that respondents do not recall their responses from the first assessment. Test-retest reliability analysis was performed and intraclass correlation coefficient was calculated. Values higher than 0.70 were considered acceptable (20,22). Statistical analysis was performed using SPSS software (version 18.0).
Results
Sample characteristics
Randomly selected patients who underwent a thyroid ectomy or parathyroidectomy between January 2000 and March 2010 in the Department of Surgery at “Hygeia” Hospital, Athens, Greece, were included in this study. From these persons we generated a random sample of 900 patients, stratified by gender, age (in 10 year groups), and type of surgery. Two physicians blinded to patients' surgical history and initial diagnosis conducted telephone interviews with a 77% response rate, resulting in a sample of 696 patients.
Of the 696 patients participating in the study, 146 were male and 550 female. The mean age of the subjects at the date of the interview was 51.6 years (range 8–94 years). Twenty-six (3.7%) of the patients had scars <6 months old and 90 (12.9%) had scars that aged less than one year (Fig. 1a). 517 (74.3%) had been treated with total thyroidectomy, 21 (3%) with partial thyroidectomy, 95 (13.6%) with parathyroidectomy, 5 (0.7%) with robot-assisted techniques, 27 (3.9%) with total thyroidectomy and modified lymph node radical dissection, and 31 (4.5%) with synchronous thyroidectomy and parathyroidectomy. The descriptive statistics of the summary scores of the Appearance and Consciousness subscales of PSAQ are demonstrated in Table 2. The distribution of the summary scores of both subscales is depicted by frequency histograms and boxplots in Figure 2.

Bar charts of the age of the scars at the time of the interview in

CI, confidence interval; SEM, standard error of the mean.
Acceptability analysis
There was no missing data in both subscales. Skewness values of the Appearance and Consciousness subscale were 1.46 and 2.14, respectively (Table 2).
Reliability analysis
Internal consistency
Cronbach's alpha was acceptable for the Appearance (α=0.79) and Consciousness (α=0.85) subscales (Table 3). The individual subscale items' correlations with the subscale score to which they belonged were greater than their correlation with the summary score of the other subscale.
“Levels observed” refer to the range of a subscale score in our sample.
Test-retest reliability
The stability of PSAQ was assessed by interviewing the 10% of the respondents (70 patients). Four (5.7%) of the patients had scars <6 months old and 10 (14.3%) had scars that aged less than one year (Fig. 1b). The intraclass correlation coefficient of the Appearance and Consciousness subscales was 0.79 and 0.81, respectively, demonstrating high test-retest reliability of these subscales following thyroid or parathyroid surgery (Table 4).
Internal validity analysis (within-scale analysis)
Internal consistency and item correlations are also indicators of internal validity of the subscales (see above). Individual subscale items' correlations with the Appearance (0.31 to 0.78) and Consciousness (0.23 to 0.81) subscales scores were acceptable.
Internal validity was supported by evaluating correlations between the global assessment item of each subscale and both summary subscale scores (Appearance: 0.42 to 0.72, Consciousness: 0.66 to 0.67).
Subgroup analysis
Acceptability, reliability, and internal validity analysis for the Appearance and Consciousness subscales was also performed in two subgroups based on the minimally invasive nature of the surgical approach being used. Three hundred and eight (44.3%) had been treated with a minimally invasive nonendoscopic approach from the neck, 383 (55%) had been treated with the conventional open technique, and 5 (0.7%) had been treated with a transaxillary robotic technique. The Appearance and Consciousness subscales were found to be both reliable and valid for the assessment of scars following thyroid and parathyroid surgery, independent of the minimally invasive approach being used (data not shown).
Discussion
The aim of this study was to examine the validity and reliability of two subscales of PSAQ, a scar-specific patient-reported outcome measure, following thyroid or parathyroid surgery. The results of this study demonstrate that the Appearance and Consciousness subscales of PSAQ are reliable in patients who have undergone a thyroidectomy or parathyroidectomy in terms of internal consistency and test-retest reliability. In other words, the items comprising the Appearance and Consciousness subscales measure the same construct and Appearance and Consciousness domains of PSAQ proved stable over time in this study's patient group.
The individual subscale items' correlation with the subscale score to which they belonged was greater than their correlation with the summary score of the other subscale. This provides further evidence of internal validity of the Appearance and Consciousness subscales of PSAQ following thyroid and parathyroid surgery. Finally, even though our study had a longer test-retest interval than the study by Durani et al. (18) (2 months vs. 14 days), test-retest reliability analysis was acceptable for both examined subscales.
This is the first study examining the reliability and validity of a postsurgical scar-specific patient-reported outcome measure to the setting of thyroid and parathyroid surgery. The initial validation and reliability analysis of PSAQ by Durani et al. (18) was performed in 667 patients of various surgical settings. In our study, consisting only patients who have undergone a thyroidectomy and/or a parathyroidectomy, the PSAQ was applied to a large patient sample (n=696) to draw more definitive conclusions for this specific subgroup of patients.
The remodeling phase of skin wound healing can last up to one year (24). Although, the stability of PSAQ was assessed over a 2-month period, the impact this can impose on the test-retest reliability analysis depends on the age of the scars. Ten (14.3%) of the subjects used for the stability analysis of PSAQ had scars whose remodeling phase was still active at the time of the interview. This could potentially modify the responses by subjects and overestimate or underestimate the stability and reliability of the PSAQ.
Although our results regarding reliability and validity are generally in agreement with the study by Durani et al. (18), it is important to discuss possible limitations of our study. First, even though the clinicians are allowed to use the subscales of PSAQ in isolation without affecting reliability or validity (18), the fact that we decided to examine only the first two domains of PSAQ led to our inability to calculate the summary score of the whole PSAQ, which consists of five subscales. Second, our study did not examine external validity, known differences between patient groups, and sensitivity over time of the Appearance and Consciousness subscales. Further, another limitation of our study is that test-retest reliability analysis was performed in 10% (70 patients) of our sample. However this is frequently done in studies examining the stability of a measuring instrument over time, mainly due to feasibility reasons (25,26). In this study psychometric assessment of the Symptoms subscale was abandoned because 4 of the 6 items in this subscale have double-stemmed response options. This subscale needs further modifications before being applied in surgical groups with linear scars where scar symptoms are rare, as it has been already evidenced from the creators of the PSAQ (18). For all the above, we were unable to apply a scoring system to this subscale that could allow any meaningful psychometric assessment. Finally, PSAQ was designed for self-completion by the patient to minimize administrator bias and not for completion via a telephone interview (18,27).
Conclusion
The Appearance and Consciousness subscales of the PSAQ are both reliable and valid for the assessment of a linear scar following thyroid or parathyroid surgery, independent of the minimally invasive approach being used.
Footnotes
Disclosure Statement
The authors declare that no competing financial interests exist.
