Abstract
Abstract
Background:
The prevention and relief of suffering in palliative care are critical to the well-being and quality of life of patients and families facing life-threatening diseases. Many tools to assess different issues in health care are available, but few are specifically designed to evaluate suffering, which is essential for its prevention, early management, and treatment.
Objective:
The purpose of this review was to identify and describe existing instruments developed to assess suffering in palliative care, as well as to comment on their psychometric properties.
Methods:
A review of articles indexed in MEDLINE, PsycINFO, and SciELO up to June 2011 was conducted. All articles reporting the development, description, or psychometric properties of instruments that assess suffering were included. An assessment of their psychometric quality was performed following a structured checklist.
Results:
Ten instruments that assess suffering were identified. Their main features and psychometric properties are described in order to facilitate the selection of the appropriate one given each patient's context.
Conclusion:
By taking into consideration all features of the assessment instruments under review, the evaluation of suffering can be made easier. A wide and ever expanding range of approaches is now available, which facilitates the selection of the suffering-assessment instrument that is best suited to the needs of the specific patient. One of the challenges ahead will be to further analyze the psychometric properties of some existing instruments.
Introduction
At the end of life, suffering is widespread, 5 leading to significant deterioration in well-being and QOL. 6 Suffering is associated with physical, psychological, social, and spiritual factors that interact with each other,7–11 comprising a unique, complex, and multidimensional experience that needs to be approached in an integrated and individualized fashion.1,2,11–13 According to Cassell, 12 suffering is “a specific state of severe distress associated with events that threaten the intactness of person.” More recently, it has been defined as “a multi-dimensional and dynamic experience of severe stress that occurs when there is a significant threat to the whole person and regulatory processes are insufficient, leading to exhaustion.” 14
A growing interest in the concept of suffering has taken place, in part thanks to the biopsychosocial–spiritual model, 15 which supports the integration of psychosocial and spiritual aspects into medical practice. Moreover, given that the relief of suffering is central to palliative care, its detection and assessment take the highest priority.16,17
The detection of suffering presents a challenge, especially at the end of life when pivotal treatment decisions are required.16–20 Suffering may be related to the severity of disease, and severity of disease can be objectively measured. However, suffering is assessed subjectively.12,14,21 This subjectivity, when juxtaposed with the objectivity of disease severity itself, makes the choice of intervention strategy especially challenging. Suffering assessment must also factor in the progressive deterioration in the patient's willingness to address complex or excruciating issues, and in his or her energy and personal resources. 13
As Bayés22–24 points out, a suffering assessment instrument should be: able to measure subjective elements over a concrete period of time; understandable and replicable; simple and fast to administer; noninvasive to the patient; and unlikely to suggest new problems. Many available questionnaires may facilitate symptom assessment.25–28 However, few instruments have been developed to specifically assess suffering, they are not used regularly and none of them constitutes a gold standard. To date, the clinical interview is still considered the best way to assess suffering. 29
The present review aims to (1) identify the suffering assessment instruments available in the palliative care context, (2) describe their main features, and (3) comment on their psychometric properties, in order to facilitate selection according to the specific clinical or research context.
Methods
Two independent researchers conducted an extensive search attempting to identify suffering assessment instruments in palliative care. Published articles in English and Spanish were reviewed up to June 2011 in the MEDLINE, PsycINFO, and SciELO databases, using as search terms: “Suffering” (or “Distress”), “Assessment” (or “Instrument” or “Questionnaire” or “Measurement”) and “Palliative Care” (or “End-of-life” or “Terminally ill”).30–33 Furthermore, references in the retrieved articles were screened for relevant articles. Initial inclusion of articles was based on the title and abstract.
Articles that (1) specifically assessed the experience of suffering or associated factors in palliative care using a quantitative measure, (2) reported the development of instruments designed to assess suffering, and (3) attempted to validate or examine the psychometric characteristics of current suffering assessment instruments were included. When an instrument was identified, a new search using its specific keywords was carried out, aiming at finding more articles reporting on its psychometric properties.
Since “distress” is sometimes used when referring to the suffering experience, 13 we decided to include it to increase search sensitivity. However, articles were only included when the experience of suffering was specifically assessed, acknowledging theoretical differences between the concepts. 34 Also, while recognizing the relevance and frequent use in palliative care of instruments such as the Edmonton Symptom Assessment System (ESAS) 28 and the Distress Thermometer 34 or others designed to assess QOL, we decided not to include them given that they are intended to assess different constructs. A checklist for assessing the psychometric quality and ease of use of the instruments was used (Appendix A). This checklist was developed by Bot et al. 35 and based on Lohr et al. 36 and Bombardier and Tugwell. 37 It has already been used in systematic reviews.38–40 An adaptation for the present study was done to include not only self-assessment instruments, but also interview-based assessment instruments.
The checklist was used to assess the following properties of each instrument:
1. Validity: the degree to which an instrument measures the construct it is intended to measure, including construct validity, content validity, internal consistency. 2. Reproducibility: the extent to which an instrument yields stable scores over time among respondents who are assumed to remain unchanged on the domains being assessed; reproducibility was assessed with rating reliability and agreement. 3. Responsiveness: the ability of an instrument to detect real or important change over time in the concept being measured based on the confirmation of predefined hypotheses. 4. Interpretability: the degree to which scores can be interpreted and a qualitative meaning can be assigned to the quantitative scores. 5. Minimum Clinically Important Difference (MCID): the smallest difference in score in the domain of interest that patients perceive as beneficial and that would mandate, in the absence of troublesome side effects and excessive cost, a change in the patient's management.
41
All qualities assessed were rated as positive [+], inadequate [ø], or negative [−]. In case information on an element was insufficient or unavailable, no rating was given [ ]. All positive ratings for each instrument were summed to obtain an overall score for each instrument (0=no positive rating, 12=positive rating for all properties). Table 1 shows ratings for each quality and overall score and Appendix A details rating criteria.
[+]=Good; [ø]=fair; [−]=poor; [ ]=information not available; [NA]=not applicable.
MCID, Minimum Clinically Important Difference; IAS, Initial Assessment of Suffering; PRISM, Pictorial Representation of Illness and Self Measure; MSSE, Mini-Suffering State Examination; SAT, Suffering Assessment Tool; SISC, Structured Interview for Symptoms and Concerns in PC; SISC, Structured Interview for Symptoms and Concerns in PC.
Results
Study selection
Initially, 18,113 results were obtained (Fig. 1). Given the breadth of the search terms and the frequency with which palliative care studies refer to suffering, we expected a large number of initial results, even when recommended filters were used. 42 We found that many articles use the expression “suffering from” when only referring to a patient's diagnosis. After implementing a search filter eliminating that particular expression, 2240 results were obtained. Then, abstracts that did not fit the inclusion criteria were eliminated. Two hundred three suffering-related abstracts were identified. Afterwards, qualitative and case studies, position papers, and articles in which suffering assessment methods were not specified were excluded, leaving a total of 51 articles (Table 1). The material was then classified according to the assessment instrument mentioned. In the final 22 articles that described the development or assessment of psychometric properties, 10 assessment instruments were identified.

Flow chart of selection process of suffering assessment instruments.
Description of instruments
A summary of the various instruments' characteristics can be seen in Table 2, while Table 3 describes studies through their development or psychometric properties. The first instrument developed, the initial assessment of suffering (IAS), was published in 198743 after Cassell's 88 definition of suffering in health care, and all reported instruments follow this definition, although some seem to be more coherent with it than others.
IAS, Initial Assessment of Suffering; AIDS, acquired immune deficiency syndrome; PRISM, Pictorial Representation of Illness and Self Measure; SF-36, SF-36® Health Survey; HADS, Hospital Anxiety and Depression Scale; SOC, Sense of Coherence Scale; SWLS, Satisfaction With Life Scale; MHI, Mental Health Inventory; DSM-IV, Diagnostic and Statistical Manual of Mental Disorders 4th edition; HMQ, Health Monitor Questionnaire; PDIS, Perceived Disease Impact Scale; DLQI, Dermatology Life Quality Index; QoL-CS, Quality of Life-Cancer Survivors questionnaire; CAD-EOLD, Scales for the Evaluation of End-of-Life Care in Advanced Dementia; VAS, visual analogue scales; EORTC QLQ-C30, European Organization for Research and Treatment of Cancer Quality of Life Questionnaire; BP, Brief Pain Inventory; SSSS, Somatic Symptom Severity Scale; CSQ, Coping Strategies Scale; QOL-AD, Quality of Life Alzheimer Disease; DEMQOL, Dementia Quality of Life; SF-12, SF-12® Health Survey; MSSE, Mini-Suffering State Examination; CES-D, Center for Epidemiologic Studies-Depression Scale; SIP, Sickness Impact Profile; BHS, Beck Hopelessness Scale; DFSSQ, Duke-UNC Functional social support questionnaire; MSAS Memorial Symptom Assessment Scale; MQOL, McGill Quality of Life Questionnaire.
Population and administration
Most instruments were developed to assess suffering in advanced or terminally ill patients (n=8)5,6,43,44,73,79,80,82,85 and are used within a structured or semistructured interview (n=7).5–7,43,44,50,79,80,85 Only The Suffering Scales by Schulz et al. 82 was developed for self-administered use, while Pictorial Representation of Illness and Self Measure (PRISM) was adapted for self-administered use.58,64 No information on administration was provided for the Suffering Scale. 81
Structure
Regarding the dimensions considered, four instruments explicitly indicate a multidimensional assessment,5,80–82 three instruments assess total suffering6,7,44,50,85–87 and one instrument considered only physical aspects. 73 Two instruments do not explain the number of dimensions under consideration.43,79 Also, almost all instruments examine the patient's subjective perception. The Suffering Scales also include the caregiver's perception about the patient's experience of suffering 82 and only the Mini-Mental State Examination (MSSE) is exclusively administered by health care professionals and caregivers focusing on their judgments about the patient's condition. 73 The instruments vary from 1 item to as many as 69 items long. Three instruments have fewer than 10 items6,7,44,50,85–87; four instruments have between 10 and 20 items5,43,73,81; and three have 20 items or more.79,80,82 Most instruments are available in English,5,43,50,73,79–82 while the PRISM and the Single-item Numeric Rating, may be used in any language as they are mostly nonverbal.6,7,50,85–87 The PRISM is the only instrument using a graphic strategy instead of verbal sentences for assessment.
Scoring
Almost all instruments offer a discrete numeric score either using Likert or visual analogue scales (VAS; rating from 0 to 10), and the PRISM provides a continuous numeric score. 50 Only the Perception of Time, by Bayés et al., 44 uses an ordinal scale system.
Psychometric properties of the instruments
The rating of each psychometric property examined in the validation studies conducted for each instrument, summarized as good, inadequate, poor, or unavailable (Appendix A), is presented in Table 1. Also, an overall rating of the psychometric quality is provided. Only two studies included in this review have tested all properties included in the checklist (except for MCID) when possible,50,79 and only the PRISM obtained the maximum possible score, taking into account that internal consistency was not possible to assess due to the intrinsic design of the instrument. 50 It was not possible to examine fully the psychometric properties of the IAS 43 and the Suffering Scale, 81 given that the information was extracted from abstracts as full texts were not available, so it would be inappropriate to derive conclusions regarding their psychometric quality.
Time to administer
Most instruments rated positively in this aspect. Only the Structured Interview for Symptoms and Concerns in PC (SISC) and the State of Suffering (SOS-V) were rated negatively as the interview took more than 45 minutes,79,80 a period of time the researchers considered too long, given the participants' condition. For four of the instruments this information was not provided.5,43,81,82
Ease of scoring
Most instruments were also rated positively in this aspect, except for the Suffering Scales 82 and the SOS-V, 80 which use different rating systems within the instrument and need a simple formula to obtain a total score. Again, information about this aspect was not available for the IAS and the Suffering Scale.
Readability and comprehension
Comprehension was tested and showed good results for PRISM, SOS-V, and SISC. The Suffering Scales, 82 the only self-administered instrument, as well as other instruments5,43–45,73,77,81,82,85 did not provide information about this aspect.
Content validity
The PRISM, Suffering Assessment Tool (SAT), SISC, and SOS-V were scored positively for content validity. The Perception of Time was scored as moderate since only the patient was involved in item selection. The MSSE and the Suffering Scales reported only expert involvement for item selection and thus, were rated negatively. No information on content validity was available for the other instruments.
Internal consistency
This aspect was examined by the analysis of factor structure and/or Cronbach α. The Scale of Suffering and the Scales of Suffering examined both criteria and were positively rated. For the MSSE and SOS-V, only Cronbach α was obtained, the latter reporting an α<0.70 in one of the subscales; and thus were rated as moderate. The assessment of this aspect was not applicable for 3 of the 10 instruments due to the use of a single item (Perception of Time, PRISM, Single-item Numeric Rating), and information was not available for 3 instruments (SISC, IAS, and SAT).
Construct validity
This aspect was adequately examined and was rated positively in the PRISM (in all its validation studies included in the review), SISC, SOS-V and Suffering Scales. Construct validity was rated as moderate in the Perception of Time due to the use of inadequate measures (measures not validated for testing hypothesis) and the MSSE (authors used laboratory tests that are not directly related to suffering, but to illness severity or physical deterioration). It was also rated as moderate in the SAT, the Suffering Scale and the Single-Item Numeric Rating because hypotheses were not explicitly formulated in the studies reviewed. Information on construct validity was not available for the IAS. Floor and ceiling effects were only tested for four instruments: the PRISM,50,52,53,58,59 the MSSE, 77 the SISC, 79 and the SOS-V 80 and none were found.
Reproducibility
This aspect was assessed by rating reliability and agreement. Test–retest reliability was examined for only three instruments: PRISM,48,56 SISC, 74 and the Suffering Scales. 77 All studies used adequate sample sizes (>50) and intraclass correlation coefficients (ICC>0.70). Agreement was examined in PRISM,53,62 MSSE, 73 and SISC 79 and was rated positively. No information about agreement was found for the other instruments.
Responsiveness
The examination of responsiveness was reported in only three of the instruments: Perception of Time, 45 PRISM,63,61 and SISC. 79 A positive rating was given to PRISM and SISC, but Perception of Time was rated as inadequate, due to the use of an inappropriate measure and that hypotheses relating to the magnitude of change and/or the relationship to change scores were not presented.
Interpretability and MCID
Seven instruments were positively rated on interpretability: PRISM,50–72 MSSE, 77 SAT, 5 SISC, 79 SOS-V, 80 the Suffering scales, 82 and the Single-item Numeric Rating. 85 The Perception of Time was rated as inadequate and no information was available for the IAS and the Suffering Scale. None of the studies reported calculations of MCID.
Overall score for psychometric quality and ease of use
The instruments receiving the highest scores were the PRISM and the SISC, both with 11 positive ratings. For the SISC only 1 study reported psychometric properties, while for PRISM, 12 studies were included in the review with scores ranging from 6 to 11. The SOS-V had 6 positive ratings in only one reported study; MSSE had 4 positive ratings in 1 of 2 validation studies available and 5 in the other. The Suffering Scales had 4 positive ratings from a single study, and the same occurred for the Single-item Numeric Rating. The SAT had 3 positive ratings from the 1 study available. The Suffering Scale and the Perception of Time each had 1 study with 2 positive ratings. No positive rating was assigned to the IAS given that the information gathered was insufficient to examine its psychometric quality.
Discussion
Since the redefinition of palliative care, 2 the relief of suffering has come to the fore. With new, systematic studies of this topic, the field is shifting from “end-of-life care” to “palliation of suffering.”9,89 There is a growing body of research, and new assessment instruments have been developed throughout the past decade for screening purposes (a brief, rapid and prospective process using standardized tools in order to determine if a referral for further evaluation is necessary 34 ).
According to the results of this review, 10 instruments to assess suffering are now available for both clinical and research purposes.
PRISM and SISC have the strongest psychometric properties among all instruments included in the review. PRISM, a validated tool, enjoys the greatest acceptance and is regularly used in multiple populations,50–72 aside from being the most conceptually coherent instrument. 50 It allows a nondirective approach, provides a quantitative measure and can be used in patients with verbal or written communication difficulties. 51 However, the assessment of factors related to suffering relies mostly on the clinician's interest and has not been validated at end of life or in traditional palliative care contexts.
The Perception of Time and PRISM use nontraditional approaches to assess suffering. The former uses the perception of time as an indirect measure and is better suited to initiating an informal dialogue, while the latter uses a graphic, indirect approach, allowing assessment of total suffering and quantitative measurement.
The more recently developed scales, SISC, SOS-V, and the Scales of Suffering,79,80,82 are multidimensional measures that address the various physical, psychological, social,l and spiritual spheres, allowing for quantification of total suffering in all its dimensions.
In the case of the SISC, 79 the assessment is conducted through a semistructured interview, which facilitates contact with the patient and delves more deeply into other relevant aspects. Wilson et al.10,79 used a mixed qualitative and quantitative methodology, yielding richer results.
The length of the instruments under review varies substantially. The SOS-V, SISC, and the Suffering Scales have the most items, while SISC and SOS-V take the longest time to administer. In the case of SOS-V, the length of administration may be inappropriate in the context of unbearable suffering and the patient's wish to hasten death. 80 The MSSE,73 the Suffering Scale,81 and SAT 5 all have an average length, more comfortable for gravely ill patients. The PRISM, the Single-item Numeric Rating, and the Perception of Time are the shortest. Single-item numeric ratings are practical and used informally in clinical practice. But although they are considered a sensitive metric for subjective constructs, 90 they are not precise enough to measure psychometric properties.
PRISM 50 and MSSE 73 are the only instruments that allow the assessment of patients with communication difficulties. The MSSE gives an objective measure of a decidedly private and subjective experience, one that does not fit neatly into acknowledged theoretical approaches.1,11,91
Aside from the PRISM and SISC, the biggest challenge for the assessment of suffering lies in the insufficient data on psychometric properties for most instruments. Accordingly, validation studies of existing instruments will be needed, as well as studies that reflect their coherence with existing theoretical models.
Conclusion
Taking into consideration all features of the assessment instruments under review, the evaluation of suffering can be made easier. A wide and ever expanding range of approaches is now available, which facilitates the selection of the instrument that is best suited to the needs of the specific patient. 92 This broad menu of assessments makes it easier to evaluate patients, monitor progress and guide treatment decisions for patients who often face a dismal prognosis. Validated scales not only facilitate the timely screening of suffering in patients and families, but can be used as data points for clinical research in palliative care.
Footnotes
Acknowledgments
Alicia Krikorian thanks Universidad Pontificia Bolivariana for its financial support during the preparation of this manuscript. Tne authors also thank the anonymous reviewers for their invaluable comments and constructive ideas.
Author Disclosure Statement
No competing financial interests exist.
| Psychometric quality | Definition | Criteria used to rate the psychometric quality |
|---|---|---|
| Time to administer | Time needed to complete the questionnaire | Rating: [+] less than 10 minutes if self-administered or less than 45 minutes if interview was used a [-] more than 10 minutes for self-administered or more than 45 minutes for interview a [ ] no information found on time to administer |
| Ease of scoring | Ease of method used to calculate the questionnaire score | Rating: [+] easy: summing up of the items [Ø] moderate: visual analogue scale (VAS) or simple formula [-] difficult: VAS in combination with formula or complex formula [ ] no information found on calculation of score |
| Readability and comprehension | The questionnaire is understandable for all patients | Rating: [+] readability and/or comprehension tested; result was good [-] inadequate readability and/or comprehension [ ] no information found on readability and comprehension |
| Content validity | The extent to which the domain of interest is comprehensively sampled by the items in the questionnaire. | 1) Patients were involved during item selection and/or item reduction. 2) Patients were consulted for reading and comprehension Rating: [+] patients and investigator/expert involved [Ø] patients only [-] no patient involvement [ ] no information found on content validity |
| Internal consistency | The extent to which items in a (sub)scale are intercorrelated; a measure of the homogeneity of a (sub)scale | 1) Factor analysis was applied in order to provide empirical support for the dimensionality of the questionnaire. 2) Cronbach α>0.70 for each dimension/subscale. Rating: [+] adequate design and method; factor analysis supporting the dimension; α>0.70 [Ø] method adequacy indeterminate or no factor analysis used [-] inadequate internal consistency (α<0.70) or dimensions not supported by factor analysis [ ] no information found on internal consistency |
| Construct validity | The extent to which scores on the questionnaire relate to other measures in a manner that is consistent with theoretically derived hypothesis concerning the domains that are measured | 1) Hypotheses were formulated. 2) Results were acceptable in accordance with ≥75% of hypotheses. 3) An adequate measure was used. Rating: [+] adequate design, method, and result [Ø] method adequacy indeterminate [-] adequate design and method and inadequate construct validity [ ] no information found on construct |
| Floor and ceiling effects. | The questionnaire fails to demonstrate a worse score in patients who clinically deteriorated and an improved score in patients who clinically improved. | 1) Descriptive statistics for the distribution of scores were presented. 2) ≤15% of respondents achieved the highest or lowest possible score. Rating: [+] no floor/ceiling effects [-]>15% at the extremes [ ] no information found on floor and ceiling effects |
| Test–retest reliability | The extent to which the same results are obtained on repeated administrations of the same questionnaire when no change in physical functioning has occurred | 1) Calculation of an intraclass correlation coefficient (ICC); ICC>0.70. 2) Time interval and confidence intervals (or n>50) were presented. Rating: [+] adequate design, method, and ICC>0.70 [Ø] doubtful method [-] inadequate reliability, with adequate design and method [ ] no information found on test–retest reliability |
| Agreement | The ability to produce exactly the same scores with repeated measurements | 1) For evaluative questionnaires, reliability agreement should be assessed. 2) Limits of agreement, κ, or standard error of measurement were presented. Rating: [+] adequate design, method, and result [Ø] method adequacy indeterminate [-] inadequate agreement, with adequate design and method [ ] no information found on agreement |
| Responsiveness | The ability to detect change over time in the concept being measured | 1) For evaluative questionnaires, responsiveness should be assessed. 2) Hypotheses were formulated and results were in agreement with ≥75% of hypotheses. 3) An adequate measure was used (effect size, standardized response mean, comparison with external standard). Rating: [+] adequate design, method, and result [Ø] method adequacy indeterminate [-] inadequate responsiveness with adequate design, method [ ] no information found on responsiveness |
| Interpretability | The degree to which one can assign qualitative meaning to quantitative scores | Authors provided information on the interpretation of scores: 1. Presentation of means and standard deviation (SD) of scores before and after treatment 2. Comparative data on the distribution of scores in relevant subgroups 3. Information on the relationship of scores to well-known functional measures or clinical diagnoses 4. Information on the association between changes in score and patients' global ratings of the magnitude of change they experienced Rating: [+] ≥2 of above types of information was presented [Ø] method adequacy indeterminate or unclear description; 1 type of information was presented [ ] no information found on interpretability |
| Minimum Cclinically Important Difference (MCID) | The smallest difference in score in the domain of interest that patients perceive as beneficial and that would mandate a change in a patient's treatment | Information is provided about what (difference in) score would be clinically meaningful. Rating: [+] MCID is presented [ ] no information found on MCID |
aAdapted for interview-based assessment.
