Abstract
The gold standard for measuring anhedonia is the Snaith-Hamilton Pleasure Scale (SHAPS). To date, there are no validated electronic versions of this questionnaire. We aim to study the equivalence between the traditional paper-and-pencil format and a digital version of the SHAPS. A group of 67 patients completed both SHAPS formats, and differences between formats were assessed. McNemar’s test showed no significant differences between the two systems. The Kappa coefficient was over 40% for most items, and reliability was above 0.8, showing good to excellent levels of internal consistency. Thus, we have demonstrated a close equivalence between paper-and-pencil and electronic SHAPS.
Introduction
Anhedonia, described as the inability to derive pleasure from sensory experiences or social interactions (Chapman et al., 1976), is considered a core symptom of major depressive disorder (MDD) throughout DSM classifications. According to The Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), there are two definitions of anhedonia: loss of interest in daily activities, and loss of pleasure from daily activities; while both aspects could be correlated, they are not equivalent (American Psychiatric Association, 2013). Additionally, anhedonia plays a role in many other clinical syndromes, such as schizophrenia, anxiety, substance use, and eating disorders, as revealed by epidemiological and psychopharmacological evidence but also by neuroimaging data (Ritsner, 2014).
To date, the gold-standard questionnaire for the measurement of anhedonia is the Snaith-Hamilton Pleasure Scale (SHAPS) (Nakonezny et al., 2015; Snaith et al., 1995). In both the original version and its Spanish translation and validation (Fresán and Berlanga, 2013), SHAPS is a 14-item self-reported inventory that assesses four domains of hedonic experience: interest/hobbies, social interaction, sensory experience, and satisfaction with food and drink. For each item, patients must indicate their level of agreement with the corresponding statement by means of four possible answers: Strongly disagree (4 points), Disagree (3 points), Agree (2 points), and Strongly agree (1 point). Higher scores show an inability to have hedonic experiences, representing greater degrees of anhedonia (Leventhal et al., 2006).
In a world where technology is continuously evolving, telemedicine, otherwise known as e-health, is an emerging field (WHO, 2020), and the same change can be seen in the field of mental health (Firth et al., 2016; Marzano et al., 2015). Of particular interest within e-mental health is the use of electronic psychometric instruments, not only because younger generations, raised in a more technologically advanced world, are more comfortable with electronic formats (Torous et al., 2014) but also because in certain domains such as depression, electronic assessment tools have demonstrated more sensitivity than traditional formats (Torous et al., 2015). Indeed, studies have shown that people are more authentic in online health questionnaires than in face-to-face interviews (Bennett and Glasgow, 2009), especially in sensitive areas as substance use, traumatic events, or suicide (Barak, 2007; Christensen and Hickie, 2010; Lin et al., 2007).
Current electronic instruments are commonly paper-based questionnaires that have been adapted for use with an electronic device. With regard to the transfer of psychometric instruments from a paper-and-pencil format to a digital medium, some initiatives advocate mandatory validation to demonstrate equivalence between the two formats rather than assuming such an equivalence (Coons et al., 2009). In this regard, the procedure for creating an electronic instrument from an existing pencil-and-paper format resembles the validation and adaptation process for a psychometric instrument when it is used in populations with different characteristics from those for whom they were created (Wild et al., 2009).
Various studies have been published in favor of migrating psychometric instruments for use with digital devices, and electronic self-reported questionnaires have demonstrated satisfactory validity. Three reviews studied the reliability of pencil-and-paper and electronic versions of around 50 psychometric instruments, mainly for anxiety and depression, finding good reliability despite discrepancies in some questionnaires (Alfonsson et al., 2014; Gwaltney et al., 2008; van Ballegooijen et al., 2016).
Though the SHAPS scale has been validated for a variety of languages and different psychiatric and medical conditions, there has been no validation study of the electronic version. Furthermore, no evidence has been found of existing formats other than the pencil-and-paper version.
Our work aims to determine whether the SHAPS electronic version has the same psychometric properties as the traditional paper-and-pencil format. Additionally, we wanted to test the acceptability of SHAPS for electronic devices. We hypothesize that: (1) no differences will be found between the classic and the electronic SHAPS versions in terms of their psychometric factors and (2) the electronic format will have good acceptability.
Methods
Participants and setting
This study forms part of previous research conducted in the Psychiatry Department of Fun-dación Jiménez Díaz University Hospital in Madrid that aimed to validate the adaptation of the Dimensional Anhedonia Rating Scale (DARS) for the Spanish population (Arrua-Duarte et al., 2019). From July 2016 to February 2017, we recruited 130 patients over 18 years of age with several psychiatric diagnoses. Cases were included if they met the following criteria: age 18 years or older with a diagnosis of depressive, psychotic, adjustment, anxiety, personality, bipolar, or eating disorder according to the DSM-5 criteria (American Psychiatric Association, 2013). Exclusion criteria were active substance abuse, presence of a neurological disease or a decompensated medical condition, illiteracy, or lack of fluency in the Spanish language.
Procedure
Patients were requested to take part in the study during their regular psychiatric appointment or at the end of hospitalization. In the previous study, a total of 130 patients were recruited and completed the paper-and-pencil questionnaire (Arrua-Duarte et al., 2019). The questionnaire was completed anonymously after an appointment with the clinician, and patients were not compensated for participating. When participants returned the completed survey to the researcher, they were given an access code enabling them to fill out the electronic version via the MEmind tool (available on the Apple Store and Google Play) and were given instructions (in person and on a leaflet) on how to download and use the app (Barrigón et al., 2017a) in the second step of the study. The protocol was uploaded to MEmind and was active throughout the following week, allowing participants to complete the survey on any electronic device during this time frame.
Assessment
For the present study, participants completed the SHAPS (Snaith-Hamilton Pleasure Scale) (Fresán and Berlanga, 2013; Snaith et al., 1995) in two versions, that is, the traditional paper-and-pencil system and the electronic version via MEmind (Barrigón et al., 2017a) (Figure 1) Additionally, sociodemographic and clinical data were collected.

SHAPS in MEmind.
Snaith-hamilton pleasure scale
The Snaith-Hamilton Pleasure Scale (SHAPS), including its translation into Spanish, is a 14-item self-administered instrument to measure anhedonic experiences related to interests/hobbies, social interaction, sensory experience, and satisfaction with food and drink. Each item has four different response options: Strongly agree, Agree, Disagree, and Strongly Disagree. “Agree” responses, whether “Strongly agree” or simply “Agree,” receive a score of 0, while “Disagree/Strongly disagree” responses receive a score of 1. Total scores range from 0 to 14, with higher scores indicating higher levels of anhedonia or lower hedonic experiences.
MEmind wellness tracker
We used the MEmind Wellness Tracker tool to administer the scale digitally. MEmind is a collaborative, multiprotocol, and multilingual electronic tool developed by the Psychiatry Department of the Fundación Jiménez Díaz University Hospital. MEmind has two interfaces: one for health-care professionals and another designed for patients; it is accessible via the web or as an app and is compatible with all types of devices that have Internet access (i.e. computers, tablets, smartphones) independently of the operating system used on the device.
Statistical analysis
All statistical analyses were performed with Statistical Package for the Social Sciences (SPSS), version 22.0 edition and Stata 15.1/IC. Sociodemographic and clinical characteristics were described as mean and standard deviation values (numerical variables) or percentages (categorical variables). First, differences between paper and electronic scales for each item were assessed using McNemar’s test for matched pairs (McNemar, 1947). Additionally, we evaluated the differences between the sums of the item scores for each scale using the sign test instead of the Wilcoxon signed-rank sum test (Wilcoxon, 1945). The Wilcoxon signed-rank sum test is the non-parametric version of a paired samples t-test used when the difference between two variables is assumed to be ordinal, but not interval-scaled and not normally distributed. However, as the differences between paper- and electronic-scale scores were believed not to be ordinal, but merely classified as positive or negative, we used the sign test instead of the Wilcoxon signed-rank test (Wilcoxon, 1945). The sign test evaluates the equality of matched pairs of observations (Snedecor and Cochran, 1989). We applied a Bonferroni correction coefficient of 14 (the number of SHAPS items compared), accordingly, the statistical significance was set at p ⩽ 0.0036 (0.05/14). Finally, scale reliability was assessed by determining the level of agreement between both formats and their internal consistency. In order to obtain agreement between paper-and-pencil and electronic formats, we used the test-retest Kappa coefficient and Landis and Koch’s criteria for interpretation; values less than 0 indicated poor agreement, 0%–20% slight agreement, 21%–40% fair agreement, 41%–60% moderate agreement, 61-80% substantial agreement, and 80%–100% denoted near-perfect agreement (Landis and Koch, 1977). Internal consistency was evaluated by Cronbach’s Alpha test, and values less than 0.5 were considered unacceptable, 0.5–0.6 poor, 0.6–0.7 questionable, 0.7–0.8 acceptable, 0.8–0.9 good, and values over 0.9 represented excellent internal consistency according to the criteria of George and Mallery (2016).
Ethical considerations and data protection
The study was carried out in accordance with the Declaration of Helsinki. It was previously approved by the local ethics committee. Patients were informed of the study goals and provided signed informed consent. Personal identifiers were removed from the datasets to safeguard subject confidentiality. Data protection was guaranteed according to the same Spanish legal standards applied in the previous study (Barrigón et al., 2017a).
Results
Of the total baseline sample of 130 subjects, 67 participants (52.5%) completed both the paper-and-pencil and the electronic SHAPS and thus completed both steps of the study. There were no differences in age or sex between those participants who completed both steps and those who failed to complete the electronic format (first step completed only), whereas those with depression tended not to fill out the electronic format (20.9% of patients with depression filled out the electronic version versus 39.7% who did not, p = 0.014). In both steps, participants were mostly women and had a mean age of 43.9 ± 11.8 years. The most common diagnoses were psychotic and depressive disorders (23.9% and 20.9%, respectively) followed by adjustment and personality disorders (16.4%) (Table 1).
Sample description.
Comparison between SHAPS scores for paper-and-pencil and electronic formats
For paper-and-pencil SHAPS, the median score was 1 (P25 = 0, P75 = 4) and for the median score for the electronic format was 2 (P25 = 0, P75 = 5). When we compared both formats of SHAPS, we observed an overall lower number of positives (or “agree”) responses (18.02%) for the paper-and-pencil SHAPS compared to the electronic SHAPS (25.96%). Specifically, for each individual item out of the total 14, the percentage of answers expressing agreement was lower (Figure 2).

Agreement percentages per item.
When a McNemar’s test for matched pairs was performed for both formats, the differences found were non-significant (Table 2). Furthermore, in the sign test, we found non-significant differences between formats, with 17 negative signs, 30 positive signs, and 20 ties (p = 0.079).
Differences between paper-and-pencil and electronic SHAPS.
Reliability of the SHAPS electronic version
The overall correlation between the expected agreement, the observed agreement, and the kappa value for all items can be seen in Table 3. With the exception of “Enjoy being with family or friends,” for which a low kappa score (6.7%) was found, the level of agreement was above moderate for most items.
Kappa coefficient & Cronbach’s α for paper and electronic version.
slight.
fair.
moderate.
substantial.
The reliability of Cronbach’s α index was 0.85 for the paper-and-pencil format, showing a good level of internal consistency, and 0.92 for the electronic SHAPS, indicating excellent internal consistency.
Discussion
Our results show no significant differences between scores on the classic paper-and-pencil SHAPS and the electronic version. The level of agreement was above moderate for all items except one, and the internal consistency of the electronic SHAPS was even higher than that of the paper-and-pencil format. Therefore, the two versions may be equivalent measurement methods for anhedonia. Although there are no existing electronic versions of SHAPS, this finding is consistent with previous research showing equivalence between paper-and-pencil and electronic formats of different depression-related questionnaires (Alfonsson et al., 2014; van Ballegooijen et al., 2016).
Remarkably, our electronic SHAPS showed excellent internal consistency (Cronbach’s α index = 0.92), even surpassing that of the paper-and-pencil format, which was good (Cronbach’s α index = 0.85) and even higher than the original Spanish validation, which was moderate (Cronbach’s α index = 0.77). This finding is similar to those of previous works, which found high internal consistency for online or electronic versions of GHQ-12, WHO-5, and PHQ-9 (Barrigón et al., 2017b).
The need for studies like ours was argued by Alfonsson et al. (2014); in their review, the authors state that although for most self-report symptom scales in psychiatry there is a high reliability between classical and electronic or online versions, this cannot be generalized to all scales, which means we must study every digital transformation. Furthermore, these conclusions can be linked to the findings of the review of online instruments by Van Ballegooijen et al. (2016), who found at least one online measurement for the most common mental health disorders, though only some had been correctly studied and validated.
Our results did not demonstrate a good acceptability of the electronic version, as the number of subjects who were recruited and completed the paper version was 130, while only 67 (53.6%) completed both steps and filled out the electronic version of the SHAPS scale. The limited use of an electronic tool such as the MEmind Wellness Tracker tool is an unexpected finding, as it has been shown that subjects tend to prefer electronic methods to traditional ones (Torous et al., 2014). Here, the age gap might play a role, as previous studies were carried out in healthy participants aged 18–24 years (Torous et al., 2014), whereas the mean age of our subjects was around 44 years; however, we found no significant differences in mean age between subjects who completed both scales and those who completed only the paper-and-pencil one. Furthermore, previous studies have demonstrated high levels of internet access and confidence when using the internet among working-age adults (Cruickshank and MacIntyre, 2018). Other explanations for the general lack of success of electronic scales must be considered, such as the additional steps required to complete the electronic version (downloading the app independently), and the one-week time frame given to patients, both of which may explain the loss of participants. Other causes include forgetting to perform the task or not finding time to download the app and complete the electronic version.
Our study is novel in that it is the first validation of an electronic version of SHAPS. In addition, our sample comprises not only subjects diagnosed with depression but also patients with other psychiatric disorders in which anhedonia might play a relevant role, such as psychotic, anxiety, personality, and bipolar disorders, thus providing a more diverse sample and making it possible to extrapolate our results to psychiatric conditions other than depressive disorders.
Our study has certain limitations. First, our small sample size may have impaired our ability to detect differences. However, as we designed the inclusion criteria to include the most common mental diseases in which anhedonia is included as a symptom, our results have a high level of external validity. Second, our sample not only contains the main age range for these diseases but also includes the principal prototypes of patients receiving care in an outpatient clinic, inpatient unit, or a consultation-liaison unit. A third limitation concerns the SHAPS and the dichotomous scoring of answers. This answer format reduces the range of anhedonia expression that can be reflected on the different items of both versions of the SHAPS scale, and could explain the discrepancy found for the item “Enjoy seeing smiling faces,” making this scale of lesser validity. In light of this issue, new scales such as the Dimensional Anhedonia Rating Scale (DARS) have been developed (Arrua-Duarte et al., 2019; Rizvi et al., 2015).
As the validity of electronic tools like this has been proven, our main challenge is to engage people to use them in real-world conditions, that is, without economic incentive. In our first study, in which we explored the use of the MEmind Wellness Tracker in randomly selected general psychiatric patients, 20% of the people who were given access to MEmind finally used the tool (Barrigón et al., 2017a). This low rate likely indicates that the participants believed the electronic tool was not useful. In the present work, the percentage of individuals who used the electronic tool rose to 50% despite there being no therapeutic benefits or other incentives to participate. Finally, in ongoing studies of suicidal patients (Berrouiguet et al., 2019), the level of engagement with electronic tools is increasing, and more than 80% patients used the tool (Porras-Segovia et al., 2020). These figures probably represent the perceived usefulness of the tool in a different profile of patients.
The potential uses of electronic tools are vast. Electronic and online assessment are not only a better way for physicians to monitor their patients but also hold advantages for patients, who can be evaluated in the convenience of their own home, thus freeing them to devote their appointment time to solving problems linked to their current state, which can be tested by the psychiatrist via real-time online assessments.
Overall, our findings have proven the equivalence between the classic paper-and-pencil and the electronic format of the SHAPS. The digital revolution will require the transformation of the traditional systems into new tools, which must be well-suited to our digital environment. Despite this emerging scenario, electronic scales cannot be used without appropriate validation to guarantee equivalence between tools.
Footnotes
Declaration of Conflict of Interest
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Enrique Baca-García designed the Memind Wellness Tracker. The other authors declare no conflicts of interest.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partly funded by Instituto de Salud Carlos III (PI16/01852), the American Foundation for Suicide Prevention (LSRG-1-005-16), the Ministerio de Ciencia, In-novación y Universidades (RTI2018-099655-B-I00; TEC2017-92552-EXP), the Comunidad de Madrid (Y2018/TCS-4705, PRACTICO-CM), and by Con-vocatoria de ayudas para la contratación de investigadores predoctorales e investigadores postdoctorales co-funded by Fondo Social Europeo through Programa Operativo de Empleo Juvenil and Ini-ciativa de Empleo Juvenil (YEI) (PEJD-2018-PRE/SAL-8417).
Data Availability Statement
The data that support the findings of this study are available from the corresponding author, MLB, upon reasonable request.
