Abstract
Aims:
To generate a short version of a newly developed inventory that adopted the conceptual framework of the Substance Abuse and Mental Health Services Administration (SAMHSA) consensus statement on recovery.
Methods:
Through Rasch analysis, this paper presents how this recovery inventory (SAMHSA-RIC), with its original 111 items, can be reduced to a much shorter version with only 41 items.
Results:
Although internal consistency is slightly lowered because of item reduction, the short version maintains satisfactory and significant correlations with quality of life measures. Overall, the canonical correlation between the scale and WHOQOL-BREF was virtually the same, with only a 0.2% decrease.
Conclusions:
SAMHSA-RIC (short version) has strong potential to become a general tool for evaluating rehabilitative services for persons with persistent and severe mental illness. A validation study of the short version with clinical samples is warranted.
Introduction
The Substance Abuse and Mental Health Services Administration (SAMHSA) statement on recovery presented a comprehensive framework for the consumer-based recovery concept, and signalled the gradual extension of recovery from a pure medical concern of symptom control and restoration of functions to a personal and psychosocial process. The statement identified 10 components as essential to recovery. These components not only covered aspects such as clinical improvement and functional normalization, but also consumers’ subjective experiences of optimism, empowerment, interpersonal support, peer support and stigma reduction (SAMHSA, 2005, 2006). This consumer-based recovery model is considered closer to the lived experience of recovery among many who suffered from persistent and severe mental illness (PSMI). However, the general recovery framework is so all encompassing that a great deal of empirical work still remains to be done regarding its many facets.
Empirical research into recovery can be broadly divided into two different lines: one focusing on the demarcation of recovery stages and the development of inventories to identify these stages; the other, a process model without assumption of stages. For the stages model, Andresen, Oades and Caputi (2003) presented an overall review of existing major studies on recovery stages and developed the five-stage model for recovery based on its conclusions: (1) moratorium; (2) awareness; (3) preparation; (4) rebuilding; and (5) growth. This model encompassed the several three- or four-stage models that had been previously created. However, the model must refer to the variable features of each stage before they become clinically meaningful. A common problem with stage division by cluster analysis is the overlapping features of adjacent stages, which make certain clusters easily absorbed by others. For this probable reason, Andresen, Caputi and Oades’s (2006) later empirical study on stage modelling, which includes measures of mental health, psychological well-being, hope, resilience and recovery, named three major stages instead of five.
The other line of research work attempts to test out major recovery components without making assumptions about stages and their associated features. Measures in the major recovery components can be directly compared and a sensible empirical model can be built and tested. The SAMHSA model of recovery, although comprehensive, requires tremendous effort to operationalize, because many of its building blocks were considered subjective and vague (Bellack, 2006). Chiu, Ho, Lo and Yiu (2010) operationalized the SAMHSA recovery concept, and based on 11 existing psychometric scales or sub-scales, they developed a 111-item Chinese Recovery Inventory (SAMHSA-RIC) to measure the recovery of 200 community-residing people with schizophrenia spectrum disorders. These items were found to correlate significantly with health-related quality of life (HRQOL) measures, and subsequent analysis of its best-fit factor structure by using structured equation modelling technique showed that the final structural model explained as much as 81% variance in the quality of life measures. HRQOL was chosen in the study because of its popularity in schizophrenia research and its proximity to the term ‘recovery’ from the consumer movement (Ho, Chiu, Lo & Yiu, 2010). The high percentage of variance explained by the structural model helped to reveal the relation between the recovery components and HRQOL, although originally we only had a unified statement for the concept of recovery instead of its linkage with HRQOL. The developed scale did not refer to the Psychosis Recovery Scale (Chen, Tam, Wong, Law & Chiu, 2005) because it has a different notion of recovery (i.e. clinical recovery), nor to existing recovery scales such as the Recovery Assessment Scale (RAS) (Corrigan, Giffort, Rashid, Leary & Okeke, 1999; Corrigan, Salzer, Ralph, Sangster & Keck, 2004; Song & Hsu, 2011), which adopted only part rather than the whole of the SAMHSA recovery framework. For example, RAS also tapped on reliance on others and willingness to seek help.
Back in the local Asian context of a generally under-resourced and understaffed mental health system (Chiu, 2012; Tse, Siu & Kan, 2011) and the overwhelming concern over symptom control in conventional psychiatry, it is not surprising for mental health workers to be unfamiliar with the recovery concept and how these recovery principles could be implemented (Mak, Lam & Yau, 2010). Even though there has been the initial suggestion of the use of user participation and the grooming of leaders from within service users as a means to promote recovery practice (Tse, Cheung, Kan, Ng & Yau, 2012), most discussions are conceptual-based rather than evidence-based. It is fair to say that the recovery concept issue is still quite new in Hong Kong and elsewhere in Asia.
Although some researchers once doubted if the individualistic American concept of recovery could be used for societies in other cultures (Tse, 2004; Yee, 2003), the latest work in this line of research has lent empirical support to the SAMHSA recovery concept (Chiu et al., 2010; Ho et al., 2010). Recovery components like hope, desire for personal agency, frustration with stigmas, and so on, are probably universal experiences during the recovery journey, irrespective of cultural differences.
These empirical studies, though promising, have yet to overcome the various difficulties associated with SAMHSA’s lengthy list of items. The many items of the inventory incur a considerable administration cost and probably would not allow those with limited memory and concentration to take part in the study. This possible exclusion may deprive the researchers of the opportunity to test their models on lower-functioning persons with PSMI. Moreover, the 111-item inventory is so long that it is not considered optimal for frequent clinical usage. Therefore, there is both a practical and a research need to develop a shorter version.
Rasch model
In this research, Rasch analysis was conducted to reduce the number of items on SAMHSA-RIC. As the response to the items in the scale was given in ordinal outcomes, this study employed the polytomous Rasch model, which is a generalization of the dichotomous Rasch model (RUMM, 2005). The dichotomous Rasch model is considered to be the simplest, with a one-item parameter equivalent to an item-response theory model (Meads & Bentall, 2008).
The polytomous Rasch model assumes that the probability of a given response of an item is a logistic function of the relative distance between the item location and the respondent location on a linear scale. Mathematically, it can be written as:
where βn and δi are the location of the nth person and ith item respectively. τkik=1,2,…,mi are the thresholds that partitioned the latent continuum of item i into mi+1 ordered categories. Xni is the random variable of item score. For a polytomous Rasch model involving three category coefficients 0, 1, and 2, we have:
One of the characteristics of Rasch analysis is locating the responses of each respondent on an item-person map according to the above probabilistic relation between the items’ location βn (difficulty’) and persons’ locationδi (‘ability’). This enables further analysis of items’ discriminative power on the respondents by using the item-person map (Figure 1). Moreover, in contrast to the traditional modelling approach, which normally requires tuning the model parameters to fit the data, the Rasch analysis requires the data to fit the Rasch model. The Rasch analysis provides different kinds of statistical indices for assessing the fitness of the data to the Rasch model, such as individual items’ χ2 fit statistics, item residual score, and item–trait interaction χ2 fit statistics. Once the fitness of the data has been confirmed, their uni-dimensionality will also be confirmed. Hence, the item scores can justifiably be added together as a single comprehensive total score (Bond, 2007).

Person-item map of Adult State Hope Scale items.
Item reduction by means of Rasch analysis is not uncommon in other research fields. In psychology, Meads and Bentall (2008) shortened the 48-item Hypomanic Personality Scale to 20 items, while the Cronbach’s α was maintained at a high level (α = 0.8). Moreover, Dreer et al. (2009) also successfully shortened the 25-item Social Problem Solving Inventory (Revised Scale) to 10 items and uni-dimensionality was also confirmed on the 10-item scale. Similar analysis using the Rasch model can also be found in the field of optometry (Pesudovs, Garamendi, Keeves & Elliott, 2003; Ryan, Court & Margrain, 2008), rehabilitation (Siegert, Tennant & Turner-Stokes, 2010; Vidotto, Carone, Jones, Salini & Bertolotti, 2007) and education (Waugh, 2010). There is a sizeable volume of literature that provides a comprehensive review on the Rasch model and its application (Andrich, 2011; da Rocha, Chachamovich, de Almeida Fleck & Tennant, 2013; Hagquist, Bruce & Gustavsson, 2009), and the model’s increasing popularity demonstrates that Rasch analysis might be an alternative method of item reduction for SAMHSA-RIC on top of the conventional use of factor analysis.
Methods
Sample
A total of 204 eligible participants were recruited from two psychiatric outpatient clinics in Hong Kong. The inclusion criteria were: (1) aged 18–60; and (2) primary diagnosis of ‘schizophrenia’, ‘schizo-affective’ or ‘schizophreniform’ disorder. The exclusion criteria were: (1) inability to communicate in mother tongue (Cantonese); (2) global score less than 4 in the Capacity to Report Subjective Quality of Life (CapQOL) inventory screening assessment (a score of 4 or above indicates the ability to complete QOL measures and give valid and reliable answers (Wong et al., 2005)); and (3) discharged from the psychiatric ward during the 30 days preceding the interview. Written consent was obtained from the participants after they were given a detailed description of the study. Ethics approval of the original study had been obtained from the Hospital Authority Cluster Research Ethics Committees before collecting the data. This Rasch analysis is essentially an extended study, this time examining only the measurement items.
Instruments
Eleven (sub-)scales were included in the Rasch analysis. All of the scales were filled out by trained research staff through face-to-face interview. These scales included Adult State Hope Scale (ASHS) (Snyder et al., 1996), Recovery Attitude Questionnaire (RAQ-7)(Ralph, Kidder & Philips, 2000), Health Care Climate Questionnaire (HCCQ) – a scale assessing the degree of autonomy support that clients perceive their psychiatrists to provide (Williams, Rodin, Ryan, Grolnick & Deci, 1998), Self-responsibility sub-scale of the Exercise of Self Care Agency scale (ESCA; Kearney & Fleischer, 1979; Riesch & Hauck, 1988), the personal competence sub-scale of the Resilience Scale (RS) (Wagnild & Young, 1993; Bengtsson-Tops, 2004; Rosenfield, 1992), the self-esteem, self-efficacy sub-scale of the Making Decision Empowerment Scale (MDES; Rogers, Chamberlin, Ellison & Crean, 1997), the alienation, perceived discrimination and the social withdrawal sub-scale of the Internalized Stigma of Mental Illness (ISMI) scale (Ritsher, Otilingam & Grajales, 2003).Holistic recovery was assessed in three aspects: (1) psychosocial symptoms (mind and emotion); (2) social support (community); and (3) spirituality. The frequency of psychosocial symptoms was measured by a 15-item psychosocial sub-scale of the Schizophrenia Quality of Life Scale (SQLS; Wilkinson et al., 2000). Social support was measured by the Multidimensional Scale of Perceived Social Support-Chinese version (MSPSS-C) (Chou, 2000). Spirituality was measured by the World Health Organization Spirituality Religion and Personal Belief Scale Hong Kong version (WHOQOL-SRPB-HK) ) (WHOQOL SRPB Group, 2006). Among the eight facets of WHOQOL-SRPB-HK, only three were selected for this study: connectedness to a spiritual being or force, spiritual strength and faith sub-scales. This was because only these three provide a ‘pure’ measure of a respondent’s spirituality; that is, they are not complicated by a respondent’s mental health condition (Moreira-Almeida & Koenig, 2006). Respondents’ quality of life was measured by the Hong Kong Chinese World Health Organization Quality of Life Measure abbreviated version (WHOQOL-BREF(HK)). The WHOQOL perception among people with schizophrenia has been found to be negatively correlated with psychiatric ratings (Chan, Ungvari, Shek & Leung, 2003; Chan & Yu, 2004). All scales were translated directly from English to Cantonese by an experienced linguist, except the MSPSS-C and WHOQOL-SRPB-HK, which were already in Chinese. The English version of ISMI, MDES, MS, SQLS and RAQ-7 were validated in psychiatric samples, whereas RS, ESCA, ASHS and HCCQ were validated in healthy samples. The reasons for the choice of scales and the reliability and validity issues have already been reported elsewhere (please refer to Chiu et al, 2010 for details).
General steps of Rasch analysis
The Rasch analysis was conducted using the RUMM 2020 software (http://www.rummlab.com.au) on each of the 11 sub-scales of SAMHSA-RIC individually. The 204 participants in this study was considered an adequate number because Linacre (1994) showed that a sample size of 27–61 gives 99% confidence of item-calibration stability within a logit, which is accurate enough.
In general, four item-reduction steps were followed throughout the analysis. First, the individual item’s χ2 fit statistics were checked to ensure that they fitted the Rasch model. Any item with a p-value less than .05 was considered to be a misfit item and was discarded from the scale. Moreover, researchers also checked the item residual score, which represents the error on the fit of the data to the model from the perspective of the item (RUMM, 2005). As a rule of thumb, residuals greater than ±2.5 were considered to be a misfit to the Rasch model and would be discarded if the item had significant item χ2 fit statistics.
Second, differential item functioning (DIF) by gender was evaluated to ascertain if any item was subject to gender bias. Similar to individual-item χ2 fit statistics, a DIF with a p-value less than .05 was considered to be significant and the item would be discarded.
Third, the item-person maps of the scales were checked in order to ensure that the subjects’ responses were evenly spread out, so that the scales could distinguish between subjects at different levels of recovery.
Finally, item–trait interaction χ2 fit statistics were checked for the remaining chosen items after deletion. Insignificant χ2 statistics were obtained and ensured the overall fit of the Rasch model, which also implied the uni-dimensionality of the scale. Once the uni-dimensionality had been confirmed, the item scores could justifiably be added together (Bond, 2007).
Result validation
A series of pre-post comparisons were done before and after item reduction in order to ascertain the quality of the shortened SAMHSA-RIC. First, the change of Cronbach’s α, Person Separation Index and item–trait interaction χ2 fit statistics were assessed to ensure scale reliability. Second, canonical correlation analysis was conducted between SAMHSA-RIC and WHOQOL-BREF before and after item reduction so that the stability of their relationship could be tested.
Results
An example of item reduction
In order to illustrate the item-reduction process but to save the lengthy repetition on all items, ASHS was chosen to show reduction steps. First of all, the χ2 fit statistics of the six items were checked. As items 2, 5 and 6 had significant p-values of .0083, .0210 and .0006, respectively, they should be removed from the scale. However, after deleting the items, Cronbach’s α dropped from 0.78 to 0.56 and was not considered acceptable. As item 5 had the highest p-value among the three, it was selected to be retained in the scale and Cronbach’s α thereby increased to 0.69. For this set of items (i.e. items 1, 3, 4 and 5), the item residual scores were within the acceptable range, no significant DIF statistics were found, item–trait interaction χ2 fit statistics showed the overall fit of the Rasch model and the items in the person-item map basically covered nearly all responses (Figure 1). Hence, items 1, 3, 4 and 5 formed the new group after item reduction.
The items of ASHS before and after item reduction are listed in Table 1. The original version of ASHS had six items that could be divided into two groups of questions: items 1, 3 and 5 were related to pathways (i.e. belief in one’s capacity to generate routes); and items 2, 4, and 6 were related to agency (i.e. belief in one’s capacity to initiate and sustain actions) (Snyder et al., 1996). The result of item reduction showed that items 2 and 6, related to agency, were deleted from the scale. Although retaining item 4 did not meet our expectation because it was also related to agency, the item-person map (Figure 1) suggested that the discriminant power of item 4 was weak. However, it was retained in the scale because deleting it would have lowered Cronbach’s α from 0.69 to 0.61, an unacceptable level for clinical use.
Items of ASHS before and after item reduction.
Number of items reduced
After item reduction by Rasch analysis, the number of items of each sub-scale of SAMHSA-RIC was reduced. Although some only had minor reductions from six to five items, many long scales were shortened to half of their original length. Overall, the Chinese recovery scale was shortened from 111 to 41 items, around just one-third of its original length (Table 2).
Number of items in the SAMHSA-RIC before and after item reduction.
Scale reliability
From Table 3 we see that the item–trait interaction χ2 fit statistics of five sub-scales have a p-value less than .05; this means that the data of the scales did not fit the Rasch model. However, after item reduction, all p-values were above .05, and hence the reduced scales fit the Rasch model. All of the sub-scales had different degrees of decrease in Cronbach’s α. However, the final α values were still within the acceptable range (α > 0.6) and suitable for clinical use (Moss et al., 1998). In addition, some of the sub-scales, such as WHOQOL-SPRB, maintained a high level of internal consistency (α = 0.89). The result of the Person Separation Index was similar to that of Cronbach’s α, with all values also being above 0.6. Actually, such a result is expected because the Person Separation Index is considered as an analogue of Cronbach’s α (Curt, 2007; Piquero, Macintosh & Hickman, 2002).
Scale reliability indices of the SAMHSA-RIC before and after item reduction.
Canonical correlation analysis between SAMHSA-RIC and WHOQOL-BREF
Table 4 shows the result of canonical correlation analysis between SAMHSA-RIC and WHOQOL-BREF before and after item reduction. The overall canonical correlation, which is the correlation between the canonical variates of SAMHSA-RIC and WHOQOL-BREF, was virtually the same with only a 0.2% decrease. Regarding the cross-canonical loadings, which is the correlation between a sub-scale score and the opposite scale’s canonical variate (e.g. the correlation between the sub-score of hope in SAMHSA-RIC and the canonical variate of WHOQOL-BREF), seven out of the 16 cross-canonical loadings were strengthened. For the nine weakened loadings, only three belonging to SAMHSA-RIC had a decrease larger than 5% after item reduction. They represented holistic well-being/social support, empowerment and holistic well-being/spirituality, which had a decrease of 14.8%, 11.3% and 7.8%, respectively. We considered the decreases to be acceptable because the magnitude of the three loadings after item reduction was still above the standard acceptable threshold of 0.3.
Cross-canonical correlation analysis between SAMHSA-RIC and WHOQOL-BREF before and after item reduction.
Discussion
The study successfully shortened the 111-item SAMHSA-RIC to 41 items while maintaining the scope of the original scale, with the trade-off of a slightly lowered internal consistency because of the significant reduction in the number of questions. This demonstrated, similar to Meads and Bentall’s (2008) study, that although many original scales have good internal consistency, item reduction would inevitably reduce the level of internal consistency. Nevertheless, the result of canonical correlation analysis provides a more comprehensive picture on the effect of the item reduction. The magnitude of the canonical correlation after item reduction was virtually the same as before. Generally, around half of the cross-canonical loadings obtained were stronger than the original scale. In other words, for the short version some sub-scales have a slightly increased loading with the opposite scale’s canonical variate, while some sub-scales have a slightly decreased loading. This balanced picture has provided us with better confidence that the short version’s efficacy will not be significantly reduced. Removing some of the less effective items actually not only caused less of a burden for the respondents, but also added strength to some sub-scales like hope, person-centred recovery, self-responsibility and personal strength in terms of the relations with the WHOQOL-BREF’s canonical variate. Selecting items (by deletion) and putting them under the SAMHSA recovery framework may render some of the question items less effective than others in measuring all the facets of recovery that the original full scale intended to measure. However, the remaining items that survived the deletion now have a new mission, which is to measure the recovery concept rather than individual traits.
Limitations
Obviously, one of the limitations of this research was the aspect of uni-dimensionality (Falissard, 1999). Because the 11 sub-scales of SAMHSA-RIC originally consisted of 11 independent scales that may be used for other purposes, the number of options in each scale was different (ranging from four to eight). The different number of items in each sub-scale made fitting the whole scale into the Rasch model as a block unfeasible and the issue of uni-dimensionality therefore could not be tested. The use of a single total score to represent the extent of recovery, although technically feasible, would not be justifiable. It will be meaningful only when the response range of all items has been aligned to the same scoring options.
Another limitation is related to the nature of sample subjects. The participants were consecutive pickups instead of random samples of those suffering from schizophrenia. Their age covered a wide range from 20 to 60, with a mean of 42. The sample characteristics will be very different from a younger cohort who in general had a shorter period of delay in first consultation and treatment, and who had a better treatment outcome on symptoms and disruption. What this study represents may be those with more chronicity and might not be applicable to younger groups. Unfortunately, the spread of age range and the resulting small number of subjects per group do not allow us to analyse the age effects comfortably.
Conclusion
In connection to this aspect of the survey, one future research direction is to align the number of options for all sub-scales and re-analyse them with the Rasch model. Once uni-dimensionality has been confirmed by Rasch analysis, the score of the items can be justifiably added together and a comprehensive total score representing the overall recovery stage can then be obtained. On the other hand, it is also necessary to test out in subsequent validation study whether a straightforward single total score is possible or not. The total score will not only be useful for clinicians, but also serve as a direct reference index for persons with PSMI to understand their recovery progress. At the current stage, however, it is suggested to calculate separate scores for each of the 11 sub-scales as the overall uni-dimensionality has not been confirmed.
To make the survey a valid and handy tool to measure consumer-oriented recovery (Bellack, 2006), a validation study is warranted to establish its divergent, convergent and predictive validity. If proven valid, a user-friendly tool like SAMHSA-RIC would allow many health care and social care programmes to be evaluated beyond the traditional outcome indicators of symptom control and hospital readmissions, and possibly serve as an important criterion of psychiatric rehabilitation (Schrank & Slade, 2007).
Footnotes
Funding
This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
