Abstract
By the late 1970s it was clear that cognitive and behavioral therapies were promising alternatives to antidepressant medications for treatment of depressed outpatients. One such model of therapy, Social Skills Training, was developed by Michel Hersen and his colleagues specifically for treatment of depressed women. Professor Hersen and his colleagues obtained funding from the National Institute of Mental Health to conduct the first well-controlled randomized trial of this intervention, contrasting Social Skills Training, in combination with either placebo or active amitriptyline, against two active standards: amitriptyline alone and time-limited, psychodynamic psychotherapy in combination with placebo. The results of this study suggested that Social Skills Training (plus placebo) was at least as effective as amitriptyline alone or psychodynamic psychotherapy (plus placebo), with superior mode-specific effects on measures of social skill. The current narrative, which provides an autobiographical perspective of four critical years (1980-1984) in the early career of the author that were intertwined with the conduct and completion of this clinical trial, is an homage to Professor Hersen’s talents as a supervisor, researcher, and mentor.
I came to the Western Psychiatric Institute and Clinic (WPIC) of the University of Pittsburgh Medical Center in 1979 to begin residency training in general psychiatry. I chose the WPIC program specifically because in that era it was a veritable promised land for a fledgling behavior therapist. Although there were in fact a number of strong residency programs, no other Department of Psychiatry offered such a strong cadre of federally funded behaviorally oriented researchers. As my own interests gravitated toward mood disorders, it was my privilege to have the opportunity to work with Michel Hersen, who was conducting a National Institute of Mental Health (NIMH)-funded study of social skills training for the treatment of women with major depressive disorder in collaboration with colleagues Alan Bellack and Jonathan Himmelhoch. This was an exciting time for the development of behavioral and cognitive-behavioral therapies for depression, and the results of several published studies had suggested that these therapies could be at least as effective as antidepressant medications. The opportunity to learn how to formulate, design, and execute this type of “cutting-edge” comparative outcome research was exactly what I was looking for in Pittsburgh. I will always be grateful to Michel for giving me the opportunity to get started in this process so early in my career.
The First Year
Learning how to conduct research is usually not a high priority for psychiatric residents. In fact, at WPIC, there was no opportunity to conduct research in the residency training program curriculum until the fourth year. Rather, residency training is focused on learning the intricacies of diagnosing psychiatric conditions, mastering psychopharmacology, and conducting psychotherapy. Residents typically begin to learn psychotherapy after completion of internship. Our initial psychotherapy training at WPIC included a didactic course that covered interviewing skills and the importance of delivering core therapeutic qualities, like empathic listening, as well as an introduction to the principles of psychodynamic psychotherapy, which was the orientation of my first psychotherapy supervisor. Because I had learned some of the basic strategies of behavior therapy prior to medical school, I was able to move at a somewhat accelerated pace and Michel agreed to be my second therapy supervisor. At our first meeting, Michel gave me a copy of the treatment manual for social skills training that was being used in their ongoing clinical trial (Bellack, Hersen, & Himmelhoch, 1980) as well as a paper describing the findings of their pilot project (Wells, Hersen, Bellack, & Himmelhoch, 1979). We met weekly for supervision from July 1980 through June 1981.
Michel was an engaged and attentive supervisor and never seemed distracted by his other academic demands; only later did I learn to appreciate what valuable qualities these were for a supervisor. Michel was also a pragmatic, honest, and directive psychotherapy supervisor, who could be blunt when the circumstances justified it. Unlike my psychodynamic psychotherapy supervisor, who was filled with creative insights about what might be happening in the mental life of my patients and within our therapeutic relationships, Michel asked me to audiotape the sessions, and we listened to these tapes during supervision sessions and talked about what was actually happening. Michel would point out what was going well and what seemed to be missing the point, and on occasion we would use role-playing to practice, rehearse, and shape more competent therapist behaviors.
For the first 6 or so months, Michel endured my not altogether successful attempts to blend this relatively structured intervention within a longer term treatment model in which I was also handling the pharmacotherapy. Our first case together was what could euphemistically be called “a therapeutic challenge”: a woman (let’s call her Mrs. K) who was slow to recover from a severe episode of depression that had required hospitalization earlier that year. I had been her resident during her inpatient admission and had already seen her for as an outpatient for several months when Michel began as the supervisor. Though I thought that her medication management was well in hand, it was clear that I was adrift with the therapy. Mrs. K was in the midst of an unwanted divorce; the depressive episode had begun when she had learned that her husband was having an affair with his secretary. Mrs. K very much liked to talk about the injustice of her predicament and her belief that anyone in her circumstance would have similar emotional problems. At times, it seemed like Mrs. K could talk nonstop for the whole hour. Michel observed that I was inadvertently rewarding her complaining behaviors and recognized that, whatever subjective benefits Mrs. K was obtaining from emotional ventilation, a nondirective therapy would be unlikely to help hasten her recovery. Over the next several months, Michel helped me to try to provide structure, set more measurable goals, and try to maintain a therapeutic focus on teaching Mrs. K social skills training as an alternate way of understanding the factors that were maintaining the depressive symptoms and to try out other ways of behaving that might be more productive. In retrospect, what was happening in supervision mirrored what was happening in therapy. However, while I was motivated to be a good supervisee, Mrs. K never fully “bought in” to this way of participating in therapy and, whenever possible, she gravitated toward long monologues. Despite this tension in the process of therapy, Mrs. K did make some progress overall. Nevertheless, she never became asymptomatic and our therapeutic struggles came to an abrupt end when she developed an unequivocal manic episode (her first). Thereafter, she received shorter sessions with medication management as the predominant focus on her treatment.
During the latter part of the year, Michel supervised a second case, a young college student with a milder depression who was a better fit for the social skills training. I followed the protocol and my patient got better relatively quickly. As is also the case when learning to prescribe medications, it is a very gratifying experience to see someone get better quickly as an apparent consequence of the treatment that you are delivering and these kinds of experiences can instill considerable therapeutic optimism and confidence in the intervention. Although I must confess to subsequently embracing a more cognitive-behavioral approach as my preferred method of therapy (as it turned out, Maria Kovacs, one of Aaron T. Beck’s early students, was my next psychotherapy supervisor), to this day I remain convinced of the value of straightforward behavioral interventions, which today are grouped together as behavioral activation strategies.
Joining the Team
Toward the end of our year of supervision, Michel and I talked less about conducting therapy and more about studying the effects of therapy. Michel correctly identified that I was more interested in becoming a treatment researcher than a “master therapist,” and he gave me a prepublication copy of a paper describing the interim results of their randomized controlled trial (RCT; Bellack, Hersen, & Himmelhoch, 1981). It was exciting to see the project that I had heard about for several years coming to fruition, particularly since the interim analysis suggested that social skills training might have superior antidepressant effects compared with amitriptyline. As that project was winding toward completion, Michel asked me to give the paper a careful reading and to let him know my thoughts on ways to improve the analysis and presentation of the final results of the project. The following year, Michel received a grant from the NIMH to foster research training, and in 1982 I was honored to be selected to be in the inaugural class of this postdoctoral fellowship program. This invaluable experience enabled me to devote about one quarter of my time over the next 2 years to working with Michel on the final analyses and papers from this project.
Description of the Study
It was immediately clear that this study was “state of the art” for comparative efficacy research in the early 1980s. It was a RCT and enrolled a relatively large sample of 125 women who met criteria for a principal diagnosis of primary unipolar depression according to the so-called Washington University criteria. The trial had two stages: a 12-week acute phase to assess the initial effects of therapy and a 6-month “maintenance” phase to evaluate the durability of these effects. Outcomes were assessed by two venerable measures of depression: the 24-item Hamilton Depression Rating Scale (HAM-D), which was administered by an independent evaluator, and the self-report Beck Depression Inventory (BDI). In addition to examining change in depressive symptoms, rates of “significant improvement” were compared. For the purposes of this study, this was defined as a final score of 10 or less on both the HAM-D and the BDI. Although this particular definition did not gain wide acceptance in comparative treatment research, it certainly approximates the contemporary definition of remission. In addition, a standardized behavioral test of social skill was obtained to determine whether social skills training had a specific, theoretically relevant effect.
The primary goal of the study was to ascertain the efficacy of social skills training, both alone and in combination with the tricyclic antidepressant (TCA) amitriptyline. Social skills training was conducted by experienced PhD-level behavior therapists according to the manual developed by Michel and his colleagues. The study employed two active comparators: amitriptyline monotherapy and time-limited dynamic psychotherapy. To equate expectations for drug effects across the four treatment arms, all patients attend visits at the Affective Disorders Clinic and saw a senior nurse clinician for management sessions; patients randomized to the two psychosocial “monotherapy” conditions also received (double blind) inert pill placebos. The amitriptyline monotherapy condition thus served as the active standard of comparison against which social skills training, both as the single active intervention and in combination with amitriptyline, was compared. The second comparison group, time-limited dynamic psychotherapy, was likewise provided by highly experienced PhD-level psychotherapists who were theoretically committed to this model of intervention. The inclusion of a time-limited dynamic psychotherapy arm in the study design permitted an assessment of the mode-specific effects of social skills training by ensuring that patients received the same amount of contact with an expert psychotherapist. In retrospect, the only hints of an allegiance bias in the study are (a) the lack of representation of the psychodynamic therapists in the authorship of the report and (b) the absence of a mode-specific measure of insight or conflict resolution.
The validity of the primary contrasts was of course contingent upon delivery of adequate courses of pharmacotherapy and time-limited dynamic psychotherapy. Pharmacotherapy was supervised by Jonathan Himmelhoch, a psychiatrist who was an expert psychopharmacologist. Amitriptyline was initiated at 50 mg and titrated upward over the first 4 weeks of the protocol to a maximum dose of 300 mg/day if tolerated. The adequacy of amitriptyline therapy during the trial was supported by the mean doses (202 mg/day and 178 mg/day, respectively, for the monotherapy and combination conditions), which to this day are respectable outpatient doses for this venerable TCA. No such index of therapeutic integrity was available for the psychotherapy condition, although it did appear that session attendance and the likelihood of completing a full course of treatment were very similar in the two psychosocial monotherapy conditions.
The Main Findings
As reported in the main outcome article (Hersen, Bellack, Himmelhoch, & Thase, 1984), analyses of covariance of the depression measures indicated that there were no significant differences among the four treatments on the BDI or HAM-D at either the 6-week or 12-week assessments. In fact, participants in all four treatment conditions experienced large and clinically meaningful reductions in levels of depressive symptoms. Rates of significant improvement—reported first for the intent to treat sample (rates for completers are reported in brackets)—were as follows: social skills training plus placebo, 49% [64%]; social skills training plus amitriptyline, 30% [43%]; dynamic psychotherapy plus placebo, 32% [45%]; and amitriptyline, 23% [43%]. During the 6-month maintenance phase, there was no evidence that one treatment had more durable effects than the others: symptomatic improvements were typically maintained.
As reported in a second article summarizing key secondary outcomes (Bellack, Hersen, & Himmelhoch, 1983), the depressed women who received social skills training did experience significantly greater improvements on the behavioral test of assertiveness skills as compared with the participants who were randomized to receive amitriptyline or psychodynamic therapy plus placebo. Thus, although social skills training did not have statistically significantly greater antidepressant effects in this study, it did have a significant mode-specific effect.
Examining the Potentially Moderating Effects of Endogenous Depression
At the time that this study was conducted, it was commonly believed that depression occurred in two prototypic forms: a more severe and biologically mediated form (i.e., endogenous depression) and a less severe, more psychologically mediated form (i.e., neurotic or nonendogenous depression). It was also widely assumed that TCAs such as amitriptyline would be more effective for patients with endogenous depression, whereas interventions such as social skills training and time-limited dynamic psychotherapy would be more effective for patients with nonendogenous depressions. My particular contribution to this research project was to try to test this secondary hypothesis.
This task was not as easy as it might seem: The diagnostic criteria that were used for this study did not include subtyping for endogenous or neurotic depressions. Fortuitously, the year before I began to work with the data from this project, my new psychotherapy supervisor had published an article that used a subset of items from the HAM-D to approximate the diagnosis of endogenous depression (Kovacs, Rush, Beck, & Hollon, 1981) to conduct secondary analyses of an earlier study of Beck’s model of cognitive therapy (Rush, Beck, Kovacs, & Hollon, 1977). The specific items included in this subscale had face validity, and it had the added credibility of having been suggested by a noted psychopharmacology researcher (Donald F. Klein). Nevertheless, its psychometric characteristics had not been properly evaluated and it had not been validated against clinical diagnoses of the endogenous subtype. I spent the next year working on these tasks, which necessitated applying the research diagnostic criteria (RDC) and, the Diagnostic and Statistical Manual of Mental Disorders (3rd ed.; DSM-III; American Psychiatric Association, 1980) criteria for melancholia to the case records of the 120 study participants. The results of the validation of what came to be called the Hamilton Endogenomorphy Subscale (HES) were published the following year (Thase, Hersen, Bellack, Himmelhoch, & Kupfer, 1983). We found that the distribution of HES scores was bimodal and that patients with high “endogenomorphy” scores (nearly 60% of the sample) were also very likely to be classified as either probable or definite endogenous depression according to RDC or melancholia according to the DSM-III criteria. However, only a small proportion of the study participants—about 15%—met criteria for both definite endogenous depression and melancholia, which suggested that although many of the study participants had a number of endogenous symptoms, the study group as a whole was perhaps not ideal to test the hypothesis that social skills training was an effective treatment for endogenous depression.
I spent the next year trying to make lemonade out of this very large basket of lemons. There were many interesting trends in the data, but I was frequently thwarted by the effects of dwindling subsample sizes on the probability of finding a statistically significant difference and, for the most part, I remained oblivious to the insidious effects of alpha inflation. Fortunately, most of these post hoc analyses of various patient subgroups never saw the light of day in the published literature. Nevertheless, the results of these analyses were generally consistent with the main findings of the study: The patients who received social skills training experienced considerable symptom improvement whether or not they had high HES scores or met criteria for probable or definite endogenous depression, and the addition of amitriptyline to social skills training did not improve outcomes in the more severely depressed patients compared with the less severe patients (Thase, 1983). Given the methodological limitations of our approach to classifying endogenous depression—whether using a recently validated subscale score or a retrospectively determined subtype diagnoses—we decided not to try to publish a separate paper on the relationship of treatment outcome to various classifications of endogenous depression. Instead, we chose to publish a paper on the outcomes of the four patients from the social skills training monitering group with the most unequivocal diagnoses of endogenous depression as a small standardized case series (Thase, Hersen, Bellack, Himmelhoch, Kornblith, & Greenwald, 1984). Three of these four patients responded to social skills training plus placebo, with two meeting criteria for significant improvement. Interestingly, both of the significantly improved patients experienced a depressive relapse during the first 2 years after completing the treatment protocol.
The View From 30 Years Later
The comparative study of social skills training crafted by Michel Hersen, Alan Bellack, and Jonathan Himmelhoch was a landmark clinical trial and should have established this form of therapy as one of the interventions of greatest interest in the next generation of psychotherapy research. The study was particularly distinguished by the use of two active comparison groups and by the use of pill placebo to equate contact with staff at the Affective Disorders Clinic. Other noteworthy features included the use of blinded independent evaluations of the primary outcome measures, “true believers” to deliver all of the therapies, and a validated in vivo test of mode-specific benefit. As things turned out, the competing renewal application for this project, which focused on the comparison of social skills training against an upgraded pharmacotherapy condition, did not receive federal funding and there were no further large-scale studies of this form of psychotherapy. In this regard, social skills training shared the same fate as other more behaviorally focused interventions, whereas Beck’s model of cognitive therapy and Klerman and Weissman’s interpersonal psychotherapy were selected for further study in the large-scale, multicenter NIMH Treatment of Depression Collaborative Research Program (TDCRP). I always thought that it was somewhat ironic that the two principal psychotherapists for the time-limited psychotherapy arm of our study (Stan Imber and Paul Pilkonis) led the Pittsburgh site of the TDCRP study.
Before writing this reflective piece, it had been at least 20 years since I last read the main outcome article (Hersen et al., 1984). During that interval, I have read many, many other comparative outcome studies, including some to which I have contributed and several that I helped to lead. When I look at the table summarizing the significant improvement rates, I am struck by the approximately 20% advantage favoring the social skills training plus placebo group over the amitriptyline alone and dynamic psychotherapy plus placebo control groups. In 2012, such a difference in remission rates would certainly be viewed as a clinically significant difference if it were statistically significant. Was this an example of what is now called a Type II error? Statistical power was a novel concept when this study was designed in the late 1970s, and from a contemporary perspective all of the comparative treatment studies of this era were underpowered to test their primary aims. What if the study had been twice as large or if Michel and his colleagues had marshaled their resources and allocated the 120 subjects between only two arms? Such wistful questions fly in the face of harsh realities: a 240-patient study would not have been feasible at a single site and a simple two-group comparison of social skills training and either amitriptyline or time-limited dynamic therapy alone would have been a fundamentally different study and one that might not have passed the test of peer review for NIMH funding.
Looking back, the relatively poor showing of amitriptyline, particularly when combined with social skills training, continues to cast doubts about the “assay sensitivity” of the study for TCA response, which—in the absence of a pill placebo alone condition—will remain a moot point. As noted earlier, only about a quarter of the sample met criteria for definite endogenous depression, and although the HAM-D scores at pretreatment (ranging from 21.5 to 25.5 across treatment groups) seem respectably high at first glance, one must keep in mind that the 24-item version of the HAM-D used in this study results in about a 40% inflation in scores. Thus, approximately one half of the study participants would have been considered to have “less severe” depression by contemporary standards. Results of some of the secondary analyses pertaining to attrition did suggest that this patient group had problems tolerating amitriptyline therapy (Last, Thase, Hersen, Bellack, & Himmelhoch, 1985; Thase, Last, Hersen, Bellack, & Himmelhoch, 1984).
At the time the study was conceived, it was widely considered that, as the efficacy of the TCAs had been demonstrated unequivocally, it was unnecessary (and perhaps ethically questionable) to include a placebo-only condition in a comparative efficacy design. Hence, the decision of Michel and his colleagues to include the time-limited psychotherapy control group instead of a pill placebo alone control condition makes good sense in historical context. However, with respect to the use of active comparators, the concept of statistical power is again relevant: Although the comparative studies of this era were underpowered to detect superiority, they had simply no chance of demonstrating noninferiority. Thus, with an intent-to-treat rate of “significant response” of only 23% in the amitriptyline arm, one cannot conclude if the treatments were comparably effective or similarly ineffective. This conundrum underscores the wisdom of the decision to include a pill placebo control group in the NIMH TDCRP (Elkin et al., 1989) and in several subsequent comparative studies using newer generation therapies (DeRubeis et al., 2005; Dimidjian et al., 2006).
Final Thoughts
Although my day-to-day work with Michel ended for most intents and purposes in 1984, it is no small coincidence that I stayed in Pittsburgh after completing the fellowship. Michel and I have stayed in touch over the years and I have occasionally contributed a chapter to one of his edited books or reviewed a paper for one of his journals. I learned things from my work with Michel that have shaped my career and informed the direction of my research across three decades. Importantly, I learned that if there was a neurobiological boundary delimiting psychotherapy response in depression, it was unlikely to be found by simply using cross-sectional clinical assessments of symptom severity. Subsequently, our team has examined whether evidence of neurobiological dysfunction at pretreatment, such as changes in excretion of the stress hormone cortisol (Thase, Dubé, et al., 1996), sleep electrophysiology (Thase, Simons, & Reynolds, 1996), or functional magnetic resonance imaging and papillary responses to words (Siegle, Carter, & Thase, 2006; Siegle, Steinhauer, Friedman, Thompson, & Thase, 2011) might identify patients with a greater or lesser likelihood of responding to cognitive-behavior therapy. Whether these findings actually identify depressed patients who are simply more difficult to treat or who require more biologically targeted interventions remains an unanswered question.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
