Biomarkers in DSM-5: Lost in translation

Abstract

DSM Digest

There are no biomarkers in the Diagnostic and Statistical Manual of Mental Disorders (DSM)-5, though their inclusion was a goal to which the planners aspired (Kupfer et al., 2002). What happened?

The main impediment was an unrealistic expectation – seeking certainty in biomarkers, yet tolerating the generally weak discriminative performance of definitional signs and symptoms. A misunderstanding clearly persists about how biomarkers are used in medicine. Most laboratory tests are probabilistic, not pathognomonic, markers of disease. They can assist in ruling in or ruling out diagnoses. Few have exceptional sensitivity and specificity, however, and, even if they do, their predictive value depends critically on disease prevalence. Phenylketonuria, with a prevalence of 1 in 10,000 births, is a case in point: even though neonatal screening tests for this condition have excellent sensitivity (100%) and specificity (99.95%), most positive test results are false positives (Galen and Gambino, 1975). Expressed in Bayesian terms, the prior probability matters . This same principle applies when making diagnoses, whether relying on symptoms or biomarkers. As the various DSMs have demonstrated, less than perfectly discriminative symptoms can be of some use in diagnosis. Likewise, less than perfect biomarkers can be useful in guiding diagnosis and prognosis. The glucose tolerance test and the electroencephalogram are good examples (DECODE Study Group, 2001; Goodin and Aminoff, 1984).

Some key principles apply. First, an iterative approach is essential (Carroll, 1989). Failure to understand this principle dooms efforts to develop biomarkers. We know that current, symptom-based definitions of psychiatric disorders are just placeholders, limited by etiologic heterogeneity and unimpressive reliability or stability, so it is illogical to treat these as unquestioned gold standards. Second, longitudinal studies are needed to complement cross-sectional studies. There are several reports of apparently false positive biomarkers being reconsidered true positive in light of subsequent diagnostic change (Carroll, 1989). That is a form of iteration. Third, making a diagnosis is not the same as defining a disorder: diagnoses are casewise probability estimates, subject to revision as new information arrives. That new information may be independent of the defining DSM criteria. For example, clinically salient features such as family history and course of illness are excluded from DSM criteria, even though they are among the originally proposed validators of psychiatric diagnosis (Robins and Guze, 1970). In practice, establishing a family history of recurrent depression or suicide can shift the probability of melancholic depression from 50% in an ambiguous case to 80%, which is the usual treatment threshold (Carroll, 2012). Biomarkers operate in the same way. The incremental confidence they provide is case specific, and the resulting change of diagnostic certainty depends on the prior probability of the diagnosis for that patient. Diagnostic use of electroencephalography (EEG) in suspected epilepsy illustrates these principles nicely (Carroll, 1989; Goodin and Aminoff, 1984). Bayes’ theorem provides the statistical basis of this iterative differential diagnostic process (Bianchi and Alexander, 2006).

A case example will demonstrate the Bayesian application of biomarkers in psychiatric diagnosis. A disheveled elderly gentleman presents with depressive, psychotic, and cognitive features of uncertain duration. The referring physician suggests psychotic depression with mood-related cognitive change. The emergency room physician suggests schizophrenia with negative symptoms and demoralization. The inpatient chief resident says rule out dementia with apathy and delusions. The attending psychiatrist estimates the probabilities of these differential diagnoses at 65%, 20%, and 15%, respectively. These prior probabilities are shown in Table 1. A dexamethasone suppression test (DST) (Carroll et al., 1981) is now performed. The typical rates of positive and negative DSTs for each candidate diagnosis are shown. These are termed conditional probabilities. After a positive DST result is found, Bayes’ theorem updates the probabilities of each candidate diagnosis in light of the test result and of the prior and conditional probabilities. These updates are shown as revised or posterior probabilities. The probability of psychotic depression increases from 65% to 82%, and the probability of each alternative diagnosis decreases. This result occurs despite the fact that the specificity of the DST is far from perfect – it is 80% for schizophrenia and only 65% for dementia. In general, the weaker the specificity of the test (and most are indeed weak) the more the prior probability matters . If a negative test result were obtained then the revised diagnostic probabilities would be as Table 1 also shows: psychotic depression would still be the most likely diagnosis. These observations demonstrate that the DST (and most other, similar tests) serve primarily to rule in rather than to rule out the leading candidate diagnosis.

Table 1.

Case illustration of Bayesian application of biomarkers (DST then REM latency).

Possible diagnoses	Psychotic depression	Schizophrenia	Dementia
Prior probabilities (clinical differential diagnosis)	0.65	0.20	0.15
DST conditional probabilities
Test positive rate expected (DST +)	0.65	0.20	0.35
Test negative rate expected (DST –)	0.35	0.80	0.65
Posterior (revised) probabilities
Test positive result (DST +)	0.82	0.08	0.10
Test negative result (DST –)	0.47	0.33	0.20
New prior probabilities (after positive DST)	0.82	0.08	0.10
Sleep EEG conditional probabilities
Test positive rate expected (short REM latency)	0.65	0.20	0.10
Test negative rate expected (normal REM latency)	0.35	0.80	0.90
Posterior (revised) probabilities
Test positive result (short REM latency)	0.95	0.03	0.02
Test negative result (normal REM latency)	0.65	0.15	0.20

DST: dexamethasone suppression test; EEG: electroencephalography; REM: rapid eye movement.

Sequential testing can further improve diagnostic confidence. Here the DST was followed by a sleep EEG study which was positive for short rapid eye movement latency. Now, we use the previously calculated posterior probabilities as new priors in further updating the probabilities (another example of iterative reasoning). As Table 1 shows, the sleep EEG resembles the DST in sensitivity and specificity, yet the updated diagnostic probabilities after two positive results will be 0.95, 0.03, and 0.02 (Table 1): there is an asymptote towards high diagnostic confidence (65% → 82% → 95%). The new test has again been useful for ruling in (not ruling out) the leading diagnostic formulation. These computations can be done online, for instance, through: statpages.org/bayes.html. Other combinations of outcomes also can be examined, and candidate biomarkers for other disorders can be assessed in a like manner.

This example illustrates the reality of clinical uncertainty, confirming Spitzer’s dictum that ‘The use of specified criteria does not … eliminate clinical judgment …’ (Spitzer et al., 1978). Indeed, the various DSMs are not designed for making diagnoses, much less differential diagnoses. The DSMs serve, rather, to document that diagnoses made under conditions of clinical uncertainty conform to a minimum symptom profile (Carroll, 2012). The current DSMs are innumerate in not providing a Bayesian approach to differential diagnosis. Instead, they merely stipulate signs and symptoms in disjunctive format (the so-called Chinese menu approach), without stating sensitivity and specificity measures. It is possible to improve the confidence of DSM-III diagnoses by considering specific combinations of symptoms when their discriminative performances are known (Widiger et al., 1984), but this approach was not adopted in later DSMs.

A frequent reservation about the Bayesian approach to diagnosis is that the judgment of casewise prior probabilities will vary, perhaps widely, from one clinician or unit to another. Such variance may simply reflect the reality that some clinicians are more astute than others (Wolf et al., 1985). An urgent goal of research will be to establish and maintain cumulative large data sets so that such variance is minimized (Goodman, 2009). A Bayes-informed approach to symptoms, biomarkers, family history, and course of illness would be a welcome advance to improve diagnostic confidence in DSM-6. Haste the day.

Footnotes

Acknowledgements

The author thanks John CS Breitner, MD, who helpfully reviewed early drafts of this manuscript. No duplicate submissions were made to other journals.

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Declaration of interest

The author declares no competing financial or professional interests. This commentary was commissioned by the Editor-in-Chief.

References

Bianchi

Alexander

(2006) Evidence based diagnosis: Does the language fit the theory? British Medical Journal 333: 442–445.

Carroll

(1989) Diagnostic validity and laboratory studies: Rules of the game. In: Robins

Barrett

(eds) The Validity of Psychiatric Diagnosis. New York: Raven Press, pp. 229–245.

Carroll

(2012) Bringing back melancholia. Bipolar Disorders 14: 1–5.

Carroll

Feinberg

Greden

. (1981) A specific laboratory test for the diagnosis of melancholia. Archives of General Psychiatry 38: 15–22.

DECODE Study Group (2001) Glucose tolerance and cardiovascular mortality: Comparison of fasting and 2-hour diagnostic criteria. Archives of Internal Medicine 161: 397–404.

Galen

Gambino

(1975) Beyond Normality: The Predictive Value and Efficiency of Medical Diagnoses. New York: Wiley.

Goodin

Aminoff

(1984) Does the interictal EEG have a role in the diagnosis of epilepsy? Lancet i: 837–839.

Goodman

(2009) Building a Bayesian bridge from evidence to guidelines. Archives of Internal Medicine 169: 1436–1437.

Kupfer

First

Regier

(2002) A Research Agenda for DSM-V. Arlington, VA: American Psychiatric Association.

10.

Robins

Guze

(1970) Establishment of diagnostic validity in psychiatric illness: Its application to schizophrenia. American Journal of Psychiatry 126: 983–987.

11.

Spitzer

Endicott

Robins

(1978) Research diagnostic criteria: Rationale and reliability. Archives of General Psychiatry 35: 773–782.

12.

Widiger

Hurt

Frances

. (1984) Diagnostic efficiency and DSM-III. Archives of General Psychiatry 41: 1005–1012.

13.

Wolf

Gruppen

Billi

(1985) Differential diagnosis and the competing-hypotheses heuristic. Journal of the American Medical Association 253: 2858–2862.