Brief Scales to Detect Postpartum Depression and Anxiety Symptoms

Abstract

Background:

Depressive and anxiety disorders in the postpartum period cause significant suffering for women. State public health officials across the country use the Centers for Disease Control and Prevention (CDC)-sponsored Pregnancy Risk Assessment Monitoring System (PRAMS) to assess health behaviors and conditions, including depression and anxiety, that occur around the time of pregnancy. The purpose of the present study was to validate two to three items that could be included on the PRAMS questionnaire to detect depression and anxiety among postpartum women in a surveillance system.

Methods:

A comprehensive set of 16 depression and anxiety items was developed and tested in a final sample of 1077 postpartum women, 353 of whom completed Structured Clinical Interview for DSM-IV (SCID) interviews to determine the presence of a major depressive episode (MDE) and generalized anxiety disorder (GAD). Regression analyses reduced candidate items to 5 each for MDE and GAD. Responses were scored on a 5-point scale ranging from never (1) to always (5), and 2 and 3 item combinations of these items were examined for their psychometric properties as indicators of MDE and GAD.

Results:

Item sets varied in their psychometric properties. The combination of depressed mood, felt hopeless, and slowed down > 9 (out of a possible total of 15) yielded the highest positive predictive value (PPV=60) and estimated MDE prevalence most accurately (24.4% vs. 25.4% true prevalence). The combination of felt panicky, felt restless, and problems sleeping >9 estimated GAD prevalence most accurately (20.2% vs. 15.7% true prevalence) and had high specificity (83%).

Conclusions:

Depression and anxiety can be detected using very few items, which makes assessment feasible in surveillance systems, such as PRAMS, and in primary care settings that have severe limits on time for depression and anxiety screening.

Introduction

Depressive and anxiety disorders are disabling conditions that are common among women of childbearing age.^1,2 Postpartum depression (PPD) in particular is a serious mental health problem.¹ The period prevalence of depression over the first 3 months postpartum is approximately 19% for major and minor depression and 7% for major depression alone.³ PPD may persist for many months, and even after successful treatment, many women experience relapse or recurrence.^4,5 Deleterious effects extend to the offspring and may cause delays in socioemotional and cognitive development and increased risk for internalizing and externalizing disorders and major depression.^6,7

Anxiety disorders in the perinatal period are less studied than depression, but it is becoming clear that they are as prevalent as major depression. Generalized anxiety disorder (GAD) has a prevalence rate of about 7% at 6 months postpartum; it is frequently comorbid with depression and leads to significant social impairment.²

Consequences of PPD and generalized anxiety have led state public health departments to survey their prevalence to provide accurate and contemporary information for public health planning. The Pregnancy Risk Assessment Monitoring System⁸ (PRAMS) is one of the primary tools used by state public health departments in the United States to assess risk factors for adverse maternal and infant outcomes. PRAMS, initiated in 1987, is an ongoing state-based and population-based surveillance system designed to monitor selected self-reported maternal behaviors and experiences that occur before, during, and after pregnancy among women who deliver a liveborn infant. PRAMS is administered by the Centers for Disease Control and Prevention (CDC) National Center for Chronic Disease Prevention and Health Promotion, Division of Reproductive Health, in collaboration with state health departments (www.cdc.gov/prams). The project supports the activities of the CDC's Safe Motherhood Initiative, which aims to reduce infant low birth weight and mortality. PRAMS provides a rich data source that can be used to inform, plan, and evaluate programs; to direct policy decisions; and to monitor trends in maternal behaviors.

Since its inception, PRAMS has expanded from 6 to 40 states and New York City. Collectively, PRAMS represents approximately 78% of all live births in the United States. Prior to 2009, there were no questions that assessed PPD or anxiety in the PRAMS Core Questionnaire (www.cdc.gov/prams/Questionnaire.htm), the questions required to be used by all participating states. Some states included optional depression questions based on a two-question screen.^9,10 Because the states use data from PRAMS to determine need for maternal and child health services, it is critical that they have access to reliable and valid indicators of depression and anxiety among postpartum women.

The goals of this study were to develop and evaluate 2–3 self-reported items assessing depression and 2–3 self-reported items assessing generalized anxiety that would provide the most accurate estimate of the prevalence of these conditions among postpartum women.

Materials and Methods

The research consisted of three phases, and phase III is our focus. Briefly, in phase I (M.W. O'Hara, unpublished observations), a total of 44 candidate items were developed by experts in the assessment of PPD and anxiety (the authors, M.W.O., S.S., D.W.). The aim was to develop a diverse array of items that together provided a reasonably comprehensive assessment of the full range of depressive and generalized anxiety symptoms.” Focus groups consisting of postpartum women were conducted to assess the format and meaningfulness of all items.

In phase II, a convenience sample of 1123 postpartum women completed a questionnaire containing questions on 20 candidate items and several additional validated screening instruments assessing depression and anxiety. Convergent and discriminant analyses led to the refinement and reduction from 20 items to 16 candidate items; specifically, 6 items were dropped and 2 new items were added to improve coverage of anxiety.

This report focuses on phase III, during which the sensitivity, specificity, positive predictive value (PPV), and estimated prevalence of combinations of items were assessed to identify the items that produced the most accurate measures of depression and generalized anxiety among postpartum women. All research procedures reported here were approved by the University of Iowa Institutional Review Board, and all subjects provided informed consent before their participation in this research.

Participants and procedure

The study sample included 1077 postpartum women recruited through two means. Of the 1077 study participants, 885 women gave birth between December 2004 and July 2007 and were recruited through infant birth records. Research staff mailed invitations and questionnaires to all women who delivered a baby in four rural and urban counties in Iowa. Women completed the questionnaire, on average, 21 weeks postpartum (range 1–56 weeks after delivery). To increase the racial diversity of the sample, a convenience sample of 192 nonwhite postpartum women was recruited in the same manner at maternal and child health centers in Iowa and Michigan.

To assess a major depressive episode (MDE) and GAD, a subset (n=475) of the 1077 women was invited to participate in clinical interviews by phone. Those invited included all minority women who completed questionnaires, all women who scored ≥ 13 (indicating probable depression) on the Edinburgh Postnatal Depression Scale (EPDS),¹² and every fifth woman scoring < 13 on the EPDS. A total of 353 of the 475 eligible women (74%) completed clinical assessments administered by master's level research assistants. The clinical assessment included the MDE and GAD modules of the Structured Clinical Interview for DSM-IV (SCID)¹³ and the Hamilton Rating Scale for Depression (HRSD).¹⁴

Measures

The self-administered questionnaire included demographic questions (age, race/ethnicity, education, income, marital status), the 16 candidate items, three depression scales, and one anxiety scale. The three depression scales were the Beck Depression Inventory (BDI),¹⁵ the General Depression Scale of the Inventory of Depression and Anxiety Symptoms (IDAS-GD),^16,17 and the EPDS.¹² The anxiety scale was the Beck Anxiety Inventory (BAI).¹⁸ The interview-based measures included the HRSD¹⁴ and the SCID.¹³ All the depression measures and the anxiety measure have been validated for use with postpartum women. For each of the 16 candidate items (Appendix, supplemental material available online at www.liebertpub.com/jwh), women were asked: How often have you felt or experienced things this way since your baby was born? They gave their answer on a 5-point scale ranging from never (1) to always (5). Sample items include: I have felt down, depressed, or hopeless, and I have had little energy.

The BDI¹⁵ consists of 21 items, each of which contains four descriptive statements (scored 0–3) that reflect increasing levels of severity of each symptom. For each item, respondents choose the option that best characterizes how they have been feeling during the past week, including today. The BAI¹⁸ assesses 21 affective and somatic symptoms of anxiety on a 4-point scale. For the purposes of this study, the 6-item Subjective subscale (BAI-Subj) was used because its items most clearly represent general anxiety symptoms.¹⁸ Respondents indicate to what extent they have been bothered by each symptom during the past week, including today.

The EPDS¹² has 10 items, each of which contains four descriptive statements (scored 0–3) that reflect increasing levels of severity of each symptom. Respondents are asked to indicate the answer that comes closest to describing how they have been feeling in the past 7 days. It has shown good reliability and validity across a large number of studies.¹² The IDAS-GD^16,17 contains 20 items that ask about depression symptoms over the past 2 weeks and are rated on a 5-point scale (1, not at all, to 5, extremely).

The HRSD¹⁴ contains 17-items with 3-point scales (0–2) to 5-point scales (0–4) and covers experiences in the past week. Each scale point is associated with a descriptive statement reflecting increasing severity. Twenty randomly selected cases were used to examine interrater reliability, which was excellent (intraclass correlation=0.99). The SCID¹³ MDE and GAD modules were used in this study. The interrater reliabilities based on 20 randomly selected cases for the MDE and GAD modules were kappa=0.80 and kappa=1.00, respectively.

A depression composite for self-report measures was computed by converting the BDI, EPDS, and IDAS-GD severity scores to z-scores, adding them, and dividing by 3.

Statistical analyses

For the outcomes MDE (SCID-based), HRSD, and the depression composite, backward and forward stepwise regressions were used to identify 4–5 candidate items to evaluate further. Initially, backward stepwise regression was used to identify which of the 16 items was independently associated at a significance level of p≤0.001 with each of the three depression outcomes. Because the final measure needed to be very brief, forward stepwise regressions were undertaken with the items identified in the backward regressions to identify 4–5 candidate items that accounted for the most variance in each outcome (MDE [SCID-based], the HRSD, and the depression composite). The same approach was taken with GAD and the BAI. Once the smaller set of items was identified, prevalence estimates, sensitivity, specificity, PPV, and Youden's J ¹⁹ were calculated for a variety of item sets (separately for depression and anxiety) for the sample of women who completed the SCID interview. Youden's J is a statistic that was developed to provide a measure of overall performance of a screening test and to allow comparisons between tests. As a consequence, we were able to compare the performance of the various combinations of 2 and 3 items as indicators of MDE and GAD in our sample. Finally, receiver operating characteristic (ROC) analyses were undertaken, which yielded area under the curve (AUC) for each set of items. As a comparison to the performance of the candidate items, sensitivity, specificity, and PPV for various thresholds for the EPDS, BDI, and the IDAS-GD scales also were calculated. ROC analyses were undertaken for these scales as well. All analyses were conducted using SPSS version 19.

Results

Demographic characteristics of the samples are reported in Table 1. In both the entire sample and the interview subsample, the participants were largely well educated, married, and Caucasian, and approximately 43%–46% had delivered their first child.

Table 1.

Demographic Characteristics of Total Sample and Interviewed Subsample

Variable	Entire sample n=1077 M (SD)/%	Interview subsample n=353 M (SD)/%
Age, years	27.9 (5.3)	27.3 (5.4)
Education, years	14.7 (2.5)	14.3 (2.5)
Married, %	70.5	64.5
Spouse/partner age, years	30.3 (6.2)	29.9 (6.6)
Spouse/partner education, years	14.5 (2.8)	14.1 (2.9)
Ethnicity, %
Caucasian	84.9	68.5
African American	4.9	9.7
Hispanic	4.6	10.9
Asian/Pacific Islander	2.1	3.2
American Indian/Alaskan Native	0.7	2.0
Other	2.8	5.7
Employed, %	61.5	55.1
Spouse/partner employed, %	90.2	86.3
Income level, %
$19,999 or less	19.4	24.6
$20,000–$39,999	24.7	28.5
$40,000–$69,999	26.7	25.5
$70,000 and above	24.7	14.3
Refused/missing	4.5	7.1
Primiparous, %	46.0	43.4
Currently breastfeeding, %	44.4	41.5
Time since delivery, weeks	20.9 (12.8)	18.3 (12.3)
Number of children	1.9 (1.0)	1.9 (1.1)
Previous miscarriage, %	23.1	23.9
Previous pregnancy termination, %	9.8	10.5

M, mean; SD, standard deviation.

Means and standard deviations (SD) of depression and anxiety scales and candidate PRAMS items are reported in Table 2. Rates of moderate to severe depression, based on the EPDS, BDI, and IDAS-GD ranged from 11% to 16%. Approximately 16% of women reported at least moderate levels of anxiety on the BAI-Subj. The candidate items with the highest prevalence included felt overwhelmed, low energy, slowed down, problems sleeping, and felt tense. The least reported symptoms were self-harm, felt fearful, felt hopeless, felt panicky, loss of interest, and poor appetite.

Table 2.

Descriptive Statistics for Depression and Anxiety Scales, Depression and Anxiety Diagnoses, and Pregnancy Risk Assessment Monitoring System Candidate Items

Instrument and scale	M	SD	Range	%>Cutoff ^a
EPDS	6.99	5.24	0–25	15.8
IDAS-GD	38.84	12.44	20–83	13.3
BDI	10.00	7.45	0–49	11.2
Major depressive episode (SCID)				25.4
BAI-Subj	3.09	3.54	0–18	15.6
Generalized anxiety disorder (SCID)				15.7
PRAMS candidate items
Depressed mood	2.65	0.86	1–5	14.1
Loss of interest	2.11	0.90	1–5	7.8
Low energy	2.86	0.96	1–5	24.5
Felt restless	2.26	1.02	1–5	12.1
Slowed down	2.68	1.06	1–5	23.2
Felt guilty	2.17	1.11	1–5	13.6
Poor concentration	2.35	1.04	1–5	14.2
Felt hopeless	1.72	0.97	1–5	7.1
Self-harm	1.16	0.49	1–5	0.8
Poor appetite	1.86	1.01	1–5	7.9
Problems sleeping	2.41	1.17	1–5	19.6
Felt panicky	1.79	1.00	1–5	7.5
Felt fearful	1.83	0.98	1–5	6.9
Felt tense	2.52	1.10	1–5	19.3
Worrying	2.28	1.18	1–5	17.8
Felt overwhelmed	2.93	1.04	1–5	29.5

Thresholds for cutoff scores are Edinburgh Postnatal Depression Scale (EPDS)>12; Beck Depression Inventory (BDI)>18; Beck Anxiety Inventory-Subjective Subscale (BAI-Subj)>6; Inventory of Depression and Anxiety Symptoms-General Depression Scale (IDAS-GD)>54. These levels represent a moderate severity of depression or anxiety. For Pregnancy Risk assessment monitoring system (PRAMS) candidate items, the % represents a report that the symptom was experienced often or always (4 or 5 on 5-point scale) during the postpartum period.

SCID, Structured Clinical Interview for DSM-IV.

Results from the different regression models predicting depression identified 5 items (depressed mood, felt overwhelmed, felt restless, felt hopeless, and slowed down) that were independently associated with at least one depression outcome (p<0.01) (Table 3). In addition, loss of interest was retained in the smaller set of items because it has been included in the PRAMS questionnaire in the past as a state optional question (along with depressed mood). The item self-harm was not included for ethical reasons because providing timely and appropriate follow-up and care is not possible in a surveillance system. Five candidate items were independently associated with GAD or the BAI-Subj (p<0.01): felt panicky, felt restless, felt fearful, problems sleeping, and worrying (Table 3).

Table 3.

Pregnancy Risk Assessment Monitoring System Candidate Items Accounting for Most Variance in Depression and Anxiety Outcome Measures, from Regression Analyses

Outcome	First predictor	Second predictor	Third predictor	Fourth predictor
Regression based on interviewed subsample
MDE	Depressed mood	Overwhelmed	Felt restless	Hopelessness
	R ²Δ=0.21	R ²Δ=0.03	R ²Δ=0.02	R ² Δ=0.01^a
HRSD	Felt restless	Depressed mood	Self-harm	Slowed down
	R ²Δ=0.28	R ²Δ=0.07	R ²Δ=0.03	R²Δ=0.03
Depression composite	Felt hopeless	Slowed down	Worrying	Depressed mood
	R ²Δ=0.59	R ²Δ=0.10	R ²Δ=0.05	R ²Δ=0.02
GAD	Felt panicky	Felt restless	Problems sleeping	Worry
	R ²Δ=0.14	R ²Δ=0.02	R ²Δ=0.01^b	R ²Δ=0.01^c
BAI-Subj	Worrying	Felt Fearful	Felt Restless	Felt Hopeless
	R ²Δ=0.41	R ²Δ=.09	R ²Δ=.05	R ²Δ=.02
Regressions based on entire sample
Depression composite	Felt hopeless	Worrying	Slowed down	Loss of interest
	R ²Δ=0.49	R ²Δ=0.10	R ²Δ=0.05	R ²Δ=0.03
BAI-Subj	Worrying	Felt fearful	Felt hopeless	Felt restless
	R ²Δ=0.37	R ²Δ=0.08	R ²Δ=0.04	R ²Δ=0.02

HRSD, Hamilton Rating Scale for Depression.

Significance of this term is p=0.089.

Significance of this term is p=0.033.

Significance of this term is p=0.102.

R ²Δ=R square change for each step. For logistic regressions (SCID-based major depressive episode [MDE] and generalized anxiety disorder [GAD] diagnosis), the Cox and Snell R ² was used. All regression models were significant at p<0.001. With three exceptions, all R ²Δ were significant, p<0.01.

When examining different combinations of items (Table 4), there were no significant differences in the statistical measure Youden's J, as all 95% confidence intervals (CI) overlapped (data not shown). In addition, there were no significant differences among item sets with respect to AUC; all item sets exceeded 0.800, which indicates excellent performance.²⁰ Differences were found in sensitivity, specificity, and PPV among the different combinations of items. For example, a combined score > 9 for depressed mood, felt hopeless, and slowed down produced the highest values for specificity (87%) and PPV (60%), and the estimated prevalence (24.3%) came close to the true prevalence of MDE (25.4%). A combined score of > 6 for depressed mood, felt hopeless, and slowed down produced the highest sensitivity (95%), but this cutoff score substantially overestimated prevalence (62%). The two depression items included in the PRAMS standard questionnaire from 2004 to 2008, depressed mood and loss of interest at the level of often (4) or always (5) yielded 63% sensitivity, 83% specificity, a PPV of 55%, and an estimated prevalence of 28.8%.

Table 4.

Performance Characteristics of Selected Cutoff Points for Two-Item and Three-Item Sets of Pregnancy Risk Assessment Monitoring System Candidate Items and Depression Measures in Relation to Structured Clinical Interview for DSM-IV-Based Diagnoses of Major Depressive Episode (True Prevalence in Dataset=25.4%)

Item sets/Cutoffs	AUC	Youden's J	Sensitivity	Specificity	PPV	Prevalence
PRAMS standard
Dep or LI>3		0.451	62	83	55	28.8
Two-item combinations of depressed mood with loss of interest, felt hopeless, felt restless, and slowed down
Dep+LI>5	0.802	0.464	75	71	47	40.3
Dep+LI>6		0.433	58	85	57	25.6
Dep+Hope>4	0.815	0.491	87	62	44	50.9
Dep+Hope>5		0.461	68	78	51	33.8
Dep+Rest>5	0.818	0.542	84	70	49	43.6
Dep+Rest>6		0.436	58	86	58	25.4
Dep+Slow>5	0.800	0.456	90	56	41	55.8
Dep+Slow>6		0.423	66	76	49	34.4
Three-item combinations of depressed mood with loss of interest, felt hopeless, felt restless, and slowed down
Dep+LI+Hope>7	0.819	0.497	80	70	48	42.5
Dep+LI+Hope>8		0.492	67	82	56	30.3
Dep+LI+Rest>7	0.823	0.499	86	64	43	49.1
Dep+LI+Rest>9		0.432	74	85	57	25.7
Dep+Hope+Rest>7	0.829	0.507	82	69	47	44.1
Dep+Hope+Rest>8		0.529	72	81	57	32.2
Dep+LI+Slow>7	0.805	0.433	87	56	40	55.2
Dep+LI+Slow>9		0.443	64	81	53	30.6
Dep+Hope+Slow>6	0.821	0.443	95	49	39	62.4
Dep+Hope+Slow>7		0.446	84	60	42	50.9
Dep+Hope+Slow>9		0.436	57	87	60	24.3
Dep+Rest+Slow>8	0.819	0.495	82	68	46	44.9
Dep+Rest+Slow>9		0.461	66	80	53	31.6
Dep+Rest+Panic>7	0.823	0.518	83	69	48	44.3
Dep+Rest+Panic>8		0.498	70	79	54	33.3
Depression measures
EPDS>9	0.810	0.339	81	66	45	46.3
EPDS>11		0.340	74	73	49	39.4
EPDS>12		0.343	67	81	55	31.4
BDI>10	0.830	0.442	81	63	43	48.6
BDI>11		0.480	79	69	47	43.5
BDI>12		0.517	79	73	50	40.8
BDI>14		0.494	69	81	56	32.0
IDAS-GD>38	0.818	0.518	93	55	41	57.3
IDAS-GD>43		0.480	82	65	44	46.5
IDAS-GD>50		0.445	64	80	52	31.1

AUC, area under the curve; Dep, depressed mood; Hope, hopelessness; LI, loss of interest; Panic, felt panicky; PPV, positive predictive value; Rest, restlessness; Slow, slowed down.

As a comparison to the performance of the candidate items, various thresholds for the EPDS, BDI, and IDAS-GD scales are reported in Table 4. The highest sensitivity (93%) but highest estimated prevalence (57% compared to the true prevalence of 25.4%) among the scales occurred at an IDAS-GD threshold of > 38. The highest specificity (81%–82%), highest PPV (55%–56%), and closest estimated prevalence (31%–32%) to the true prevalence (25%) were observed when the EPDS threshold was > 12 or the BDI threshold was > 14.

With respect to GAD, there were no significant differences among item sets in the ROC analyses, which ranged from 0.780 to 0.830 (acceptable to excellent), or using different thresholds within item sets in terms of Youden's J. However, threshold levels and combinations of items influenced sensitivity, specificity, and PPV. Two sets of items (a 2-item and a 3-item combination) yielded essentially the same performance with respect to specificity and PPV. Both sets of items included felt panicky and problems sleeping. The 3-item set also included felt restless. Using a threshold of panic and sleep > 6 yielded 87% specificity and 42% PPV and a prevalence of 20%, which compared relatively well to the true prevalence of 16%. Using a threshold of > 4, felt panicky and problems sleeping yielded a sensitivity of 86% but a prevalence of 46%, almost three times the true prevalence. Using a threshold of > 5, felt panicky and felt restless yielded a good balance of relatively high sensitivity (75%) and specificity (77%) but a prevalence of 30.4%, about double the true prevalence of GAD. Using a BAI-Subj > 4 threshold yielded a sensitivity of 76% and a prevalence of 36%. Using a threshold of BAI-Subj > 6 yielded an 82% specificity, a 35% PPV, and a prevalence of 23%.

The item felt restless was common to well-performing 2-item sets for identifying MDE (depressed mood and felt restless) and GAD (felt panicky and felt restless). The combination of depressed mood, felt restless, and felt panicky had relatively good performance in identifying both MDE and GAD (Tables 4 and 5).

Table 5.

Performance Characteristics of Selected Cutoff Points for Two-Item and Three-Item Sets of Pregnancy Risk Assessment Monitoring System Candidate Items and Depression Measures in Relation to Structured Clinical Interview for DSM-IV-Based Diagnoses of Generalized Anxiety Disorder (True Prevalence in Dataset=15.7%)

Item sets/Cutoffs	AUC	Youden's J	Sensitivity	Specificity	PPV	Prevalence
Two-item combinations of felt panicky with felt restless, worrying, felt fearful, and problems sleeping
Panic+Rest>4	0.796	0.514	87	65	30	42.9
Panic+Rest>5		0.525	75	77	37	30.4
Panic+Worry>4	0.816	0.478	88	59	28	47.8
Panic+Worry>5		0.504	75	75	35	32.2
Panic+Worry>6		0.418	56	86	41	20.3
Panic+Fear>3	0.796	0.430	87	56	26	50.0
Panic+Fear>4		0.490	77	72	33	35.3
Panic+Fear>5		0.447	63	81	37	25.4
Panic+Sleep>4	0.813	0.472	86	61	28	46.1
Panic+Sleep>5		0.509	75	76	36	31.2
Panic+Sleep>6		0.416	55	87	42	19.6
Three-item combinations of felt panicky with felt restless, worrying, felt fearful, and problems sleeping
Panic+Rest+Worry>7	0.828	0.514	85	67	31	41.0
Panic+Rest+Worry>8		0.514	76	76	36	31.4
Panic+Rest+Worry>9		0.448	60	85	42	21.5
Panic+Rest+Fear>6	0.820	0.512	85	67	31	41.2
Panic+Rest+Fear>8		0.490	65	84	41	23.8
Panic+Worry+Fear>7	0.817	0.555	83	73	35	35.5
Panic+Worry+Fear>8		0.479	67	81	38	26.6
Panic+Rest+Sleep>7	0.830	0.564	86	70	34	38.3
Panic+Rest+Sleep>8		0.554	78	77	37	31.3
Panic+Rest+Sleep>9		0.431	57	86	42	20.2
Panic+Worry+Sleep>7	0.826	0.466	82	64	29	42.7
Panic+Worry+Sleep>8		0.517	76	75	35	32.5
Panic+Worry+Sleep>9		0.518	69	83	42	24.6
Panic+Fear+Sleep>6	0.819	0.489	88	61	28	46.6
Panic+Fear+Sleep>7		0.498	74	73	33	34.1
Panic+Fear+Sleep>8		0.409	61	80	35	25.9
Dep+Rest+Panic>8	0.819	0.555	80	75	36	33.1
Dep+Rest+Panic>9		0.343	49	85	37	19.8
Beck Anxiety Inventory-Subjective Subscale^a
BAI-Subj>4	0.780	0.471	76	71	31	35.8
BAI-Subj>6		0.385	56	82	35	23.2

Except for BAI-Subj, only two-item and three-item combinations for which Youden's J≥0.400 are displayed.

Fear, Felt fearful; Sleep, problems sleeping; Worry, worrying.

Discussion

In this study, we sought to develop two to three questions to estimate depression prevalence and separate questions to estimate generalized anxiety prevalence for use on PRAMS, a state-based surveillance system of postpartum women. The ideal set of questions would have high sensitivity, specificity, and PPV. Because the prevalence of depression and anxiety is relatively low (even though clinically significant), high specificity is particularly important to achieve high positive predictive values and precise estimates of prevalence. This is particularly important because prevalence estimates for PPD and generalized anxiety will drive public expenditures in states that use the PRAMS.

The findings of this research led to several recommendations. We found that a number of combinations of 2–3 items performed as well as or better than existing scales with higher numbers of questions with respect to PPV and estimating true prevalence. For example, using the 3 items, depressed mood, felt hopeless, and slowed down, with a cutoff for scores > 9 yielded the highest specificity and PPV, closely estimated the true prevalence, and had higher specificity and PPV than the EPDS, the BDI, and the IDAS. These features are very important in a surveillance context, such as that represented by PRAMS.

Sensitivity was higher for all the previously validated measures (EPDS, BDI, and IDAS) than the shorter scales developed through this research. Consequently, longer measures, such as the EPDS, BDI, and IDAS, may be preferable to use in clinical settings (obstetrics-gynecology, family medicine), where a two-stage approach is feasible, first, to identify potentially depressed women and then to provide a more intensive assessment of those who screen positive. With respect to anxiety detection, several combinations of items performed equally well. With respect to PPV and estimation of true prevalence, the 2-item combination of felt panicky and problems sleeping performed very well. The 2-item combination of felt panicky and felt restless had a good balance of sensitivity and specificity.

Finally, one set of 3 items (depressed mood, felt restless, and felt panicky) performed reasonably well in identifying both depressed and anxious women. The items depressed mood and felt panicky were the prime predictors of MDE and GAD, respectively, and felt restless was a good indicator of both. This combination of these 3 items could be considered in clinical screening and surveillance contexts in which there is a severe constraint on number of items but a need to identify women at risk for both depressive and anxiety disorders.

There are limitations to the work reported here. Subjects represented a convenience sample; however, the interviewed subsample of women was quite diverse with respect to race and ethnicity. As a consequence, the findings of this research should have some generalizability to populations living outside of the State of Iowa. Nevertheless, it will be important to cross-validate the performance of the item sets and their thresholds for use both in surveillance, as in the case of PRAMS, and in clinical screening in primary care.

There was not always a good match between the time frame of the candidate items (since your baby was born) and the SCID assessments (past month), raising the possibility that time since delivery may modify the association between responses to candidate items and MDE and GAD diagnoses. This possibility was examined, and the association between time since delivery and a diagnosis of MDE or GAD was very weak (r=0.06 in both cases, p>0.29). This result reflects in part the well-known phenomenon that individuals who complete mood questionnaires meant to reflect extended periods of time are influenced considerably by their current mood state.^21,22 The implication of this finding in the context of PRAMS is that episodes of depression that end long before women complete the PRAMS questionnaire may be missed. However, PRAMS participants, on average, complete the survey within 3–4 months of childbirth, which lessens the possibility that clinically significant episodes of MDE occurring early in the postpartum period will be missed. It would be valuable in future research to document precisely with the timing of episodes of major depression and generalized anxiety in the postpartum period relative to the timing of the administration of PRAMS questionnaire.

Maternal depression and anxiety, particularly in the postpartum period, are significant public health issues. The impacts of these disorders on women's health and well-being are well documented,^1,2 as are the long-term negative effects for infants exposed to maternal depression.⁷ It is important that state public health officials have tools to determine the prevalence of depression and anxiety among mothers of infants. The most common mechanism for health surveillance among women who have recently given birth is PRAMS. The findings of our study have provided, for the first time, performance measures for a set of items that reflect depression and anxiety among women who have recently given birth.

The findings of our study also support the use of 2-item and 3-item screening scales to identify women at risk for postpartum depressive and anxiety disorders in primary care settings when there is a premium on time, such that the use of longer scales is not feasible. The item sets tested here have an advantage relative to the commonly used EPDS¹² in that they can be completed more quickly and do not contain British idioms that may not be familiar to many American women.

Footnotes

Acknowledgments

We acknowledge the financial support of the Centers for Disease Control and Prevention (MM-0822, S. Stuart, PI). We thank Sarah Mott, B.S., for her assistance in data management.

Disclosure Statement

No competing financial interests exist.

References

O'Hara

. Postpartum depression: What we know. J Clin Psychol, 2009; 65:1258–1269.

Wenzel

. Anxiety disorders in childbearing women: Diagnosis and treatment. Washington, DC: APA Books, 2011.

Gavin

, Gaynes

, Lohr

, Meltzer-Brody

, Gartlehner

, Swinson

. Perinatal depression: A systematic review of prevalence and incidence. Obstet Gynecol, 2005; 106:1071–1083.

O'Hara

, Stuart

, Gorman

, Wenzel

. Efficacy of interpersonal psychotherapy for postpartum depression. Arch Gen Psychiatry, 2000; 57:1039–1045.

Nylen

, O'Hara

, Brock

, Moel

, Gorman

, Stuart

. Predictors of the longitudinal course of postpartum depression following interpersonal psychotherapy. J Consult Clin Psychol, 2010; 78:757–763.

Goodman

. Depression in mothers. Annu Rev Clin Psychol, 2007; 3:107–135.

Goodman

, Brand

. Infants of depressed mothers: Vulnerabilities, risk factors, and protective factors for the later development of psychopathology. Zeanah

Jr . Handbook of infant mental health, 3rd. New York, NY: Guilford Press, 2009; 153–170.

Shulman

, Gilbert

, Lansky

. The Pregnancy Risk Assessment Monitoring System (PRAMS): Current methods and evaluation of 2001 response rates. Public Health Rep, 2006; 121:74–83.

Gjerdingen

, Crow

, McGovern

, Miner

, Center

. Postpartum depression screening at well-child visits: Validity of a 2-question screen and the PHQ-9. Ann Fam Med, 2009; 7:763–770.

10.

Whooley

, Avins

, Miranda

, Browner

. Case-finding instruments for depression: Two questions are as good as many. J Gen Intern Med, 1997; 12:439–445.

11.

Clark

, Watson

. Constructing validity: Basic issues in scale development. Psychol Assess, 1995; 7:309–319.

12.

Cox

, Holden

. Perinatal mental health: A guide to the Edinburgh Postnatal Depression Scale (EPDS) London: Gaskell, 2003.

13.

First

, Spitzer

, Gibbon

, Williams

JBW

. Structured clinical interview for DSM-IV axis I disorders, research version, non-patient edition (SCID-I/NP) New York: Biometrics Research, New York State Psychiatric Institute, 1997.

14.

Hamilton

. Development of a rating scale for primary depressive illness. Br J Soc Clin Psychol, 1967; 6:278–296.

15.

Beck

, Ward

, Mendelson

, Mock

, Erbaugh

. An inventory for measuring depression. Arch Gen Psychiatry, 1961; 4:561–569.

16.

Watson

, O'Hara

, Simms

et al.

Development and validation of the Inventory of Depression and Anxiety Symptoms (IDAS)

Psychol Assess, 2007; 19:253–268.

17.

Watson

, O'Hara

, Chmielewski

et al. Further validation of the IDAS: Evidence of convergent, discriminant, criterion, and incremental validity. Psychol Assess, 2008; 20:248–259.

18.

Beck

, Steer

. Beck Anxiety Inventory manual. San Antonio, TX: Psychological Corporation, Harcourt Brace, 1993.

19.

Youden

. Index for rating diagnostic tests. Cancer, 1950; 3:32–35.

20.

Hosmer

, Lemeshow

. Applied logistic regression, 2nd. New York: Wiley Interscience, 2000.

21.

Chmielewski

, Watson

. What is being assessed and why it matters: The impact of transient error on trait research. J Pers Soc Psychol, 2009; 97:186–202.

22.

Schwarz

, Clore

. Mood, misattribution, and judgments of well-being: Informative and directive functions of affective states. J Pers Soc Psychol, 1983; 45:513–523.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.02 MB