Abstract
The current study explored associations between two potentially invalidating self-report styles detected by the Validity scales of the Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF), over-reporting and under-reporting, and scores on the MMPI-2-RF substantive, as well as eight collateral self-report measures administered either at the same time or within 1 to 10 days of MMPI-2-RF administration. Analyses were conducted with data provided by college students, male prisoners, and male psychiatric outpatients from a Veterans Administration facility. Results indicated that if either an over- or under-reporting response style was suggested by the MMPI-2-RF Validity scales, scores on the majority of the MMPI-2-RF substantive scales, as well as a number of collateral measures, were significantly affected in all three groups in the expected directions. Test takers who were identified as potentially engaging in an over- or under-reporting response style by the MMPI-2-RF Validity scales appeared to approach extra-test measures similarly regardless of when these measures were administered in relation to the MMPI-2-RF. Limitations and suggestions for future study are discussed.
It has long been recognized that the validity of self-report–based personality assessment is limited by the degree to which the individual assessed accurately communicates his or her inner experiences and perceptions. Whether this communication occurs via interview or psychological testing, issues such as limited insight or purposeful distortion can lead to inaccurate results on personality inventories. Unlike interview data, which rely on the interviewer’s subjective assessment of the veracity of the interviewee’s self-report, in personality-based psychological testing, two potentially biasing response styles, Noncontent-Based and Content-Based Invalid Responding (CBIR), can be assessed through empirically validated objective means (i.e., scales designed to measure such response styles; Ben-Porath, 2013). Individuals who engage in Noncontent-Based Invalid Responding (NCIR) fail to respond or provide random or fixed responses to test items regardless of their content. CBIR occurs when the test taker portrays him or herself as functioning better (under-reporting) or worse (over-reporting) than would be indicated by an objective assessment of his or her functioning.
Given the potential costs of misinterpreting psychological test results (e.g., in outpatient and/or inpatient settings where an individual is prescribed medication or forensic settings where an individual is deemed incompetent to proceed to trial because of psychological difficulties), once NCIR has been ruled out, being able to effectively detect CBIR is of paramount importance. Three questions need to be considered when selecting a psychological test and considering its ability to detect CBIR. First, does the instrument include scales designed to assess CBIR? If so, have these scales been empirically validated? Finally, does any CBIR they detect generalize to other psychological tests taken at or near the same time?
With regard to the first two questions just listed, the Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008; Tellegen & Ben-Porath, 2008/2011) contains well-validated measures of both NCIR and CBIR (Ben-Porath, 2013). In terms of NCIR, the MMPI-2-RF retains the use of the Cannot Say Scale (CNS) Index, which is a count of omitted or double-answered items. The MMPI-2-RF also includes restructured versions of the MMPI-2 Variable Response Inconsistency (VRIN) and True Response Inconsistency (TRIN) scales, VRIN-r and TRIN-r, which assess random and fixed responding, respectively. In terms of CBIR, the MMPI-2-RF contains five primary scales designed to detect over-reporting. Three of the CBIR over-reporting indicators are revised versions of MMPI-2 scales, including Infrequent Responses (F-r), Infrequent Psychopathology Responses (FP-r), and Symptom Validity (FBS-r). Infrequent Somatic Responses (FS) and the Response Bias Scale (RBS) were added to the MMPI-2-RF to assess the report of unusual somatic symptoms and noncredible memory complaints, respectively. The MMPI-2-RF also contains two revised MMPI-2 scales designed to detect under-reporting, Uncommon Virtues (L-r), and Adjustment Validity (K-r).
A number of studies have supported the ability of the over-reporting scales of the MMPI-2-RF to detect invalid response styles. For example, two studies (Marion, Sellbom, & Bagby, 2011; Sellbom & Bagby, 2010) reported that the MMPI-2-RF over-reporting scales were able to differentiate between college students coached to feign mental illness and psychiatric patients. Furthermore, Wygant et al. (2011) reported that the MMPI-2-RF over-reporting scales were able to accurately identify individuals undergoing compensation-seeking evaluations who were classified as probable/definite malingerers based on structured criteria for malingered neurocognitive dysfunction (Slick, Hopp, Strauss, & Thompson, 1997) or malingered pain-related disability (Bianchini, Greve, & Glynn, 2005). All the studies just mentioned demonstrated a large effect size for the ability of the MMPI-2-RF CBIR over-reporting scales in differentiating between groups.
With respect to under-reporting, Sellbom and Bagby (2008) reported that L-r and K-r were able to differentiate between both college students and psychiatric patients instructed to under-report problems and individuals instructed to take the test under standard instructions. Sellbom and Bagby (2008) concluded that their results support the utility of the MMPI-2-RF under-reporting indicators and noted that the effect sizes they obtained were similar to the large effect sizes reported in Baer and Miller’s (2002) meta-analysis of the MMPI-2 measures of under-reporting, from which L-r and K-r were derived. Data reported in the MMPI-2-RF Technical Manual indicate that scores on these scales are commonly elevated in samples of individuals tested under circumstances that are likely to motivate under-reporting (e.g., personnel screening and child custody litigation).
With respect to the third question listed earlier concerning the generalizability of MMPI-2-RF CBIR findings to other measures, only limited MMPI-2 research has been conducted to date. Garcia, Franklin, and Chambliss (2010) examined whether invalid response styles detected by the MMPI-2 Validity scales were correlated with both MMPI-2 substantive scale scores and conjointly administered collateral measures in a sample of veterans diagnosed with posttraumatic stress disorder (PTSD). Garcia et al. (2010) demonstrated that scores on MMPI-2 Clinical scales and conjointly administered measures of depression and PTSD symptoms were significantly higher for individuals with elevated scores on the F (Infrequency) scale. Although this study had some methodological shortcomings that may have artificially increased the detected over-reporting rate (i.e., using nonstandard cut scores for some validity scales), overall, results indicated that MMPI-2-detected over-reporting generalized to the other conjointly administered measures.
Forbey and Lee (2011) attempted to further clarify the association between under- and over-reporting response styles scores on the MMPI-2 and conjointly administered measures of social, behavioral, and psychological difficulties. The authors used data provided by 1,112 undergraduates who took the MMPI-2 and a number of collateral measures together as part of a larger study. After excluding NCIR individuals, three groups of participants were identified based on their Validity scale scores: CBIR over-reporters (CBIR-OR), CBIR under-reporters (CBIR-UR), and “within normal limits” (WNL) responders on the validity scales. In terms of MMPI-2 substantive scales, results mirrored those of previous coaching and simulation research, which has long established that scores on MMPI-2 substantive scales vary substantially as a function of validity scale scores. More important, with respect to the conjointly administered collateral measures, Forbey and Lee (2011) found 27 of 35 collateral measures’ total and/or subscale scores in the CBIR-OR group were significantly different from the WNL responders (with small to large median effect sizes), and 15 of 35 collateral total scale and/or subscale scores were significantly different in the CBIR-UR group when compared with the WNL responders (with medium to large median effect sizes). Overall, these results indicated that individuals who were identified as potentially engaging in CBIR-OR by the MMPI-2 Validity scales reported significantly higher levels of psychopathology on the MMPI-2 substantive scales and on conjointly administered collateral measures when compared with individuals with WNL validity scores. Individuals identified as potentially engaging in CBIR-UR by the MMPI-2 Validity scales generally reported substantially lower (than the WNL group) levels of psychopathology on both the MMPI-2 substantive scales and collateral measures administered at the same time.
The study by Forbey and Lee (2011) indicated that MMPI-2 detected CBIR-OR and CBIR-UR likely reflects a more generalized response style that affects conjointly administered collateral measures. This finding is of critical importance because the MMPI-2 or MMPI-2-RF are often administered as part of a battery that includes other tests that lack empirically validated CBIR measures. Being able to rely on MMPI Validity scale findings to guide interpretation of these additional measures may increase their utility considerably. However, Forbey and Lee’s (2011) study was limited in generalizability, as it only included college students who were administered all the measures conjointly (i.e., on the same occasion). Furthermore, this study used the MMPI-2, rather than the MMPI-2-RF, though use of the MMPI-2-RF may confer some advantages, including the ability to examine additional over-reporting scales and the reduced item overlap among the validity scales (in particular the removal of the item overlap among the under- and over-reporting scales that existed on the MMPI-2). Moreover, because it is approximately 40% shorter than the MMPI-2, the MMPI-2-RF is more likely to be administered along with other self-report measures.
The current study was conducted to build on Forbey and Lee’s (2011) findings. The goals of the study were to examine associations between CBIR, as detected by MMPI-2-RF Validity scales and scores on the substantive scales of the test (to serve as an overall procedural evaluation of the methodology) and, more important, to examine associations between the CBIR indicators and conjointly administered self-report measures (which lack such validity indicators) in different populations and during varying time frames. Specifically, using a criterion groups design, we examined MMPI-2-RF and criterion measure scale score differences between subgroups of individuals who were classified as engaging in CBIR-OR or CBIR-UR and individuals who produced interpretable (i.e., WNL on all the validity scales) MMPI-2-RF protocols in male and female college students, males undergoing intake to a correctional facility, and male Veterans Administration mental health outpatients.
Based on previous research, we hypothesized that when compared with individuals classified as responding candidly, individuals engaging in an CBIR-OR or CBIR-UR would score differently on MMPI-2-RF substantive scales and collateral measures administered conjointly or within a specified amount of time. Specifically, we hypothesized that individuals identified as potentially engaging in CBIR-OR based on MMPI-2-RF Validity scale cut scores (i.e., elevations on F-r, FP-r, FS, FBS-r, and/or RBS) would score higher than those assigned to the WNL group on substantive measures. We further hypothesized that individuals who were identified as potentially engaging in CBIR-UR based on their cut -scores on MMPI-2-RF-specific Validity scales (i.e., elevations on L-r and/or K-r) would score lower than those on the WNL group on substantive measures.
Method
Participants
Three archival samples of participants were used in the current study, with data originally being collected from these participants as part of a larger series of unpublished studies examining the comparability between conventional and several modularized 1 versions of the MMPI-2–computerized adaptive version (MMPI-2-CA; Forbey & Ben-Porath, 2007). The first sample, composed of college participants, included 1,194 (487 men and 707 women) undergraduate students from a Midwestern U.S. university enrolled in Introductory Psychology classes. The second sample, made up of correctional participants, included 632 male inmates from a large Midwestern U.S. intake correctional facility who volunteered to participate in the original study. The third sample included 181 men who were recruited from outpatient psychiatric care Veterans Affairs facility at a large tertiary care medical center.
To reduce error variance in our analyses, individuals in each of the samples who provided NCIR MMPI-2-RF profiles, defined by CNS scores ≥15 and/or T scores ≥80 on VRIN-r or TRIN-r based on the recommendations in the MMPI-2-RF administration manual (Ben-Porath & Tellegen, 2008), were removed from the current study. This resulted in a total of 129 (10.8%) college, 19 (3.0%) correctional, and 17 (9.4%) psychiatric participants being excluded from analyses. Among the college participants, no statistically significant differences between included and excluded participants were evident in terms of age or ethnicity/racial group. However, a statistically significant differences of a small effect size (Cohen, 1988) was evident for gender, χ2(1) = 26.986, p < .001, Φ = .15) with men being slightly more likely to produce NCIR MMPI-2-RF profiles. Among the correctional and psychiatric participants, no statistically significant differences between included and excluded were evident in terms of age. However, a statistically significant difference of a small effect size (Cohen, 1988) was evident for ethnicity/racial group in both sets of participants, χ2(2) = 8.526, p < .05, Φ = .12 and χ2(2) = 14.474, p < .001, Φ = .28, respectively, with minority participants being slightly more likely to produce NCIR MMPI-2-RF profiles.
After NCIR exclusions, the final sample of 1,065 college participants included 407 men and 658 women (age range = 18-48 years, M = 19.60 years, SD = 3.14) who were Caucasian (n = 947, 88.9%), African American (n = 84, 7.9%), or of another or unidentified ethnicity (n = 34, 3.2%). The final sample of correctional participants included 613 men (age range = 18-66 years, M = 32.37 years, SD = 9.94) who were Caucasian (n = 310, 50.6%), African American (n = 173, 28.2%), or of another or unidentified ethnicity (n = 130, 21.2%). The final sample of psychiatric participants included 164 men (age range = 26-85 years, M = 55.44 years, SD = 11.90) who were Caucasian (n = 138, 84.7%), African American (n = 21, 12.9%), or of another or unidentified ethnicity (n = 5, 3.1%).
Measures
Minnesota Multiphasic Personality Inventory–2–Restructured Form
Revised from the MMPI-2 (Butcher et al., 2001), the MMPI-2-RF (Ben-Porath & Tellegen, 2008; Tellegen & Ben-Porath, 2008/2011) is a 338-item, true/false, self-report inventory that assesses an individual’s psychological functioning in a number of domains (i.e., personality, psychopathology, and social/ behavioral functioning). The MMPI-2-RF Technical Manual (Tellegen & Ben-Porath, 2008/2011) provides extensive evidence supporting reliability and validity for this instrument. For the current study, the MMPI-2-RF was rescored from a conventional administration of the MMPI-2 (see Procedure section below). Tellegen and Ben-Porath (2008/2011) and Van der Heijden, Egger, and Derksen (2010) have demonstrated that MMPI-2-RF scale scores generated from an MMPI-2 administration are interchangeable with those generated from the MMPI-2-RF booklet.
Collateral measures
Eight self-report based collateral measures, originally selected for the purpose of examining the comparative validity of modularized versions of the MMPI-2-CA and conventional administrations of the MMPI-2 with conceptually relevant personality and psychopathology constructs were used in the current study. Depending on setting, the collateral measures were either divided into two administration packets (i.e., the college and psychiatric participants) or all measures were administered in a single packet (i.e., the correctional participants). All eight collateral measures were administered via paper and pencil. Each of the eight self-report collateral measures is described below.
Barratt Impulsivity Scale–Version 10
The Barratt Impulsivity Scale (BIS, Barratt, 1985) is a 34-item measure of impulsivity rated on a 4-point Likert-type scale (1 = rarely/never, 4 = almost always/always) that provides a total score as well as scores on three dimensions of impulsivity: Nonplanning (12 items), Motor (11 items), and Cognitive (11 items). In the current study, only the total scale score was uses, which had an estimated internal consistency (α) of .80 for college, .87 for correctional, and .86 for psychiatric participants.
Beck Depression Inventory
The Beck Depression Inventory (BDI; Beck, Ward, Mendelson, Mock, & Erbaugh, 1961) is a 21-item measure of depressive symptomatology rated on a 4-point Likert-type scale, with higher ratings generally indicating higher levels of psychological distress. In the current study, estimated internal consistency (α) for the BDI was .88 for the college, .89 for the correctional, and .93 for the psychiatric participants.
Drug Abuse Screening Test
The Drug Abuse Screening Test (DAST; Skinner, 1982) is a 20-item measure rated dichotomously (i.e., yes or no) that assesses an individual’s self-reported consumption of prescription, over the counter, and other illicit drugs. In the current study, estimated internal consistency (α) for the DAST was estimated to be .81 for the college, .93 for the correctional, and .94 for the psychiatric participants.
Magical Ideation Scale
The Magical Ideation Scale (MIS; Eckblad & Chapman, 1983) is a 30-item true/false inventory that assesses beliefs about causal relations between events that are unconventional and which are commonly associated with thought disorders. In the current study, estimated internal consistency (α) for the MIS was .80 for the college, .81 for the correctional, and .86 for the psychiatric participants.
Michigan Alcohol Screening Test
The Michigan Alcohol Screening Test (MAST; Selzer, 1971) is a 24-item measure rated dichotomously (i.e., yes or no) that assesses an individual’s self-reported level of problematic alcohol consumption. In the current study, internal consistency (α) for the MAST was estimated to be .68 for the college, .80 for the correctional, and .84 for the psychiatric participants.
Perceptual Aberration Scale
The Perceptual Aberration Scale (PAS; Chapman, Chapman, & Raulin, 1978) is a 35-item true/false inventory that assesses physical and other perceptual distortions related to thought disorders. In the current study, internal consistency (α) for the PAS was estimated to be .96 for the college, .97 for the correctional, and .92 for the psychiatric participants.
Screener for Somatoform Disorders
The Screener for Somatoform Disorders (SSD; Janca et al., 1995) is a 12-item measure designed to assess diffuse somatic complaints related to somatoform disorders as defined by the International Classification of Diseases (ICD-10; World Health Organization, 1992) and the fourth edition of the American Psychiatric Association’s (APA) Diagnostic and Statistical Manual of Mental Disorders (DSM-IV; APA, 1994). In the current study, internal consistency (α) for the SSD was estimated to be .80 for the college, .84 for the correctional, and .86 for the psychiatric participants.
State-Trait Personality Inventory
The State-Trait Personality Inventory (STPI; Spielberger, 1979) is a 60-item measure of anxiety and anger rated on a 4-point Likert-type scale (1 = almost never, 4 = almost always). The current study relied on a modified 10-item version of the instrument in which only the Trait Anger items were administered. In the current study, the estimated internal consistency (α) for Trait Anger subscale was .69 for the college, .88 for the correctional, and .88 for the psychiatric participants.
Procedure
As indicated, all participants in the current study were selected from three larger unpublished studies examining various modularized versions of the computerized adaptive version of the MMPI-2 (i.e., the MMPI-2-CA). Although the measures used in each study were consistent across settings, because of various institutional considerations/constraints, each of the data collection sites had slightly different data collection procedures. However, all participants were assessed in accordance with procedures approved by the institutional review board at the facilities where the data were collected, received compensation in line with those procedures dictated by the facilities and their respective institutional review boards, and were free to withdraw their participation at any time.
For the college participants, after agreeing to participate, individuals were assigned to complete the MMPI-2, as well as one of two packets of selected collateral measures, and then return in 1 week to complete another MMPI-2 administration and the remaining collateral measures. For the psychiatric participants, letters were sent prior to initial clinic appointments introducing the study and individuals were asked to call and set up an appointment to participate in the study. However, the actual data collection in the psychiatric participants was similar to that of the college participants, with the exception that the time frame for the second MMPI-2 administration ranged from 6 to 10 days. For both college and psychiatric participants, administration of the MMPI-2 and collateral measures was counterbalanced across the two time frames (i.e., either the adaptive or conventional version of the MMPI-2 was administered followed by the collateral measures or vice versa).
For the correctional participants, all individuals completed a standard paper and pencil audio-taped version of the MMPI-2 on intake to the correctional facility as part of standard institutional screening procedures. On completion of the paper-and-pencil audio MMPI-2, participants were recruited for participation in the larger study. Those who volunteered for the study were required to have a sixth-grade reading level, which was verified using facility records. 2 Administration of the collateral measures (as well as either a computerized conventional or adaptive MMPI-2) occurred in a group format typically between 1 and 5 days after administration of the intake MMPI-2, with 87.8% (N = 555) of participants completing the collateral measures 1 day after completing the initial MMPI-2. The administration order of the second MMPI-2 and collateral measures was counterbalanced.
In the current study, across the samples, only the first conventional MMPI-2 administered was used in the rescoring of the MMPI-2 to the MMPI-2-RF. In the case of the correctional participants, the rescored MMPI-2 selected for use in the current study was always the first MMPI-2 administered during the original intake procedure. For the college and psychiatric participants, the rescored MMPI-2 used in the current study could have been administered either approximately a week before or after the administration of some of the collateral measures because of counterbalancing of administration formats. Across settings, all collateral measures were administered via a Latin square design and were considered invalid for the current study’s analyses if 10% or more of items were not answered.
Response Style Group Assignment
After removing content nonresponsive MMPI-2-RF profiles (as described in the Participants section above), participants in each sample were assigned to one of three groups: CBIR-OR, CBIR-UR, and WNL reporting on the validity scales, based on their T scores on standard MMPI-2-RF validity. The MMPI-2-RF Validity scale T score cutoffs used for group assignment in the current study were consistent with those indicated in the MMPI-2-RF Manual for Administration, Scoring, and Interpretation (Ben-Porath & Tellegen, 2008). CBIR-OR groups were identified by T scores of 100 or greater on the Infrequent Responses (F-r), Infrequent Somatic Responses (FS), Symptom Validity (FBS-r), or Response Bias (RBS) Scales, and/or 80 or more on the Infrequent Psychopathology Responses (FP-r) Scale. CBIR-UR was identified by T scores of 70 or greater on the Uncommon Virtues (L-r) scale and/or an Adjustment Validity (K-r) scale score of 66 or higher. In addition, no individuals in the CBIR-OR group could have elevations on the under-reporting scales and vice versa. Finally, the WNL group consisted of individuals who had no elevations on any of the over-reporting and/or under-reporting MMPI-2-RF scales.
Using this group assignment procedure for the 1,065 included college participants, 100 were assigned to CBIR-OR, 59 were assigned to CBIR-UR, 903 were assigned to WNL, and 3 were removed from subsequent analyses because of having elevations on both CBIR-OR and CBIR-UR Scales. For the 613 included correctional participants, 39 were assigned to CBIR-OR, 163 were assigned to CBIR-UR, 406 were assigned to WNL, and 5 were removed from subsequent analyses because of having elevations on both CBIR-OR and CBIR-UR Scales. For the 164 included psychiatric participants, 44 were assigned to CBIR-OR, 16 were assigned to CBIR-UR, and 104 were assigned to WNL.
Comparative demographic analyses in group assignments revealed a few statistically significant differences for each of the samples. For college participants, no differences emerged in terms of gender or ethnic/racial identity. However, statistically significant differences of a small effect size (Cohen, 1988) was demonstrated for age, F(2, 1059) = 3.409, p < .033, η2 = .01. Post hoc comparisons for age using Tukey’s HSD test indicated that the mean age in the CBIR-UR group (M = 20.61 years, SD = 5.275) was significantly older than the WNL (M = 19.57 years, SD = 3.052) and CBIR-OR groups (M = 19.35 years, SD = 2.037) groups. For both correctional and psychiatric participants, no differences were found between groups in terms of age or ethnicity.
Data Analyses
To examine potential differences in MMPI-2-RF scale scores between the three response style groups that were created, a series of t tests were computed comparing mean T scores on MMPI-2-RF substantive scales for each of the three samples of participants separately. To reduce potential Type I error, a Bonferroni correction was applied and the critical alpha for analyses within each of these three samples was set to .001 (.05/42). Cohen’s d (1988) effect sizes, with .3, .5, and .8 reflecting small, medium, and large effects, respectively, for the differences between means were reported for all analyses, regardless of significance.
To examine for potential differences on collateral measure scores between response style groups, a series of t tests were computed comparing mean scores on the collateral measures for each of the three samples. As each of the eight collateral measures have unique scale score ranges, means, and standard deviations, the descriptive statistics reported for the collateral measure analyses were converted (within each group) to a z-score metric to facilitate interpretation of the results. As with the MMPI-2-RF substantive scale analyses, to reduce potential Type I error, a Bonferroni correction was applied, setting the critical alpha for these analyses at .006 (.05/8) for each of the samples. Effect sizes (d; Cohen, 1988) were reported for all analyses, regardless of significance.
Results
The first set of analyses explored the differences on substantive scale scores between the WNL and CBIR-OR response style group as well as for the WNL and CBIR-UR response style groups, across the three samples. Tables 1, 2, and 3 contain the results of these analyses for the college, correctional, and psychiatric participants, respectively.
Minnesota Multiphasic Personality Inventory–2 (MMPI-2) Scale Mean T Score Differences Between MMPI-2 Within Normal Limits (WNL) Versus Content-Responsive Invalidity: Over-Reporting (CBIR-OR) and Content Responsive Invalidity: Under-Reporting (CBIR-UR): College Participants.
Note. ES = effect size; Higher Order: EID = Emotional/Internalizing Dysfunction, THD = Thought Dysfunction, BXD = Behavioral/Externalizing Dysfunction; Restructured Clinical (RC): RCd = Demoralization, RC1 = Somatic Complaints, RC2 = Low Positive Emotions, RC3 = Cynicism, RC4 = Antisocial Behavior, RC6 = Ideas of Persecution, RC7 = Dysfunctional Negative Emotions, RC8 = Aberrant Experiences, RC9 = Hypomanic Activation; Somatic/Cognitive: MLS = Malaise, GIC = Gastrointestinal Complaints, HPC = Head Pain Complaints, NUC = Neurological Complaints, COG = Cognitive Complaints; Internalizing Scales: SUI = Suicidal/Death Ideation, HLP = Helplessness/Hopelessness, SFD = Self-Doubt, NFC = Inefficacy, STW = Stress/Worry, AXY = Anxiety, ANP = Anger Proneness, BRF = Behavior-Restricting Fears, MSF = Multiple Specific Fears; Externalizing: JCP = Juvenile Conduct Problems, SUB = Substance Abuse, AGG = Aggression, ACT = Activation; Interpersonal Scales: FML = Family Problems, IPP = Interpersonal Passivity, SAV = Social Avoidance, SHY = Shyness, DSF = Disaffiliativeness; Interest Scales: AES = Aesthetic-Literary Interests, MEC = Mechanical-Physical Interests; Personality Psychopathology Five (PSY-5): AGGR-r = Aggressiveness-Revised, PSYC-r = Psychoticism–Revised, DISC-r = Disconstraint–Revised, NEGE-r = Negative Emotionality/Neuroticism–Revised, INTR-r = Introversion/Low Positive Emotionality–Revised.
Minnesota Multiphasic Personality Inventory–2 (MMPI-2) Scale Mean T Score Differences Between MMPI-2 Within Normal Limits (WNL) Versus Content-Responsive Invalidity: Over-Reporting (CBIR-OR) and Content Responsive Invalidity: Under-Reporting (CBIR-UR): Correctional Participants.
Note. ES = effect size; Higher Order: EID = Emotional/Internalizing Dysfunction, THD = Thought Dysfunction, BXD = Behavioral/Externalizing Dysfunction; Restructured Clinical (RC): RCd = Demoralization, RC1 = Somatic Complaints, RC2 = Low Positive Emotions, RC3 = Cynicism, RC4 = Antisocial Behavior, RC6 = Ideas of Persecution, RC7 = Dysfunctional Negative Emotions, RC8 = Aberrant Experiences, RC9 = Hypomanic Activation; Somatic/Cognitive: MLS = Malaise, GIC = Gastrointestinal Complaints, HPC = Head Pain Complaints, NUC = Neurological Complaints, COG = Cognitive Complaints; Internalizing Scales: SUI = Suicidal/Death Ideation, HLP = Helplessness/Hopelessness, SFD = Self-Doubt, NFC = Inefficacy, STW = Stress/Worry, AXY = Anxiety, ANP = Anger Proneness, BRF = Behavior-Restricting Fears, MSF = Multiple Specific Fears; Externalizing: JCP = Juvenile Conduct Problems, SUB = Substance Abuse, AGG = Aggression, ACT = Activation; Interpersonal Scales: FML = Family Problems, IPP = Interpersonal Passivity, SAV = Social Avoidance, SHY = Shyness, DSF = Disaffiliativeness; Interest Scales: AES = Aesthetic-Literary Interests, MEC = Mechanical-Physical Interests; Personality Psychopathology Five (PSY-5): AGGR-r = Aggressiveness–Revised, PSYC-r = Psychoticism–Revised, DISC-r = Disconstraint–Revised, NEGE-r = Negative Emotionality/Neuroticism–Revised, INTR-r = Introversion/Low Positive Emotionality–Revised.
Minnesota Multiphasic Personality Inventory–2 (MMPI-2) Scale Mean T Score Differences between MMPI-2 Within Normal Limits (WNL) Versus Content-Responsive Invalidity: Over-Reporting (CBIR-OR) and Content-Responsive Invalidity: Under-Reporting (CBIR-UR): Psychiatric Participants.
Note. ES = effect size; Higher Order: EID = Emotional/Internalizing Dysfunction, THD = Thought Dysfunction, BXD = Behavioral/Externalizing Dysfunction; Restructured Clinical (RC): RCd = Demoralization, RC1 = Somatic Complaints, RC2 = Low Positive Emotions, RC3 = Cynicism, RC4 = Antisocial Behavior, RC6 = Ideas of Persecution, RC7 = Dysfunctional Negative Emotions, RC8 = Aberrant Experiences, RC9 = Hypomanic Activation; Somatic/Cognitive: MLS = Malaise, GIC = Gastrointestinal Complaints, HPC = Head Pain Complaints, NUC = Neurological Complaints, COG = Cognitive Complaints; Internalizing Scales: SUI = Suicidal/Death Ideation, HLP = Helplessness/Hopelessness, SFD = Self-Doubt, NFC = Inefficacy, STW = Stress/Worry, AXY = Anxiety, ANP = Anger Proneness, BRF = Behavior-Restricting Fears, MSF = Multiple Specific Fears; Externalizing: JCP = Juvenile Conduct Problems, SUB = Substance Abuse, AGG = Aggression, ACT = Activation; Interpersonal Scales: FML = Family Problems, IPP = Interpersonal Passivity, SAV = Social Avoidance, SHY = Shyness, DSF = Disaffiliativeness; Interest Scales: AES = Aesthetic-Literary Interests, MEC = Mechanical–Physical Interests; Personality Psychopathology Five (PSY-5): AGGR-r = Aggressiveness–Revised, PSYC-r = Psychoticism–Revised, DISC-r = Disconstraint–Revised, NEGE-r = Negative Emotionality/Neuroticism–Revised, INTR-r = Introversion/Low Positive Emotionality–Revised.
After a Bonferroni correction, t-test results for the 42 comparisons of MMPI-2-RF substantive scale scores for the CBIR-OR and WNL groups indicated that 36 scale scores for college participants, 36 scale scores for correctional participants, and 35 scale scores for psychiatric participants were significantly different. In all cases, mean MMPI-2-RF substantive scale scores for the CBIR-OR groups were significantly higher than the WNL groups. Median effect sizes for the CBIR-OR and WNL mean scale score comparisons for the Higher Order (H-O) scales were 1.05 for college participants (range = 0.91 to 1.57), 2.08 for correctional participants (range = 0.63 to 2.46), and 1.56 for the psychiatric participants (range = 0.75 to 1.86). Median effect sizes for the CBIR-OR and WNL mean scale score comparisons for the Restructured Clinical scales were 1.07 for college participants (range = 0.71 to 1.49), 2.07 for correctional participants (range = 0.72 to 2.34), and 1.37 for the psychiatric participants (range = 0.82 to 2.01). For the Specific Problems and Interest scales, median effect sizes for the CBIR-OR and WNL scale score comparisons were .70 for college participants (range = 0.10 to 1.45), 1.41 for correctional participants (range = −0.31 to 2.99), and 1.07 for the psychiatric participants (range = −0.05 to 1.87). Finally, median effect sizes for the CBIR-OR and WNL comparisons for scores on the PSY-5-r scales were 0.59 for college participants (range = 0.24 to 1.64), 1.26 for correctional participants (range = −0.42 to 2.49), and 0.55 for the psychiatric participants (range = 0.24 to 1.85). For the CBIR-OR and WNL effect size analyses, nonsignificant results were included in median calculations.
For the comparison of mean MMPI-2-RF substantive scale scores between CBIR-UR and WNL response style groups for each of the three samples, after a Bonferroni correction, t-test results for the 42 comparisons indicated that 30 scale scores for college participants, 32 scale scores for correctional participants, and 16 scale scores for psychiatric participants were significantly different. When significant differences existed, mean scores for the CBIR-UR groups were significantly lower than the WNL responders’ mean scores. Median effect sizes for the CBIR-UR and WNL mean scale score comparisons for the H-O scales were −0.65 for college participants (range = −0.50 to −1.04), −0.96 for correctional participants (range = −.49 to −1.09), and −0.74 for the psychiatric participants (range = −0.57 to −1.63). Median effect sizes for the CBIR-UR and WNL mean scale score comparisons for the Restructured Clinical scales were −0.61 for college participants (range = −0.28 to −1.13), −0.73 for correctional participants (range = −0.39 to −1.05), and −0.92 for the psychiatric participants (range = −0.37 to −1.42). For the Specific Problems and Interest scales, median effect sizes for the CBIR-UR and WNL scale score comparisons were −0.51 for college participants (range = 0.20 to −0.97), −0.54 for correctional participants (range = 0.24 to −0.85), and −0.73 for the psychiatric participants (range = 0.35 to −1.23). Finally, median effect sizes for the CBIR-UR and WNL comparisons for scores on the PSY-5-r scales were −0.43 for college participants (range = 0.14 to −1.13), −0.50 for correctional participants (range = −0.07 to −0.97), and −0.45 for the psychiatric participants (range = 0.04 to −1.39). As with the CBIR-OR analyses, for the CBIR-UR and WNL effect size analyses nonsignificant results were included in median calculations.
Tables 4, 5, and 6 report the scores on the collateral measures for the CBIR-OR and WNL response style groups, as well as CBIR-UR and WNL response style groups for the college, correctional, and psychiatric participants, respectively. Within the tables, total scores for collateral measures are reported in z scores, and the measures are divided into three broad categorizations (i.e., externalizing, internalizing, and thought dysfunction) based on the general content and/or the construct examined by each collateral measure.
Comparison of Collateral Measure Mean Scores for Within Normal Limits (WNL) Group With Content Responsive Invalidity: Over-Reporting (CBIR-OR) and Content Responsive Invalidity: Under-Reporting (CBIR-UR): College Participants.
Comparison of Collateral Measure Mean Scores for Within Normal Limits (WNL) Group With Content Responsive Invalidity: Over-Reporting (CBIR-OR) and Content Responsive Invalidity: Under-Reporting (CRI-UR): Correctional Participants.
Comparison of Collateral Measure Mean Scores for Within Normal Limits (WNL) Group With Content Responsive Invalidity: Over-Reporting (CBIR-OR) and Content Responsive Invalidity: Under-Reporting (CBIR-UR): Psychiatric Participants.
For the CBIR-OR response style groups, results indicated that the CBIR-OR response style groups demonstrated significantly different mean scores on the majority of the eight measures when compared with the WNL groups after applying a Bonferroni correction. For the college participants, all the externalizing, internalizing, and thought disorders measures differed significantly compared with the WNL group (median effect size [d] = −0.91, range = −0.54 to −1.33). For the correctional participants, two of four externalizing, all internalizing, and all thought disorders measures differed significantly compared with the WNL group (median effect size [d] = −1.12, range = −0.20 to −2.00). Finally, for the psychiatric participants, three of four externalizing, both internalizing, and both thought disorders measures were significantly different (median effect size [d] = −1.15, range = −0.27 to −1.84). The median effect sizes include all results (i.e., significant and nonsignificant) in their calculations. For the scales that demonstrated a statistically significant difference across the three samples, inspection of the mean scores indicated the CBIR-OR group reported increased negative functioning on the criterion measures when compared with the WNL group.
Finally, with respect to the CBIR-UR group, after applying a Bonferroni correction, the CBIR-UR response style group demonstrated significantly different mean scores on a number of the eight measures when compared with the WNL groups. For the college participants, one of four externalizing, all internalizing, and one of two thought disorders measures differed significantly compared with the WNL group (median effect size [d] = 0.42, range = 0.19 to 0.68). For the correctional participants, all externalizing, all internalizing, but no thought disorders measures differed significantly compared with the WNL group (median effect size [d] = 0.42, range = 0.05 to 0.87). Finally, for the psychiatric participants, two of four of the externalizing, one of two internalizing, and no thought disorders measures were significantly different (median effect size [d] = 0.64, range = 0.10 to 1.24). The median effect sizes include all results (i.e., significant and nonsignificant) in their calculations. For the scales that demonstrated a statistically significant difference, inspection of the mean scores indicated the CBIR-UR groups reported increased positive functioning or fewer experiences of psychopathology-related symptoms on the criterion measures compared with the WNL groups across the three samples.
Discussion
The current study examined associations between CBIR response styles, as determined by MMPI-2-RF Validity scale cut scores, and scores on the MMPI-2-RF substantive scales and collateral measures in three groups of participants. As hypothesized, results indicated that individuals identified as potentially engaging in CBIR scored significantly higher on MMPI-2-RF and collateral measures of dysfunction than did participants who responded to the MMPI-2-RF in a valid manner. Specifically, individuals identified as potentially over-reporting were significantly more likely to report experiencing higher levels of social, behavioral, and psychological difficulties on both the MMPI-2-RF substantive scales and non-MMPI-2-RF collateral measures. Those identified as possibly engaging in an under-reporting response style were more likely to report having fewer social, behavioral, and psychological difficulties indicated by their scores on both MMPI-2-RF substantive scales and collateral measures, though the effect sizes of these differences were not as large as those demonstrated for over-reporting. One exception to these trends was that a smaller number of MMPI-2-RF substantive scale scores were lower for the under-reporting group in the psychiatric sample. This likely reflects the reduced power available for analyses conducted with this smaller sample.
Most germane to the purpose of the current study, individuals identified as engaging in a potential CBIR response style by MMPI-2-RF Validity scales demonstrated numerous statistically significant differences on the eight self-report collateral measures when compared with individuals believed to be responding honestly. This result was demonstrated for collateral measures that were administered either at the same time as the MMPI-2-RF or within a range of 1 to 10 days prior to or afterward (depending on the participant sample). For the CBIR-OR groups, all eight measures for the College participants, six of eight measures for the Correctional participants, and seven of eight measures for the Psychiatric participants were significantly different compared with the WNL group. The pattern of differences observed for these groups indicated that MMPI-2-RF-detected over-reporters endorsed significantly higher levels of internalizing, externalizing, and/or thought dysfunction on the collateral measures. Conversely, for the CBIR-UR groups, of the eight analyzed collateral measures, scores on four measures in the College participants, six of the eight measures in the correctional participants, and three of the measures in the psychiatric participants reflected significantly lower levels of externalizing, internalizing, and/or thought dysfunction when compared with the WNL group.
Our ability to use three quite different samples in which the collateral measures were administered with different timing points to a robust finding that invalid responding as detected by the MMPI-2-RF validity indicators generalizes to other measures administered along with the test. As noted earlier, this finding has important implications for clinical and correctional assessments in which the MMPI-2-RF is administered as part of a battery of tests that includes measures that lack validity scales. The implications indicated by validity scale results for interpretation of scores on the MMPI-2-RF substantive scales generalize to other measures and should be applied to their interpretation as well.
Although the current study has a number of strengths, use of “naturally occurring” (rather than simulated) invalid responding, replication in three distinct settings, and varying time frames for the administration of the collateral measures, a number of limitations must be acknowledged as well. Foremost among these is our broad classification of over-reporting or under-reporting based on any elevation on a single validity indicator. In practice, interpretive guidelines for the MMPI-2-RF indicate a need to consider scores on some validity indicators conjointly. For the present investigation, we opted not to do so in order to first explore the generalizability of validity scale findings at the broadest level. This approach led to roughly 9% of college, 6% of correctional, and 27% of psychiatric participants being identified as potentially engaging in CBIR-OR. Conversely, roughly 6% of college, 27% of correctional, and 10% of psychiatric participants were identified as potentially engaging in CBIR-UR. In examining the comparative CBIR-OR percentages, it is possible that some individuals in the psychiatric group who had legitimate psychological distress were misidentified as over-reporting because they had an elevated score on F-r. On the other hand, when examining the percentages of the CBIR-UR classifications across samples, it is not as likely that the discrepancy between the percentage of correctional and other groups reflected a genuinely overly virtuous or high level of psychological adjustment in the correctional participants. Furthermore, in a very limited number of cases, individuals were identified as members of both groups. This finding most likely was the result of using MMPI-2-RF cut scores that were slightly below those of more definitive invalidity as stated in the manual for several of the scales. In any event, although the overall results of the current study are similar to previous findings for the MMPI-2 and MMPI-2-RF substantive scales and collateral measures, future studies might consider using a different method of CBIR group assignment, such as symptom validity testing in combination with MMPI-2-RF Validity scale scores.
A second limitation of the current study involved the inability to examine the potential differences in substantive and collateral measure scores that specific MMPI-2-RF over- and/or under-reporting scales might suggest due to relatively small sample sizes for the CBIR-OR and UR groups. For example, individuals who have elevated scores on Fp-r (a measure of over-reporting severe psychopathology) might have substantially different scores on the MMPI-2-RF substantive scale and/or collateral measures compared with individuals who elevated on Fs (a measure of over-reporting somatic problems). On the other hand, different under-reporting strategies (as reflected by L-r and K-r) might also lead to differential patters on both the MMPI-2-RF substantive scales as well as collateral measures. We encourage researchers with access to considerably larger databases to apply the methodology of the current study to further explore the impact of more specifically delineated response styles.
A third limitation of the current study is that we focused only on self-report collateral measures to explore the impact of CBIR response styles. Therefore, we were not able to examine the potential impact on other information gathered via different means. Future studies should examine whether other forms of client-provided information (e.g., structured and unstructured interviews) might be affected by response biases detected by the MMPI-2-RF validity indicators. A final limitation of the current study involves the fact that only men were examined in the psychiatric and correctional groups. Future studies should attempt to include women from these populations as well as to explore the impact of CBIR styles in additional populations with both genders (e.g., other legal settings, employee screening settings, etc.).
The limitations just noted notwithstanding, our findings suggest that if the MMPI-2-RF Validity scale scores identify an individual as potentially engaging in an exaggerated or over-reporting response style, scores on collateral measures administered either conjointly or within a brief period of time are also likely to reflect an attempt to present one’s self in an overly negative or psychologically dysfunctional manner. Although not as strong as the results for the CBIR-OR analyses, if individuals are suspected of under-reporting on the MMPI-2-RF, they are also likely to suppress scores on collateral measures, reflecting an attempt to present one’s self in an overly positive or psychologically healthy fashion. These findings have both clinical and research implications. Clinically, when the MMPI-2-RF is administered as part of a battery of instruments that includes other self-report scales that lack validity indicators, the current study suggests that the cautions indicated for MMPI-2-RF interpretation should also be considered when interpreting these collateral measures. In addition, the current study suggests that researchers who rely on self-report measures that lack validity scales should consider including measures that include such scales, such as the MMPI-2-RF, in their designs.
Footnotes
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Yossef S. Ben-Porath is a paid consultant to the MMPI publisher, the University of Minnesota, and distributor, Pearson. As coauthor of the MMPI-2-RF, Dr. Ben-Porath receives royalties on sales of the test.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported in part by a grant to Yossef S. Ben-Porath from the University of Minnesota Press, publisher of the MMPI-2.
