Abstract
The present study aimed to differentiate pedophilic child sex offenders (CSOs) from nonoffending controls (CTLs), as well as contact from noncontact CSOs. For this purpose, we investigated 21 contact CSOs, 20 noncontact CSOs (child pornography offenders), as well as 21 CTLs on neuropsychological test measures and indirect test measures of sexual interest. Multiple logistic regression models showed that three parameters of indirect tests and two neuropsychological test parameters allowed the differentiation of CSOs from CTLs with a maximum accuracy of 87%. The profile of contact and noncontact CSOs was remarkably similar and the optimal model for this group differentiation had a maximum accuracy of 66%, with slightly increased levels of risk-taking behavior and greater susceptibility for perceptual interference in contact CSOs than in noncontact CSOs. The findings suggest that standardized, objective methods can support the assessment of sexual offenders against children in forensic psychiatry and legal psychology.
Introduction
Pedophilia is a clinical diagnosis for adults who are recurrently and strongly sexually attracted to prepubescent children. The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-V; American Psychiatric Association [APA], 2013) differentiates between pedophilic sexual interest and pedophilic disorder. The latter applies if the affected person suffers from feelings of guilt, shame, or anxiety, if the person is otherwise adversely affected, or if the person has acted sexually against minors.
Sexual offenses against children can also be related to traits other than pedophilia, such as psychopathy (Porter et al., 2000; Turner et al., 2014), low intelligence (Lett et al., 2018), or poor response inhibition (Kärgel et al., 2017; Massau et al., 2017), as well as to states, such as alcohol intoxication (Rada, 1976). Furthermore, situational factors, such as access to children, locations to abuse them, as well as the trust and compliance of potential victims may increase the risk of committing sexual offenses against children (Holt & Massey, 2012). Nevertheless, pedophilia has to be considered as the most severe risk factor for sexual offenses against children, with estimated 40%–50% of adult child sex offenders (CSOs) and more than 60% of child pornography offenders being pedophilic (Seto, 2004; Seto et al., 2006).
Research on pedophilia is confronted with the problem that only a few pedophiles would be expected to voluntarily disclose their sexual attraction toward prepubescent children. Even nonoffending pedophilic men are severely stigmatized and are at risk of fierce discrimination (Jahnke et al., 2015). Pedophiles may deny their sexual interest in children—not just because they fear negative social consequences, but also because this interest might conflict with their own self-concept and set of values. Many convicted CSOs deny their offense or minimize their responsibility (Barbaree, 1991; Marshall, 1994). Aside from interviews and questionnaires, such as the Multiphasic Sex Inventory (MSI, Mackaronis et al., 2011; Nichols & Molinder, 2001) or the Explicit Sexual Interest Questionnaire (ESIQ, Banse et al., 2010), how can a forensic psychiatrist or psychologist reveal evidence for the presence of pedophilic interests when these are denied by the patient?
In North America, penile plethysmography (PPG), also labeled as phallometry, is commonly used to investigate the reaction to potentially erotic stimuli—with the aim of determining sexual interest in children. PPG measures blood flow to the penis in reaction to erotic stimuli. It has been shown that this method has fairly good sensitivity and high specificity for identifying pedophiles (Cantor & McPhail, 2015). Moreover, PPG has been used in Canada in hundreds of court decisions sentencing convicted sex offenders (Purcell et al., 2015). Despite its frequent use, phallometry has been criticized for lack of standardization (Marshall & Fernandez, 2003; Purcell et al., 2015), reliability (Marshall & Fernandez, 2003), and temporal stability (Mokros & Habermeyer, 2016; Müller et al., 2014). Moreover, erectile reactions can be voluntarily suppressed (McAnulty & Adams, 1991; Quinsey & Chaplin, 1988), which further questions the validity of the method. In most European countries, ethical concerns are the major reason why PPG is rarely used (Babchishin et al., 2013; Harris et al., 1999).
A less intrusive approach for assessing sexual preferences is to record blood oxygen level decrease (BOLD) responses in fMRI when participants are exposed to sexual stimuli. This method does not record the peripheral physiological response to such stimuli, but the neurocognitive correlates of processing sexual stimuli. Early fMRI studies revealed that pedophiles show increased BOLD responses in the amygdala to images of children, as compared with controls (CTLs) (Sartorius et al., 2008). Moreover, pedophiles show reduced activity to adult sexual stimuli in the hypothalamus and lateral prefrontal cortex, as compared with CTLs (Walter et al., 2007). A previous fMRI study allowed the discrimination of pedophilic from nonpedophilic participants with very high accuracy on the basis of their BOLD responses to sexual stimuli (nude persons, genitals, Ponseti et al., 2012) or faces (Ponseti et al., 2014, 2016). However, these promising findings were obtained in relatively small samples and, therefore, need further validation. Moreover, fMRI recordings and data analysis require the necessary technical equipment, considerable specialized knowledge, and time resources, which makes this technique expensive.
In contrast, so-called indirect test procedures of sexual preference require neither extensive technical equipment nor extensive training. Indirect tests are behavioral, usually computer-administered tests that seek to tackle functions of automated response selection and behavioral control. Automated response selection is fast, effortless, and cannot be easily verbalized (Shiffrin & Schneider, 1977). A classical example for automated response selection is semantic priming, that is when responses are more rapid to targets that were preceded by an associated cue (e.g., “bread” and “butter”) as compared with targets preceded by an unassociated cue (e.g., “bread” and “nurse,” Meyer & Schvaneveldt, 1971). Such a priming task would qualify as an indirect test for semantic associations because the associations are inferred from the reaction time pattern and not directly asked for. For participants, it is difficult to identify the critical information provided by the testing procedure. In pedophilia research, a prominent example of indirect testing is measuring viewing time (VT). In VT tests, participants are instructed to rate the sexual attractiveness of visual stimuli. For participants, it appears that the ratings of pictures provide the critical information, whereas, in fact, the investigator’s primary interest is the VT for categories of visual stimuli, with longer VTs for a particular category indicating stronger sexual interest (Imhoff et al., 2010; Quinsey et al., 1996; Zamansky, 1956). A recent meta-analysis showed that VT allows differentiation of CSOs and nonoffenders, but also of CSOs and other kinds of (sexual) offenders (Schmidt et al., 2017). For VTs, there is some convergent validity with PPG and self-report measures, suggesting that increased VTs to pictures of children relative to VTs to pictures of adults indeed reflect pedophilic interests (Schmidt et al., 2017).
The Implicit Association Test (IAT), developed by Greenwald et al. (1998), represents another example of indirect tests. The IAT measures the strength of automatic associations between a category (e.g., adult vs. child) and an attribute (e.g., sexy vs. nonsexy) by recording the response speed in a sorting task (Babchishin et al., 2013). Responses are presumed to be faster when the category and attribute are strongly associated in memory and mapped onto the same response (congruent condition) compared with when the category and attribute require the same response, but are only weakly associated (incongruent condition, Healy et al., 2015). It is critical that different category–attribute pairings represent the congruent condition in teleiophilic individuals (individuals sexually attracted to adults) and pedophilic individuals. In their meta-analysis, Babchishin et al. (2013) showed that IAT scores make it possible to distinguish between CSOs and nonoffenders, but with a slightly lower discriminative power than VTs. IAT measures are consistently related to VTs and self-reports, but evidence for convergent validity with PPG is unfortunately still lacking (Babchishin et al., 2013).
Another kind of indirect tests relies on automatic attentional shifts toward to sexually attractive objects (Singer, 1984). The Choice Reaction Time (CRT) task represents a prominent example for this kind of test, which uses sexual stimuli as distractors to determine sexual preference. In the CRT task, participants have to identify as quickly as possible the location of a visual target that is presented together with a sexual stimulus. The task takes advantage of the effect that sexually attractive imagery induces a sexual content-related delay to subsequent cognitive processes (Dombert et al., 2017; Geer & Bellard, 1996; Gress & Laws, 2009; Mokros et al., 2010). The relative preference for certain kinds of sexual stimuli is estimated by measuring the time it takes a participant to release attention from the image to perform the primary target detection task (for a different example of attention-based indirect tests, see Jordan et al., 2016).
Aside from using test procedures for identifying pedophilic interests, neuropsychological and neuroimaging studies have been designed to identify personality, cognitive, and neuroanatomical factors contributing to the development of pedophilia. Such studies on pedophilic men investigated their impairments of cognitive control functions (Habermeyer et al., 2013; Schiffer & Vonlaufen, 2011; Suchy et al., 2009), structural brain abnormalities (Cantor et al., 2008; Schiffer et al., 2007), lower IQ, and a higher rate of left-handedness (Blanchard et al., 2007; Cantor et al., 2004), lower processing speed (Suchy et al., 2009), or personality profiles (Cohen et al., 2002; Wilson & Cox, 1983). Some of these findings, such as the structural brain abnormalities, lower IQ, and a higher rate of left-handedness in pedophiles, support the notion that pedophilia might represent a neurodevelopmental disorder. In contrast, impairments in self-regulation have been considered as risk factor for sexual offenses and recidivism in general (Hanson & Harris, 2001; Ward & Beech, 2006). Recent studies have shown that pedophiles with a history of child sexual offenses exhibit executive dysfunctions and poorer intelligence, as compared with pedophiles without such a history (Kärgel et al., 2017; Lett et al., 2018; Massau et al., 2017), which suggests that these deficits are also risk factors for child sexual offenses in pedophiles.
Likewise, one might assume that contact CSOs and noncontact CSOs (“child pornography offenders”) represent two distinct groups of offenders that vary in their neuropsychological profile and pedophile interest: A large percentage of CSOs commit both contact as well as noncontact offenses, with more than half of the child pornography offenders also admitting contact sexual offenses (Dombert et al., 2016; Seto et al., 2011). This rate also implies, however, that a substantial portion of child pornography offenders do not commit contact offenses. This finding has spawned a debate on whether these child pornography-exclusive offenders represent a distinct group of sexual offenders (Babchishin et al., 2011). A more recent meta-analysis of Babchishin et al. (2015) suggested that child pornography-exclusive offenders show more sexual deviance (pedophilia) than contact CSOs. There are still very few studies, though, that actually compared contact CSOs and child pornography offenders, in particular by means of indirect tests. As one rare exception, Schmidt et al. (2014) found no differences between extrafamilial contact CSOs and noncontact CSOs in an aggregated measure of VTs and IAT scores, and tentatively suggested that the two groups show little variance in their sexual preference for children (see also Hempel et al., 2013; Roche et al., 2012). As yet, no study has directly compared contact CSOs and noncontact CSOs by both indirect and neuropsychological tests. Such combined testing would provide knowledge about the extent to which child pornography-exclusive offenders are a distinct group of sexual offenders, as proposed by Babchishin et al. (2015).
In the current study, we investigated pedophilic contact CSOs, noncontact CSOs with no history of child sexual assaults, and a nonoffending heterosexual control group. All participants took part in four indirect tests (VT, IAT, CRT task, and a semantic misattribution task) and underwent neuropsychological testing. Parameters were extracted from both kinds of testing and entered into multiple logistic regression analyses. In this way, we sought to reveal a sparse set of measurable and easily accessible indicators of pedophilic CSOs in general (contact and noncontact CSOs vs. controls) and of contact CSOs in particular (contact vs. noncontact CSOs). One working hypothesis was that the combination of indirect and neuropsychological test parameters would allow optimal identification of pedophilic CSOs versus nonoffending CTLs (Habermeyer et al., 2013; Schiffer & Vonlaufen, 2011; Suchy et al., 2009). More specifically, we presumed that neuropsychological tests alone would already allow fairly good identification of contact CSOs versus noncontact CSOs, with contact CSOs being characterized in particular by poorer self-regulation than noncontact CSOs (Babchishin et al., 2011). The data reported herein were collected within the framework of the Basel Measurable Indicators of Pedophilic Sex Offenders (MIPS) study.
Method
Participants
Data on three groups of adult male individuals were obtained: contact CSOs who had either been convicted of or had admitted to a contact sexual offense against a child (N = 21); noncontact CSOs who had been convicted of or admitted accessing, storing, or producing sexual material depicting children (N = 20); and healthy nonoffender CTLs (N = 21). CSOs were recruited among the outpatients and, to a lesser extent, among the inpatients of forensic-psychiatric hospitals in Switzerland and Germany. CTLs were recruited by advertisements in two local newspapers. All study participants received CHF 400 as reimbursement for participation.
Inclusion criteria for all participants were male gender, between 18 and 55 years of age, an IQ of 70 or above, and a sufficient command of written and spoken German, as well as unrestricted legal capacity to consent. For contact CSOs, a diagnosis of pedophilic disorder was mandatory, as well as a low risk for reoffending due to security precautions. The latter was based on a legal assessment of the participating CSOs. Only heterosexual noncontact CSOs and CTLs were included, to limit sample heterogeneity. Due to substantial recruitment difficulties, we could not apply this inclusion criterion for contact CSOs.
For all participants, exclusion criteria were acute diagnoses of psychiatric Axis I disorders according to the criteria of the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR; APA, 2000), physical health conditions (e.g., severe head trauma, other neurological conditions or systemic diseases possibly affecting a valid cognitive and clinical assessment), the use of medication potentially impairing performance in the tests conducted, substance abuse/dependency, and anti-androgenic therapy. Finally, for CTLs, a sexual preference for children and a previous conviction for sexual or violent offenses were incompatible with study participation. To counteract the potential risk of cognitive response bias, participants were not fully informed about the study aims. Instead, a cover story was used, focusing on male social gender roles. Participants were informed about the actual study aims after the administration of the indirect tests on the second day of testing (Supplementary material Table S1).
The study was conducted in accordance with the Declaration of Helsinki (2001) and approved by the local Ethics Committee (EK: 256/12). Written informed consent was obtained from all participants. All offenders were informed that all study data were stored anonymously and, thus, could not have any repercussions for their court case.
Materials and Procedures
Neuropsychological measures
A neuropsychological testing battery was administered to assess various cognitive functions (for a complete list of neuropsychological tests, see Supplementary material Table S2). For the purpose of the current study, we restricted the analysis to eight variables derived from seven tests, each quantifying conceptually different cognitive domains:
(1) Fluid intelligence (“IQF,” ability for abstract reasoning, measured by Scale 3 of the German intelligence test Leistungsprüfsystem [LPS], Horn, 1983);
(2) Crystallized intelligence (“IQC,” education-dependent, semantic knowledge, measured by the total score in the Mehrfachwahl-Wortschatz-Test Version B [MWT-B], Lehrl, 1999);
(3) Alerting (attention function: ability to achieve and to maintain an alert state; measured by the decrease of reaction times by presenting unspecific visual cues in the Attention Network Task [ANT], Fan et al., 2002);
(4) Orienting (attention function: ability to select specific information from sensory input; measured by the decrease of reaction times by presenting visual spatial cues in the ANT);
(5) Risk taking (willingness for taking risks, measured by the Cambridge Gambling Task [CGT], Rogers, 1999);
(6) Resistance to interference (“Stroop,” executive control function: suppression of overlearned behavior when instructed; measured by a German version of the Stroop Color and Word, Bäumler, 1985);
(7) Episodic memory (ability to encode and to retrieve recently encountered information; measured by the long-delay free recall performance obtained in the German version of California Verbal Learning Test [CVLT], Niemann et al., 2008);
(8) Working memory (WM) errors (failures in the ability to temporarily hold und use information for processing; measured by the number of omissions in a two-back task). In the two-back task, 99 digits from 1 to 9 were presented with an interstimulus interval of 3,000 ms (33 targets).
Indirect tests
Four indirect tests on sexual preference were conducted:
(1) IAT: The IAT allows the estimation of the strength of implicit attitudes by comparing carry-over effects when shifting concept/response contingencies from congruent to incongruent conditions (for teleiophilic individuals, when a button originally assigned to both the concept “adult” and attribute “sexual” is used for the concept “child” and attribute “sexual”). In the current study, the concept of maturation (adult vs. child) was represented by five Not-Real-People (NRP) pictures (Pacific Behavioral Assessment Corporation, 2004) of children (Tanner Stage 1, “T1”) and adults (Tanner Stage 5, “T5”) from either gender, all clad in swimwear. The set of attributes was subdivided in five sexual and five neutral ones (for details see Supplementary materials).
There were five blocks: In the first block, all 20 NRP images (50% female) had to be classified by pressing a button with the left or right index finger as “adult” or “child” (k = 20 trials). In the second block, all 10 attributes had to be classified by pressing a button as “sexual” or “nonsexual.” Each attribute was presented twice (k = 20 trials). In the third block, trials of the first two blocks were presented in alternating order, with the same task instruction. Responses for “adult” and “sexual” were mapped onto the same response (k = 120 trials). For teleiophilic individuals, this represented the congruent condition. In the fourth block, the same task and material as in the first block (images) was used, but the response sides were swapped (k = 20 trials). In the fifth block, trials of the first two blocks were presented again in alternating order, with the same task instruction as in the third block, except that responses for “adult” and “sexual” now required different responses (k = 120 trials). For teleiophilic individuals, this represented the incongruent condition. The concepts/attributes were presented and scored according to the improved algorithm of Greenwald et al. (2003). The resulting outcome measure yielded a single latency-based effect size (dIAT), which was presumed to indicate both the degree and direction of the sexual preference for either children or adults.
(2) VT: In the VT paradigm, participants were requested to rate NRP images of a balanced NRP picture set on a 6-point Likert-type scale from “highly sexually and aesthetically attractive” to “highly sexually and aesthetically unattractive.” All 80 female and male NRP images (T1 to T5), either in bathing suits or nude, were presented. NRP images were displayed until the participants rated them and another 1,000 ms passed. There was no time limit for providing the rating. After a delay of 1,500 ms (blank screen), a new image was shown. Each image was presented only once. To adjust the influence of gender on the outcome measure, we used the maximum median values in VT separately for Tanner Stages T1–T3 and T4 and T5, irrespective of the gender of the NRP individuals depicted (Blanchard et al., 2001; Mokros et al., 2013). As the critical measure, we extracted the effect size for longer VTs for T4 and T5, as compared with VTs for T1–T3 (dVT = [max VTT4 and T5 − max VTT1–T3]/pooled SDmax VT T4 and T5 and max VT T1–T3). The attractiveness ratings were not used as predictors.
(3) Semantic Misattribution Procedure (SMP): The SMP makes use of conceptual priming effects on the subsequent semantic evaluation of Chinese ideographs that are generally meaningless for Europeans (Blaison et al., 2012; Murphy & Zajonc, 1993). The capacity of the SMP to draw inferences on sexual preference has been demonstrated in previous studies by its application to groups of homo- and heterosexual individuals (Imhoff et al., 2011). We adopted the approach described by Imhoff et al. (2011) and modified the SMP by varying the temporal masking conditions of the prime stimulus to obtain subliminal and supraliminal prime trials (for details see Supplementary material). As prime stimuli, we selected 12 different NRP pictures, showing nude male and female individuals of Tanner Stages 1, 3, and 5.
Each trial consisted of a shortly presented prime with backward masking, followed by the presentation of a Chinese ideograph. Participants were instructed to ignore the prime and to respond to the Chinese ideograph only. By pressing a designated button, participants indicated whether a particular ideograph presumably had a sexual or nonsexual meaning. After the participant’s response, a new trial commenced. We hypothesized that the Chinese ideographs would be more likely to be assessed as having a sexual meaning after presenting a preferred sexual prime. That is, for controls, sexual attribution would be more likely after primes of Tanner Stage 5 than after primes of Tanner Stage 1 or 3, whereas pedophile sex offenders would show the opposite pattern. Analogously to the analysis of VTs (Mokros et al., 2013), we selected either female or male primes, depending on which of the two led to more sexual meaning ratings for the ideograph, for the Tanner Stages T1 and T3, as well as T5 separately. The critical measure we extracted was the difference of the likelihood for providing a sexual attribution (“rsp.rate”) after a Tanner Stage 5 prime and after Tanner Stage 1 or 3 primes (dSMP = max rsp.rateT5 − max rsp.rateT1,T3), separately for subliminal and supraliminal presentations.
(4) CRT task: Female and male NRP images of all Tanner stages, either in bathing suits or nude, were presented at the center of a screen. Participants were instructed to press one of five buttons when an orange dot appeared in one of the four corners or the center of the image, corresponding to the position of the dot (Mokros et al., 2010; Wright & Adams, 1994). Images were displayed until the participants pressed a response button (for further technical details see Supplementary material). The RTs were averaged for all combinations of sex by Tanner stage, ignoring the dressed/nude factor. Analogously to the VT procedure, the factor sex was eliminated by choosing the RT values of the sex with longer RTs for each Tanner stage. We hypothesized that RTs should be delayed for preferred sexual stimuli and, thus, CTLs would show RT delay for (female) T4 and T5 stimuli as distractors, as compared with T1–T3 stimuli, whereas pedophile sex offenders would show the opposite behavior. As for VTs, we used the maximum median values in RT separately for Tanner Stages T1–T3 and T4 and T5, irrespective of the gender of a depicted NRP individual (Mokros et al., 2013). The extracted critical measure was the effect size for longer RTs to the dots presented with T4 and T5 distractors, as compared to RTs with T1–T3 distractors (dCRT = max RTT4 and T5 − max RTT1–T3/pooled SDmax RT T4 and T5 and max RT T1–T3).
Statistical Classification
The aim of our statistical classification was twofold: (a) we sought to identify predictors for pedophilic versus teleiophilic sexual preference by comparing contact and noncontact CSOs with controls; (b) we sought to identify characteristics that would differentiate between contact versus noncontact offenders within CSOs. Both analyses aimed at identifying a sparse set of variables that would allow optimal differentiation between groups. For this purpose, we used a multiple logistic regression analysis, with five variables from the indirect test procedures (dIAT, dVT, dCRT, and dSMP, the hindmost for subliminal and supraliminal cues separately), as well as the eight variables from neuropsychological testing described above (IQF, IQC, Alerting, Orienting, Risk taking, Stroop, Episodic memory, and WM errors) as predictors, and group membership as dependent variable. To control for a potential influence of age on the findings, predictors showing a correlation with age (dVT, IQC) were adjusted for the impact of age within each group prior to the model selection process. Aside from age, the impact of handedness as measured by the Edinburgh Handedness Inventory (Oldfield, 1971) on predictors was checked by univariate regression analyses before running the multiple logistic regression analysis. Handedness did not modulate any of the predictors. The inclusion of age and handedness as additional predictors led to the identification of optimal models with the same predictors as reported below. For the sake of clarity and brevity, we report only the results with the age-adjusted predictors dVT and IQC. All descriptive tables, however, show unadjusted values.
For both group classification analyses, the nested models for all possible linear combinations of the 13 predictors (8,191 in total) were fitted by means of a multiple logistic regression analysis and the Bayesian Information Criterion (BIC) was calculated for each model. The BIC quantifies the trade-off between model fit and model complexity. It penalizes increments in model fit that are gained by including more predictors, and thus protects against overfitting (Burnham & Anderson, 2004). The minimal BIC in a group of possible models served as criterion for selecting optimal models. To assess the contribution of indirect tests and neuropsychological tests, the model with minimal BIC was identified: (a) for models exclusively based on predictors from indirect tests (“indirect measures-only model”); (b) for models exclusively based on the neuropsychological predictors (“neuropsychological measures-only model”); and (c) for models including both types of predictors (“combined model”). To evaluate the prediction performance of the models, the following parameters were derived via receiver operating characteristic (ROC) analyses: Area under the ROC curve (AUC), as well as the classification accuracy (acc), when weighting sensitivity and specificity equally (“Newton criterion”). AUC >0.8 indicate large effect sizes (Rice & Harris, 2005). The classification accuracy varies when either the sensitivity is optimized (with some resulting loss of specificity) or specificity is optimized (with some resulting loss of sensitivity).
Statistical and numerical analyses were performed by using the R Environment for Statistical Computing Version 3.5.1 (R Core Team, 2015). Univariate group comparisons were performed using Kruskal–Wallis tests for ranked data and Fisher’s exact test for count data. The descriptive statistics and univariate group comparisons are presented first in the result section in order to allow an evaluation of the group differences underlying the subsequent group classification. The R package compareGroups version 3.4.0 was used for univariate comparisons and sample descriptives. For multiple logistic regression modeling, the package brglm version 0.6.1 was used, because it implements a bias-reduction method for binomial-response general linear models following the approach of Firth (1993), which provides an improvement of computational stability over the classical approach of maximum likelihood estimation (Badi, 2017). The ROC analysis was performed using the package ROCR version 1.0-7. Standardized regression coefficients for the logistic regression models were derived following the variance-based approach of Menard (1995), using pseudo R2 values (McFadden, 1974). The standardized regression coefficients of the reported optimal models can be found in Supplementary Material (Table S3).
Results
Sample Characteristics
The three samples were similar in age, education, job prestige, and percentage of Swiss nationals, as well as in verbal and nonverbal intelligence (Table 1). The handedness of the combined sample of CSOs did not differ from CTLs (Kruskal–Wallis test χ2 = 0.30, p = .582), but within the CSOs there was a higher rate of right-handedness in noncontact CSOs than in contact CSOs (Kruskal–Wallis test χ2 = 7.38, p =.007).
Demographic Characteristics of the Three Study Samples (Mean Values and Standard Deviations, SDs).
Note. CTLs = controls; CSOs = child sex offenders. p values were obtained by calculating Kruskal–Wallis test for rank data or Fisher’s Exact Test for count data.
Job prestige was defined as occupation translated into an occupational prestige score (Featherman & Stevens, 1982).
Handedness was determined by Edinburgh Handedness Scale (Oldfield, 1971).
Classification of CSOs Versus CTLs
To determine variables that allowed the correct classification of participants with pedophilic versus teleiophilic preferences, the combined CSO samples were contrasted with the sample of CTLs. Descriptive data of the two samples and the results of the univariate comparisons between these samples are presented in Table 2.
CTLs and CSOs: Indirect and Neuropsychological Test Data.
Note. CTLs = controls; CSOs = child sex offenders; IAT = Implicit Association Test; SMP = semantic misattribution procedure; CRT = choice reaction time; IQF = fluid intelligence; IQC = crystallized intelligence; WM = working memory.
The mean values and standard deviations (SD) of variables entered into the group classification of (contact and noncontact) CSOs versus CTLs. Data were compared by Kruskal–Wallis tests between the two samples. H-statistics and effect sizes (d) for these univariate contrasts are provided. The indirect test measures (dVT, dIAT, dCRT, dSMP) refer to how adult versus children related stimuli were processed in these tests. For the SMP, we differentiated between supraliminal and subliminal cue presentation (dSMP Supra and dSMP Sub). The neuropsychological variables were presumed to reflect fluid intelligence (“IQF”), crystallized intelligence (“IQC”), alerting, orienting, risk taking, resistance to inference (“Stroop”), episodic memory, and working memory (WM) errors.
For all indirect tests, a positive effect size (d) indicates sexual preferences for adult stimuli; a nonpositive value indicates a preference for child stimuli. CSOs and CTLs differed in three indirect tests (VT, CRT task, IAT) in the expected direction: CSOs showed effect sizes close to zero in these tests, which means they had similar VTs for pictures of children and adults, they were similarly distracted by pictures of children and adults in the CRT task, and in the IAT the switch between congruent and incongruent condition had little impact on their RTs. Relative to CSOs, CTLs showed longer VTs for pictures of adults than for pictures of children, were more distracted by pictures of adults than by pictures of children in the CRT task, and the switch from the congruent to the incongruent condition had a pronounced impact on their RTs in the IAT. Group differences in the SMP were not significant.
The neuropsychological profile showed some minor group differences: CSOs committed significantly more WM errors than CTL, but the two samples did not differ in other tests. The ROC curves of the group classification analyses are illustrated in Figure 1 and the results of prediction performance parameters are summarized in Table 3 for each of the three models.

ROC curves for the classification of CSOs versus CTLs.
Performance Parameters of the Optimal Classification Models.
Note. AUC = area under the ROC curve; CSOs = child sex offenders; CTLs = controls.
The optimal classification models for differentiating child sex offenders (CSOs) and controls (CTLs), as well as noncontact CSOs (ncCSOs) and contact CSOs (cCSOs). For ncCSOs versus cCSOs, the optimal combined model did not add any improvement over the neuropsychological measures-only model and is, therefore, not reported. Accuracy measures were obtained for the Newton criterion (equal weighting of sensitivity and specificity) and adjusted for group size. The true positive rate corresponds to the sensitivity and the true negative rate to the specificity at this cut-off point.
Indirect measures-only model: The logistic regression procedure on the basis of indirect tests identified an optimal model of three variables (BIC = 67.06) that classified sexual preference (CSOs vs. CTL) with an AUC of 0.88. These three predictors were (a) the effect size of the CRT task (dCRT), (b) the age-adjusted effect size of VT (dVT), and (c) the IAT effect size (dIAT). The indicators are listed in descending order of their relative contributions to the classification (Supplementary material Table S3). Information on the test reliabilities of the indirect measures, as assessed in the current study, can be found in Supplementary materials as well.
Neuropsychological measures-only model: The optimal model from the neuropsychological test domain (BIC = 80.14), comprising (a) age-adjusted crystallized intelligence (IQC), (b) WM errors, and (c) Orienting, did not perform as well as the indirect measures-only model and resulted in an AUC = 0.78. In particular, the specificity was relatively poor. CSOs committed more WM errors, but surprisingly benefited more strongly from orienting spatial cues than CTLs and scored somewhat higher in the test for crystallized IQ.
Combined model: For models including predictors from both indirect and neuropsychological tests, the one with the lowest BIC of 66.46 was found to be largely a combination of the identified optimal models of the respective predictor domains and included: (a) dCRT, (b) dVT, (c) dIAT, (d) Orienting, and (e) WM errors. Age-adjusted IQC was no longer part of the model and WM errors reverted to being the least influential predictor. By the inclusion of the two neuropsychological tests, the AUC increased to 0.92, as compared with 0.88 for the optimal indirect measures-only model. The ROC analysis performed on the optimal combined model showed a classification accuracy of 0.87, when weighting sensitivity and specificity equally.
Classification Contact Versus Noncontact CSOs
The second analysis aimed at identifying variables for correctly classifying contact CSOs within the offender group. Group characteristics regarding the predictor variables are displayed in Table 4. The two samples differed in two of the eight neuropsychological variables, but not in indirect tests. Univariate comparisons showed that contact CSOs took higher risks in the Cambridge Gambling Task (Risk taking) and were more susceptible to interference in the Stroop task (Stroop) than noncontact CSOs. The ROC curves of the group classification analyses are illustrated in Figure 2 and the results of prediction performance parameters are summarized in Table 3.
Contact Versus Noncontact CSOs: Indirect and Neuropsychological Test Data.
Note. CTLs = controls; CSOs = child sex offenders; IAT = Implicit Association Test; SMP = semantic misattribution procedure; CRT = choice reaction time; IQF = fluid intelligence; IQC = crystallized intelligence; WM = working memory.
The mean and SD values of predictor variables eligible for classifying noncontact CSOs versus contact CSOs. Data were compared by Kruskal–Wallis tests between the two samples. H-statistics and effect sizes (d) for these univariate contrasts are provided. The two samples varied only in two neuropsychological variables (Risk taking and Stroop).

ROC curve for the selected model (solid line) discriminating noncontact CSOs from contact CSOs by the single predictor CGTRisk.
Indirect measures-only model: In the multiple regression analysis, the BIC-minimized model (BIC = 62.58) of the indirect test domain included only a single predictor, namely the effect size of the CRT task (dCRT). As apparent from Table 4, this predictor did not significantly differ between contact and noncontact CSOs in the univariate test. Accordingly, the AUC achieved in the ROC analysis was comparatively low (AUC = 0.56), suggesting that neither indirect tests in general nor the CRT task in particular were suited for the differentiation of contact versus noncontact CSOs.
Neuropsychological measures-only model: For the domain of neuropsychological tests, again a model with a single predictor, namely Risk taking, was selected (BIC = 57.35). With an AUC of 0.71, it reached a strong effect size (Rice & Harris, 2005). However, the classification accuracy was relatively low.
Combined model: No model with combined predictors from both indirect and neuropsychological test measures resulted in a lower BIC value than the neuropsychological measures-only model. Thus, the optimal combined model corresponds to the neuropsychological model and is not reported due to this redundancy.
Discussion
The current study was conducted to identify the potential of both indirect measures of sexual interest and general neuropsychological parameters to distinguish pedophilic CSOs from nonoffending controls, as well as contact CSOs from noncontact CSOs (i.e., men adjudicated for child pornography). The results of the ROC analysis show that pedophilic sex offenders could be identified on the basis of indirect and neuropsychological test data with high accuracy (AUC = 0.92). Both kinds of measurements (i.e., indirect measures and neuropsychological tests) contributed to the overall rate of correct classifications. Importantly, indirect tests alone allowed relatively highly accurate differentiation of CSOs from CTLs (AUC = 0.88). In contrast, differentiation between contact CSOs and noncontact CSOs was less pronounced and was solely due to neuropsychological tests.
Differentiating CSOs Versus CTLs
Three indirect test measures contributed to the correct classification of pedophilic CSOs: the effects size measures of VTs, IAT score, and of the CRT task. The inclusion of both the VT and IAT effect size is in line with previous studies that concluded that VT and IAT measures are highly promising for differentiating CSOs from nonoffenders or non-sex offenders (Babchishin et al., 2013; Mokros et al., 2013; Schmidt et al., 2017). The inclusion of the CRT task effect size as a further indirect test measure corroborates studies of Mokros et al. (2010) and Dombert et al. (2015) that have shown a differential RT pattern in the CRT task between CSOs and non-sex offenders. However, the CRT effect might dissipate with repetitive blocks within an experiment and, thus, could be sensitive to repeated exposure to the stimulus material (Santtila et al., 2009). This sensitivity might explain previous null findings, as reported by Rönspies et al. (2015) for instance, when using the CRT task for differentiating hetero- and homosexual men.
In contrast to the three indirect test measures that revealed significant group differences, the effect size of the SMP did not show a significant group effect in univariate statistics and did not contribute to group classification. Imhoff et al. (2011) differentiated participants with heterosexual preference from those with homosexual preference on the basis of the SMP. The authors also showed that the frequency of sexual attributions of the Chinese idiom increased with the Tanner stage of the cue. In the current study, however, the SMP response patterns did not vary between CSOs and controls, and the Tanner stage of the cue had no influence on the likelihood of sexual attributions either. This lack of a statistically significant impact of Tanner stage on response behavior suggests that our SMP design may have been flawed, very likely by the short presentation times for the Chinese ideographs that we had introduced. Given this limitation, it remains unclear whether SMP measures could further improve the differentiation of pedophilic CSOs and CTLs if the ideographs were presented according to the original timing and without backward-masking (Imhoff et al., 2011).
To sum up, the findings indicate that indirect tests are promising in assessing pedophilic sexual interest. Moreover, the results suggest that it is more reasonable to use a range of such tests for this purpose rather than just singular tests, as also recommended by others (Banse et al., 2010; Ó Ciardha & Gormley, 2013; van Leeuwen et al., 2013). The use of different tests increases the reliability of the assessment and the difficulty for participants to manipulate the results of such tests. The diagnostic value of indirect tests might be further improved by the refinement and standardization of existing indirect tests, including the collection of normative data, as well as by the development of new indirect test procedures.
As expected, the classification of CSOs versus CTLs improved when neuropsychological tests were added as predictors to the classification model. Two neuropsychological measures contributed to the optimal classification model: these two measures referred to deficits in working memory (WM) errors in CSOs, but also to greater benefit of CSOs from orienting spatial cues in the ANT (Orienting), as compared with controls. The observed deficits in WM are in line with a previous report (Massau et al., 2017), whereas orienting has to the best of our knowledge not been studied in pedophilia previously. The likely reason for the observed group difference in orienting was that the control group showed only a small benefit from spatial cues, which with 27 ms was clearly below the orienting effect seen for healthy individuals in other studies (e.g., Fan et al., 2002: 51 ± 21 ms), whereas the orienting effect in CSOs was with 41 ms close to this previously reported value.
Neuropsychological tests are of course not designed to identify pedophilic interests but to evaluate individual cognitive abilities. Nevertheless, neurodevelopmental considerations suggest that brain structural and neurocognitive alterations might co-develop with pedophilia (Cantor et al., 2004; Tenbergen et al., 2015). A lower IQ and an increased ratio of left-handers in pedophilic men as compared with teleiophilic men, as well as gray and white matter changes in pedophilic individuals have been considered as evidence for such neurodevelopmental disturbance (Blanchard et al., 2007; Cantor et al., 2004; Schiffer et al., 2007). Furthermore, impairments of executive functions have been described in pedophilia (Habermeyer et al., 2013; Schiffer & Vonlaufen, 2011; Suchy et al., 2009).
Thus, on the basis of a neurodevelopmental account of pedophilia, one would expect that pedophilic men generally tend to show some cognitive deficits and brain structural alterations. For the current sample, however, neuropsychological variables alone did not allow as accurate a classification as predictors from indirect tests, which suggests that pedophilia is not necessarily accompanied by neurocognitive deficits. Schiffer and Vonlaufen (2011) argued that some of the cognitive deficits appeared to be associated with criminality or violence rather than pedophilia, as the verbal memory deficits were more pronounced in nonpedophilic than in pedophilic child molesters. Even more importantly, Kärgel et al. (2017), Massau et al. (2017), and Lett et al. (2018) observed that deficits in executive functions and IQ were primarily present in pedophilic CSOs and widely absent in pedophilic men with no such history of offenses. This finding suggests that there is not necessarily a strong association between pedophilia and neurocognitive deficits, but between prior offense behavior and neurocognitive deficits.
Even though all pedophilic individuals in the current study had a history of child sexual offenses, their neuropsychological profile was remarkably unremarkable, so to speak. The participating CSOs showed very few significant cognitive alterations, as compared with controls. One could of course argue that some of the null-findings in the neuropsychological testing were due to the small sample sizes and the resulting lack of statistical power. But the IQ, in particular, was virtually the same in CSOs and controls. This suggests that an on average lower IQ in pedophilic CSOs cannot necessarily be taken for granted. A lower IQ in pedophilia has repeatedly been reported (Blanchard et al., 2007; Cantor et al., 2004). However, even Cantor et al. (2004) concluded that the association between low IQ and pedophilia is far too weak to allow this measure to be used as diagnostic indicator. This apparently also applies to other neuropsychological markers, as in our study the classification accuracy for differentiating CSOs and CTLs was just marginally improved when neuropsychological measures were added as predictors. The absence of neuropsychological differences between CSOs and CTLs might partially be associated with the low levels of antisociality in our sample (Stoll et al., 2019), as increased levels of antisociality have for example been associated with decreased levels of spatial intelligence (de Tribolet-Hardy et al., 2014).
Differentiating Contact CSOs and Noncontact CSOs
Interestingly, only one neuropsychological indicator, but none of the indirect measures contributed to an optimized correct classification between contact CSOs and noncontact CSOs. Even though pedophilic disorder was an inclusion criterion for the contact offenders and not for the noncontact sample, the indirect test parameters did not contribute to the differentiation of the two samples, suggesting that the pedophilic interest did not vary between the two offender groups. Our finding corroborates the view of Seto et al. (2006), who suggested that the consumption of child pornography itself is a strong indicator of pedophilia. In our study, child pornography offenders had no record of sexual contact offenses. But in general, child pornography offenders do often have prior sexual contact offenses (Seto & Eke, 2005). Child pornography offenders with prior contact sexual offenses also have an increased risk for contact sexual re-offenses as compared with child pornography offenders without such priors (Endrass et al., 2009; Seto & Eke, 2005).
Merdian et al. (2009) suggested that noncontact CSOs on average attained higher education levels, including higher computer literacy, as compared with contact offenders, due to the fact that child pornography is usually accessed from websites (via downloads or streaming) or via the exchange of such material in internet-based child pornography communities. Moreover, noncontact CSOs might exhibit less cognitive distortion than contact CSOs (Merdian et al., 2014). In line with the assumption that contact and noncontact offenders share many similarities but also differ in some psychological dimensions, we found that noncontact CSOs showed superior performance in some, yet only a few, cognitive domains. In particular, they were less prone to interference and were less risk-taking than contact CSOs. The accuracy with which the two samples could be classified was, however, far lower than the differentiation of CSOs from controls, underlining the fact that there were no striking differences between the two CSO samples. This also applies to personality factors, with the two CSO samples having similarly increased levels of neuroticism and decreased levels of conscientiousness, as compared with CTLs (Boillat et al., 2017). Moreover, the levels of antisociality did not differ between the two CSO samples and were only modestly greater than in CTLs (Stoll et al., 2019), even though other studies have found that contact CSOs usually show higher levels of antisociality than child pornography offenders (Babchishin et al., 2018).
In their meta-analysis, Seto et al. (2011) proposed that there appears to be a subgroup of online-exclusive offenders with a relatively low risk of committing contact offenses. In this context, Merdian et al. (2013) suggested that a conceptual distinction between fantasy-driven and contact-driven child pornography offenders might be useful. Merdian et al. (2018) proposed that contact-driven child pornography offenders might be more comparable with contact CSOs, in particular in their cognitive distortions, whereas fantasy-driven child pornography offenders might be characterized by intimacy deficits. For the purpose of the current study, we did not differentiate between subtypes of child pornography offenders, also due to the small sample size. Based on the differences observed between contact and noncontact CSOs within the present study, one might speculate that factors contributing to contact child sexual offenses of pedophilic men include their willingness to take risks and their greater susceptibility to interference. The latter might indicate that contact CSOs are worse than noncontact CSOs in coping with the dilemma of having sexual desires that eventually lead to offenses and victimization of children.
Limitations and Outlook
The comparatively small sample size represents a major limitation of the current study. It needs to be stated that the recruitment of pedophilic participants for the purpose of the current study was very difficult and time-consuming. Aside from candidates not being suitable for participation or unwilling to participate, treatment programs for pedophilic individuals were often not sought to the expected extent or were sometimes announced but not offered, due to the lack of trained therapists. The classification models reported herein do depend on distributional sample characteristics. Our study did not include CSOs with high risk for recidivism who might show more distinctive features than low-risk CSOs. The study did not include nonoffending pedophiles who were reported to show less neuropsychological deficits than offending pedophiles (Kärgel et al., 2017; Lett et al., 2018; Massau et al., 2017). With the comparatively small sample size, the influence of many confounding factors, such as drug history, medication status, comorbidities, life events, homosexual versus heterosexual orientation, and so on could not be addressed sufficiently. Moreover, smaller group effects were likely to remain undetected due to the lack of statistical power. Their detection would have required a larger sample size. Thus, a replication of the current findings in a larger sample that also covers high-risk pedophilic CSOs and pedophiles with no history of child sex offenses would be most desirable.
Conclusion
The findings suggest that indirect tests can support forensic psychiatrists and psychologists in their assessment of sexual offenders against children. Indirect test parameters, such as slower responses in the CRT, longer VTs, or lower IAT scores, suggest the presence of pedophilic interest, whereas certain neuropsychological test measures, such as increased risk taking, might be more related to the tendency to commit child sexual offenses. Even though the neuropsychological profile of contact and noncontact CSOs varies to some extent, the indirect test measures did not allow differentiation of the two samples, suggesting that the pedophilic interest in the two offender groups was similar. To implement standardized, indirect testing for the assessment of pedophilia, guidelines should be formulated and standard procedures should be defined. Moreover, for defined standard procedures, normative data need to be obtained.
Supplemental Material
Supplement_V1.4 – Supplemental material for Indirect and Neuropsychological Indicators of Pedophilia
Supplemental material, Supplement_V1.4 for Indirect and Neuropsychological Indicators of Pedophilia by Timm Rosburg, Marlon O. Pflueger, Andreas Mokros, Coralie Boillat, Gunnar Deuring, Thorsten Spielmann and Marc Graf in Sexual Abuse: A Journal of Research and Treatment
Footnotes
Acknowledgements
We greatly appreciate the assistance of several staff members of the Forensic Department at the University Psychiatric Clinics Basel in data collection: Barbara Buser, Jacqueline Dijkstra, Dr. Patrick Lemoine, Sophie Müller-Siemens, Franziska Prinz, Nina Rüegg, Maria Schmidlin, Matthias Stutz, and Michael Weber (in alphabetic order). Moreover, we are very thankful for the support and assistance by practicing psychiatrists and psychologists in the recruitment of study participants. We greatly appreciate the comments of three anonymous reviewers on this study and thank Dr. Rodney Yeates for language editing. We take responsibility for the integrity of the data, the accuracy of the data analyses, and have made every effort to avoid inflating statistically significant results.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The MIPS study was funded by the Federal Office of Justice (Switzerland).
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
