Abstract
There has been significant interest in examining the developmental factors that predispose individuals to chronic criminal offending. This body of research has identified some social-environmental risk factors as potentially important. At the same time, the research producing these results has generally failed to employ genetically sensitive research designs, thereby potentially generating biased parameter estimates. The current study addresses this gap in the literature by using both a standard social science methodology (SSSM) and two separate genetically informative research designs to examine whether parent, teacher, and peer risk factors are associated with four maladaptive outcomes: arrests, low IQ, reduced self-control, and a combined measure of the “truly disadvantaged.” Analysis of twin pairs drawn from the National Longitudinal Study of Adolescent to Adult Health revealed that the SSSMs produced upwardly biased estimates of the impact of social-environmental influences on each of the four outcomes. Once genetic factors were controlled, the effect of social-environmental risk was reduced (but remained significant in certain cases). We conclude by discussing these findings in the context of criminal justice policy and their implications for future criminological research.
The criminal offenders who pose the greatest risk to society are those who are characterized as being career criminals or life-course-persistent offenders (DeLisi, 2005; Farrington, 2003). For these offenders, their antisocial behaviors begin to emerge early in the life course, they tend to engage in a wide array of serious criminal acts in adolescence and adulthood, and they are unlikely to desist from their criminal actions (Moffitt, 1993). Over the duration of their criminal career, they tend to accrue lengthy criminal records with multiple contacts with the criminal justice system. Recent estimates suggest that the most violent of these offenders may cost taxpayers millions of dollars annually (DeLisi et al., 2010). These numbers are staggering and, perhaps, even more importantly, do not account for physical injuries and emotional trauma that are inflicted on the victims and their families. No matter how the toll on society is measured, these life-long criminal offenders represent a key threat to public safety and security.
The ability to prospectively identify causal pathways to chronic, criminal offending is thus a quintessential goal of the criminal justice system and of criminological research (Cullen, 2011; Farrington, 2003). To the extent that it becomes possible to predict these types of offending patterns, the criminal justice system will be in a better position to effectively deal with these offenders. Whether it is through interventions implemented early in life or via more individually tailored rehabilitation programs in adulthood, understanding the etiology of chronic offenders is at the heart of a more effective criminal justice system. Unfortunately, despite concerted efforts, the accuracy with which it is possible to identify these offenders is less-than-ideal (Weisburd & Piquero, 2008).
The research that has shown promise, however, focuses on environmental risk factors that have some predictive ability in the identification of chronic offenders. According to results from this body of research, there are a host of social-environmental risk factors that likely have causal influences on serious, chronic offending. To illustrate, criminological research has revealed that living in a crime-ridden neighborhood, being raised by cold and withdrawn parents, being reared in poverty, and associating with delinquent peers are all key contributors to the development of chronic offending (Sampson & Laub, 1993; Warr, 2002). While most criminologists stop short of stating definitively that these risk factors are causes, they use language that strongly suggests this is the case. Moreover, based on the findings from these studies, criminologists frequently advocate for programs and policies that would only make sense and that would only be effective if these risk factors were causal agents (e.g., Cullen, 2011; Wasserman & Miller, 1998).
There are two key limitations, however, with the existing criminological research that has examined the development of chronic offending. First, the outcome measure used in these studies is typically operationalized as some measure of criminal offending. While this is certainly the anchor of being a chronic offender, research has underscored the fact that the most high-rate offenders tend to be characterized as being saturated with multiple risk factors (Huizinga & Jakob-Chien, 1998). These offenders, for instance, tend to be typified by having antisocial personality traits, relationship problems with their family members, difficulties at school, and they tend to associate with antisocial peers (J. P. Wright, Tibbetts, & Daigle, 2015). In addition, they also have financial difficulties, reside in disadvantaged social environments, and live a criminalistic and parasitic lifestyle (R. T. Wright & Decker, 1997). Any attempt to fully understand and identify chronic offenders, therefore, requires a much more comprehensive outcome measure than one that focuses exclusively on criminal offending (e.g., arrests). While these risk factors might be viewed as predictor variables of criminal offending, empirical research has revealed that this is not necessarily the case (J. P. Wright, Beaver, & Gibson, 2010). Instead, it appears as though these criminogenic risk factors emerge from a common etiology or are products of criminal offending. As a result, when the focus is on chronic offenders, it is arguably more appropriate to create a comprehensive measure that include not just criminal offending, but also the other criminogenic risk factors that covary with it. That is to say, chronic offending is probably best characterized as stemming from lifestyles of the “truly disadvantaged” (to borrow the term from Wilson, 1987).
The second main limitation of this body of criminological research is that there is an exclusive focus on examining social-environmental risk factors, particularly family, school, and parenting influences, as causes of chronic offending (e.g., Hirschi, 1969; Sampson & Laub, 1993; Warr, 2002). Even among those criminologists who recognize genetic factors might be involved, those influences are rarely taken seriously by the broader criminological community (Beaver, 2013). What this necessarily means is that research exploring the causal routes to career criminality does so by using methods and designs that are incapable of controlling for genetic confounders (e.g., Piquero, Farrington, & Blumstein, 2007). The end result is that the knowledge base about the causes and correlates of career-criminal offending produced by criminologists has failed to consider that such findings could be partially biased and the result of selecting research designs that are unable to rule out genetic confounding (Barnes, Boutwell, et al., 2014; J. R. Harris, 1995, 1998; Pinker, 2002; Rowe, 1994).
The current study
Against this backdrop, there are two key goals of the current study. First, instead of focusing only on criminal behavior as an outcome, we also include some other individual-level risk factors, such as measures tapping reduced levels of self-control and low IQ, to create a measure of the “truly disadvantaged,” a term borrowed (albeit in a very different context) from Wilson’s influential work (1987). In doing so, we take into account a more global measure of career criminals, something that criminologists have argued for previously (J. P. Wright et al., 2010). Second, we examine some of the main social-environmental risk factors that have been identified in previous research as being consistently related to career criminals. Rather than simply estimating the effects of these risk factors without controlling for genetics, we opt to employ a two-pronged approach, wherein we first examine the effects of these risk factors without controlling for genetic influences and then we re-estimate these effects of the risk factors after controlling for genetic influences. Doing so allows us to compare our results with those generated in other studies and to provide some suggestive evidence of what likely would have occurred in previous studies had researchers used genetically informative research designs.
Method
Data
Data for this analysis were drawn from the National Longitudinal Study of Adolescent to Adult Health (Add Health; K. M. Harris et al., 2009). The Add Health data have been described at length elsewhere (K. M. Harris et al., 2009) and the twin subsample—which was relied upon for the present study—has also been described in detail (K. M. Harris et al., 2006, 2013). Briefly, the Add Health is a nationally representative, prospective longitudinal study of American youth who were enrolled in middle school or high school during the mid-1990s. Sampling began at the school-level and included more than 90,000 respondents from over 100 schools nationwide. A subsample of more than 20,000 students was selected and administered an in-depth survey questionnaire in the privacy of their own home. These interviews are known as the wave 1 interview, and most respondents were between 12 and 19 years of age at that time. Approximately 1 year after the wave 1 interview, another round of surveys was administered (wave 2). Nearly 5 years after the wave 2 interviews, the third wave of data collection took place. At this point, most of the respondents had reached young adulthood—most were between 18 and 26 years old—so the survey items were updated to reflect the aging cohort. Finally, a fourth wave of survey data was collected approximately 6 years after the third wave was completed. Most respondents were between 24 and 32 years old during the fourth wave of interviews. Thus, the Add Health data cover approximately 13 years of development between the adolescence and young adulthood phase of the life course. Each interview asked the respondent to report on a host of social, physical, and psychological domains, making the Add Health one of the most comprehensive data sources available.
One feature of the Add Health data collection is critical to point out. Specifically, twins were oversampled and included in the longitudinal component of the study with certainty (K. M. Harris et al., 2006, 2013). Any respondent who self-identified as a twin was automatically selected for the longitudinal portion of the study. Likewise, their co-twin was included in the study with certainty. In order to maintain the integrity of the longitudinal sample, Add Health researchers placed a high priority on locating and interviewing twin pairs during each of the follow-up waves. Thus, there is relatively little loss of information due to study attrition. Overall, and before removing cases due to item non-response, the Add Health included n = 289 MZ (monozygotic) twin pairs (n = 578 twins; n = 141 male pairs; n = 141 female pairs) and n = 450 DZ (dizygotic) twin pairs (n = 900 twins; n = 132 same-sex male pairs; n = 113 same-sex female pairs; and n = 196 opposite-sex pairs). After eliminating cases with missing data due to item non-response, analytic sample sizes ranged between n = 330; 449 MZ twins and n = 459; 621 DZ twins.
Measures
Dependent variables
Arrest wave 4
During the wave 4 interview, all respondents were asked whether they had ever been arrested. This measure taps into the individual’s lifetime arrest history. Respondents were asked to select (the interview was carried out with audio computer-assisted interviewing techniques) “no” (coded 0) or “yes” (coded 1). There were 73 respondents who did not receive this question because they were interviewed in prison. For the purposes of this analysis, all prisoners were coded as 1. Descriptive statistics are provided in Table 1.
Descriptive statistics for the study variables.
IQ wave 3
All respondents were administered the Peabody Picture Vocabulary Test (PVT) during the wave 3 interview. The PVT is best described as a verbal IQ test, which means that it taps into the verbal domains of intelligence. PVT scores were standardized so that the mean of entire wave 3 sample was 100, the median was 102, and the standard deviation was 15. As shown in Table 1, the twin subsample did not substantively depart from this pattern (see also Barnes & Boutwell, 2013).
Self-control W4
During the wave 4 interview, respondents were asked a series of questions about their personality. For example, respondents were asked how much they agreed or disagreed (all items were originally coded on a Likert scale that ranged between 1 and 5, and larger values indicated lower self-control) with statements such as “I have frequent mood swings,” “I get angry easily,” “I am relaxed most of the time,” and “When making a decision, I go with my ‘gut feeling’ and don’t think much about the consequences of each alternative.” In addition to these items, respondents were asked whether they had ever been told by a doctor, nurse, or other health care provider that they have or had ADD or ADHD (coded so that 1 = no and 5 = yes to remain consistent with the Likert scale items). In all, 17 questions appeared to tap into respondent self-control. Exploratory factor analysis indicated that a single latent factor accounted for 80% of the variance in the 17 items. As a result, the scale was generated by summing the responses to the 17 items (α = 0.784). Next, the resulting scale was reverse-coded so that larger values indicated higher levels of self-control. As shown in Table 1, the mean of the scale was 33.847, the median value was 35, and the range of observed values was between 1 and 57.
Truly disadvantaged
An indicator of whether the individual was “truly disadvantaged” was created by referencing his or her scores on the three dependent variables described previously. Specifically, any respondent who reported having been arrested (i.e., was coded as a 1), who was below the median IQ, and who was below the median self-control was coded as being “truly disadvantaged” (= 1). All other respondents with non-missing data on each of the three variables (arrest, IQ, self-control) were coded as 0. It is important to note that we relied on the median values for the IQ and self-control variables because of the statistical properties that this statistic provides (i.e., that approximately 50% of the sampled cases will fall above/below that value).
Genetic liability variables
A measure of genetic liability was developed for each of the dependent variables discussed above because genetic liability is expected to be trait specific. That is to say genetic liability for trait development will vary from phenotype to phenotype and it is not assumed that a gene that loads on trait A will also impact on trait B. Thus, we developed four genetic liability measures: one for arrest, one for IQ, one for self-control, and one for the truly disadvantaged indicator. The genetic liability scores were created by drawing on two pieces of information: 1) the genetic relatedness score of each twin and 2) the co-twin’s score on the phenotype of focus (Note that the data were double-entered so each twin appeared once as the focal twin and again as the co-twin. Standard errors produced by the regression models discussed later were corrected to account for the non-independence of data [more on this in what follows]). MZ twins share 100% of their DNA so any genetic liability for arrest will be present in both twins from an MZ pair. DZ twins share, on average, 50% of their distinguishing DNA meaning that one-half of the genetic liability for arrest will be present in both twins (on average) from a DZ twin pair. Thus, MZ twins whose co-twin was not arrested have the lowest genetic liability for arrest (coded as 0), DZ twins whose co-twin was not arrested have the next lowest liability for arrest (coded as 1), DZ twins whose co-twin was arrested have a higher liability of arrest than the previous two groups (coded as 2), but MZ twins whose co-twin was arrested have the highest genetic liability of any group (coded as 3). Thus, an ordinal measure tapping into genetic liability for arrest was constructed where 0 = lowest genetic liability for arrest and 3 = highest genetic liability for arrest.
A similar strategy was followed to create the genetic liability scores for IQ, for self-control, and for the truly disadvantaged indicator. One complication had to be overcome for the genetic liability scores for IQ and self-control. Specifically, the genetic liability score is well suited for nominal traits that are either present or not (e.g., arrest). When the phenotype is a quantitative trait, such as IQ and self-control, this classification scheme becomes untenable. As a result, we created the genetic liability scores for IQ and for self-control by assessing whether the co-twin was above or below the median. Respondents who scored below the median on IQ were considered “affected” and, therefore, the co-twin was coded as having high/low liability for being below the median. For instance, MZ twins whose co-twin was above the median were coded 0 and MZ twins whose co-twin was below the median were coded 3. The exact same process was carried out when creating the genetic liability score for IQ; respondents with IQ scores below the median were coded as being “affected.” This measurement strategy for tapping into genetic liability has been used by researchers studying a variety of personality and behavioral phenotypes (Boutwell, Franklin, Barnes, & Beaver, 2011; Jaffee et al., 2005).
Social risk factors
Teacher attachment wave 1
Attachment to one’s teachers was assessed during the wave 1 interview with a single item measure that asked the student how much they feel their teachers care about them. Responses were given on a scale ranging between “not at all” (coded 1) and “very much” (coded 5). Respondents with lower scores are expected to be at greater risk of developing negative outcomes later in life (Hirschi, 1969).
Parental aspirations for success wave 1
During the wave 1 interview, respondents were prompted with the following question: “On a scale of 1 to 5, where 1 is low and 5 is high, how disappointed would your [mother/adoptive mother/stepmother/foster mother] be if you did not graduate from high school?” A second question was asked, but this time the reference was college rather than high school. Combining information from these two questions resulted in a measure of maternal aspirations for success. Importantly, the exact same information was assessed for father figures (i.e., father/adoptive father/stepfather/foster father), meaning a measure of paternal aspirations for success was also available. The parental aspirations for success variable was generated by averaging the maternal measure with the paternal measure. In cases where only one measure was available, the parental measure reflects the single-parent score. The bivariate correlation between the maternal and paternal measures was r = 0.76 (p < .05).
Parental attachment wave 1
Respondents were asked to report how close they felt to their mother/adoptive mother/stepmother/foster mother. Responses ranged between “not at all” (coded as 1) and “very much” (coded as 5). A second question asked the respondent how much his/her mother figure cared about him/her. Again, responses ranged between “not at all” (coded as 1) and “very much” (coded as 5). Combining these two items together resulted in a measure of maternal attachment. These same questions were asked about the respondent’s father figure (father/adoptive father/stepfather/foster father), so a paternal attachment measure was also created. In order to generate the parental attachment variable, we averaged the scores on the maternal attachment variable with those on the paternal attachment variable. In cases where only one measure was available, the parental measure reflects the single-parent score. The bivariate correlation between the maternal and paternal measures was r = 0.40 (p < .05).
Peer drug use wave 1
All wave 1-respondents were asked to report on the drug-using behaviors of their three best friends. Specifically, participants were asked, “Of your 3 best friends, how many smoke at least one cigarette a day?” A second question referenced alcohol use in the past week and a third question referenced marijuana use in the last week. Each of the three questions was coded so that 0 = no friends, 1 = one friend, 2 = two friends, and 3 = three friends. Combining the three items together by summation resulted in a scale of peer drug use, where higher values indicated a greater level of peer drug use (α = 0.746).
Control variables
Control variables for the respondent’s age (coded in years), sex (0 = female, 1 = male), and race were included in the analysis. Race was captured by two dummy variables identifying the respondent as White (coded as 1) or Black (coded as 1). All others were coded as 0 for both variables and, therefore, served as the reference group.
Analysis plan
The analysis unfolded in a series of three steps. First, a standard social science methodology (SSSM) was employed. An SSSM is any method or statistical model that does not include controls for genetic influences (J. R. Harris, 1998). In the present context, the SSSM was estimated with a multiple regression model. The regression model was a logistic regression model when the dependent variable was coded dichotomously (i.e., arrest and truly disadvantaged). The ordinary least squares (OLS) model was estimated for the quantitative dependent variables (i.e., IQ and self-control). The SSSM can be expressed algebraically as:
There are several points to note from the SSSM. First, note that the subscript i identifies the individual twin of reference and j identifies the family to which the twin is linked. Second, the SSSM is a between-effects estimator, meaning it will estimate the impact of the variables on the right-hand side of the equation as deviations from the mean score of y for all respondents ij in the analysis. Third, the impact of the various social risk variables (i.e., teacher attachment, parental aspirations for success, parental attachment, and peer drug use) will be estimated by the coefficients labeled b1 through b4 . The twin-level controls—which are captured by the coefficients bt through bT — are the age, sex, and race of the twin. The last thing to note is that the SSSM does not include any specific control for genetic influences. Instead, any genetic influences on y are captured by the error term eij . It is important to note that the standard errors estimated in the SSSM were corrected for the clustering of twins within families.
Since genetic influences appear as part of the error term in the SSSM, there is a non-zero probability that some (or all) of the parameter estimates (i.e., the bs) are confounded due to omitted variable bias (Barnes, Boutwell, et al., 2014; Turkheimer, 2000). One potential strategy for avoiding such bias is to include a measure of genetic influences on y as a control variable in the regression model. The second step to the analysis was, therefore, to estimate an augmented version of the SSSM, this time including the relevant genetic liability variable:
These regression models will be informative because they will directly tap into the genetic liability of the dependent variable under focus. Most importantly, estimating the above model will allow one to observe whether the effect of the social risk factors on y is affected by the inclusion of the genetic liability variable. Evidence of genetic confounding will emerge if the impact of one (or more) of the social risk factors weakens once the genetic liability variable is included. As with the SSSM, the standard errors gleaned from these models were corrected due to the clustering of twins within families.
Including the genetic liability variable(s) is a useful strategy for ruling out genetic confounding. Yet, there is reason to suspect the genetic liability variables will not completely account for all of the genetic influences on y. The logic is straightforward: genetic influences on y are likely to be multifaceted, meaning it is unlikely a simple ordinal variable of genetic “risk” will fully account for the complexity of such influences. With this in mind, we introduce the final step to the analysis.
The third step to the analysis was to estimate the fixed effects (FE) regression model (see generally, Allison, 2009; Wooldridge, 2013). In a general sense, the FE model is a within-group estimator. Thus, the FE model allows one to estimate the impact of an independent variable on deviations from the group mean of y. In the context of the present analysis, the groups under examination were families of MZ twins. Recall that MZ twins share 100% of their DNA and they share much of their environment, so any similarities between two twins from a twin pair may be caused by genetic influences and shared environmental influences. The other side of this point is that MZ twins can differ only as a result of differential environmental exposures (i.e., the non-shared environment), somatic mutation, or error (e.g., measurement error). Thus, if one were to study the differences between MZ twins of the same twin pair, these differences would be free from the confounding influences of genetic and shared environmental influences. This is precisely what the FE model does. In short, the FE model analyzes group-mean (i.e., twin-pair) deviations, which will effectively control away genetic influences and shared environmental influences when MZ twins make up the estimation sample.
The FE equation can be shown to be a special case of the SSSM that was presented above. Specifically, note that b0
from the SSSM can be decomposed into two parts: a twin-level intercept µ
ij
and a family-level intercept µ
j
. The same decomposition can be carried out for the error term, which is comprised of a twin-level error term eij
and a family-level error term ej
. Inserting these twin- and family-level terms into the SSSM and then taking the first-difference with respect to the MZ twin unit results in the following model:
where every item on the right-hand side of the equation reflects the within-twin pair difference (i.e., twin 1–twin 2) of the social risk factor variables. Thus, each of the parameter estimates captures the effect of within-twin pair differences of the social risk factors on within-twin pair differences in y. Note that standard errors are corrected for the clustering of twins within families by design in the fixed effects model.
The FE model has been used extensively with samples of twin and sibling pairs to estimate the impact of environmental sources of phenotypic variance while controlling for genetic and shared environmental confounds (Barnes & Meldrum, 2015; Burt, Donnellan, Humbad, McGue, & Iacono, 2010; Connolly & Beaver, 2015; D’Onofrio et al., 2010; Ellingson, Goodnight, Van Hulle, Waldman, & D’Onofrio, 2014; Goodnight et al., 2012). 1
Results
Table 2 provides the parameter estimates that were gleaned from the three models discussed in the previous section. Note that Table 2 is broken into three sections. The first section—which displays the results from three models—analyzed the arrest indicator as the dependent variable. The second section—which also displays the results from three models—analyzed IQ as the dependent variable, and the third section analyzed self-control as the dependent variable.
The impact of social risk factors on y before and after controlling for genetic influences.
Note. Controls for demographic characteristics were included in the standard social science methodology (SSSM) models and a dummy variable for twin 1 was included in the fixed effects (FE) model. All of these control variables have been omitted from the table for parsimony.
*p < .05; **p < .01 (two-tailed).
Turning to the first set of models analyzing arrest as the dependent variable, we see that the first model is labeled “SSSM: No Gene Control.” This model corresponds to the first regression equation discussed in the previous section. The results from this model indicate that three of the four social risk factors were statistically significant predictors of arrest probability. For instance, the teacher attachment variable was estimated to have a negative impact on arrest probability, revealing that individuals who reported feeling that their teachers cared for them more during the wave 1 interview were less likely to report an arrest during the wave 4 interview. The effect of teacher attachment was −.189, meaning the log odds of arrest were reduced by .189 units for every one-unit change in teacher attachment. This value is easier to interpret if it is transformed into an odds ratio: e− .189 = 0.828. The odds ratio reveals that each one-unit increase in teacher attachment corresponds to a (0.828 −1) × 100 = −17.20% change in the odds of an arrest. In other words, the odds of an arrest are decreased by 17% for each one-unit increase in teacher attachment. This is a fairly sizeable effect, which might suggest that practical reductions in arrest risk can be achieved if youth are taught to bond more strongly with their teachers in middle and/or high school. A similar substantive conclusion can, tentatively, be drawn in terms of parental aspirations for success. The model reveals that youth who perceive that their parents have higher aspirations for their success are less likely to report an arrest in adulthood. Finally, the level of drug use in the respondent’s peer group was a strong, positive predictor of arrest probability. Although this set of findings accords with many a priori expectations, it is important to keep in mind that the SSSM model was used to generate the parameter estimates. Thus, these parameter estimate values may be biased due to omitted genetic factors.
The second model, labeled “Gene Control,” using arrest as the dependent variable, included the genetic liability variable as a control variable. Other than the addition of the genetic liability variable, this model was identical to the previous model. This means that the parameter estimates gleaned from the second model are comparable to those from the first model. If the effect of any (or all) of the social risk factors weaken from model 1 to model 2, then there may be reason to suspect omitted variables bias influenced the former set of results. As can be seen in Table 2, this is precisely the pattern of results that emerged. Once the genetic liability variable was included the effect of teacher attachment was no longer a statistically significant predictor of arrest. Moreover, the impact of parental aspirations weakened from model 1 to model 2. The same was true for the peer drug use variable. Overall, these results suggest that genetic confounding may have influenced the parameter estimates from the SSSM, but note that two of the three predictors remained statistically significant even after accounting for genetic liability. This pattern of findings may indicate that these social risk factors are, in fact, related to arrest risk. Alternatively, it may be the case that the genetic liability variable accounts for some but not all of the genetic confounding. If that were the case, we might expect to see the social risk factors retain a portion of their effect even after the genetic liability variable had been included.
The third model using arrest history as the dependent variable was estimated with the fixed effects (FE) equation. 2 As was noted in the previous section, the FE model will rule out the impact of genetic and shared environmental influences on trait variance, leaving only the variance that arises from within-twin-pair differences. When this model was estimated, the only social risk factor to retain a statistically significant influence was the peer drug use index. Specifically, the results from this analysis indicated that twins who reported more drug use in the peer group were more likely to report an arrest compared to their co-twin. It is important to reiterate that the FE model analyzes MZ twin differences and, therefore, it removes any factors that work to make MZ twins more similar to one another. Moreover, because the FE model is restricted to MZ twins, the coefficients presented in model 3 are not directly comparable to those from models 1 and 2 due to the different composition of the analytic samples. With this point in mind, we analyzed whether the parameter estimates from model 1 were moderated by zygosity. There was no evidence of moderation by zygosity (global test for equality of all coefficients and constant, X 2(df[5]) = 8.64, p = .13), suggesting that limiting the FE analysis to MZ twins has not systematically affected the FE coefficient estimates. 3
Moving to the next set of analyses, the second section of Table 2, we see that only one of the social risk factors was a statistically significant predictor of IQ scores. Specifically, parental aspirations for success was positively associated with IQ scores in the SSSM and in the Gene Control model. But, when the FE model was estimated, the influence of parental aspirations was no longer statistically significant. 4 There was no evidence of moderation by zygosity (global test for equality of all coefficients and constant, X 2(df[5]) = 4.46, p = .48), suggesting that limiting the FE analysis to MZ twins has not systematically affected the FE coefficient estimates.
The third set of analyses, where self-control was the dependent variable, revealed that teacher attachment and parental attachment were statistically significant predictors of self-control levels in the SSSM. Only teacher attachment remained statistically significant once the genetic liability variable was controlled. Notably, teacher attachment remained a statistically significant predictor of self-control levels in the FE model. This finding suggests that the twin who reported a higher level of attachment to his/her teachers tended to have higher levels of self-control than his/her co-twin. The magnitude of the effect was modest. Each one-unit increase in teacher attachment (which ranged between 1 and 5) corresponded to a 1.213 (about 1/7 of a standard deviation) increase in self-control. There was no evidence of moderation by zygosity (global test for equality of all coefficients and constant, X 2(df[5]) = 1.82, p = .87), suggesting that limiting the FE analysis to MZ twins has not systematically affected the FE coefficient estimates.
The final set of analyses is presented in Table 3. Recall that an indicator of whether the respondent was “truly disadvantaged” was created by referencing the participants arrest history, his/her IQ with respect to the median, and his/her level of self-control with respect to the median. This variable was coded dichotomously, so the logistic regression model was estimated for the SSSM and the Gene Control model. In a general sense, the substantive findings from the SSSM model mirrored those seen in the previous table: teacher attachment, parental aspirations, and peer drug use were statistically significant predictors of being identified as truly disadvantaged. Each of these variables was associated with the truly disadvantaged outcome in the expected direction. For instance, respondents who reported more drug use in their peer group had greater odds of being identified as truly disadvantaged. The second model included the genetic liability variable. This set of findings departs from those seen earlier because the findings are only modestly affected by the inclusion of genetic liability. Perhaps most interesting was that the FE model revealed that parental aspirations and peer drug use maintained statistically significant influences as predictors of one’s truly disadvantaged status. The coefficients from this model are best interpreted as linear deviations from the twin-pair mean probability of being identified as truly disadvantaged (see generally, Wooldridge, 2013). For example, a twin who scored one point higher on parental aspirations compared to the twin mean level (which ranged between 2 and 10) tended to have a .027 reduction (again, compared to the twin pairs’ mean) in the probability of being truly disadvantaged. In other words, the twin who perceives greater levels of parental aspirations was less likely to be identified as truly disadvantaged compared to his/her co-twin. 5 There was no evidence of moderation by zygosity (global test for equality of all coefficients and constant, X 2(df[5]) = 3.44, p = .63), suggesting that limiting the FE analysis to MZ twins has not systematically affected the FE coefficient estimates.
The impact of social risk factors on being identified as “truly disadvantaged” before and after controlling for genetic influences.
Note. Controls for demographic characteristics were included in the standard social science methodology (SSSM) models and a dummy variable for twin 1 was included in the fixed effects (FE) model. All of these control variables have been omitted from the table for parsimony.
*p < .05; **p < .01 (two-tailed).
Discussion
There has been considerable interest in examining the developmental pathways to serious, violent, and chronic offending. This research has uncovered some criminogenic environmental factors that might account for some of the variance in career criminality. Much of this research, however, is likely mis-specified as it fails to employ genetically sensitive research designs that are capable of removing genetic influences. The current study addressed this possibility by examining whether family, school, and peer influences were predictive of arrests, IQ, and self-control. We also estimated the effects of these environmental influences on a combined measure, termed the “truly disadvantaged.” The results of our statistical models revealed two broad findings.
First, and consistent with much of the existing criminological research, the results from the SSSMs revealed statistically significant and somewhat consistent effects of the teacher, parent, and peer measures across the outcome measures. Indeed, in the SSSM predicting the truly disadvantaged, all but one of the environmental risk measures (parental attachment) emerged as statistically significant. The second key finding from the analyses comes from the genetically informative models that indicate that the SSSMs produce upwardly biased environmental effects owing to unmeasured genetic influences. Specifically, we controlled for genetic influences in two distinct ways. In the first, we included a genetic liability covariate. For these statistical models, the environmental effects were attenuated across almost all models, though a number of statistically significant environmental effects remained. In the second genetically informative approach, fixed effects models were used. For these statistical models, almost all of the environmental effects, except for peer drug use, dropped from statistical significance.
Based on these somewhat disparate findings from three different statistical approaches, we are left to speculate which one produces the most accurate parameter estimates for environmental risk factors. As the findings of the current study and others have revealed (Harden, Mendle, Hill, Turkheimer, & Emery, 2008; Harden et al., 2007; J. R. Harris, 1998; J. P. Wright & Beaver, 2005) the SSSMs clearly are the most biased, with so-called environmental estimates including the influences produced by genetic variance (Barnes, Boutwell, et al., 2014; Cleveland, Beekman, & Zheng, 2011). As a result, we fall in line with other scholars who argue that SSSMs should be abandoned when estimating environmental effects (particularly those found within the family) on phenotypic variance (J. R. Harris, 1995, 1998; Rowe, 1994). Still, we are left to speculate as to why the two genetically sensitive models produced somewhat different findings. While perhaps not the only reasons for this difference, we do note that the genetic liability model likely provides a liberal estimate of the effect of the social risk factors on y because it only captures a small amount of the true genetic variance. The fixed effects model, in contrast, is probably the most conservative since it rules out most sources of genetic effects (rGEs, GxEs) as well as shared environmental influences (and thus, rGEs between A and C). As a result, it is probably best to think of the environmental effects that were generated in the fixed effects model as the lower bound estimates of the effect of x on y. In short, the “true” estimate of the effect of x on y is likely to be somewhere between the estimate gleaned from the “Gene Control” model and the estimate gleaned from the “Fixed Effects” model (see Table 2 and Table 3). Recall also that the FE models relied solely on MZ twins. This could have affected the parameter estimates, but our analysis of zygosity as a moderator variable for the parameter estimates in model 1 (for all of the dependent variables) strongly suggests this was not the case.
Keep in mind that in most criminological studies, the results of the analyses would have ended with the estimation and presentation of the findings from the SSSMs (Barnes, Boutwell, et al., 2014; J. P. Wright & Beaver, 2005). All conclusions, implications, and policy recommendations would then be based on the findings generated from the SSSMs. However, as our study indicates, these findings are biased and, in some cases, completely at odds with the findings generated in the genetically informative models. This is particularly salient when it comes to the development of policies designed to reduce crime and the overall criminal population. If the results of the SSSMs are accepted at face value, then programs and policies may be created that are based on the results of studies that have produced nothing more than spurious associations. If these spurious associations are marketed as causal associations, then monies and other valuable resources will be channeled into policies and programs that are destined for failure as they rest on nothing more than methodological and statistical artifacts.
Although our study offers some insight into the biasedness of environmental research that fails to account for genetic influences, there are a number of limitations that need to be addressed in follow-up studies. First, the peer drug use variable is reported on by the target respondent. Recent evidence suggests this type of measure might be prone to projection bias (Young, Rebellon, Barnes, & Weerman, 2014) and behavioral genetic research suggests that controlling away projection bias (by using a peer-reported measure of peer behavior) and genetic influences simultaneously seriously weakens the association between peer behavior and respondent behavior (TenEyck & Barnes, 2015). Second, and somewhat related to the above point, is that our measure of teacher attachment relied on a single-item indicator that had a restricted range of variance. This may have impacted the results if the limited variance on teacher attachment attenuated the covariance between it and the dependent variable(s). Third, the analyses were conducted on a sample of twin pairs which necessarily raises questions regarding the generalizability of the results. While replication studies that include other types of sibling pairs would be helpful in addressing this issue, we should note that a recent analysis revealed no significant differences between twins and non-twins in the Add Health data for most of the measures employed in the current study (Barnes & Boutwell, 2013). Last, we focused on a relatively narrow set of parent, teacher, and peer measures. It would be interesting to explore whether other environmental measures would result in a similar pattern of findings as detected here.
During the past couple of decades, there has been tremendous growth in the understanding of how, and in what ways, genetic and environmental influences impact variation in behavioral phenotypes. Most fields of study recognize, understand, and accept the likelihood that genetic influences account for a significant proportion of such phenotypic variance. The field of criminology, however, appears to be one of the main exceptions (Beaver, 2013; Barnes, Wright, et al., 2014; J. P. Wright, Barnes, et al., 2015). As demonstrated in the current study, this is a serious oversight because without recognizing even the possibility that genetic influences might matter means that criminological research will continue to be produced without controlling for genetic influences (Cleveland et al., 2011). The end result will be a biased knowledge base that is less likely to provide accurate insight into the development of public policy. Perhaps this is one of the main reasons that most programs are not very effective when it comes to reducing and preventing crime or treating offenders (Beaver, 2013).
Footnotes
Acknowledgement
This research uses data from Add Health, a program project directed by Kathleen Mullan Harris and designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill, and funded by grant P01-HD31921 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, with cooperative funding from 23 other federal agencies and foundations. Special acknowledgment is due Ronald R. Rindfuss and Barbara Entwisle for assistance in the original design. Information on how to obtain the Add Health data files is available on the Add Health website (
).
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article. No direct support was received from grant P01-HD31921 for this analysis.
