Abstract
The aim of this study was to examine the effectiveness of good behaviour bonds in reducing re-offending. Data on 19,478 individuals who received a good behaviour bond under section 9 of the Crimes (Sentencing Procedure) Act 1999 (NSW) for their principal offence were examined. Propensity score matching techniques were used to match offenders who received a bond of less than 24 months with offenders who received a bond of 24 months or more. These two matched groups were then compared on time to first new offence. After matching offenders on a large range of factors, time to reconviction was longer for offenders placed on bonds 24 months or longer compared with offenders placed on shorter bonds. A significant effect of bond length on re-offending was apparent for both supervised and unsupervised orders. The evidence presented here tentatively suggests long bonds are more effective in reducing re-offending than short bonds.
Introduction
With increasing prison numbers in most western countries (Australian Bureau of Statistics (ABS), 2009) and evidence suggesting that custody fails to deter further offending (see Nagin et al., 2009), there has been a renewed interest in the effectiveness of non-custodial sanctions, particularly the effectiveness of community-based sentences.
The most widely used alternative to prison in Australia is a good behaviour bond (ABS, 2013; hereafter simply referred to as a ‘bond’; for a recent discussion of the advantages and disadvantages of such orders, see New South Wales (NSW) Sentencing Council, 2011). In 2011, bonds accounted for 15% of all penalties imposed by Australian adult courts (unpublished ABS data, received 26 July 2012). Broadly speaking, a bond is an order requiring an offender to be of good behaviour, and abide by certain conditions (see further discussion below). If the offender fails to comply with the order, he or she can be re-sentenced for the original offence (as well as punished for any new offences).
In NSW, bonds may be imposed under sections 9, 10 or 12 of the Crimes (Sentencing Procedure) Act 1999 (NSW) (CSPA). Bonds imposed under section 9 permit a court, upon convicting an offender, to make an order directing an offender to enter into a good behaviour bond, instead of imposing a sentence of imprisonment. The maximum term of the bond is five years (CSPA: s9 (2)). By contrast, bonds imposed under sections 10 and 12 are not to exceed two years. Section 10 bonds can be imposed where the court has found a person guilty of an offence but not entered a conviction, while section 12 bonds arise following conviction and the imposition of a suspended sentence.
Legislative maximum for bonds ordered with or without conviction by jurisdiction.
Only two Australian studies have examined the effectiveness of bonds in reducing re-offending. Weatherburn and Trimboli (2008) examined the re-conviction rates of 12,838 male offenders, 4432 of whom had been placed on a supervised bond, 1 with the remainder placed on an unsupervised bond. They found no effect of supervision on risk of re-offending. Weatherburn and Bartels (2008) examined the re-conviction rates of 6356 offenders, 4957 of whom had been given a supervised bond and 1399 of whom had been given a suspended sentence. Again, no difference emerged in rates of reconviction between the two groups.
One possible explanation for the apparent ineffectiveness of bonds in reducing re-offending is that many bonds are comparatively short. Fifty percent of the bonds imposed in NSW courts, for example, are 12 or fewer months in duration (unpublished data for 2012, NSW Bureau of Crime Statistics and Research (BOCSAR)). Arguably, short bonds provide little incentive for offenders who need it to seek or obtain drug or psychiatric treatment. For those who are able to postpone further offending, or who only offend intermittently anyway, they also provide little in the way of an effective deterrent. The punishment for non-compliance may be severe, but the perceived risk of apprehension for many offenders may be quite low.
If long periods of conditional release are more effective than short periods in reducing the risk of further offending, it would be of interest to know whether the underlying mechanism is one of deterrence or rehabilitation. Offenders placed on a bond without any requirement for supervision are arguably less likely than offenders placed under supervision to receive treatment or some other form of rehabilitative support. If we observe a reduction in the risk of re-offending among offenders given a bond without supervision, then we have reason to believe that at least part of the mechanism producing the effect is one of deterrence.
Considering the popularity of bonds as a sentencing option, it is somewhat surprising to find that no-one to date appears to have examined the effect of bond length on re-offending. The purpose of this article, therefore, is to report the results of a study designed to examine the effect of bond length on risk of re-conviction. We use propensity score matching (PSM) to compare matched pairs of offenders, both of whom were equally likely to receive a long bond, but only one of whom actually received such a bond (the other receiving a short bond). Separate analyses are carried out for supervised versus unsupervised bonds.
Method
Research strategy
PSM attempts to address selection bias by matching treatment and comparison group subjects in terms of factors thought to influence selection into treatment. It has two distinct advantages over conventional statistical approaches to estimating the effect of treatment on a defined outcome, while controlling for other factors likely to influence the outcome. The first is that it makes no assumption about the functional form of the relationship between the dependent and independent variables. The second is that, properly implemented, it guarantees that treatment and control subjects are matched in terms of observable factors (Rosenbaum and Rubin, 1983).
The propensity score is the conditional probability of receiving the treatment, given a set of observed covariates. This score is typically estimated by constructing a logistic regression model predicting treatment status using a range of covariates related to both treatment allocation and outcome. An important concept in the use of PSM is covariate balance, which is said to exist when treatment and comparison groups are identical (within the limits of chance) in terms of the factors predicting entry into treatment. For example, if 10% of treatment subjects are Indigenous, we expect 10% of comparison group subjects to be Indigenous as well.
The extent to which balance is achieved by matching on propensity scores can be assessed using the standardised bias (SB; Rosenbaum and Rubin, 1985). For each covariate, the SB is defined as ‘the difference in the sample means of the treated and matched control sub-samples as a percentage of the square root of the average of the sample variances in both groups’ (Caliendo and Kopeinig, 2005: 15). Standardising the difference in means in this way allows variables on different scales to be compared. If the SBs for all explanatory covariates are sufficiently small, we can be confident that the treated and control groups are balanced with regard to the measured covariates included in the propensity score model.
To test the robustness of our findings, we employ another propensity score method known as inverse probability of treatment weighting (IPTW) (see Austin, 2010; Rosenbaum, 1987). Like PSM, the first step in IPTW is to estimate the conditional probability of receiving a treatment using a set of baseline covariates. However, rather than matching subjects on the basis of the resultant propensity score, the IPTW method uses the propensity score to weight each subject by the inverse of the probability of treatment. This, Austin (2010: 2139) maintains, ‘results in a synthetic population in which treatment assignment is independent of measured baseline covariates’.
In IPTW, a weighting is calculated for all offenders in the sample (not just those for whom a counterfactual case can be identified) and treatment effect estimates are adjusted accordingly. The performance of the IPTW method has been shown to be superior to other propensity score methods (Austin, 2007), particularly where the sample size is large and there is reasonable overlap in the propensity score distributions of the treatment and control groups (Ukoumunne et al., 2010). However, unlike PSM, in IPTW there are no means by which the adequacy of the propensity score model can be assessed.
It is important to recognise that, although PSM avoids some of the assumptions typically made in the context of conventional regression studies, it also makes assumptions of its own. The most important of these assumptions is ‘strong ignorability’ (see Shadish, 2012). Strong ignorability requires that treatment assignment and potential outcomes (from treatment assignment) are conditionally independent, given observed covariates. Put simply, PSM assumes that, conditional on the observed covariates, the allocation of cases to treatment and comparison groups is random. Covariate balance is not sufficient evidence to claim that the strong ignorability assumption is met. It merely tells us whether the observed covariates are balanced between the treatment and control groups.
In the present case, we have some reason for confidence that the assumption of strong ignorability is met. Our covariates include most if not all of the legal factors which courts are obliged to take into account when deciding what penalty to impose (e.g. plea, length of prior criminal record, type of prior criminal record, offence seriousness, number of concurrent offences, number of counts of the principal offence, prior penalties). They also include a number of other non-legal factors that have been found in past Australian studies to influence the sentence imposed, including age, sex, level of disadvantage and remoteness of residential address (Lulham et al., 2009).
Data
The data for this study were extracted from the BOCSAR re-offending database (ROD; see Hua and Fitzgerald, 2006). This database contains records of all persons appearing before the NSW courts charged with a criminal offence since 1994. It includes both information about the charge (e.g. offence type, concurrent offences, plea, outcome and penalty) and information about the offender (e.g. age, gender, last postcode and race). Each court record is linked, thus allowing individual offenders to be tracked over time.
Prior studies examining the deterrent effects of imprisonment sometimes restrict themselves to offenders being imprisoned for the first time (e.g. Nieuwbeerta et al., 2009) in order to control for any effects prior imprisonment might have on the response to the current (index) episode of imprisonment. This restricts the extent to which findings can be generalised to offenders who have previously been sentenced to imprisonment. Rather than restrict ourselves to offenders receiving their first bond, our study sample consists of all offenders who received a section 9 bond for their principal offence in the NSW Local Court in 2008, regardless of whether they had previously received a bond. To adjust for any effects of past bonds on recidivism, we included control variables indicating whether an offender had previously received or breached a bond (see below).
Only one record for each person was included in the analysis. If a person was convicted and placed on a section 9 bond more than once in 2008, the first appearance was selected as the index court appearance. The cut-off date for inclusion in the study was 31 December 2008. This allowed all offenders to be followed up for three years (i.e. until 31 December 2011) after their index court appearance.
Independent variable
The key independent variable of interest in this research was the length of bond issued at the index court appearance. A long bond was defined as a supervised or unsupervised bond which was 24 months or longer in duration. This definition was based on the distribution of the penalty value variable and the bi-variate relationship between sentence length and likelihood of re-offending. The length of bonds is not uniformly distributed, but tends to be grouped into discrete categories (i.e. 6, 9, 12, 18, 24, 36 months). The 24-month cut-off for long bonds represents the 75th percentile of this distribution. Classification of offenders into these subgroups resulted in the largest unadjusted difference in re-offending likelihood when short and long order lengths were compared. 2
Outcome variables
The outcome in this study is time to first new offence, defined as any proven offence committed within three years of the index court appearance. Here, we use time to next proven offence as a proxy measure of the underlying rate of re-offending, with shorter times indicating higher rates. Time to first new offence equated to the number of days that elapsed between the offender receiving a bond (i.e. index appearance date) and the date of the first subsequent proven offence. In cases where no proven offences were recorded during the observation period, the time between the index court appearance and the end of the three-year follow-up period was calculated. Time to first new offence was adjusted for any time spent in custody during the follow-up period to account for incapacitation effects. Offenders who did not have any time in the community to re-offend after adjusting for time spent in custody were excluded from the analysis. 3
Explanatory variables
There is no general consensus among researchers on the variables that should be included as controls in an analysis of the impact of penalties on the risk of re-offending. In their comprehensive review of the impact of penalties on re-offending, however, Nagin et al. (2009) argue that that the minimum set of controls should include prior record, conviction offence type, age, race and sex. In the present study, all these factors, with the exception of race, were included as controls.
Race was not included because almost 20% of offenders in the study cohort had missing data on this variable. Instead, the ABS Socio-Economic Indexes for Area (SEIFA) known as the index of relative socioeconomic disadvantage (IRSD) (ABS, 2001) were used as a proxy, on the assumption that the vast majority of Indigenous Australians coming before the NSW courts reside in areas of severe socio-economic disadvantage. It should be noted, in this context, that the SEIFA scale used for this study (ABS, 2001) actually used Indigenous status as one of the component measures of socio-economic disadvantage. Bi-variate analyses using the current data showed a very strong relationship between race and socio-economic disadvantage, which was particularly evident in the upper and lower quartiles of the SEIFA index.
Covariates included in propensity score model.
Propensity score methods
The PSMATCH2 module in STATA/MP was used to conduct the PSM (Leuven and Sianesi, 2003). This analysis involved several steps, which are detailed below.
Firstly, all the explanatory variables described above were regressed against a dichotomous variable indicating whether the offender received a long bond for their principal offence at the index court appearance. Secondly, offenders given long bonds (i.e. treated offenders) were matched with offenders given short bonds sentences (i.e. untreated offenders) based on the estimated probabilities derived from this model (i.e. propensity scores). One-to-one nearest neighbour matching with no replacement and a calliper of 0.001 was used here. Because no replacement was specified, an untreated offender could be matched only once with a treated offender. 4
The third step was to compare treatment and control groups to assess whether they differed significantly on any of the variables used to predict the propensity scores (this is known as the conditional independence assumption). The estimated SB was used for this purpose. Two SBs were estimated for each covariate: one before matching (unadjusted SB) and one after matching (adjusted SB). If the matching procedure was successful, we would expect to see the adjusted SBs for all the covariates in the propensity score model to have an absolute value of less than 20 (Apel and Sweeten, 2010).
Where covariate balance was demonstrated, Cox proportional hazards regression was used to compare the time to first new offence for the matched treatment and control groups. This method makes better use of the available follow-up data than fixing a follow-up period and then measuring the proportion who re-offended in that period. A further advantage of using Cox regression in PSM is that the vce(cluster) option in Stata/MP can be used to account for any intra-correlation between matched observations when calculating standard errors for regression coefficients (Austin, 2007). For the adjusted Cox regression models, all explanatory variables were considered for inclusion, but only those which were significant at the 0.05 level were included in the final models.
As mentioned above, a second propensity score method (IPTW) was used to test the robustness of the findings from the matching analysis. In this secondary analysis, propensity scores were estimated from the same logistic regression model described above (i.e. the model estimated for the matching analysis) and then these propensity scores were used to calculate weights for all offenders in the treatment and control groups. The weighting applied to the treatment group (i.e. offenders given a long bond) was the inverse of the propensity score. The weighting applied to the control group (i.e. offenders given a short bond) was the inverse of one minus the propensity score predicting treatment. The pweight option in STATA/MP was used to weight cases in the Cox regression re-offending models. This analysis is referred to as the ‘weighted analysis’ in the relevant sections of this paper.
Results
Descriptive statistics
Demographic, index offence and prior offending characteristics by length of order.
NS: no significant differences (at the 0.05 level) between the two groups on this covariate.
Examination of the index offence characteristics would suggest that offenders given long bonds were those who were charged with more serious offences. Compared with offenders given short bonds, the long bond group, on average, had more concurrent offences, were more likely to be legally represented and were more likely to:
be charged with an offence with a higher Median Sentence Ranking (MSR; MacKinnell et al., 2010); receive an order with supervision; and have three or more offence counts.
Longer bonds were also more likely for specific offence types, with a greater proportion of offenders convicted of burglary, fraud, drink-driving or traffic offences receiving a bond of two years or more.
There was, however, no clear pattern of findings to suggest longer bonds were reserved for offenders with more extensive prior offending histories. A large proportion of offenders had multiple prior convictions, had previously been placed on a bond and had prior episodes of imprisonment, but these variables did not vary significantly by the length of the order imposed for the index offence. The exception was that offenders given long bonds were more likely to have previously been found guilty of a drink-driving or other traffic offence, while offenders given short bonds were more likely to have previously been found guilty of a drug offence.
Re-offending outcomes for short and long bond groups, unmatched (n = 19,478).
PSM: Long versus short bonds
Propensity scores were generated from a logistic regression model predicting whether or not an offender received a bond of 24 months or more at the index court appearance. Using these propensity scores and one-to-one matching with no replacement, 4608 offenders given long bonds were matched with 4608 offenders given short bonds. This means that 94% of the original sample of long-bond cases was matched with a comparable short-bond case.
The propensity score model based on the unmatched sample (n = 18,705) 5 significantly predicted whether an offender received a bond of 24 months or more (pseudo R2 = 0.063, Likelihood ratio chi-square p value < 0.001). This suggests that there were systematic differences between the unmatched treatment and comparison cases. The model using the matched sample did not significantly predict group membership (pseudo R2 = 0.001, Likelihood ratio chi-square p value = 0.999). In other words, once treatment and comparison cases were matched, none of the covariates used in the matching process predicted group membership. This is as it should be (Sianesi, 2004).
Covariate balance checks were also carried out on the matched sample. Figure 1 presents the SB scores for all the covariates included in the propensity score model before and after the two groups were matched. Using a conservative criterion of |SB| >10, this figure shows that, prior to matching, a total of nine variables were unbalanced. SBs for these nine variables ranged from 24.6 (bond with supervision) to −23.3 (offence severity). After matching, none of the covariates were unbalanced. The variable with the largest |SB| after matching was offence counts (4 or more counts of the index offence) and the SB for this variable was well below the balance threshold (SB = 3.0). These data provide good evidence that the treated and untreated groups were adequately matched on the baseline covariates.
Standardised bias (SB) levels for each variable for the unmatched and matched samples.
Re-offending: Long versus short good behaviour bonds
Time to first new offence for short and long bond groups by propensity score method.
Standard errors have been adjusted to account for matched nature of the data.
Adjusted for demographic, offence and prior offending variables.
These analyses show that the significant relationship between sentence length and re-offending found for the unmatched sample was still apparent after offenders were matched using propensity score techniques. The hazard ratio associated with treatment group was 0.908 (95% confidence interval 0.853, 0.967) and significant (p value = 0.003). This indicates that treated offenders (those given long bonds) were 10% less likely to re-offend at any given time compared with untreated offenders (those given short bonds). The hazard ratio associated with the treatment group variable remained significant even after adjusting for relevant covariates in the re-offending models (hazard ratio 6 = 0.917, p value = 0.008). Supplementary analyses using IPTW produced an equivalent pattern of findings with significant effects of bond length on time to re-offend, with and without adjustment for other covariates. The results from these latter analyses are summarised in Table 5 under the heading ‘weighted analysis’.
To give some more intuitive sense of the size of the effect of long bonds on reconviction, we regressed the log odds of reconviction against a variable indicating whether the offender received a long or short bond, and adjusted for relevant covariates. The effect of bond length on the probability of reconviction was then obtained by fixing the values of all control variables at their average or modal values. The results are shown in Figure 2 below. The overall effect for bonds was to reduce the risk of reconviction by 3 percentage points (from 0.31 to 0.28).
Probability of reconviction by length of bond.
PSM: Supervised and unsupervised bonds
Further analyses were undertaken in order to examine whether supervision makes a difference to the effect of bond length on re-offending outcomes. For this analysis, two sub-samples of offenders were considered: those who had received a bond (1) with a requirement for probation and parole supervision and (2) without a supervision requirement.
Once again, offenders were matched using one-to-one nearest neighbour matching with no replacement, and the propensity score model predicting whether an offender received a long bond included all explanatory variables. If all covariates were balanced after matching offenders on their propensity scores, the time to first new offence was estimated (with and without adjustment for covariates).
Matches were identified for a total of 1767 offenders given long supervised bonds (i.e. 91% of cases). The propensity score model based on the unmatched supervised bond sample (n = 5807) significantly predicted bond length (pseudo R2 = 0.046, Likelihood ratio chi-square p value < 0.001), but the model using the matched supervised bond sample did not (pseudo R2 = 0.001, Likelihood ratio chi-square p value > 0.999). Prior to matching, a total of 12 variables had SBs above the balance threshold, but after matching none of the covariates were unbalanced. Together these data indicate that the treated and untreated supervised bonds groups were adequately matched on the observable covariates.
Matches were identified for a total of 2798 offenders (i.e. 94% of cases) given long unsupervised bonds. The propensity score model based on the unmatched unsupervised bond sample (n = 12,898) significantly predicted bond length (pseudo R2 = 0.061, Likelihood ratio chi-square p value < 0.001), but the model using the matched unsupervised bond sample did not (pseudo R2 = 0.001, Likelihood ratio chi-square p value > 0.999). Prior to matching, a total of nine variables had SBs above the balance threshold, but after matching, none of the covariates were unbalanced, indicating that the matched samples were adequately balanced.
Re-offending: Supervised and unsupervised bonds
Time to first new offence for short and long bond groups by order type and propensity score method.
Standard errors have been adjusted to account for matched nature of the data.
Adjusted for demographic, offence and prior offending variables.
Discussion
The main aim of the current study was to examine the effect of order length on re-offending among offenders placed on good behaviour bonds. A secondary aim was to determine whether supervision moderated the effects of order length. The evidence presented here shows that after matching offenders on a large range of factors, the time to reconviction in the three-year period following imposition of a bond was longer for those on bonds of 24 months and longer. Supervision made no difference to this result, inasmuch as the time to the next conviction was longer both for those on long supervised and unsupervised bonds.
These findings lend support to the hypothesis that offenders placed on long bonds are less likely to re-offend than offenders placed on short bonds. The fact that the effect is present for unsupervised as well as supervised bonds tentatively suggests that at least some of the reduction in risk of re-offending arises from the deterrent effect of a long bond rather than any rehabilitative effect. Although the absolute reduction in offending associated with long bonds appears small (three percentage points), the percentage reduction in the risk of re-offending in our study is around 10%. This is comparable to the reduction in re-offending found in connection with a number of rehabilitation programs, including drug courts (Aos et al., 2006). We cannot rule out the possibility that our results are due to pre-existing differences between the treatment and comparison groups. There are, however, four considerations which militate against this possibility. Firstly, there was considerable overlap in the propensity score distributions for both sets of analyses. Secondly, the matching threshold was strict (i.e. propensity scores had to be within 0.001 of each other to be regarded as matched). Thirdly, we were able to match a high proportion of cases in both sets of our analyses. Finally, the variables included in the treatment allocation model were very comprehensive.
Notwithstanding these considerations, there is one important caveat surrounding our findings; although we have controlled for a large number of factors known to influence bond length and reconviction, it is always possible that some omitted variable is responsible for the observed relationship between length and re-offending. Moreover, supplementary analyses show that if this type of selection bias does exist, then the treatment effect reported here is not particularly robust. 9 Sensitivity analyses using the Rosenbaum method (see for example Caliendo et al., 2005) show that if there were an omitted variable with the odds of 1.10 predicting treatment selection, which was also highly correlated with reconviction, then including this unobserved covariate in the matching analysis would have rendered the difference between the long and short bond groups no longer significant. However, Duwe (2010) notes two important limitations of the Rosenbaum bounds test. Firstly, the test does not tell you whether omitted variable bias is a problem for the current analysis but only ‘how large the hidden bias would need to be to nullify the estimated treatment effect’ (Duwe, 2010: 77). Secondly, the test is very conservative, because it assumes that the hypothetical unobserved covariate is a near perfect predictor of the outcome being measured. In the current case, it is highly unlikely we have excluded a variable from our analysis which not only significantly impacts on the legal decision to impose a long bond, but is also very highly correlated with the outcome variable (reconviction).
The question obviously arises as to what policy implications flow from our results. Given the possibly fragile effects examined here, it would be imprudent to use them as the sole basis for policy reform. At the same time, there does not seem to be any reason in principle why courts should not impose long bonds (i.e. 24 months or more) when they are appropriate in the circumstances of the case. In the case of R v Dawes [2004] NSWCCA 363, the NSW Court of Criminal Appeal rejected a Crown appeal against a five-year bond imposed on a mother who pleaded guilty to the manslaughter of her autistic son. A five-year bond was also recently imposed for manslaughter in the Victorian Supreme Court (DPP v Smith [2012] VSC 314). In addition, it is worth noting that there is considerable jurisdictional variation in the law surrounding bonds of this nature. In Western Australia, bonds comparable to section 9 bonds are capped at a maximum of two years, while the legislative maximum in South Australia is three years (see Table 1); this is a notable disparity with the majority of other jurisdictions (NSW, Vic, Tas, NT, Cth). If state and territory legislatures were persuaded to take a uniform approach to bonds and this resulted in an increase in long bonds, it would create the conditions for a natural experiment. That is, it would make it possible to examine rates of re-offending before and after the change, while controlling for any differences in the characteristics of offenders receiving bonds before and after the change. This would provide a much stronger test of the effectiveness of long bonds on risk of re-offending than the methodology we employed in our study.
There are two other potentially fruitful lines of research worth pursuing in this area. One is to survey those who are placed on a bond to obtain a better understanding of how bonds are viewed by recipients. Because nothing is known about the perceived risks of apprehension for breaching a (long versus short) bond or the frequency with which those placed on bonds receive treatment and/or support, research of this kind would shed light on the potential of bonds to reduce the risk of further offending. The second option is to conduct a study of rates of offending immediately prior to, during and following termination of a bond. That would give us a clearer picture of whether bonds exert their effects via deterrence or rehabilitation.
Footnotes
Acknowledgements
The authors thank Wai-Yin Wan, Nadine Smith and Steve Moffatt for their valuable advice on the methodology used for this paper and two anonymous reviewers for their comments on an earlier version of this paper.
Declaration of conflicting interest
None declared.
Funding
This work was supported by the Criminology Research Council (grant number CRG 02/11-12, 2012).
