Abstract
Estimates of the effect of employment on women’s risk of partner violence in cross-sectional studies are subject to potential “self-selection bias.” Women’s personal choice of whether to pursue employment or not may create fundamental differences between the group of women who are employed and those who are not employed that standard regression methods cannot account for even after adjusting for confounding. The aim of this study is to demonstrate the utility of propensity score matching (PSM), a technique used widely in econometrics, to address this bias in cross-sectional studies. We use PSM to estimate an unbiased effect-size of women’s employment on their risk of experiencing partner violence in urban and rural Tanzania using data from the 2010 Tanzania Demographic and Health Survey (DHS). Three different measures of women’s employment were analyzed: whether they had engaged in any productive work outside of the home in the past year, whether they received payment in cash for this productive work, and whether their employment was stable. Women who worked outside of the home were significantly different from those who did not. In both urban and rural Tanzania, women’s risk of violence appears higher among women who worked in the past year than among those who did not, even after using PSM to account for underlying differences in these two groups of women. Being paid in cash reversed this effect in rural areas whereas stability of employment reduced this risk in urban centers. The estimated size of effect varied by type of matching estimator, but the direction of the association remained largely consistent. This study’s findings suggest substantial self-selection into employment. PSM methods, by compensating for this bias, appear to be a useful tool for estimating the relationship between women’s employment and partner violence in cross-sectional studies.
Background
Researchers are increasingly interested in the potentially empowering effect of employment on women’s vulnerability to partner violence in low- and middle-income countries (LMIC). A systematic review found that in some LMIC settings, women’s employment was associated with higher prevalence of partner violence, while in others, it had a protective effect (Vyas & Watts, 2009). Virtually all of the reviewed studies analyzed cross-sectional data using standard regression methods that compare the experience of partner violence among women who are employed versus those who are not. A key limitation in such studies, however, is that women “self-select” into employment by virtue of their choice to pursue it or not. This means that employed women could be systematically different from those not employed—a fact that could bias any effort to estimate the “effect-size” of the association between employment and risk of partner violence. Propensity score matching (PSM), a technique widely used in econometrics, can account for such differences by matching employed and unemployed women on a series of background factors that predict their likelihood of being employed. In the absence of a well-controlled randomized study, PSM can yield unbiased estimates of effect, unaffected by selection bias.
The overall aim of this paper is twofold: to provide a primer on PSM for violence researchers who frequently face the issue of uncorrected selection bias in cross-sectional studies and to apply PSM techniques to calculate an unbiased estimate of the effect of employment on women’s risk of partner violence in urban and rural Tanzania. The employment-partner violence example provides both a real-life illustration of PSM applied to violence data and contributes new information on the long-standing debate on the role of economic factors in women’s risk of partner violence.
Self-Selection Bias: An Introduction
PSM is an analytical technique useful for managing selection bias in observational studies, that is, where the comparison group is outside the control of the researcher. Bias resulting from self-selection is commonly explained in terms of the counterfactual framework—the schema underlying all intervention studies. Normally, in experimental studies, individuals are randomized into either receiving the intervention (frequently called the “treatment”) or not. Theoretically, to obtain a purely unbiased effect-estimate, the outcomes would be observed for each individual under both intervention and non-intervention conditions, and the resulting difference in outcomes averaged across the sample. This is known as the average treatment effect (ATE). More commonly, the effect-size of interest is the average treatment effect on the treated (ATT)—that is, the average effect of the intervention on those who received the intervention (Blundell & Costa Dias, 2008). This requires a non-exposed scenario (a counterfactual) for the group that received the intervention.
In cross-sectional studies, a pure counterfactual scenario is not available, and researchers must rely on substitute data that approximate the missing counterfactual. Standard regression techniques use the population that has not been exposed to the intervention as their counterfactual comparison group. If the exposed and comparison groups are comparable in terms of their non-exposed outcomes, then the estimated ATT is unbiased. By adjusting for all known confounders, standard regression methods assume that observed exposed individuals are compared with alike non-exposed individuals (Blundell & Costa Dias, 2008; Oakes & Johnson, 2006). Standard regression methods, however, still fail to account for the fact that these groups may have been drawn from different distributions and that there exists “structural confounding”—a difference that cannot be overcome by inclusion of variables or increases in sample size (Oakes & Johnson, 2006).
The counterfactual framework highlights a limitation inherent in all observational studies. Because of the way individuals are recruited into the study, there is no true comparison group to represent the counterfactual condition—that is, what would have happened among those exposed to the intervention had they not been exposed (Angrist & Pischke, 2009; Blundell & Costa Dias, 2008; Oakes & Johnson, 2006). While in our example, there is no intervention per se, the principle that underlies the counterfactual framework still applies. Even after adjusting for all known confounders, differences in women’s pre-employment likelihood of entering employment may persist between those who became employed and those who did not.
Propensity Score Matching
Matching methods aim to emulate an experimental design and, therefore, can help address the issue of self-selection bias (Blundell & Costa Dias, 2008; Heckman, Ichimura, & Todd, 1998). From a set of identified pre-intervention factors, X, a non-exposed comparison group is selected so that the distribution of X in this group is similar to the distribution of X in the exposed group. Theoretically, conditioning on X ensures that the outcome is independent of exposure and that the only remaining relevant difference between the two groups is the exposure to the intervention. This is known as the conditional independence assumption (CIA).
An issue is how to match individuals on the range of variables that distinguish exposed and non-exposed respondents prior to analysis of the intervention. Early matching methods involved listing all possible combinations of pre-intervention variables in a contingency table and then matching exposed and comparison units that fall within the same cell. With potentially many factors to match on, this approach rapidly becomes unwieldy. PSM resolves this issue by using probability models, most commonly probit or logit, to derive a single variable (the propensity score) that captures the probability that a respondent will be exposed to the intervention (i.e., employed). If the CIA holds for X, then it also holds for the probability of X (P(X)) (Blundell & Costa Dias, 2008; Heckman et al., 1998). Each exposed individual is then matched with one or more non-exposed individuals based on the degree of similarity in the estimated propensity score. Matches are only possible when the propensity scores of exposed and non-exposed units overlap in an “area of common support,” and individuals who fall outside of this range are discarded from the analysis. The area of common support, however, narrows the more accurately the probability model predicts exposure, and a perfect-fit model results in no possible matches as all exposed respondents are assigned a propensity score value of 1 and all non-exposed individuals are assigned a propensity score value of 0. 1 Therefore, the set of pre-intervention factors X should not perfectly predict the probability of exposure but should be sufficiently close (Caliendo & Kopeinig, 2005; Khandker, Koolwal, & Samad, 2010).
Several PSM algorithms exist that differ in their identification of the comparison group and the weights they assign to individuals within the group (Caliendo & Kopeinig, 2005; Khandker et al., 2010; Leuven & Sianesi, 2003). The most common PSM estimators are as follows:
Nearest neighbor: Exposed individuals are matched to one or more individuals in the comparison group who have the closest propensity score. Matching can be with replacement (individuals in the comparison group can be used more than once) or without replacement. Matching with replacement can increase the quality of matches and decrease bias; however, the variance of the estimator increases if many comparison cases are discarded.
Kernel: Exposed individuals are matched with a weighted sum of individuals in the comparison group, with greater weight given to individuals who have a closer propensity score. By using more observations than nearest neighbor matching, the standard errors, and bias, of the estimator are lower.
Radius: This involves imposing a restriction (caliper) on how far away comparison matches can be. A smaller caliper reduces “poor” matches; however, if only a few matches are found, then the variance of the estimator increases.
The balance of this paper applies these techniques to the issue of women’s employment and its impact on their risk of partner violence in urban and rural Tanzania. The goal is to contribute to the literature on women’s employment and partner violence and to illustrate how to use PSM to reduce selection bias in future violence studies.
Method
Data were drawn from the 2010 Tanzania DHS that was conducted by the Tanzania National Bureau of Statistics with assistance from ICF Macro as part of its MEASURE DHS program (National Bureau of Statistics and ICF Macro, 2011). The Tanzania DHS is a nationally representative survey that gathers information on women’s sexual and reproductive health, indicators of children’s health, socioeconomic characteristics, and experience of domestic violence (National Bureau of Statistics and ICF Macro, 2011). From a total sample of 10,300 households, all women aged 15 to 49 were individually interviewed yielding a sample size of 10,139, including women in settings designated as urban and women in settings designated as rural. The domestic violence module was administered to a subsample of 5,688 women where one eligible (ever partnered) woman per household was randomly selected for interview.
The 2010 Tanzania DHS uses a modified version of the Conflict Tactics Scale (CTS)—an established tool used to capture information on a range of violent acts. Women were asked a series of behaviorally specific questions about their experiences of emotional, physical, and sexual violence (Straus, 1979, 1996). Experience of partner violence was defined as an affirmative response to having experienced one of seven acts of physical violence—push, slap, arm twisting or hair pulling, punch, kick, choked or burned, or threatened or attacked with a weapon—and/or at least one of two acts of sexual partner violence in the past 12 months—physically forced to have sexual intercourse by one’s partner or forced to perform any sexual acts they did not want (National Bureau of Statistics and ICF Macro, 2011).
Three dichotomous indicators of women’s employment were developed from questions about their work. The first was whether or not respondents had done any productive work outside of the home (i.e., other than housework) in the last 12 months, including selling things, small business, or work on the family farm. Women who reported that they had worked in the past year were asked about whether they were paid in cash or in kind and whether they worked throughout the year (i.e., had stable work), worked seasonally, or only once in a while. The second and third measures of women’s economic status (method of compensation and continuity of work) were a subset of the first measure.
Data Analysis
All analyses were performed using STATA v10.0. Population estimates were derived using the “svyset” command applying survey weights that adjusted for strata and clustering, and PSM was applied using the “PSMATCH2” command (Cameron & Trivedi, 2010; Heeringa, West, & Berglund, 2010; Leuven & Sianesi, 2003). Survey weights were not used to estimate the propensity scores because the scores were used only to match employed with not employed women; they were used to estimate the population-level effect-size together with 95% confidence intervals of women’s employment on her risk of partner violence (Leuven & Sianesi, 2003; Zanutto, 2006). Our analyses focused on the subpopulation of women who were currently partnered (married or cohabiting) at the time of the survey, and we compare all women who experienced partner violence in the past 12 months with all women who had never experienced emotional, physical, or sexual partner violence (total Tanzania, N = 4,487: urban Tanzania, N = 986; rural Tanzania, N = 3,519).
Matching estimators perform well when individuals reside in the same local labor market (Dehijia & Wahba, 2002; Heckman et al., 1998; Smith & Todd, 2001). Therefore, propensity scores for urban and rural Tanzania were estimated separately and for each employment indicator. Matching estimators also require a comprehensive set of factors (covariates) that can be used to predict women’s employment to satisfy the balancing property—a diagnostic tool used to confirm whether similar propensity scores for employed and not employed women are adequate for matching, as indicated by having a similar distribution of covariates. For each group (employed and not employed), women are ranked based on their propensity score, and the distribution of covariates within a specified stratum of the propensity score is tested for statistical difference. A significantly different result in any strata for any covariate is a violation of the balancing property. Theoretical and empirical literature points to several factors that are likely to influence women’s choice of whether or not to pursue employment including age; educational attainment; marital status; presence of children; household socioeconomic status and partner occupational status (Dougherty, 2007; Morrison & Orlando, 1999). Starting with the most parsimonious model including only women’s age, we added additional factors that satisfied the balancing property. The final model used for all three measures of employment included age, age2 (to account for a non-linear relationship between age and employment), marital status (married vs. living together not married [cohabiting]), years of education, number of children living in the household, household socioeconomic status, and partner years of education. Table 1 presents the results of probit models showing how each factor contributed to the propensity scores. 2 With the exception of being married in urban areas and household socioeconomic status in both urban and rural areas, most of the factors strongly predicted women’s employment status. The Pseudo R2 (a measure of goodness of fit) for the models ranged between .026 and .096.
Probit Estimates of Employment Indicators and Pre-Intervention Covariates.
p < .05. **p < .01. ***p < .001.
Figures 1 to 6 show the distribution of the estimated propensity scores for employed and not employed women. The distributions of the two groups are superimposed (gray bars reflecting the employed women and the clear bars the not employed women), and the height of the bars measures the frequency of women within each propensity score range. In both settings, when considering whether or not women worked in the past year, there are far more women who had worked compared with women who had not worked. In addition, in urban Tanzania, there are more women who were paid in cash rather than in kind or not working, although the difference is less dramatic than whether women had worked or not. This may have implications for the nearest neighbor without replacement matching estimator because many women who had worked would not be matched.

Tanzania urban: Worked in past 12 months.

Tanzania urban: Worked for cash.

Tanzania urban: Stability of work.

Tanzania rural: Worked in past 12 months.

Tanzania rural: Worked for cash.

Tanzania rural: Stability of work.
In urban areas, there are roughly the same number of women in stable employment as there are women in occasional work or not working. In rural areas, for both the paid-in-cash and employment-stability measures, there are almost twice as many not employed women as employed women. For all employment measures, there are more employed women than not employed women at higher levels of the propensity score, and this may have implications for both the nearest neighbor without replacement and the radius-matching estimator. As the caliper becomes smaller, the few employed women with the highest propensity scores may not be matched.
In our analysis, we examine the effect-size of the three measures of employment on partner violence using the following three matching estimators: nearest neighbor with replacement, kernel, and radius using three caliper levels, r = 0.0001, r = 0.001, and r = 0.01. In the nearest neighbor matching method, not employed cases that fall outside the area of common support are discarded, and the effect-size is estimated between all employed women and the remaining not employed cases. In the kernel matching method, a weighting mechanism is applied so that each employed woman is matched with all not employed women in estimating the effect-size, but some not employed women are given a weight greater than one and other not employed women are given a weight less than one. In the radius matching method, only those employed women are included in estimating the effect-size where a not employed woman or women can be found within a specified range of employed women’s propensity score. These strategies differ from standard regression techniques where all not employed women would have been included as the comparison group and given equal weight.
Results
Table 2 presents population characteristics by different employment indicators in both urban and rural Tanzania. In both settings and for the three employment indicators, most of the characteristics were significantly different between employed and not employed women. This highlights the potential bias introduced when estimating measures of effect from data drawn without the benefits of randomization. Women who worked in the past year, were paid in cash, or were in stable employment were generally significantly older than women who had not worked, were not paid or were paid in kind, or were in occasional employment (only the result for worked in the past year in rural areas was not significant). Women who were paid in cash or were in stable employment had significantly more years of education and were more likely to come from higher socioeconomic status households, whereas women who reported that they had worked in the past year were more likely to have come from lower socioeconomic households. In addition, in urban areas, women who had worked in the past year had a higher average number of children. In rural areas, women who were married were significantly less likely to have worked in the past year or to have been paid cash. By contrast, women with fewer children were more likely to have worked in the past year.
Population Characteristics of Currently Partnered Women by Employed and Not Employed Status (Includes Women Who Experienced Partner Violence in the Past 12 Months and Women Who Never Experienced Emotional, Physical, or Sexual Partner Violence).
Differences of subpopulation means (t-test) and proportions (Pearson chi-square) were conducted on continuous and categorical variables, respectively.
Table 3 presents population estimates of past year physical and/or sexual partner violence and the difference in prevalence by employment measure. The findings show that partner violence is prevalent in both urban and rural Tanzania. Partner violence is significantly more prevalent among women who worked in the past year compared to women who did not work, with a difference between the two groups of 9.5% (urban) and 10.0% (rural). By contrast, partner violence was lower among women who were paid in cash compared with women who were not working/not paid/paid in kind—a finding that was significant only in rural areas. In both settings, partner violence was lower among women who were in stable versus seasonal or occasional employment, significantly so in urban areas.
Population Prevalence of Partner Violence Among Currently Partnered Women (Includes Women Who Experienced Partner Violence in the Past 12 Months and Women Who Never Experienced Emotional, Physical, or Sexual Partner Violence).
Findings from descriptive statistics of the propensity scores within the area of common support confirm that the area of common support is vast for all three employment measures in both settings, with only a few not employed women falling outside the boundary (results not shown). For all three employment measures, the mean propensity score for employed women is significantly lower (p < .001) than the mean propensity score for not employed women, that is, the probability of being employed is, on average, lower for women who are not employed than for women who are employed.
Table 4 shows the estimated effect-size (difference in past year physical and/or sexual partner violence) between employed and not employed women using each matching method. In both settings, the estimated population prevalence of past year partner violence was, for all but one matching estimate, significantly higher among women who worked in the past year compared with women who had not. For example, in the urban setting, with the exception of the radius (r = 0.0001) matching estimate, the effect-size ranged between 16.9% (nearest neighbor) and 18.6% (radius, r = 0.01). For the radius-matching estimator (r = 0.0001), the population size was substantially reduced as was the effect-size at 2.5%, although the result was not significant.
Population Weighted Average Effect-Size Estimates of Women’s Employment Indicators on Past Year Physical and/or Sexual Partner Violence Among Currently Partnered Women (Includes Women Who Experienced Partner Violence in the Past 12 Months and Women Who Never Experienced Emotional, Physical or Sexual Partner Violence).
In urban Tanzania, compared with working in the past year, the effect of being paid cash was less strong but still positive and statistically significant for most matching estimates. The only non-significant result was associated with the radius-matching estimate (r = 0.0001) that yielded an effect-size of 9.6%. For other matching estimates, the significant effect-size ranged between 6.3% (kernel) and 10.9% (radius, r = 0.001). In rural Tanzania, all of the matching estimates yielded negative effect-sizes implying the prevalence of past year partner violence was lower among women who were paid in cash compared with women who were not; however, none of the results were significant. The effect-size ranged between −1.7% (nearest neighbor) and −5.4% (radius, r = 0.0001).
Stable women’s employment resulted in varied effect-sizes in urban areas. While the matching estimates generally resulted in negative effect-sizes, they ranged between 0.8% (kernel) and −9.1% (radius, r = 0.0001). By contrast, in rural areas, stable employment yielded positive means differences for all matching estimates and the effect-size ranged between 3.6% (radius, r = 0.001 and r = 0.01) and 6.3% (nearest neighbor). However, in both sites, none of the matching estimates were significant.
Conclusion
In this article, we use PSM as a means to overcome self-selection bias in an effort to estimate the effect-size between different measures of women’s employment and their risk of partner violence in non-experimental studies. The merit of PSM lies in its ability to match an employed woman with a woman or women who are not employed but who share the same profile of characteristics that predict self-selection into employment. We demonstrate that Tanzanian women who work outside of the home are significantly different from those who do not, suggesting self-selection into employment. In both urban and rural areas, women’s risk of partner violence appears higher among women who worked in the past year than among those who did not, even after using PSM to account for underlying differences in these two groups of women. Being paid in cash reversed this effect in rural areas whereas stability of employment reduced this risk in urban centers.
There are, however, several limitations associated with using PSM methods that need to be highlighted. The first is if exposed individuals cannot be matched, then this can compromise the generalizability of the findings and make interpretation of the estimated effect-size hard to define over the subgroup. In our study, the area of common support included all employed women, and so this was generally only an issue when we used the radius PSM estimator with a restrictive caliper (r = 0.0001). At higher values of the propensity scores, there were fewer not employed women to match employed women with and this potentially contributed to the wider confidence intervals observed for the nearest neighbor estimations as fewer not employed matches are used to estimate the effect-size. A second limitation is that PSM methods also risk biased estimates if the available covariates are not adequate to yield a propensity score that adequate predicts the underlying exposure (in our case, the likelihood to seek employment). A limitation of probability models, however, is that they often yield low goodness-of-fit statistics as evident in this and other studies (Morrison & Orlando, 2004). A third limitation of PSM is that it does not account for unobserved characteristics that influence exposure, and therefore, establishing the causal relationship can still remain unresolved if there is no clear causal pathway. This is particularly an issue in our analysis where the pathway between women’s employment status and partner violence is complex.
In the absence of an experimental-benchmark estimate, the validity of our results from using PSM is difficult to assess. However, the findings shed interesting light on how different aspects of women’s work are associated with their experience of partner violence. The majority of Tanzanian women are engaged in productive work outside of the home, and on average, working within the last year is associated with an increased risk of women experiencing partner violence at the population level. This risk is higher than that documented by the differences presented in Table 3. However, among those in rural Tanzania who have recently worked, women receiving cash had lower risk compared to women working for no or in-kind payment or who were not working at all. In urban Tanzania, working for cash was not protective whereas employment stability did reduce women’s vulnerability to partner violence among those women who worked. When considering that in urban Tanzania almost all women in stable employment were paid cash, this finding may suggest that it is the combination of stable, paid work that is more likely to protect women from partner violence. 3
The findings from this study confirm existing evidence that the effect of women’s work operates differently in different contexts. A study in Bangladesh found higher violence among women who earned an income in rural Bangladesh but no significant relationship in urban Bangladesh, and a multi-country study found that working for cash was associated with higher violence in India, The Dominican Republic, and Nicaragua, but with lower violence in Egypt (Kishor & Johnson, 2004; Naved & Persson, 2005). To date, one study has focused on regularity of work as a proxy for stable employment and found it to be associated with lower lifetime violence in India (Panda & Agarwal, 2005).
Exactly why productive work outside the home increases women’s overall risk of partner violence is still an open question. The effect could reflect the added stress on household dynamics as women’s work is often in addition to their domestic responsibilities yet expectations of women as wives and mothers remain high (Narayan, Chambers, Shah, & Petesch, 2000). Intriguingly, the lower effect among women receiving a cash payment in rural Tanzania supports the assertion that monetary resources enhance women’s status within the household (Tauchen & Witte, 1995; Tauchen, Witte, & Long, 1991). In addition, the majority of cash-paid women in rural Tanzania are engaged in agricultural work—a type of work that may be an extension of their existing productive activities and, therefore, does not fall outside of established norms. By contrast, in urban Tanzania, where gender roles are in greater flux, women taking on non-traditional jobs may represent more of a challenge to the gendered division of labor. In such a situation, cash alone may not provide women with the opportunity to leave abusive husbands and the added security of a regular income is required to secure their status within the household.
This study used PSM methods to estimate an unbiased effect-size between women’s employment status and partner violence. Despite the limitations of PSM, the method is a useful tool for reducing bias in observational studies. Its application, combined with other evaluation methods (e.g., difference in difference, for use in longitudinal or repeated cross-sectional studies), provides new avenues for violence researchers to probe the relationship between key variables and partner violence.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
