Abstract
A wide literature uses date of birth as an instrument to study the causal effects of educational attainment. This paper shows how parents delaying their children’s initial enrollment in kindergarten, a practice known as redshirting, can make estimates obtained through this identification framework all but impossible to interpret. A latent index model is used to illustrate how the monotonicity assumption in this framework is violated if redshirting decisions are made in a setting of essential heterogeneity. Empirical evidence is presented from the Early Childhood Longitudinal Study, Kindergarten Class (ECLS-K) data set that favors this scenario; redshirting is common and heterogeneity in the treatment effect of educational attainment is likely a factor in parents' redshirting decisions.
Keywords
1. Introduction
Social scientists have devoted a great deal of attention to understanding the effects of educational attainment on a range of outcomes. These effects are a large factor in many policy decisions, such as whether to subsidize education programs for General Equivalency Diploma (GED) certification (Cameron & Heckman, 1993), how much to invest in preventing students from dropping out of school (Dearden, Emmerson, Frayne, & Meghir, 2009; Oreopoulos, 2007), and setting the age at which children should be eligible to enter school (Aliprantis, 2010) and the labor market (Deming & Dynarski, 2008). More generally, it is important to understand the effects of education when designing a range of interventions to improve outcomes, especially those focusing on health (McCrary & Royer, 2011), early childhood interventions (Heckman, Moon, Pinto, Savelyev, & Yavitz, 2010), labor market skills (Heckman, Lalonde, & Smith, 1999), earnings (Card, 1999), and housing (Sanbonmatsu, Kling, Duncan, & Brooks-Gunn, 2006). However, since educational attainment is chosen endogenously by individuals, it is difficult to identify its causal effects (Card, 2001).
One widely used approach to identifying causal effects of educational attainment uses quarter of birth as an instrument for educational attainment, a literature that began with the seminal work of Angrist and Krueger (1991). This identification strategy uses the naturally occurring variation in birth dates together with schools' entrance cutoff dates to assign different levels of education to children of the same age. This framework has since been used in many settings, but in its original setting it is combined with compulsory schooling laws that prohibit students from dropping out of school before a specific age. Since these compulsory schooling laws apply to students' ages, otherwise similar children are legally able to withdraw from school with differing levels of educational attainment. The crucial identifying assumption of monotonicity in this framework is that quarter or date of birth affects all children’s educational attainment in the same way.
The contribution of this paper is to show that parents delaying their children’s initial enrollment in kindergarten, a practice known as redshirting, makes it all but impossible to interpret estimates of the effects of educational attainment when date or quarter of birth is used as an instrument for educational attainment. Theoretical evidence is presented that redshirting creates violations of the monotonicity assumption necessary to identify many of the causal effects of educational attainment estimated in the literature. The paper also presents empirical evidence from the Early Childhood Longitudinal Study, Kindergarten Class of 1998–1999 (ECLS-K) data set indicating not only that redshirting is common but that heterogeneity in the treatment effect of educational attainment is likely a factor in parents' redshirting decisions. The paper discusses in detail exactly how the interpretation of the estimator breaks down when this evidence is considered. Despite previous scrutiny that has already been given to this identification strategy, date of birth has been and continues to be used as an instrument for the Local Average Treatment Effect (LATE) or Average Causal Response (ACR) of educational attainment in a wide variety of applications. 1 The novelty of this paper is to highlight the distinct methodological problem redshirting creates when date of birth is used as an instrument for educational attainment, an important factor when considering the results from this literature. 2
The result presented in this paper is pertinent to the wider discussions about the role of theory in empirical microeconomics (Keane, 2010; Heckman, 2010; Imbens, 2010), and is especially relevant to discussions about the interpretation of estimates generated by natural experiments (Rosenzweig & Wolpin, 2000). One line of research on these topics by Heckman and coauthors (Heckman, Urzúa, & Vytlacil, 2006; Heckman & Urzúa, 2010) emphasizes that while recent developments in the instrumental variables (IV) literature allow for responses to treatment to be heterogeneous, the monotonicity assumption in these models restricts the choice into treatment from being similarly heterogeneous. The way parents choose to redshirt their children violates this assumption, a scenario Heckman et al. (2006) refer to as essential heterogeneity. This example accentuates the importance of understanding the relationship between the Rubin Causal Model developed in the statistics literature and the Roy Model developed in the economics literature (Heckman, 2005; Sobel, 2005), especially as it relates to the joint modeling of outcome and choice equations (Heckman, 2010).
The paper is organized as follows: Sections 2 and 3 discuss the identifying assumptions of several causal treatment effects within a canonical framework. Section 4 presents the popular application of this framework using date of birth as an instrument for educational attainment to estimate causal effects of schooling. Section 4 also demonstrates how redshirting violates the identifying assumption of monotonicity, and Section 5 examines data from the ECLS-K data set illustrating the empirical magnitude of this problem. Section 6 goes into detail about how the interpretation of estimates obtained by this identification scheme is affected by redshirting, and this Section also presents a very brief overview of the literature affected by this issue. Section 7 concludes.
2. Identifying Treatment Effects Using Randomization
2.1. The Average Treatment Effect (ATE)
Consider a standard framework for studying causal treatment effects (Holland, 1986; Rubin, 1974). Let
3. Identifying Treatment Effects Using Instrumental Variables
When the researcher does not control the treatment individuals receive, one strategy for identifying treatment effects is to search for an instrumental variable. Define Assumption 1-i: Assumption 1-ii:
Comparing the outcome variable
3.1. Constant Treatment Effect
Consider a version of Assumption 2 where the researcher assumes a constant treatment effect: Assumption 2a:
When Assumptions 1 and 2a hold,
3.2. The Average Treatment Effect for the Treated (ATT)
A researcher might also have reason to believe that there is some value of the instrument, Assumption 2b: There exists
In this case, Equation 2 becomes:
3.3. The LATE
A final approach, originally proposed in Imbens and Angrist (1994), is to make a monotonicity assumption. The assumption of monotonicity is that if the instrument induces changes in treatment, these changes must be the same for all individuals. This assumption allows for treatment effect heterogeneity and also allows for some individuals to receive treatment at all values of the instrument. Under the assumption of monotonicity, all of the individuals affected by the instrument are either caused to “switch-in” or else to “switch-out” of treatment. Specifically: Assumption 2c: For all possible values of
Define the LATE to be the average causal effect of treatment for those whose treatment status is affected by the instrument. If
3.4. Multiple Treatments and the ACR
Now consider a scenario in which the instrumental variable is still dichotomous, but individuals may receive three treatment intensities: 1-i:
while Assumption 2 is the same as necessary to identify the LATE (i.e., 2c). Angrist and Imbens (1995) proved that if Assumptions 1 and 2 are true, and
4. Date of Birth as an Instrument for Educational Attainment
We now consider the widely used application of the LATE and ACR that is the focus of this paper: using date of birth to identify causal effects of educational attainment. In the United States, children are eligible to begin kindergarten if they turn 5 before a specific entrance cutoff date. To continue with the previous framework in which the instrumental variable is dichotomous, consider only those children born in the quarter before (

Instrument, Treatment, and Attainment by Birthday.
Further assume there exists a latent index:
4.1. Heterogeneous Treatment Effects Satisfying the Monotonicity Assumption
The assumption of monotonicity is that those individuals affected by the instrument must all be affected in the same way. In terms of our model, this assumption is that either
4.2. Heterogeneous Treatment Effects Violating the Monotonicity Assumption: The Case of Redshirting
Parents and schools often choose to redshirt children or to delay their initial enrollment in kindergarten. Thus, it is more realistic to consider a model in which the parents of type H children redshirt their children, while children of type L are redshirted. This may be captured in the context of our model by letting
Redshirting creates violations of the monotonicity assumption, Assumption 2c. When
Figure 1 helps to illustrate that in this case the latent index in Equation 7 and the treatment assignment rule given by Equation 6 yield
The next Section presents empirical evidence that redshirting is prevalent and that it is appropriate to apply the specified model of essential heterogeneity to the process of redshirting in the data set examined. Together with the theoretical considerations just presented, this empirical evidence complicates the interpretation estimates of the LATE or ACR of educational attainment obtained when using date of birth as an instrument for educational attainment. A detailed example illustrating these complications is considered in Section 6.
5. Empirical Evidence Regarding the Violation of Monotonicity
5.1. Data
Data are used from the ECLS-K data set. The ECLS-K is a nationally representative sample of 22,666 children enrolled in 1,277 schools who started kindergarten in the fall of 1998. Data were collected during the the fall and the spring of kindergarten (1998–1999), the fall and spring of first grade (1999–2000), the spring of third grade (2002), fifth grade (2004), and eighth grade (2007) from the children, their parents/guardians, teachers, and school administrators.
5.1.1. Variables
Following the terminology in Bedard and Dhuey (2006), we refer to the relative age at which a child would be observed if they entered kindergarten when first eligible as assigned relative age, and the child’s actual age relative to their school’s cutoff date as observed relative age. Figure 2shows this relative age measured in months. For example, consider a child who lives in a state where the entrance cutoff age is exactly 5 years old at the start of the school year. Then a child who is 5 years and 3 months old at the start of the school year when first eligible to enroll is in the relative age group

Relative Age Groups and Entrance Age.
In order to assign children in the ECLS-K to these relative age cohorts, the ECLS-K public data file was used to obtain data on respondents' exact birth date, as well as school-level entrance cutoff dates. All variables represented as calendar dates were first converted to a daily time line in which day 1 is January 1, 1990. After all time-related variables were first constructed using this time line, these daily variables were divided by 365 to create annual variables. The yearly variables were then multiplied by 12 in order to create variables measured in months. A child’s relative age
5.2. Empirical Evidence That Redshirting Is Prevalent
Table 1shows the distribution of observations in the ECLS-K in each relative age group when using school-level entrance cutoff dates, including children repeating kindergarten. Table 2shows the same data but for the sample including only first-time kindergarteners. If we assume parents' decision rule for determining observed entry age does not change over time, cutoff dates stayed the same between 1997 and 1998, and that any seasonal patterns in number of births are repeated every year, then we may use Tables 1 and 2 to estimate the percentage of children in each relative age group who enter early, when first eligible, or after redshirting. These estimates are presented in Tables 1 and 2. Tables 3 and 4 show these estimates aggregated to the level of quarters.
Cohorts of the ECLS-K (By Month)
Cohorts of the ECLS-K (By Month)
Examining Tables 2 and 4, note that 27% of children who turned 5 within 1 month of their school’s cutoff date are redshirted, as are 19% of children who turned 5 within one quarter of their school’s cutoff date. The percent of children delayed in school by month and quarter rises to 31% and 23%, respectively, if we include children who are held back after starting school (Tables 1 and 3). These figures suggest that the scenario described in Section 4.2 is empirically large, with a conservative estimate of

CDFs of Attainment at Age 6 Conditional on Z.
5.3. Empirical Evidence of Essential Heterogeneity
To investigate the relationship between redshirting and treatment effect heterogeneity, Tables 5 through 7 present descriptive statistics of children in the groups from : those in
Cohorts of the ECLS-K (By Quarter)
Cohorts of the ECLS-K (By Quarter)
Race
Gender
Household Characteristics
This evidence from the ECLS-K shows that redshirting patterns are different for a specific group of children, but the model of essential heterogeneity in Section 4.2 requires that redshirters are affected differently by educational attainment than other children. Since we never observe the counterfactual of redshirters entering on time, it is difficult to conceive of conclusive evidence that there are, or are not, differences in the effects of educational attainment between redshirters and non-redshirters. The current evidence on the impacts of redshirting examines outcomes only after children have been redshirted (Graue & DiPierna, 2000).
However, there is empirical evidence that strongly suggests treatment effect heterogeneity between redshirters and non-redshirters. First, parents redshirt children based on perceived treatment effect heterogeneity. Although there is no clear definition of the word “readiness” (Ackerman & Barnett, 2005), the fact that parents and schools use some measure of readiness, however imprecise (Stipek, 2002), means that parents clearly choose to delay their children’s entry into kindergarten based on perceived heterogeneity in the effects of educational attainment (Graue, 1993). Second, there is evidence of heterogeneity in the effect of educational attainment on earnings (Chernozhukov & Hansen, 2006). Finally, there is ample evidence of heterogeneity in the effects of many educational interventions over the demographic variables characterizing redshirters. For example, there is evidence that income (Blau, 1999), home inputs such as the number of books at home (Todd & Wolpin, 2007), mother’s time at home (Datcher-Loury, 1988), mother’s educational attainment (Murnane, Maynard, & Ohls, 1981), maternal employment (Bernal & Keane, 2010), gender (Dee, 2007; Hastings, Kane, & Staiger, 2006), and race (Currie & Thomas, 1995; Dee 2004b; Garces, Thomas, & Currie, 2002; Hanushek, Kain, & Rivkin, 2004; Krueger, 1999) all play important roles in the effects of education interventions. While inconclusive, this empirical evidence points in favor of the model of essential heterogeneity specified in Section 4.
6. Example: Angrist and Krueger (1991)
Redshirting was likely not prevalent among males in the United States born between 1930 and 1959, the sample studied in Angrist and Krueger (1991) (henceforth AK).
7
However, AK introduces the seminal framework for the instrument being discussed, and understanding how redshirting would have affected its estimates helps to illustrate the problems redshirting creates for newer samples in which redshirting is prevalent. Consider the Wald estimates obtained in AK. Let
Now consider the group of individuals who respond to the instrument, and assume in the case of AK that these individuals would all drop out at the age when first eligible. Returning to the latent index in Section 4, consider what happens if 20% of children are redshirted, being of type

Solutions.
6.1. Implications for the Literature
The preceding example illustrates that parameters of interest may be unidentified when quarter or date of birth is used as an instrument for educational attainment. The implications of redshirting for parameter estimates in the literature will depend on the nature of redshirting in the sample being studied, as well as the exact way redshirting interacts with the compulsory schooling laws being used. Nevertheless, there is a large literature for which redshirting might be a relevant concern, as compulsory schooling laws have been used to estimate a wide range of parameters. A sample of these parameters includes the effects of schooling on AFQT scores (Cascio & Lewis, 2006; Neal & Johnson, 1996), civic participation (Dee, 2004a; Milligan, Moretti, & Oreopoulos, 2004), criminal activity (Lochner & Moretti, 2004), mortality (Lleras-Muney, 2005), happiness (Oreopoulos, 2007), and general health outcomes (Adams, 2002); the effects of maternal education on infant health (McCrary & Royer, 2011) and fertility decisions (Black, Devereux, & Salvanes, 2004); the effect of parents' educational attainment on children’s educational outcomes (Oreopoulos, Page, & Stevens, 2006); the magnitude of human capital externalities (Acemoglu & Angrist, 2000); and the effects of kindergarten entrance age on educational outcomes (Bedard & Dhuey, 2006; Datar, 2006; Elder & Lubotsky, 2008; McEwan & Shapiro, 2008). It should also be noted that although the Regression Discontinuity Designs (RDDs) discussed in the literature such as Hahn, Todd, and Klaauw (2001) and Imbens and Lemieux (2008) are for binary treatments, redshirting also has implications for the appropriate application of RDDs.
7. Conclusion
Beginning with the seminal work of Angrist and Krueger (1991), a wide literature has sought to estimate the effects of educational attainment using quarter or date of birth as an instrument for educational attainment. In this paper, we have provided theoretical and empirical evidence that parents delaying their children’s initial enrollment in kindergarten, a practice known as redshirting, makes it all but impossible to interpret estimates of the effects of educational attainment using this identification framework. Theoretical evidence is presented that redshirting creates violations of the monotonicity assumption necessary to identify many of the causal effects of educational attainment estimated in the literature. Empirical evidence from the ECLS-K data set demonstrated that redshirting is common and that a model of essential heterogeneity is likely appropriate for the redshirting decisions of children in the ECLS-K.
The result presented in this paper contributes to the wider discussions about the role of theory in empirical microeconomics, as well as the relationship between econometrics and statistics. More specifically, a careful investigation of the complications introduced by redshirting showed that estimates of the effect of educational attainment may become all but impossible to interpret in a model of essential heterogeneity. This scenario resulted in a breakdown of the IV framework in which we were simply unable to identify treatment parameters. This finding has important implications for the literature using date of birth as an instrument for the LATE or ACR of educational attainment.
Footnotes
Acknowledgments
The author would like to thank Ken Wolpin, Petra Todd, Alan Krueger, Dylan Small, Becka Maynard, Matt White, Michela Tincani, Tim Dunne, and two anonymous referees for helpful comments. The research reported here was supported by the Institute of Education Science, U.S. Department of Education, through Grant R305C050041-05 to the University of Pennsylvania. The views stated herein are those of the author and are not necessarily those of the Federal Reserve Bank of Cleveland, the Board of Governors of the Federal Reserve System, or the U.S. Department of Education.
