Abstract
The Florida College and Career Readiness Initiative (FCCRI) was a statewide policy requiring college readiness testing and participation in college readiness courses for high school students. We used regression discontinuity to compare outcomes for students scoring just above and below test score cutoffs for assignment to FCCRI. We also examined impacts for students from a wider range of academic performance by using a before-after regression analysis to compare outcomes for targeted students before and after their schools implemented the FCCRI. The FCCRI increased the likelihood of enrolling in nondevelopmental courses for some targeted students, although results differ by academic performance. However, smaller differences in the likelihood of passing nondevelopmental courses suggest that some students were not prepared for these courses.
This study examines the impact of legislation enacted by Florida for a statewide program known as the Florida College and Career Readiness Initiative (FCCRI), which was intended to reduce the need for developmental education. The FCCRI consisted of testing Grade 11 students to determine their college readiness and offering math and English college readiness and success (CRS) courses in Grade 12 for students who did not test college ready the year before. We begin by reviewing the literature on the transition from high school to postsecondary education and the impact of similar policies in other states. Next, we provide an overview of the state policy context in Florida and changes to the policy over time. Then we describe the data and two different methods used to estimate program impacts. We used a sharp intent-to-treat (ITT) regression discontinuity (RD) design to compare outcomes for students scoring just above and below test score cutoffs for assignment to the FCCRI. 1 We also examined the impact of offering the FCCRI to students from a wider range of academic performance levels by using regression analysis to compare outcomes for targeted students before and after the schools implemented the FCCRI. We find that the FCCRI increased the likelihood of enrolling in nondevelopmental courses for some targeted students who seamlessly enrolled in college, although the results differ based on student performance on standardized assessments. However, smaller differences between the treatment and control groups in the likelihood of passing nondevelopmental courses suggest that some students may not be prepared for these courses. We conclude with implications for researchers and policymakers to consider for similar programs in other states and directions for future research.
Literature Review
Developmental education is necessary because many high school graduates do not have the requisite skills to complete college-level courses (e.g., Achieve, 2016; Boser & Burd, 2009; Strong American Schools, 2008). These students often do not recognize that they lack the preparation necessary to complete for-credit college courses and that they will be required to enroll in developmental education. The gap in understanding college readiness can be traced to the governance divide between K–12 and postsecondary institutions as well as to complexities in procedures that are meant to alert students to their college readiness. Often high schools and colleges within a state use different college readiness tests—even when states attempt to align college readiness testing at high schools and colleges, the resulting policy is often fragmented or incomplete. When college readiness testing is not consistent between the high school and college levels, students do not receive a clear message about their college readiness. Better alignment would not only help students prepare appropriately for postsecondary education but might also assist teachers by providing focused and centralized college readiness standards.
High school curriculum also plays an important role in preparing students for college. Students who participate in rigorous high school classes are more likely to persist at a postsecondary institution, remain enrolled at their initial institution, and pursue a bachelor’s degree track if they transfer institutions (Attewell & Domina, 2008; Long, Conger, & Iatarola, 2012). Furthermore, postsecondary enrollment correlates more closely with high school curriculum than with high school test scores or class rank (Adelman, 2006). These findings suggest that policy initiatives such as the FCCRI that are designed to boost enrollment in more rigorous high school courses may improve postsecondary outcomes.
There is limited evidence on the effectiveness of programs like the FCCRI. Kurlaender, Jackson, Grodsky, and Howell (2016) examined a portion of California’s Early Assessment Program, which was designed to bridge the gap between secondary and postsecondary institutions. The program included a voluntary assessment of college readiness in Grade 11 similar to the FCCRI’s. Using a difference-in-difference approach, the researchers found that the college readiness assessment slightly reduced the need for developmental education in math and English, with the largest impact for students near the readiness cutoff.
Hurwitz, Smith, Niu, and Howell (2015) assessed a policy in Maine that required students to take the SAT in Grade 11. The policy, designed to meet No Child Left Behind’s accountability requirements by providing an indication of student and school performance, had the additional benefit of helping students meet admissions requirements at many 4-year colleges. Using difference-in-difference methodology, the authors found that mandatory testing raised college-going rates by 2 to 3 percentage points. However, the authors did not look at the impact on developmental education participation rates.
To our knowledge, only one other study has examined the impact of transition courses. Pheatt, Trimble, and Barnett (2016) studied West Virginia’s intervention, which required schools to offer a math transition course to Grade 12 students who scored below mastery on the state’s standardized exam in Grade 11, although course enrollment was voluntary. They found that the math transition course had a negative impact on the likelihood of passing a college gateway course. The researchers hypothesized that the courses may not have been aligned with college readiness testing and that students may have taken the transition course in place of more rigorous courses.
Florida Policy Overview
In 2008–2009, Florida introduced its own statewide initiative, the FCCRI, to improve the state’s college and career readiness rate. This differed from other college readiness initiatives, such as summer bridge programs, in that it provided students with earlier information about their level of college readiness and focused solely on improving the academic skills needed for college. The FCCRI was also geared toward a much broader group of students than other readiness interventions, which tend to be more narrowly targeted toward college-bound students from disadvantaged backgrounds.
In Grade 10, all students took the Florida Comprehensive Assessment Test (FCAT). The FCAT assesses basic skills learned through Grade 10, but does not assess higher-level skills learned later in high school and needed for college success. Students who scored in the midrange of the FCAT were targeted for college readiness testing in Grade 11. Students scoring above this range were assumed to be college ready, whereas those below were considered at risk of not graduating from high school and were required to take a course to help them pass the FCAT. Students who did not score above the readiness threshold on the college readiness assessment in Grade 11 were assigned to CRS courses in Grade 12. Offered in math and English, CRS courses were intended to inform students of their college readiness status and help them develop the skills needed to score college ready and succeed in college courses. The FCCRI differed from other statewide initiatives in that Florida was the only state where community colleges and high schools administered the same test (Achieve, 2012).
Implementation of the FCCRI began during the 2008–2009 school year under Senate Bill 1908 (2008). Under the initial policy, herein referred to as the voluntary FCCRI, students could choose whether or not to take the College Placement Test (CPT) in Grade 11 and CRS courses in Grade 12. Starting in 2011–2012, the FCCRI became mandatory for targeted students under House Bill 1255 (2011). The updated policy, herein referred to as the mandatory FCCRI, made two further changes. First, a new assessment, the Postsecondary Education Readiness Test (PERT), replaced the CPT. Second, high schools were required to administer the PERT directly, rather than partnering with colleges to administer the CPT. There were three cohorts under the voluntary FCCRI (V1 through V3) and two cohorts under the mandatory FCCRI (M1 and M2).
New legislation in 2015 eliminated the requirements for common placement testing and CRS courses, and participation in both components became voluntary at both the student and school levels beginning in the 2015–2016 school year. These changes occurred after the cohorts for our evaluation completed high school and do not affect the impact analyses.
Description of Treatment and Counterfactual Conditions
The Florida Department of Education (FLDOE) set standards for CRS courses that defined the topics to be covered. However, districts, high schools, and teachers had considerable discretion in how they implemented these courses and in the curricular materials that were used. This means that even though all CRS courses shared the same label, implementation differed across schools and even within schools. However, many math CRS courses took the form of review of Algebra I and II. English CRS courses tended to resemble a traditional English IV course, although many teachers indicated that they placed more emphasis on integrating ACT or PERT preparation and study skills into the CRS course (Mokher et al., 2014). Schools tended to assign teachers with strong credentials to these courses; CRS teachers were more likely to have graduate degrees and more years of experience compared to the statewide population of teachers (Mokher et al., 2013).
During the first 4 years of the FCCRI (the time frame for this analysis), approximately 84% of students were targeted for college readiness testing in math and 57% in English; thus, most of the student population was affected by this initiative. The state did not impose sanctions for noncompliance, so not all schools participated. Roughly 60% of schools offered college readiness testing during the voluntary FCCRI, rising to 96% during the first mandatory year. Fewer than 50% of schools offered CRS math courses and fewer than 25% offered CRS English courses in the voluntary period; both rates rose to over 90% when the policy became mandatory.
Nontargeted students also participated in CRS courses for several reasons. At some schools, students who did not take the PERT were placed in CRS courses. Additionally, some districts eliminated standard- or honors-level Grade 12 English courses, making the CRS course the default English course for many students. In some cases, administrators placed nontargeted students in CRS courses when the students were struggling in more advanced classes or needed an additional credit to graduate.
Course-taking by voluntary cohorts presents evidence of the counterfactual. In math, lower- to mid-performing targeted students primarily shifted away from “other regular math” courses (which include courses such as Financial Applications, Analytic Geometry, and Liberal Arts Math) and Algebra II. Higher-performing targeted students also shifted away from “other regular math” as well as from “other advanced math” courses (which include courses such as AP Statistics and Honors Probability). In English, targeted students, particularly those who were higher performing, were more likely to take regular English courses (including CRS) and less likely to take honors-level English courses. This suggests that the FCCRI could have induced students on the margin into less rigorous course work, which influences the effect of the FCCRI for these students.
Research Questions
The purpose of this study is to examine the impact of the FCCRI on student outcomes at the end of high school and during the 1st year of college. The research questions are as follows:
What is the impact of the FCCRI on outcomes by the 1st year of college for Students just below the upper FCAT threshold for assignment to college readiness testing (highest-performing targeted students) compared to students just above the threshold? Students just above the lower FCAT threshold for assignment to college readiness testing (lowest-performing targeted students) compared to students just below the threshold? Students just below the PERT threshold for assignment to CRS courses compared to students just above the threshold?
What is the impact of the FCCRI once schools switch from low to high compliance for all students with eligible FCAT scores (not just those near the thresholds)?
How does the impact of the FCCRI differ among all eligible students by baseline achievement and the level of initial college course taken?
Data and Sample
Data Sources
Our primary data source for both analyses is student-level records from the Florida K–20 Education Data Warehouse. These data follow all Florida public school students from Grade 10 for as long as they remain in Florida’s public education system and are supplemented with school-level variables from the National Center for Education Statistics’ Elementary/Secondary Information System and reports produced by FLDOE. We omitted students who transferred to an out-of-state, private, or home school or withdrew from school for medical reasons in Grade 11 or who did not have a school enrollment record in Grade 12.
RD Sample
The RD analysis includes only cohort M1 because the voluntary cohorts had very low participation rates in both college readiness testing and CRS courses, with little variation at test score cutoffs. We combined test score data with information on race-ethnicity, gender, free or reduced-price lunch (FRPL) status, English language learner (ELL) status, Grade 10 grade point average (GPA), and college outcomes. Casewise deletion of students with missing values gives n = 145,580 in math and n = 145,754 in English among all students who took the FCAT in Grade 10. 2
We examine two different FCAT cutoffs using two different quasi-experiments and samples. For example, few students near the low FCAT margin score college ready on the PERT, whereas nearly all of those near the high FCAT margin do. The former group is therefore less likely to succeed in college but more likely to enroll in CRS course work. As our Grade 11 PERT sample consists of students targeted based on FCAT scores, it represents a third quasi-experiment and sample with achievement levels bounded by those of the two FCAT samples. However, students near this cutoff have above-average FCAT scores among targeted students. Our Grade 12 PERT sample (a fourth quasi-experiment using a fourth sample) was targeted based on FCAT scores, scored below college ready on the PERT, and then took a CRS course; it therefore tends to be lower achieving than the overall PERT sample. Additionally, because PERT retesting is not required under the FCCRI, there may be both observed and unobserved differences among students who choose to retest compared to those who do not retest. Although these differences are not huge, they do suggest that retesters are more likely to come from lower in the grade distribution than single-testers, perhaps because those at the top of the grade distribution have concordance scores on another assessment (see Supplementary Table S1 in the online version of the article). 3 These differences in composition at each cutoff do not invalidate our estimates at any given cutoff; however, we cannot separately identify the effect of different policies at each cutoff from composition effects at each cutoff.
Variable Overview
Note. DE = developmental education; FCAT = Florida Comprehensive Assessment Test. For college course outcomes, if no course was taken during the 1st year, dual enrollment and Advanced Placement courses are considered.
Pretreatment achievement variables are used only as control variables in the before-after regression analysis. We also included squared terms for both FCAT subjects and an interaction term between the intent to treat and the FCAT score for the subject of interest (e.g., for the math estimates, we interact the intent to treat with the math FCAT score).
Analyses using Grade 11 PERT values as the running variable use only students targeted for PERT testing (n = 89,225 for math, n = 51,383 in English under casewise deletion). 4 The modal student in these samples comes from the middle of the FCAT targeting range. Students near the bottom of the targeted FCAT range may be less likely to have postsecondary plans and may therefore have little incentive to comply with assignment to take the PERT, whereas students near the top of the range may be more likely to have concordance scores on the SAT or ACT that exempt them from PERT testing. Analyses using Grade 12 PERT values as the running variable use only students targeted for PERT testing who did not score college ready on the Grade 11 PERT, complied with assignment to a CRS course, and retook the PERT in Grade 12 (n = 29,264 for math, n = 10,043 in English under casewise deletion). Figure 1 shows how students progress through the steps of the FCCRI, with sample sizes (without casewise deletion) at each stage.

Number and percentage of students at each stage of the Florida College and Career Readiness Initiative progression. Results are for cohort M1.
For the college course-taking outcomes, we limit the analyses to the subsample of targeted students who are seamless college enrollees, so the results are not diluted by students who do not attend college. This should not bias our estimates because the FCCRI did not have an impact on high school graduation or seamless college enrollment. Supplementary Tables S2 and S3 in the online version of the article provide descriptive statistics of the characteristics of students in the math and English subsamples.
Before-After Regression Sample
Our sample for the regression analysis started with all targeted students from cohorts V2 through M1. We then dropped students who were not part of the treatment or comparison groups. The analytical sample was reduced further because of missing data. Models were estimated separately for students targeted in math and English. We refer to this as the full sample of students, and it is used for the high school diploma and seamless college enrollment outcomes. For other postsecondary outcomes, we restricted the sample to students who seamlessly enrolled in college. The full sample was 147,302 in math and 157,646 in English, and the seamless enrollment subsample was 69,718 in math and 76,772 in English.
To determine whether the full sample was representative of the population it was drawn from, we compared student- and school-level characteristics based on whether the student attended a school that was in or out of before-after analysis. First, we compared student-level characteristics of all targeted students statewide to those who attended schools that were included in the before-after regression analysis (Tables A1 and A2). The students attending schools in the before-after regression sample have slightly higher achievement than students in schools that are out of the analytic sample. There are also small differences in demographic characteristics, but most of the differences are small in magnitude. Second, we compared the characteristics of schools that switched from low to high compliance to those that did not and found that most of the mean differences were also small in magnitude (Table A4). The largest differences were in school locale, as the in-sample group had more schools located in suburbs and towns and the out-of-sample group had more schools located in the city and rural areas. These differences in urbanicity may also contribute to the small differences in student achievement, as suburban schools (which are overrepresented in our sample) tend to have higher achievement levels (e.g., Lleras, 2008).
We also compared student and school characteristics in the treatment group versus the comparison group (Tables A5 and A6, respectively). Almost all of the baseline student-level characteristics have a standardized mean difference that is less than 0.05. The largest standardized mean difference was −0.088, for FRPL status. Even though differences between the treatment and comparison group are small, we included all baseline student characteristics as control variables in our regression analysis to ensure that we controlled for any small differences between the treatment and comparison groups. Differences in school characteristics are also small in magnitude. The largest difference is in the percentage of FRPL students, as students in the treatment group attend schools with roughly 5% more FRPL students as compared to students in the comparison group.
Method
This study uses two methodologies to examine the FCCRI’s effects on student outcomes. First, we use RD analysis to estimate the FCCRI’s impact on students scoring just above or below thresholds for participation in the FCCRI (Research Question 1). The RD analysis uses data for the first cohort of students under the mandatory FCCRI (M1). Second, we use regression analysis with a before-after design and school-level fixed effects to analyze the impact for students from a broader range of baseline achievement (Research Questions 2 and 3). The before-after analysis explores changes over time using data from M1 as well as two cohorts of students under the voluntary FCCRI (V2 and V3). The first voluntary cohort, V1, is excluded because it was subject to different math requirements for high school graduation than the other cohorts.
Outcome Variables
At the high school level, we examined the probability that students graduated with any type of diploma. At the college level, outcomes included enrollment during the fall semester immediately following the cohort’s on-time graduation (seamless college enrollment) and taking or passing nondevelopmental math or English courses in the 1st year of college. Transition courses may be the most relevant outcome in math because the college readiness cutoff on the PERT is the score at which students are recommended for the transition course rather than developmental courses. The regression analysis also includes a categorical outcome for the level of the first course a student takes in college (no course, lower-level developmental education, upper-level developmental education, transitional [math only], or degree credit). Math is unique in offering a transition course (intermediate algebra) that counts for elective credit but does not count toward math graduation requirements. Table 1 has a list of outcomes and definitions.
RD
RD analysis is used when assignment to a policy treatment is determined by whether a continuously valued variable (the “running variable”) has crossed a predetermined cutoff; Lee and Lemieux’s (2010) research contains an extremely comprehensive study of its features and requirements. RD analysis has a strong causal interpretation, provided the data meet strict validity requirements; Lee and Lemieux provide a list of its applications to that point by education economists, showing how it has been applied to a variety of contexts, outcomes, and treatments. Here, RD analysis is used to determine the impact of being assigned to take the PERT in Grade 11, enroll in a CRS course in Grade 12, and place in developmental course work prior to college enrollment (since students scoring college ready on the PERT in Grade 12 were exempt from enrolling in developmental education). State policy did not require students to retake the PERT in Grade 12; three districts (Flagler, Gulf, and Hamilton) used the Grade 12 PERT as a component of CRS course grades, but incentives and/or requirements for retesting were often determined at the school or teacher levels. Students who retook the PERT were slightly more likely to be female (in math only), minority, non-native English speakers, and/or economically disadvantaged, but these differences all had standardized mean differences of 0.11 or less, far below the level of 0.25 at which the What Works Clearinghouse (WWC; 2015) maintains that baseline equivalence is violated.
RD analysis is useful because it isolates the impact of the policy being analyzed without capturing extraneous factors. If assignment to the treatment group is the only thing that changes noticeably at the cutoff for treatment, any difference in student outcomes should be attributable to that treatment. The main drawback of RD is that its results apply only around the cutoff for treatment and are not generalizable to the full sample of students. RD analysis cannot separately analyze individual components of treatment—for example, it cannot differentiate between a true null result and offsetting positive and negative component effects. One case in which this latter result might occur is if the benefits from a well-designed CRS course are canceled out by discouragement from being labeled “not college ready.” Dougherty (2015) finds in an RD design that African American students may be particularly susceptible to discouragement effects and/or stereotype threat from labels; future work will explore heterogeneous effects of the FCCRI among student subgroups.
Table 2 shows how FCAT performance levels are used to assign students to college readiness testing (in the subject they were targeted in) and how the PERT is used in Grade 11 to assign students to CRS courses in Grade 12 (and onward) to assign students to developmental education course work in college. Although the FCAT and PERT assessments group students into broad categories, students also receive scaled scores—between 100 and 500 on the FCAT and between 50 and 150 on the PERT—that function as nearly continuous measures of student achievement. Using these scores, it is possible to compare students on either side of a cutoff who have extremely similar profiles and differ primarily in their assignment to treatment.
Assignment to PERT, CRS Courses, and Remedial Course Work
Note. FCAT = Florida Comprehensive Assessment Test; PERT = Postsecondary Education Readiness Test; CRS = college readiness and success.
As a score of 300 or higher on each section of the FCAT is required to graduate from high school, the FCAT is the higher-stakes exam of the two; students have an incentive to perform well on the PERT only if they plan to attend a postsecondary institution or if they care strongly about the courses they take in high school. However, as the samples of students near FCAT graduation requirements and near PERT college readiness benchmarks are likely to be quite different, different students might view either exam as higher stakes in their particular case.
Our estimates use a sharp RD design, which is modeled in a local linear framework as
where Yi is an outcome of interest for individual i,
The Grade 12 PERT is not technically part of the FCCRI, as students on both sides of the college readiness cutoff have been targeted for college readiness testing and taken a CRS course; however, including this assessment in the analysis has two advantages. First, it allows us to contrast the effects of targeting students for college readiness testing and course work against the effects of a known policy with a predictable effect on student outcomes. Second, it allows us to assess the difference between college readiness and college success—to see how many of the students whom CRS courses prepare for college (in the sense of scoring college ready) are capable of passing college-level course work. It does not, however, provide any information about the impact of the FCCRI, as students in both the treatment and control groups participated in the intervention.
Before-After Regression Analysis
To supplement the RD analysis, we used a before-after regression analysis with school-level fixed effects to assess the FCCRI’s effects for students from a broader range of achievement levels than the RD design. This is important because the likelihood that students will be successful varies by students’ academic achievement, and the FCCRI’s ability to help students achieve college readiness likely varies with students’ pretreatment achievement levels. For this analysis, variation over time in school-level FCCRI compliance rates was used as an exogenous predictor of treatment take-up. This yielded an analytical sample where treatment take-up was conditionally independent from the outcomes of interest.
We calculated the school-level compliance rate as the proportion of Grade 12 students who enrolled in a CRS course after being targeted by the FCAT and not scoring college ready on the PERT or CPT. We defined the treatment group as students who were targeted by the FCAT and attended a high school with at least a 50% compliance rate (a high-compliance school). The comparison group consisted of students who were targeted by the FCAT and attended a school with less than a 5% compliance rate (a low-compliance school). We considered high-compliance schools to have implemented the FCCRI and the low-compliance schools to have not; from this, we obtained a treated group and a comparison group. We limited the analytical sample to students in the same schools that were categorized as both low and high compliance at some point between V2 and M1 (low- to high-compliance are in the in-sample group) because we were concerned that schools that were always low-compliance or high-compliance might have differences related to the outcomes. For almost all of the schools that switch from low to high compliance, the switch to being in the treatment group reflects an exogenous change—the FCCRI becoming mandatory—as opposed to some other change in the school. Additionally, the inclusion of school-level dummy variables addresses any time-invariant, unobservable differences in schools. Essentially, this is a before-after analysis, in which we compared student outcomes before and after schools implemented the FCCRI.
One limitation of this approach is that the results are not generalizable to schools that do not switch from low to high compliance; however, a large proportion of schools is included in the sample (46% in math and 68% in reading). Another limitation is that during the 1st year of college for cohort M1, Senate Bill 1720 (S.B. 1908, Fla. Stat. § 1008.30; 2008) changed the laws on developmental course taking. Under this bill, beginning in the spring semester of 2013 (the second semester for cohort M1’s seamless enrollees), developmental courses were no longer required for recent high school graduates. This change could induce students in cohort M1 to hold off on required developmental courses in the fall because these courses would not be required in the spring. This could incorrectly make the FCCRI appear to reduce developmental course taking. However, instances of students deferring developmental education in this way are likely uncommon, as the new policy was not well known prior to its implementation. Additionally, other researchers found little change since fall 2011 in developmental course-taking rates in Florida until the fall semester of 2014 (Hu et al., 2016); this semester is not included in our evaluation. Thus, this policy change is not a substantial concern for our analysis but might become an issue for later cohorts or for cohort M1’s long-term outcomes. We are not aware of any other changes over time that would threaten the validity of the results for the cohorts examined in our analyses.
We used multiple regression analysis to estimate the ITT effect of the FCCRI for all targeted students and for the subsample of students who seamlessly enrolled in college. For binary variables, we used a logit model with maximum likelihood estimation (MLE). For the categorical outcome, we used a multinomial logit model with MLE. The treatment effect was obtained by estimating
The outcome (Yij) is a function of the treatment status Ti, a vector of control variables shown in Table 1 (Xi), a school-level fixed effect
RD Validity
The WWC (2015) has three criteria necessary for a valid RD design; sharp RD studies must meet four further sets of criteria to meet WWC evidence standards. To qualify as an RD study, policy treatment must be based on a running variable; this accurately describes the FCCRI. The running variable must be ordinal, with a sufficient number of unique values; the FCAT and PERT are ordinally scored, with a large range of possible values above and below each cutoff point. Finally, no other policies may be implemented at the same cutoff value. This is certainly true on a statewide level for the FCAT, as there are no other statewide policies uniformly affecting students at either cutoff. It may not be true at more finely grained levels, as some schools might institute interventions for students in FCAT Level 1 (deemed at risk for dropping out of high school); however, there is no clear indication of this in our estimates. 6 The high school graduation cutoff is located sufficiently far from proficiency-level cutoffs that bandwidth selectors should avoid any confounding effects. The Grade 11 PERT is a more challenging case, as the college readiness cutoff is used to simultaneously inform students whether they are required to take CRS course work and whether they will be exempt from developmental course work in college (if they attend). When taken in Grade 12, the PERT affects placement into college developmental course work but should not affect high school course selection.
The first validity requirement for RD is that the running variable must be immune to manipulation. Manipulation requires that students know the cutoffs for treatment, have incentives to change their running variable values, and have the means to do so. As the FCAT and PERT are administered and scored by independent contractors with no known incentive to modify scores or treatment statuses, manipulation within a single exam sitting is unlikely.
Nonrandom retesting may be a greater threat to validity, however. Students may retake both the FCAT (if they score below a scaled score of 300, which is required for graduation) and the PERT; this could lead to selective retesting on the two assessments to avoid FCCRI requirements. If so, our estimates might capture the effects of student motivation as much as of the FCCRI. Furthermore, some students may benefit more than others from retesting due to additional home or school supports (see, e.g., Papay, Murnane, & Willett, 2010). To avoid this, we used students’ initial Grade 11 PERT scores on each section; whereas students’ highest scores are more likely to affect their placement into or out of CRS courses, initial scores cannot be manipulated through selective retesting. Because our data do not include FCAT dates, we instead use students’ lowest FCAT scores as proxies for their initial scores.
Further evidence on manipulation via retesting is presented in McCrary density tests in Figure 2, which show the number of students at each FCAT or PERT score (McCrary, 2008). Bunching just above or below any of the policy cutoffs would indicate that students are systematically working to avoid a particular policy outcome. 7 The clearest signs of bunching are at FCAT scores of 300, which represent the cutoff for high school graduation. However, this cutoff is not associated with targeting for the FCCRI and is sufficiently far from the FCAT Level 2 cutoff that students’ efforts to graduate should not be mistaken for attempts to clear the FCAT Level 2 threshold.

McCrary density tests. FCAT = Florida Comprehensive Assessment Test; PERT = Postsecondary Education Readiness Test. Results are for cohort M1. Circles on the graphs show the number of test takers at each given score; each circle represents one possible score. The reading FCAT has a large number of scores obtained by 10 or fewer students, very likely representing scaled scores that raw scores do not easily map to. The fitted lines represent best-fit quartic polynomials, with bandwidths selected to maximize R2. Solid vertical lines on FCAT graphs represent cutoffs for proficiency levels, and dashed vertical lines represent the cutoff for high school graduation. Solid vertical lines on PERT graphs represent the college readiness cutoff.
A more troubling case of bunching is at the cutoff for college readiness in the Grade 11 PERT, where there appears to be substantial bunching in the number of students who score college ready in reading. Although some of this discontinuity may be due to wide variations in pass rates just to the left of the college readiness cutoff, estimates using the Grade 11 reading PERT should be taken with caution.
The second validity requirement is that there cannot be excessive attrition overall or by treatment status. Within a narrow bandwidth of all cutoffs, both average and differential attrition are sufficiently low to avoid introducing substantial bias. Overall attrition in both subjects is less than 10% at all cutoffs (under 5% when the low FCAT cutoff is omitted), and the difference between treatment and control groups is less than 5 percentage points at all cutoffs (4% for the low FCAT margin in reading and 1% or less for all other cutoffs). Given an overall attrition rate of 10%, WWC standards permit up to 6.3 percentage points of differential attrition under the most stringent set of criteria. As a result, our data are well within acceptable boundaries for attrition.
The third RD validity requirement is that outcome variables be continuous everywhere but at the policy cutoff and that the outcome variable does not have any unexplainable discontinuities. We check for baseline equivalence of FCAT scores, cumulative high school GPA, and FRPL status and find effect sizes ranged from 0.000 to 0.094 standard deviations, with the majority less than 0.05 (Table A7). Because no effect sizes were greater than 0.25, these tests do not invalidate our estimates. However, because some effect sizes were greater than 0.05, our estimates control for a full set of covariates in Table 1. Removing covariates does not significantly change our estimates; although some magnitudes change, only one specification has a different significance level when covariates are removed (results without covariates are available upon request).
The second approach for examining the continuity of the outcome–running variable relationship is to demonstrate that outcome variables are continuous or explainably discontinuous away from any cutoffs. Explainable discontinuities arise at the FCAT cutoff for high school graduation and the PERT cutoff for degree credit course work in math. Discontinuities caused by either of these cases are well explained and sufficiently far from relevant cutoffs that bandwidth selectors will avoid them if needed.
The final validity requirement is that RD estimation must use an appropriate functional form and/or bandwidth. We use local linear estimation, with bandwidths selected using a cross-validation method initially presented by Imbens and Lemieux (2008) and further explicated by Lee and Lemieux (2010). Bandwidth selection uses a similar principle as the RD analysis itself but applied to a broader range of values. RD analysis, at its crux, examines the difference between the predicted value of an outcome variable at the cutoff value and the actual (average) value at that cutoff. Cross-validation applies this logic to each point within a particular range of the cutoff in order to determine the bandwidth that contains the least amount of statistical noise.
To ensure that our bandwidth selector is not confounded by the high school graduation cutoff (near the low FCAT margin) or the cutoff for degree credit eligibility (near the PERT college readiness cutoff), we use two strategies as robustness checks. The first strategy is to cap bandwidth selectors so that bandwidths cannot contain the second cutoff value. The second strategy is to use an estimator designed by Calonico, Cattaneo, and Titiunik (2014) without providing any boundaries to the bandwidth (our main strategy considers bandwidths between 5 and 20 points on the assessment in question). This estimator converged only for the FCAT analyses and only without covariates but selected quite large bandwidths, going far beyond the other cutoff values. Neither set of estimators produced large, statistically significant results (available upon request). 8 Our bandwidth selector therefore does not appear to be influenced by known policy cutoffs near those that we are analyzing. Furthermore, to be certain that variation in optimally selected bandwidths does not impact our results, we run all specifications using bandwidths of 5, 10, and 20 points using cross-validated bandwidths under a local quadratic regression and include the results in the online version of the article (Supplementary Tables S8–S11); these results are broadly similar to those presented in the Results section.
Results
RD Results
We present estimates for the RD analysis that cover the following sets of effects:
The ITT with PERT testing, based on FCAT score (using both upper- and lower-level cutoffs).
The ITT with CRS courses, based on Grade 11 PERT score, conditional on being targeted based on FCAT scores.
The ITT with placement into college developmental college course work, based on Grade 12 PERT score, conditional on being targeted based on FCAT scores, scoring below college ready on the Grade 11 PERT, and enrolling in a CRS course.
Conditional statements for both the Grade 11 and Grade 12 PERT apply to all results, which are shown in Table 3. Each row in the table shows an assessment (and cutoff, when the assessment in question is the FCAT), and each column refers to a different outcome. Readers should recall that each cutoff represents a distinct quasi-experiment and that differences in samples across experiments may help in interpreting some results.
Regression Discontinuity Estimated Intent-to-Treat Effect, Local Linear Regression With Optimal Bandwidths
Note. FCAT = Florida Comprehensive Assessment Test; bw = bandwidth; PERT = Postsecondary Education Readiness Test. Results are for cohort M1. Results shown are for local linear regression with bandwidths optimally determined on a regression-by-regression basis via cross-validation. Numbers reported are the difference in predicted probabilities across treatment status. Standard errors are shown in parentheses, and bandwidths are listed in brackets. Regressions control for gender, free or reduced-price lunch status, race-ethnicity, English language learner status, native-English speaker status, disability status, gifted/talented status, cumulative grade point average as of Grade 10, and district indicators. N varies by specification; for FCAT, N ranges from 4,607 to 32,810; for Grade 11 PERT, N ranges from 22,835 to 67,630; for Grade 12 PERT, N ranges from 2,246 to 25,552.
p < .10. **p < .05. ***p < .01.
Columns 1 and 2 show that no stage of the FCCRI had a statistically significant effect on high school graduation or on seamless college enrollment. Since the FCCRI did not impact either factor, we focus the remainder of both the RD and before-after analysis on the subset of students who seamlessly enrolled in college. Graphical analysis of the remaining outcomes is shown in Figure 3. On each graph that makes up Figure 3, the x-axis represents the range of scores on the exam in question (with solid lines representing the cutoffs between FCAT proficiency levels and the PERT college readiness level and dashed lines representing the FCAT high school graduation requirement), and the y-axis represents the percentage of students meeting the outcome in question. Markers on each graph represent the average outcome value at each FCAT or PERT score, and the lines on either side of each cutoff being evaluated are local linearizations based on average outcome values. The difference in local linearizations at the cutoff being evaluated is equivalent to the point estimate of a local linear RD regression with no coefficients.

Regression discontinuity estimated intent-to-treat effect of treatment for enrolling in and passing nondevelopmental courses. FCAT = Florida Comprehensive Assessment Test; PERT = Postsecondary Education Readiness Test. Markers on the graphs show the average outcome level at each score; each marker represents one possible score. Bolded best-fit lines represent local linear approximations, with bandwidths selected to maximize R2. Solid vertical lines on FCAT graphs represent cutoffs for proficiency levels, while dashed vertical lines represent the cutoff for high school graduation. Solid vertical lines on PERT graphs represent the college readiness cutoff. Results are for cohort M1.
In math, FCCRI targeting at the low FCAT cutoff had no impact on whether students enrolled in (column 3) or passed (column 4) nondevelopmental courses at either FCAT cutoff. The bottom half of Table 3, on the other hand, shows a coefficient of 0.0425 for students targeted at the low FCAT cutoff in English, meaning that these students were 4.3 percentage points (from a baseline of 59.2%) more likely to enroll in a nondevelopmental English course but no more likely to pass. This suggests that those students who were pushed to take nondevelopmental English may not have been prepared to do so. At the high FCAT cutoff, students who were targeted for the FCCRI were no more likely to enroll in or pass a nondevelopmental credit English course.
Students targeted for CRS courses in math at the college readiness cutoff on the Grade 11 PERT were no more likely than nontargeted students to enroll in or pass a nondevelopmental math course. Students at the college readiness cutoff in English were 2.3 percentage points (from a baseline of 79.1%) less likely to enroll in a nondevelopmental English course but no less likely to pass.
For comparison, college readiness on the Grade 12 PERT had a large impact on student course taking. Students at the college readiness cutoff in math who were exempted from developmental coursework were 30.6 percentage points (from a baseline of 45.9%) more likely to take a nondevelopmental course than students who were not exempted. However, these students were only 18.1 percentage points (from a baseline of 27.9%) more likely to pass, meaning that over 40% of students induced to take nondevelopmental course work were unable to pass. In English, students at the college readiness cutoff were 16.8 percentage points (from a baseline of 54.4%) more likely to enroll in nondevelopmental course work and 13.5 percentage points (from a baseline of 40.6%) more likely to pass; nearly 20% of students induced to enroll in nondevelopmental course work were therefore unable to pass.
One important finding is that students targeted for CRS courses performed comparably to students who were already college ready in Grade 11. Based on performance in Grade 11, all students who were below college ready had scores corresponding to developmental courses, whereas all students who were above college ready were exempt from developmental education. Thus, if the students’ performance remained the same between Grade 11 and college enrollment, we would expect those scoring just below college ready to be much more likely to be placed into developmental education courses. Yet in both subjects, there was little to no difference in the likelihood of passing nondevelopmental courses. This stands in contrast to the results for the Grade 12 PERT, where students just below college ready are significantly less like to enroll in and pass nondevelopmental courses than students just above college ready in Grade 12. However, it is not possible to know how many students in CRS courses would have enrolled in and passed nondevelopmental courses if they had taken another course in Grade 12 instead.
Results from the Grade 12 PERT help put the null results elsewhere into context. Seamless college enrollees who were targeted for college readiness testing did not score college ready, took a college readiness course, and passed the PERT in Grade 12 enrolled in for-credit courses at a rate very similar to that of seamless college enrollees who passed the PERT on their first attempt. Although we cannot separately identify the effects of discouragement from being labeled “not college ready,” any psychological effects from taking a course that advertises college readiness, and any human capital gained from the course, it is nonetheless clear that the Grade 12 PERT is the main factor affecting many students’ decisions to enroll in developmental education or in a for-credit course. Because the RD construction implies that students narrowly on either side of the college readiness cutoff in Grade 12 are of nearly identical ability, we also conclude that the Grade 12 PERT is likely underplacing approximately 18% of students near the math cutoff and 14% near the reading cutoff whom we might expect to pass. (Due to the RD construction, we cannot say whether this rate would apply to students far from the cutoff. Future work will investigate this topic more thoroughly.) However, students who score college ready in Grade 12 pass for-credit courses in both subjects at slightly lower rates than those who score college ready in Grade 11. This is to be expected, as students in our Grade 12 sample are likely to be somewhat less proficient than those in our Grade 11 sample.
These results for students assigned to CRS courses are particularly striking when considering the counterfactual conditions. On the Grade 11 PERT, students who barely scored college ready were much more likely than those who did not to enroll in advanced course work, such as honors and college credit–level courses during Grade 12, whereas those who scored just below this cutoff were much more likely to enroll in standard-level course work, including CRS courses. This suggests that students assigned to CRS courses were being compared with a very high standard and actually fared quite well by demonstrating similar performance in college.
Before-After Regression Results
We were concerned the RD analysis might not pick up on the full treatment effect because of its limited analytical sample. Thus, we supplemented the RD results with the before-after analysis in three ways. First, the before-after analysis is capable of including students in the middle of the targeted range of students, as opposed to including only students near the cutoffs. Thus, we estimated the same set of outcomes as the RD analysis to see how the results change when we include a wider range of students (Table 4). As a sensitivity analysis, we also restricted the treatment group to include students from M1 only because the RD analysis is limited to M1. The estimates from this sensitivity analysis were all within 0.001, so the differences between the RD and before-after estimates are likely from differences in the range of baseline achievement included in the sample as opposed to differences in which cohorts are included in the treatment group.
Before-After Regression Estimated Marginal Effect of Treatment
Note. Numbers reported are the difference in predicted probabilities across intent-to-treat status. Standard errors are shown in parentheses. Results are for cohorts V2, V3, and M1. Models followed a logit specification and included student background characteristics, pretreatment achievement, and a school-level fixed effect as regressors.
All students in the sample must meet the definition of either the treatment or comparison group as discussed in the Method section. Targeting is determined by the Grade 10 Florida Comprehensive Assessment Test. The sample size for all targeted students is 147,302 in math and 157,646 in English, and for the subsample of students who seamlessly enroll in college, the sample size is 69,718 in math and 76,772 in English.
p < .10. **p < .05. ***p < .01.
Other than the RD results that relied on the Grade 12 PERT cutoff, the results from the before-after analysis point toward a larger treatment effect for the outcomes of enrollment in and passing nondevelopmental courses. All results for enrollment in and passing nondevelopmental courses are positive and significant in both math and English, whereas most of these estimates were insignificant in the RD analysis.
Second, because the before-after analysis includes a range of students in the treatment group, we were able to include a break down of the results by baseline achievement, as measured by the continuous variable for Grade 10 FCAT score, to determine if the average marginal effect is hiding variation in the treatment effect across baseline achievement. The results are presented in Figure 4, which contains predicted probabilities of enrolling in each course level by treatment status and baseline achievement. The difference between the two lines is the marginal effect of the ITT. There does appear to be variation in the impact across baseline achievement. Figure 4 shows that the FCCRI had a larger impact for students in the lower and middle range of baseline achievement. The largest difference across treatment status in math was for students near the FCAT Level 3 cutoff. In English, the largest difference was for students at the bottom of the targeted range.

Predicted probability of enrolling in and passing nondevelopmental courses by baseline achievement. FCAT = Florida Comprehensive Assessment Test. Dashed lines are drawn between FCAT levels. Results are for the seamless college enrollee subsample in cohorts V2, V3, and M1, with N = 69,718 in math and N = 76,772 in English. Models followed a logit specification and included student background characteristics, pretreatment achievement, and a school-level fixed effect as regressors.
Third, the before-after analysis allowed us to consider how enrollment in nondevelopmental courses changed across the full spectrum of course levels through the use of a multinomial outcome, as opposed to simply comparing nondevelopmental and developmental course taking. We present results in Figures 5 and 6 by plotting predicted probabilities of enrolling in each course level by treatment status and baseline achievement, where the difference between the two lines is the marginal effect of the ITT. The multinomial results indicate that the FCCRI reduced enrollment in both lower- and upper-level developmental education in math. The impact was especially large for students at the lower end of the targeted range (FCAT Level 2), where the treatment group was up to 12.6 percentage points less likely to enroll in lower-level developmental education courses. Among treated students in the middle of the targeted range (FCAT Level 3), there was a 10.7-percentage-point increase in the likelihood of enrolling in a nondevelopmental course, with most of the increase occurring in the transition course. These students moved away from both lower- and upper-level developmental education courses at similar rates. Developmental course enrollment was also lower for the treatment group in English, although differences were smaller than in math. The main change seems to be in moving lower-performing students from upper-level developmental education to degree credit courses, as treated students had up to a 5.8-percentage-point decrease in the likelihood of enrolling in upper-level developmental courses. There was little to no difference for higher-performing targeted students.

Predicted probability of enrolling in each course level, math. FCAT = Florida Comprehensive Assessment Test; DE = developmental education. Dashed lines are drawn between FCAT levels. Results are for the seamless college enrollee subsample in cohorts V2, V3, and M1, with N = 69,718 in math and N = 76,772 in English. Models followed a multinomial logit specification and included student background characteristics, pretreatment achievement, and a school-level fixed effect as regressors.

Predicted probability of enrolling in each course level, English. FCAT = Florida Comprehensive Assessment Test; DE = developmental education. Dashed lines are drawn between FCAT levels. Results are for the seamless college enrollee subsample in cohorts V2, V3, and M1, with N = 69,718 in math and N = 76,772 in English. Models followed a multinomial logit specification and included student background characteristics, pretreatment achievement, and a school-level fixed effect as regressors.
Overall, the results from the before-after analysis provided stronger evidence of a positive impact of the FCCRI than the RD analysis. The results of the two lines of analysis likely differ because (a) the RD analysis reduced the analytical sample to students near cutoff points and (b) the treatment rates differ across the RD and before-after samples, with the before-after sample having larger differences in treatment rates between the treatment and comparison groups (refer to Supplementary Table S12 in the online version of the journal for a summary of the treatment rates by sample).
Discussion
This study sought to examine the impact of the FCCRI on student outcomes by the 1st year of college and found mixed results. The FCCRI increased the likelihood of enrolling in nondevelopmental courses for some targeted students, although results differed by academic performance. However, smaller differences in the likelihood of passing nondevelopmental courses suggest that some students were not prepared for these courses. Additionally, impacts were estimated after only the 1st year of implementation of the mandatory FCCRI, and program impacts may have continued to grow as schools learned how to better deliver CRS courses over time.
In 2015, the legislature ended the requirement that high schools offer college readiness testing and CRS courses. There is no simple answer as to whether this was a good decision. Some higher-performing students may be better off without the requirement to take CRS courses if they would otherwise be taking more rigorous courses. Yet other mid-performing students may be harmed by the lack of access to college readiness testing and CRS courses. Separate legislation in 2014 exempts recent high school graduates from taking the PERT and enrolling in developmental education courses at state colleges. This means students would have no test scores to indicate their level of college readiness in high school or college, and many students might once again enroll in nondevelopmental courses without knowing they are not college ready.
Implications
This evaluation has important implications for similar programs in other states, as initiatives like the FCCRI have gained popularity over time. A national scan conducted in 2012 found that 29 states offer transitional math and English courses during the senior year of high school for students who have not previously met college readiness benchmarks, although some of these are local policies rather than statewide policies like the FCCRI (Barnett, Fay, Bork, & Weiss, 2013).
Our findings suggest that initiatives like the FCCRI may not improve high school graduation rates or college enrollment. The FCCRI had no effect on these outcomes at any level of performance on the FCAT or PERT, rejecting our hypothesis that the FCCRI might encourage students to complete high school and continue to postsecondary education by showing them that they can obtain the skills needed for college. These findings corroborate feedback from educators that the FCCRI seems to be most effective for students who want to attend college but are not quite college ready and least effective for students who are disengaged from school and lack realistic postsecondary goals (Lansing, Ahearn, Rosenbaum, Mokher, & Jacobson, 2017; Mokher et al., 2014). As it may not be practical to limit CRS courses to students who indicate they are college-bound in Grade 11, particularly because high school students’ college intent is subject to change, the findings suggest that states looking to improve high school graduation and college enrollment rates should consider other policies.
However, the FCCRI may reduce enrollment in developmental education among students who do attend college, particularly lower- to mid-performing students. This is similar to findings in California, where students who participated in the statewide Early Assessment Program had a lower probability of needing developmental education courses at California State University, particularly those at the margins of remediation risk (Kurlaender et al., 2016). Other states may want to consider this type of initiative if they have similar goals.
Our study also has implications about the types of students who should be targeted by such policies. The FCCRI had different impacts based on students’ prior academic achievement. The initiative appears to have most helped students who were neither too far behind to catch up in a single year nor so advanced that they were already college ready. Other studies have found that targeting higher-performing students may even be harmful—an evaluation of a similar policy in West Virginia found negative effects, which may be attributed to students taking transition courses at the expense of more rigorous high school courses (Pheatt et al., 2016). States should collect feedback from educators, look at their own data to identify the students who are most likely to benefit, and use this information when devising eligibility criteria.
Finally, given state policymakers’ interest in improving college readiness, more effort should be focused on finding ways for high schools and colleges to work together. Although FLDOE advised high schools to work with local community colleges to develop the curriculum for the CRS courses, there is very little evidence of secondary–postsecondary collaboration around the CRS courses or college readiness more broadly (Mokher & Jacobson, 2017). To support students who are not yet college ready, states should work to ensure that students have early information about their level of college readiness, appropriate courses in high school to better prepare them for college-level work, and effective remediation options and supports in college.
Directions for Future Research
This study also provides several directions for future research, which we are currently investigating. First, what are the impacts of the FCCRI on longer-term outcomes, such as college persistence and degree completion? If more students enroll directly into nondevelopmental courses but pass rates decline, this could have negative implications if students become discouraged and drop out of college or if their time to degree increases. Second, how do impacts differ by student and school characteristics? We may expect to find variation in impacts across schools, and the FCCRI may also change achievement gaps in postsecondary outcomes across student subgroups, such as FRPL status. Future research could also use additional methods to extrapolate the RD impacts away from the cutoffs to further explore variation by student achievement (e.g., Angrist & Rokkanen, 2015; Tang, Cook, Kisbu-Sakarya, Hock, & Chiang, 2017). Third, what impact did the FCCRI have on student course taking in Grade 12? We are examining the extent to which students took CRS courses at the expense of more rigorous high school courses and how these results differ by student achievement. Last, given that recent high school graduates are no longer required to take the PERT or enroll in developmental education courses in college, is information from students’ high school records a sufficient substitute for PERT? We will examine the extent to which students are misplaced into college-level courses and compare how high school data perform relative to the PERT in predicting student success in college.
Footnotes
Acknowledgements
We thank the staff of the Florida Department of Education for their support of the project and for providing the data required for our analyses. We thank other members of the research team, including Louis Jacobson and James Rosenbaum, as well as the members of our Technical Working Group—Robert LaLonde, David Figlio, Stephen Raudenbush, and Jeffrey Smith—for helpful comments and suggestions.
Authors’ Note
This study was conducted under an institutional review board (IRB) approval from Western IRB (WIRB Protocol No. 20121640).
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The Institute of Education Sciences, U.S. Department of Education, supported this research through Grant R305E120010 to CNA. The report represents the best opinion of CNA at the time of issue and does not represent the views of the Institute or the U.S. Department of Education.
Notes
Authors
Christine G. Mokher is an associate professor of higher education at Florida State University and a senior research scientist at CNA. Her research examines state and local policies focused on college and career readiness and success, with a particular emphasis on student transitions from secondary to postsecondary education.
Daniel M. Leeds is a research analyst in the education division at CNA. His research interests include economics of education, labor economics, and demography.
Julie C. Harris is a research analyst in the education division at CNA. Her research interests include college readiness, English language learners, law, education finance, and issues surrounding school choice.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
