Abstract
Background:
This article reports on the Future to Discover Project—a Canadian randomized controlled trial of two high school interventions—where data on key postsecondary enrollment outcomes were collected for two phases. During the initial phase, outcomes were recorded from administrative data and follow-up surveys. During the later phase, data came from administrative records only.
Objectives:
The article provides analyses that are informative about the consequences of a change from administrative-only data to survey-only data (and vice versa) for the estimation of impacts.
Results:
The change from administrative-only to survey-only data tended to produce apparent drops in postsecondary enrollment rates that varied by subgroup and education outcome. Nonetheless, levels and significance of impact with respect to postsecondary enrollment remained relatively stable.
Conclusions:
The findings of the article provide evidence that estimating education program impacts in the context of a randomized experiment can be relatively robust to the data sources chosen. They suggest that internal validity and conclusions for policy need not be affected by changing data sources even when the change produces marked changes in levels of the outcome of interest observed.
Keywords
Introduction
It is not uncommon for researchers in social evaluations to rely on multiple data sources—whether these sources are collected specifically for the project or obtained from existing data—in measuring outcomes of interest (e.g., Dynarski & Wood, 1997; Ford et al., 2014a; Gyarmati, de Raaf, Nicholson, Kyte, & MacInnis, 2008; Kemple, 2001; Long, Gueron, Wood, Fisher, & Fellerath, 1996; Michalopoulos et al., 2002; Moore, 1997; Silverberg, Bergeron, Haimson, & Nagatoshi, 1996). This offers many advantages including expanding the set of observations, contrasting results, and validating interpretations. For multiyear projects, it is also common for the data sources to change over the years. The education evaluation reported here transitioned from data sources composed of a combination of administrative data sets and follow-up surveys to administrative data sets only. The change in data sources raises several questions that are addressed in the context of this article including: What are the consequences of the change in data sources? Do impacts found statistically significant when survey data were available remain significant when they are not?
While evaluations using multiple data sources have received considerable attention in studies of earnings (e.g., Barnow & Greenberg, 2015), they have received less attention in education. 1
One predecessor is Kemple (2001) in his analysis of the impact of Career Academies. He compared experimental impact estimates derived from a post-high school survey to those drawn from school records. He found “the pattern of impacts generally similar” between samples defined as low and medium risk with respect to dropping out of high school, with only minor differences in most cases. But, the dropout rate for control group students was higher in the administrative sample than in the survey sample, while the on-time graduation rate was substantially lower. Kemple attributed both discrepancies to control group dropouts being underrepresented in the survey.
Other pertinent lessons emerge from research on educational measures outside the context of evaluation. Bakker, Linder, and van Roon (2008) develop a useful typology for sources of error deriving educational attainment from survey versus administrative data. They conclude that the validity of administrative data for a measurement purpose ultimately depends on the alignment of interest between the administrative goal and the measurement one. As one example, student registrations may be more accurately captured by a government agency than their exam results, since the former carries more direct financial implications. Scott-Clayton and Wen (2018) compare survey based to simulated state administrative data estimates of the returns to a college degree in the United States and find upward and downward biases in administrative-sourced estimates that approximately cancel each other out. Lawson, O’Connor, Rouse, Street, and Kulneva (2010) draw attention to nonresponse bias in particular as an important source of error in education surveys. The prior literature would thus suggest taking into account differential survey response rates.
The Future to Discover Project is a large-scale Canadian randomized controlled trial designed to estimate impacts that result when 5,400 incoming Grade 10 students are offered (a) 3 years of enhanced career education programming (the Explore Your Horizons [EYH] intervention) or (b) a guarantee of a generous student aid grant (the Learning Accounts [LA] intervention), (c) both, and (d) neither. 2 Findings up to and including the sixth year following recruitment were based on combined survey and administrative data. Analysis at Year 6 identified many significant impacts attributable to the interventions on access to postsecondary education. The project has since been extended to assess longer term outcomes in the 7th through 10th years following recruitment. Unlike the first 6 years, however, data are only being collected from administrative sources.
This article aims to learn whether the different data sources caused the estimated impacts to differ. It estimates outcome measures and impacts as determined by survey or administrative data and focuses on enrollment in different types of postsecondary education. It presents a sensitivity analysis for the year prior to the switch in data sources (Year 6), the last point at which the consequences of using different data sources can be examined. The primary challenge is that—as is the case in this article—administrative records for education in Canada are limited in both scope (types of education covered) and geographic reach. 3 If administrative records cannot be located for a study sample member, his or her educational outcome is not known for certain and researchers then typically infer nonparticipation in the outcome of interest. By contrast, surveys record outcomes regardless of participation in the outcome and service provider. This has direct consequences for the levels of the outcome of interest that are observed and may or may not affect the direction and size of the impacts attributed to the interventions under test.
The analysis shows that a change in data sources from administrative-only data to survey-only data in evaluating Future to Discover tended to produce apparent drops in postsecondary enrollment rates that varied by the subgroup of interest and education outcome (i.e., postsecondary, university only, college only). 4 This finding is consistent with the fact that the data indicating college participation have more conflicts in which administrative enrollment is not supported by survey enrollment. Nonetheless, the levels and significance of impact with respect to postsecondary enrollment remained relatively stable. Furthermore, the results from t tests show that, for the vast majority of subgroups, the impacts estimated with administrative-only data are not statistically different from the ones estimated with survey-only data. 5
The article therefore provides evidence that estimating education program impacts in the context of a randomized experiment can be relatively robust to the data sources chosen. It suggests that internal validity and conclusions for policy need not be affected by changing data source, even when it produces marked changes in levels of the outcome of interest observed. Interestingly, this is a counterexample to other findings in the recent literature, which conclude that impact estimates are often divergent depending on data source (see Moore, Perez-Johnson & Santillano, 2018; Yang & Hendra, 2018; for a review, see Barnow & Greenberg, 2015). Two main explanations for this difference can be advanced. First, these papers review employment outcomes, while the current study focuses on enrollment in education. This implies differences not only in the type of administrative data sets available in the studies but also in the motivation for answering survey questionnaires. Second, they review evaluations of programs in the United States and United Kingdom, while the study here is based on Canadian data: Important differences in the impacts of similar education interventions exist in these countries as shown by previous work (e.g., Ford et al., 2014a). The earlier Ford et al. study concluded that the Advancement Via Individual Determination (AVID) program—repeatedly reported as successful for “middle-achieving” students in U.S. high schools in studies such as Guthrie and Guthrie (2002) and Huerta, Watt, and Reyes (2011)—did not offer “middle-achieving” students in British Columbia any additional advantages with respect to postsecondary education compared to those already offered to them within their school system.
The article is organized as follows: The next section presents a brief overview of the Future to Discover Project. Section 3 then discusses the methodology of the article. Section 4 analyzes the sensitivity of impact findings to data sources. Section 5 concludes the article.
The Future to Discover Project
This section describes the Future to Discover Project’s context, interventions, outcomes of interest, and why the project presents a unique opportunity to assess the sensitivity of impact estimates to changes in data sources.
Overview
Future to Discover is a large demonstration project funded by the Canada Millennium Scholarship Foundation and the provincial governments of Manitoba and New Brunswick that has involved 5,429 students at 51 high schools since 2004. It aims to develop evidence about what works to increase access to postsecondary education, particularly for lower income students, and those whose parents have little or no postsecondary experience. 6 Because Future to Discover data collection ceased in Manitoba in 2011, and to facilitate the exposition, the rest of the article discusses New Brunswick outcomes only and focuses on students eligible for both interventions (2,124 students).
Future to Discover tested the effectiveness of two interventions designed to help students overcome certain barriers to postsecondary education, namely, lack of career clarity, misinformation about postsecondary education’s costs and benefits, and lack of financial resources. The two interventions were tested separately and in combination to estimate whether they would increase access to postsecondary education. The impact of Future to Discover’s interventions is measured using a rigorous random assignment of students that is described in the Method section (Methodology for Estimating and Comparing Impacts subsection). Data on outcomes—to be discussed in Future to Discover and the Discussion on Impact Sensitivity subsection—were collected from follow-up surveys and administrative data.
Brief Description of the Interventions
The details of the project’s implementation are described in Social Research and Demonstration Corporation reports (SRDC, 2007, 2009).
EYH
EYH was a career education intervention comprised of six integrated components offered over 3 years of programming, through Grades 10, 11, and 12 of high school. It was intended to facilitate participants’ development of their own postsecondary plans, based on their passions and interests. It engaged parents as allies and existing postsecondary students as role models, providing enhanced career education beginning in Grade 10.
LA
LA was a financial incentive for high school students from families with annual household incomes below the provincial median (the exact threshold depended on family size). 7 An assumption underlying the development of LA was that lower income students anticipate having inadequate financial resources to pay for their postsecondary education, particularly university and college. LA participants who attended high school until graduation and who successfully enrolled in a postsecondary education program (recognized by Canada Student Loans) would receive up to a maximum of $8,000 CDN over 2 years to subsidize their postsecondary education expenses. 8 , 9
Research samples
For sake of comparison, the study is restricted to students who were income eligible for both interventions (i.e., students from families below provincial median income). These students were eligible for assignment to one of the four following groups:
a control group that would not receive any interventions,
a group that would receive EYH only,
a group that would receive LA only,
a group that would receive both interventions combined.
The Method section provides a complete explanation of the methodological approach guiding the construction of the research sample.
Outcomes of Interest
The article presents three types of outcomes: Enrollment in a university (equivalent to a U.S. 4-year college) is defined as being enrolled at a university in a (typically, 4-year) program leading to a degree, certificate, or diploma at the bachelor’s degree level or higher. This includes a teaching certificate; bachelor’s degrees (e.g., BA, BSc, BEd, BE, LLB); any certificate above a bachelor’s; master’s degrees (e.g., MA, MSc, MBA); degrees in medicine, dentistry, veterinary medicine, or optometry; doctorate or postdoctorate programs; and professional association diploma, certificate or license (e.g., accounting, banking, insurance). University enrollment also includes being enrolled at a college in a program that leads to a bachelor’s degree. Enrollments are analyzed cumulatively for the entire period (Year 1 to Year 6 of the project); Enrollment in a college is defined as being enrolled in a community college or technical institute in a program leading to a degree certificate or diploma below a bachelor’s degree level, excluding any programs that would normally last 5 weeks or less and apprenticeship programs. College enrollment includes Collège d’Enseignement Général et Professionnel (CEGEP, a type of public preuniversity, postsecondary education collegiate institution), university transfer programs, certificate or diploma programs in cosmetology, business administration, radiology, certificate of bricklaying, and so on. College enrollment also includes being enrolled at a university in a program that leads to a diploma or certificate below a bachelor’s degree, excluding any programs that would normally last 5 weeks or less. Enrollments are analyzed cumulatively for the entire period (Year 1 to Year 6 of the project); Enrollment in postsecondary education denotes enrollment by academic year in any university, college institution, private or vocational institute, or registered apprenticeship program. Enrollments are analyzed cumulatively for the entire period (Year 1 to Year 6 of the project).
Future to Discover and the Discussion on Impact Sensitivity
The Future to Discover Project offers a valuable opportunity to discuss the sensitivity of education impact estimates to the range of data sources used. 10 The reliance on different combinations of administrative and survey data at different phases of the project offers a case study with lessons that can serve researchers contemplating whether to use survey or administrative data.
Multiple data sources
Future to Discover relies on multiple sources of data. First, it utilizes postsecondary administrative data files. This includes college enrollment data from New Brunswick Community College, Collège Communautaire du Nouveau-Brunswick, and New Brunswick College of Craft and Design, which are responsible for all community college campuses in New Brunswick. Administrative data files also include university enrollment across three provinces (Nova Scotia, Prince Edward Island, and New Brunswick) obtained from the Maritime Provinces Higher Education Commission.
A second source of data is the Future to Discover 66-month student survey. In some instances, students could not be contacted directly, in which case the study uses data from a proxy survey of parents or guardians. 11
The issue: Validating a switch to “admin-only” outcome data sources
The data on key postsecondary enrollment outcomes for Future to Discover were collected for two phases. During the initial phase, findings up to the sixth year following recruitment were based on combined survey and administrative data. The project has since been extended to assess longer term outcomes in the 7th through 10th years following recruitment. Unlike the first 6 years, however, data from the seventh year onward have only been collected from administrative sources. The change in data sources raises challenges and questions since administrative and survey data have several differences, but it also raises the opportunity to address them through the comparison of results obtained using each type of data.
Distinctions between administrative and survey data sets in the context of Future to Discover
Scope of detection
Although administrative data contain accurate information on postsecondary enrollment, they can be somewhat incomplete, depending on the outcome of interest. The primary challenge in the context of Future to Discover is that administrative records are available only for those who participate in the outcome of interest within the scope of detection by a particular service provider, and it was impossible to obtain the participation of all service providers. 12 In other words, administrative records are limited in their geographic reach (in this case, Maritime Provinces for university, New Brunswick for college). 13 If administrative records cannot be located for a study sample member, his or her educational outcome cannot be known for sure and researchers will typically infer nonparticipation in the outcome of interest. By contrast, surveys record outcomes regardless of participation in the outcome and service provider. The availability of surveys can therefore complement an administrative data set—as is the case for Future to Discover (Coding of the Outcome Variables subsection)—which has direct consequences on the levels of the outcome of interest observed and may or may not affect the scale of impacts attributable to the interventions under test.
Coverage of outcomes
Another potential distinction between sources in education evaluations is their institutional coverage. While the administrative data sets cover universities and colleges only, the survey data contain additional types of education. Specifically, the survey also covers enrollment at a private or vocational institute (programs leading to a diploma or certificate excluding programs that would normally last 5 weeks or less and apprenticeship programs) and registration as an apprentice (training in a trade leading to a journey-person certificate). 14 These types of postsecondary education (PSE) enrollment account for about 15% of the sample (tables 4.2, 5.2, and 6.2 in Ford et al., 2012). The Future to Discover survey’s coverage of more broadly defined outcomes represents a source of divergence that must be taken into account when interpreting the results. Such divergence might not arise had the survey been written differently.
Nonresponse bias, recall error, and costs
Because of their larger scope and coverage, survey data can supplement education administrative data but have important limitations that include nonresponse bias, recall error, and costs. The response rate for Future to Discover was 67.84%. Nonresponse bias is a critical issue that can affect the reliability of survey results if, for instance, some subgroups tend to respond more (or less) than others. 15 However, nonresponse biases did not appear to be important in Future to Discover (Ford et al., 2012; SRDC, 2007), based on comparisons of the baseline characteristics of experimental groups and later differential survey attrition against baseline characteristics. Furthermore, years after the intervention, survey attrition is likely to be an additional concern. In either case, survey nonresponse implies survey nonrespondents’ outcomes can only be known from administrative data.
Recall error can also affect the accuracy of survey results. However, in the context of this study, this is less of a concern since enrollment—the main outcome here—is unlikely to be subject to considerable recall error by survey respondents. Also, enrollment is unlikely to be overlooked by institutions maintaining administrative records, given its financial importance to them.
Finally, a major limitation of surveys—especially in educational evaluations—is their cost. Indeed, obtaining participation of young adults is particularly demanding in terms of time and monitoring since they are a mobile population not necessarily motivated to participate in evaluation studies. For Future to Discover, the costs of the student and proxy surveys of parents and guardians could not be supported after the sixth year of the project, and the use of surveys had to be abandoned.
Method
Coding of the Outcome Variables
The different data source configurations impose clarity in the coding of the outcome variables. With regard to a particular type of educational enrollment—say, university—six circumstances can arise from all combinations of the two administrative data outcomes (enrolled or not enrolled) and the three survey data outcomes (enrolled or not enrolled, or no survey data).
The tabulation of these cases for each educational outcome is presented in Table 1. There were 2,124 students in the study among whom about two thirds responded to surveys. 16 Nonresponse did not appear to introduce a systematic difference between the treatment and control groups. A detailed discussion of the survey rates can be found in Ford et al. (2012).
Cross-Tabulations of Administrative and Survey Data.
Note. Percentages of the total number of records (2,124) are shown in parentheses. Italicized values indicate breakdowns within the subgroup of “All survey respondents”.
Across all panels of the table, survey and administrative records tend to be consistent with each other: When survey respondents say they are enrolled, they tend to be recorded as enrolled in administrative databases and vice versa. Specifically, among survey respondents, more than three quarters falls into that category (77% of the respondents for the postsecondary outcome, 95% for university, and 86% for college). Furthermore, across all education outcomes, the majority of nonrespondents are not recorded in administrative databases (74% of the nonrespondents for the PSE outcome, 88% for both university and college).
Nevertheless, some students have conflicting enrollment statuses that show enrollment for survey data that are not supported by administrative records or vice versa (15% of the entire sample for the postsecondary outcome, 3% for university, and 9% for college). Unsurprisingly, the most common conflicting case consists of survey enrollment not supported by administrative records (12% of the entire sample for the postsecondary outcome, 2% for university, and 4% for college). This case can be explained by the larger coverage of survey data: Students enrolled outside the Maritime Provinces, enrolled at a private or vocational institute, or enrolled at an apprenticeship program would not be recorded in the administrative databases of the study.
The other conflicting case—students appearing as enrolled in the administrative databases but not in the survey—accounts for a smaller percentage of the sample (3% of the entire sample for the postsecondary outcome, 1% for university, 6% for college). 17 One can develop two plausible explanations for these odd cases. First, respondents (especially in the case of the proxy survey of parents and guardians) may simply have recorded inaccurate information. 18 Second, students who dropped out early of university or college may not have reported enrollment if they thought they did not stay long enough. However, such enrollment would have been captured by administrative databases, hence the inconsistency. A third explanation, administrative error in falsely recording postsecondary enrollment, seems implausible, given the student–institution interactions involved in generating a postsecondary record.
The patterns in Table 1 suggest that the survey data provide information that essentially confirms or supplements administrative records. This motivates the coding of enrollment described in Table 2. The coding is specific to each data source, so that the information conveyed by the source is kept. For example, the coding reflects the fact that administrative data set may not be consistent with the survey data set and vice versa.
Coding for the Analysis.
Methodology for Estimating and Comparing Impacts
To interpret the results appropriately, some prior understanding of the methodology for estimating impacts is also beneficial. 19
Method design
The core of the approach consists of randomly assigning students to program groups that receive one or both Future to Discover interventions (EYH, LA, or both) or to a control group. Since chance determines who is offered the program, differences in outcomes can be attributed causally to the offer of the intervention, eliminating any competing explanations that might normally arise due to preexisting differences between groups that receive different programs. This means all impact estimates represent the effect of the intention-to-treat, rather than the effect of treatment-on-the-treated. This is an important consideration when considering policy implications but is constant across data sources and so has no direct consequence for this paper’s consideration of the effect of switching data sources.
Random assignment
The random assignment of participants was undertaken by the Social Research and Demonstration Corporation (SRDC) using a computer program, following recruitment. 20 It is important to keep in mind that it is the offer of the intervention that was randomly assigned, not the treatment itself.
Given the number of research groups in New Brunswick, the assignment of students was one of the most complex ever used in a Canadian demonstration project. The process had to satisfy a number of requirements simultaneously, including creating an analytically useful sample, complying with the initial targets for participation in each group, maintaining feasible and comparable class sizes for EYH within each school, and staying within the budget allotted for follow-up surveys with participants. The proportions assigned to different groups reflected the need to make the interventions feasible for implementation and to ensure that the various research groups had control groups of equivalent size and characteristics.
The demographic and socioeconomic characteristics of the students recruited for the Future to Discover Project were not statistically different across the groups to be compared (SRDC, 2007; tables 4.23 and A4.1 through A4.14). It was found that the interventions had been implemented with high fidelity and thus given a fair test.
Impact estimation
To take into account the chance differences between groups arising from the random assignment, a regression adjustment strategy has been adopted. As a result, differences between program and control group outcomes can be reliably attributed to the offer of the intervention.
Specifically, ordinary least squares models were estimated, where the outcome is regressed on a treatment variable (a variable indicating whether the student was offered the intervention or not), and several other covariates collected in the baseline survey (prior to random assignment): number of children/adults in home, work status of the “signing” parent (the parent who signs the consent form), family income, gender of signing parent, age of student and signing parent, student gender, student disability indicators (difficulty seeing, hearing, learning; physical/mental condition or health problem), ethnicity indicators (White and Aboriginal indicators), average mark in Grade 9 (indicator for 80% or more), parents’ highest level of education, parental importance attached to the child obtaining a postsecondary education (indicator for “very important”), parental aspirations of the child’s educational attainment, an indicator of any barriers to the child reaching parents’ expectation, an indicator of the student ever working, and high school “fixed effects.” From this regression model, predicted outcomes are generated for two groups: students in the program group and students in the control group. In each case, predicted outcomes are calculated for the case of a student possessing all of the mean values of the covariates (“the average program group member” is compared to “the average control group member”).
Along with predicted outcomes and impacts, the tables described in the Sensitivity of Impact Findings to Data Sources in Future to Discover subsection also show standard errors clustered at the school level and calculated by means of bootstrap. 21
Comparison between sources of data
Estimates of impacts could go up or down moving between the two data sources, but in theory, the ability to detect impacts should not be affected. There was no a priori expectation during the study design phase that the programs tested would generate differential impacts on dimensions (like type of postsecondary education institution) that align with differences between the data sources. The new interventions under test were designed very deliberately to promote all types of postsecondary education equally. However, it could be argued that Explore Your Horizons and Learning Accounts can theoretically lead to out-of-province postsecondary education and therefore be captured in survey but not in administrative data; this is not observed in practice. Indeed, the students in the study come from low-income families (the LA-eligible sample) who are less likely to leave their home province. 22 The article tests whether differences in impacts exist by statistically comparing the impacts calculated using administrative data against the impacts calculated using survey data.
Specifically, the following is implemented for each subgroup of interest (described in Subgroups of Interest subsection). First, for each individual i, the enrollment status under administrative data (KAi ) and survey data (KSi ) is determined. Second, the administrative data and survey data are linked at the individual level, so that it is possible to calculate the difference Di = KSi − KAi . Third, the t tests are carried out in the same process as the impact (program vs. control) estimation described in Impact Estimation subsection. This is possible because of the mathematical equivalence between the mean difference of estimated impacts between administrative data and survey data and the estimated mean “impact” on the difference of enrollment by data source (i.e., mean Di ). Indeed, we have:
It follows that, in practice, the statistics associated with the impacts in the Sensitivity of Impact Findings to Data Sources in Future to Discover section are calculated using three different ordinary least squares, with KAi , KSi , and Di as the dependent variables, and a 0–1 dummy indicator of program and covariates as independent variable. The estimated coefficients of the program group dummy for the regression of KAi , KSi , and Di are the estimated mean impact from administrative data, estimated mean impact from survey data, and the estimated difference of impacts between administrative data and survey data, respectively. Once the standard errors are obtained, they are used to compute the Student t test for each estimate. The actual process, the list of covariates, and the bootstrap procedures used to compute the standard errors were described in Impact Estimation subsection.
Subgroups of interest
The project seeks to determine the impacts of the interventions on students most likely to need additional support to access postsecondary education. These were identified at the outset as those whose families have lower incomes and whose parents have little or no experience of postsecondary education. Specifically, the results of the report are broken down across the following subgroups: the Lower-Income and Lower parental Education (LILE) subgroup. Among lower income families (with incomes at or below the provincial median), the distinguishing feature of this group is lower parental education, which is defined as neither parent holding a postsecondary diploma, certificate, or degree requiring 2 or more years of study; the “First-Generation” Families (FGF) subgroup comprises students whose parents have no postsecondary experience at all (i.e., the highest education level of both parents at baseline was “high school or less”). Among lower income families, this is a subgroup of the LILE group; Boys and girls subgroups; and Francophone and Anglophone subgroups since linguistic differences often translate into socioeconomic differences in Canada.
The results are organized by linguistic group of the students’ high school. Although the Anglophone or Francophone divide is important in Canada, as an officially bilingual country, New Brunswick is notably the only officially bilingual Canadian province. There are two separate education systems in the province for Francophone and Anglophone students; the former is about half the size of the latter. Accordingly, Future to Discover was partitioned by linguistic sector and random assignment was undertaken independently by sector. Sampling frames, rates, and assignment fractions were constructed within sector. For precision in interpretation, these structural features underlying the study are respected in analysis and presentation of results.
Sensitivity of Impact Findings to Data Sources in Future to Discover
The impacts of EYH and LA and their combination have been described in detail in Ford et al. (2012) and Ford, Grékou, Kwakye, and Nicholson (2014b). The reports found that these interventions had strong and statistically significant positive impacts on participation in different types of postsecondary programs (university, college, or all types of postsecondary education combined, as defined in Outcomes of Interest subsection). Both interventions have statistically significant positive impacts on enrollment in postsecondary education, but with differences. Hence, EYH has statistically significant positive impacts on enrollment in postsecondary education and university but not on enrollment in college, while LA on its own has positive impacts on enrollment in college. When the interventions are combined, the results largely follow the pattern for EYH. In addition, the results were statistically significant for subgroups: Francophone students, LILE, girls, and, for LA only, for boys.
The remaining of this section studies the sensitivity of outcomes to changes in data sources.
Presentation of the Tables
The full set of results (outcomes and impacts) for each subgroup of interest (all pooled together, LILE, FGF, non-FGF, boys, and girls) is shown in Tables A1–A6 in the Appendix, respectively. Each table is organized by linguistic group (Francophone and Anglophone) and reports the results relating to each education level for the three interventions, EYH (top panel), LA (middle panel) and their combination (bottom panel). Each panel reports the results obtained with administrative-only data and survey-only data. 23
As discussed in the Method section, the predicted outcomes are calculated based on the mean characteristics of each subgroup of interest. They can therefore not be compared across groups since comparing predicted outcomes across groups would combine true differences in outcomes with differences in group characteristics. 24 However, the influence of group characteristics on outcomes is removed in the predicted impacts, so that readers interested in making comparisons across groups should compare predicted impacts.
Patterns
The results suggest that estimating education program impacts in the context of a randomized experiment in education can be relatively robust to the data sources chosen. In this case, stable internal validity and conclusions for policy are not affected by changing data source even when doing so produces marked changes in levels of the outcome of interest observed.
To summarize the findings, the change in data source from administrative-only to survey-only data tends to produce apparent drops in enrollment rates across all outcomes and interventions except for postsecondary education under LA, for which the enrollment rates tended to increase (“ever enrolled in PSE” in the middle panel in Tables A1–A6). Nonetheless, these changes were unrelated to assignment of treatment or control status, and therefore did not appreciably affect the magnitude of the impact estimates. Levels and significance of impact with respect to all education outcomes remained relatively stable.
Changes in enrollment rates
The enrollment rates for the program and control groups are reported in Tables A1–A6 (first and second columns for Francophone, fourth and fifth columns for Anglophone). They show enrollment rates affected by the change from administrative to survey data. Across all subgroups of interest, the largest negative changes always occurred with EYH (top panel) or the combined intervention (bottom panel). In particular, Francophone participants with respect to college education under these interventions tended to experience the most negative changes in enrollment. To the contrary, the largest increases across all subgroups tended to occur with LA, specifically Anglophone students’ outcomes with respect to postsecondary education. Results for this subgroup and outcome tended to show the most positive changes in enrollment impacts.
Interestingly, the drops in enrollment in university were often less important than the ones in college. This can be explained by the fact that the data indicating college participation (Table 1) has more conflicts in which administrative enrollment is not supported by survey enrollment (see Note 17).
By subgroup of interest, the changes in enrollment levels ranged from −11 to +5 percentage points for all subgroups pooled together (program or control group in Table A1), from −9 to +7 percentage points for LILE (Table A2), from −12 to +6 percentage points for FGF (Table A3), from −13 to +11 percentage points for non-FGF (Table A4), from −19 to +6 percentage points for boys (Table A5), and from −10 to +9 percentage points for girls (Table A6).
Changes in impacts
The changes in levels of impacts resulting from the change from administrative data to survey data are reported in the third column for Francophone and the sixth column for Anglophones in Tables A1–A6. The tables show the range of the changes in levels of impacts across subgroups. Across all subgroups, the largest negative changes occurred with EYH (top panel) or the combined intervention (bottom panel). In particular, Francophone participants’ outcomes with respect to postsecondary education under EYH tended to experience more negative changes in impacts. This result contrasts with Anglophone participants’ outcomes with respect to postsecondary education under LA, who tended to experience the largest increases in impacts.
By subgroup of interest, the changes in impacts ranged from about −5 to +6 percentage points for all subgroups pooled together (Table A1), from about −7 to +5 percentage points for LILE (Table A2), from about −8 to +5 percentage points for FGF (Table A3), from about −4 to +10 percentage points for non-FGF (Table A4), from about −9 to +12 percentage points for boys (Table A5), and from about −7 to +4 percentage points for girls (Table A6).
Despite these changes, the statistical significance of the impacts remained fairly stable for both linguistic groups, with most of the impacts keeping the same level of significance (Table 3). Table 3 shows that, when moving from administrative to survey data, the majority of impacts remained at the same level of significance and that significance disappeared (i.e., went from “significant” at the 10%, 5%, or 1% level to “nonsignificant”) or appeared (i.e., went from “nonsignificant” to “significant”) for only a few estimates (17 out of 108). Interestingly, LA tended to produce most of the changes in statistical significance levels (13 changes, all of which were increases). It is possible that LA, since it offers financial assistance, helped more students pursue postsecondary education outside the Atlantic region. Such students can be captured by survey data but not by the available administrative data, which potentially explain the changes in impacts found statistically significant. The differences between the impacts themselves are tested statistically in the next subsection.
Number of Impact Estimates for Which the Level of Significance Disappears, Appears, Decreases, or Increases When Switching From Administrative to Survey Data (Francophones and Anglophones Combined).
Note. “Appears” and “disappears” denote cases where impacts move within or outside the 10% cutoff when comparing/going from the administrative-only to the survey-only data. “Decreases,” “remains the same,” and “increases” denote whether the level of significance changed in any of the respective cases. Significant impacts that “appear” are included in the “increases” and significant impacts that “disappear” are included in the “decreases.” LILE = Lower-Income and Lower parental Education group; FGF = First-Generation Families (i.e., Parents with high school or less); Non-FGF = parents with any postsecondary education; EYH = Explore Your Horizons; LA = Learning Accounts; EYH + LA = combination of the two.
Are the Impacts From Administrative Data Statistically Different From the Impacts From Survey Data?
In this subsection, the impacts derived from the administrative and survey data are compared by means of t tests, as described in Comparison Between Sources of Data subsection. Specifically, for each subgroup (e.g., Francophone LILE), each intervention (e.g., EYH), and each education level (e.g., college), a t test determines whether the difference between the impacts obtained from the two types of data is statistically different from zero. The results from the two data sources statistically differ at the 10% level for only 6 cases out of 108 (rows labeled “Change” in Tables A1–A6). 25 Hence, the survey and administrative data impacts are generally not statistically different. This result confirms the analysis from above by showing that the results tend to be similar across data sources.
However, the results do suggest that the change in data sources mostly affected the significance of results for Anglophones: all six cases that statistically differ across data sources concern Anglophones. Specifically, under the combined intervention, Anglophone participants’ impacts with respect to postsecondary education (bottom panel in Table A1) and Anglophone FGF’ impacts with respect to university (bottom panel in Table A3) are statistically different across types of data. This also applies to Anglophone non-FGF participants’ impacts with respect to postsecondary education under LA (middle panel in Table A4) and to Anglophone boys’ impacts with respect to postsecondary education under all interventions (Table A5).
These results for Anglophones are consistent with the observation in Ford et al. (2012) that they were more likely than Francophones to attend private colleges or vocational institutes and enter apprenticeships (tables 4.2, 5.2, and 6.2). These types of postsecondary education are not covered by the administrative data.
Conclusion
The Future to Discover project is a large-scale Canadian experiment designed to estimate impacts when 5,400 incoming Grade 10 students are offered (a) 3 years of enhanced career education programming (EYH), (b) a guarantee of a generous student aid grant (LA), (c) both, or (d) neither. Findings up to the sixth year following recruitment were based on combined survey and administrative data. Analysis at Year 6 identified many statistically significant impacts on access to postsecondary education attributable to the interventions. The project has since been extended to assess longer term outcomes in the 7th through 10th years following recruitment. Unlike the first 6 years, however, data for the seventh year onward have only been collected from administrative sources.
The primary challenge is that—as was the case in this article—administrative and survey data in education can have different coverage in terms of both scope (types of education covered) and geographic reach. In particular, administrative records in education are available only for those who participate in the outcome of interest within scope of detection by a particular service provider and it is sometimes impossible to obtain the participation of all service providers. In other words, administrative records can be limited in both scope and geographic reach. If administrative records cannot be located for a study sample member, his or her educational outcome is not known and researchers typically infer nonparticipation in the outcome of interest. By contrast, surveys record outcomes regardless of participation and service provider. This has direct consequences for the levels of the outcome of interest observed and may or may not affect the scale or direction of impacts attributable to the interventions under test.
The change in data sources from administrative-only to survey-only data tended to produce apparent drops in postsecondary enrollment rates that varied by subgroup. Nonetheless, levels and significance of impacts with respect to postsecondary enrollments as a whole, university enrollments, and college enrollments remained relatively stable. Furthermore, t tests showed that the differences in impacts based on the administrative-only data and survey-only data are overwhelmingly not statistically different from zero.
The findings of the article provide evidence that estimating education program impacts in the context of a randomized experiment can be relatively robust to the data sources chosen. It suggests that internal validity and conclusions for policy need not be affected by changing data source even when it produces marked changes in levels of the outcome of interest observed. In the study presented here, the outcome of interest—education program enrollment—is unlikely to be affected by survey recall error. This is in line with findings in Scott-Clayton and Wen (2018). These papers perhaps provide important counterexamples to others’ findings, which conclude that impact estimates often diverge between data source (Barnow & Greenberg, 2015; Dorsett, Hendra, & Robins, 2018; Moore, Perez-Johnson & Santillano, 2018; Yang & Hendra, 2018).
Footnotes
Appendix
Girls.
| Intervention | Education Outcome | Data Source | Francophone | Anglophone | ||||
|---|---|---|---|---|---|---|---|---|
| Treated Group | Control Group | Impact (SE) | Treated Group | Control Group | Impact (SE) | |||
| Explore Your Horizons | Ever enrolled in PSE | Administrative-only data (1) | 58.28 | 44.00 | 14.28 (6.51)** | 39.25 | 40.84 | −1.59 (6.10) |
| Survey-only data (2) | 60.98 | 48.64 | 12.34 (6.44)* | 37.81 | 44.08 | −6.27 (5.84) | ||
| Change (=2−1) | 2.70 | 4.64 | −1.94 (6.69) | −1.44 | 3.24 | −4.68 (6.04) | ||
| University | Administrative-only data (1) | 39.02 | 22.50 | 16.52 (5.42)*** | 27.67 | 27.95 | −0.28 (5.44) | |
| Survey-only data (2) | 34.53 | 20.56 | 13.96 (5.16)*** | 21.48 | 21.76 | −0.28 (4.74) | ||
| Change (=2−1) | −4.49 | −1.93 | −2.56 (3.74) | −6.19 | −6.18 | 0.00 (3.85) | ||
| College | Administrative-only data (1) | 24.89 | 25.93 | −1.04 (6.08) | 14.44 | 15.00 | −0.57 (4.85) | |
| Survey-only data (2) | 23.21 | 21.17 | 2.04 (5.75) | 7.40 | 11.67 | −4.27 (3.74) | ||
| Change (=2−1) | −1.68 | −4.76 | 3.08 (5.38) | −7.04 | −3.34 | −3.70 (4.06) | ||
| Sample size |
|
|
|
|
||||
| Learning Accounts | Ever enrolled in PSE | Administrative-only data (1) | 62.08 | 43.94 | 18.13 (5.71)*** | 41.99 | 39.38 | 2.61 (5.24) |
| Survey-only data (2) | 70.78 | 49.80 | 20.98 (5.77)*** | 46.69 | 42.65 | 4.04 (5.29) | ||
| Change (=2−1) | 8.70 | 5.86 | 2.85 (6.15) | 4.70 | 3.27 | 1.43 (5.82) | ||
| University | Administrative-only data (1) | 34.44 | 24.46 | 9.97 (4.68)** | 25.85 | 26.82 | −0.98 (4.88) | |
| Survey-only data (2) | 34.18 | 22.69 | 11.49 (4.53)** | 23.25 | 20.55 | 2.70 (4.27) | ||
| Change (=2−1) | −0.26 | −1.77 | 1.51 (3.07) | −2.60 | −6.27 |
|
||
| College | Administrative-only data (1) | 35.57 | 23.39 | 12.18 (5.55)** | 19.79 | 14.47 | 5.32 (4.50) | |
| Survey-only data (2) | 32.23 | 20.50 | 11.73 (5.55)** | 14.35 | 11.29 | 3.05 (4.16) | ||
| Change (=2−1) | −3.33 | −2.89 | −0.45 (5.37) | −5.45 | −3.18 | −2.27 (4.15) | ||
| Sample size |
|
|
|
|
|
|||
| Explore Your Horizons + Learning Accounts | Ever enrolled in PSE | Administrative-only data (1) | 63.95 | 42.92 | 21.03 (6.23)*** | 34.10 | 39.96 | −5.86 (4.87) |
| Survey-only data (2) | 62.65 | 48.11 | 14.54 (6.25)** | 40.17 | 43.06 | −2.89 (5.13) | ||
| Change (=2−1) | −1.30 | 5.19 | −6.49 (6.20) | 6.08 | 3.10 | 2.97 (5.59) | ||
| University | Administrative-only data (1) | 34.15 | 23.21 | 10.94 (5.12)** | 26.90 | 26.33 | 0.57 (4.81) | |
| Survey-only data (2) | 29.82 | 21.03 | 8.79 (4.91)* | 22.19 | 20.97 | 1.22 (4.35) | ||
| Change (=2−1) | −4.32 | −2.18 | −2.14 (3.42) | −4.70 | −5.35 | 0.65 (3.45) | ||
| College | Administrative-only data (1) | 38.46 | 24.06 | 14.40 (5.88)** | 11.30 | 14.90 | −3.60 (4.13) | |
| Survey-only data (2) | 28.77 | 21.30 | 7.46 (5.55) | 6.17 | 12.17 | −5.99 (3.81) | ||
| Change (=2−1) | −9.69 | −2.76 |
|
−5.12 | −2.73 | −2.39 (3.86) | ||
| Sample size |
|
|
|
|
||||
Note. Survey and administrative data at Year 6. The largest negative and positive changes are highlighted for enrollments and underlined for impacts.
Acknowledgments
The authors wish to thank Dave Greenberg, Burt Barnow, Jacob Klerman, and participants of the 2014 APPAM Conference panel on Survey Data Versus Administrative Data in Evaluation, as well as four anonymous referees, for valuable comments.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The project described in this article was supported financially by the Canada Millennium Scholarship Foundation and the Government of New Brunswick.
