Abstract
The authors investigate the accuracy of young women’s retrospective reporting on their first substantial employment in three major, nationally representative U.S. surveys, examining hypotheses that longer recall duration, employment histories with lower salience and higher complexity, and an absence of “anchoring” biographical details will adversely affect reporting accuracy. The authors compare retrospective reports to benchmark panel survey estimates for the same cohorts. Sociodemographic groups—notably non-Hispanic white women and women with college-educated mothers—whose early employment histories at these ages are in aggregate more complex (multiple jobs) and lower in salience (more part-time jobs) are more likely to omit the occurrence of their first substantial job or employment and to misreport their first job or employment as occurring at an older age. Also, retrospective reports are skewed toward overreporting longer, therefore more salient, later jobs over shorter, earlier jobs. The relatively small magnitudes of differences, however, indicate that the retrospective questions nevertheless capture these summary indicators of first substantial employment reasonably accurately. Moreover, these differences are especially small for groups of women who are more likely to experience labor-market disadvantage and for women with early births.
1. Introduction
Sociologists and social demographers have devoted considerable attention to the accuracy of reporting on first-experienced events in domains including births, cohabitation and marriage, and sexual activity (Joyner et al. 2012; Kahn, Kalsbeek, and Hofferth 1988; Lauritsen and Swicegood 1997; Peters 1988; Wu, Martin, and Long 2001). The accuracy of data on these topics is critically important to life-course research, for which timing and sequencing of events are central concepts and the occurrence of key events signals transition into different life stages (Shanahan 2000). We are unaware of any previous efforts to evaluate the accuracy of questions on first significant employment or job experience, despite its importance in the early adult life-course. Through early employment, young adults develop skills, amass human capital, adopt workplace norms, and develop preferences for future work roles (Mortimer, Harley, and Aronson 1999). Early employment experiences predict the stability of employment later in the life-course (Alon, Donohoe, and Tienda 2001). The importance of first substantial employment as a life-course event is elevated in the current U.S. context, in which secure, stable jobs are increasingly less common (Kalleberg 2011).
In the present study, we evaluate the accuracy of reporting about the first occurrence of substantial employment among young women in three major nationally representative U.S. surveys: the 2006–2010 National Survey of Family Growth (NSFG; National Center for Health Statistics 2014), the Survey of Income and Program Participation (SIPP; U.S. Census Bureau 2014), and the National Longitudinal Study of Adolescent to Adult Health (Add Health; Harris 2009). Previous studies have evaluated the accuracy of reporting on employment status, employment transitions, hours worked, or number of jobs in a given period in the past (Belli, Bilgen, and Al Baghal 2013; Belli, Shay, and Stafford 2001; Bowers and Horvath 1984; Duncan and Hill 1985; Evans and Leighton 1995; Freedman et al. 1988; Horvath 1982; Jacobs 2002; Jürges 2007; Levine 1993; Kyyra and Wilke 2014; Manzoni, Luijkx, and Muffels 2011; Manzoni et al. 2010; Mathiowetz and Duncan 1988; Morgenstern and Barrett 1974; Pierret 2001; Sayles, Belli, and Serrano 2010; Sudman and Bradburn 1973). To our knowledge, no study has evaluated reporting on the timing of first employment.
Most studies evaluating retrospective reporting on employment have used contemporaneous panel-survey responses about a particular event as a source of verification against reports from the same respondent who later reported about the same event retrospectively (Belli et al. 2001, 2013; Dex and McCulloch 1998; Jacobs 2002; Jürges 2007; Kyyra and Wilke 2014; Levine 1993; Pierret 2001; Sayles et al. 2010). A second approach is to link administrative records with survey data. The only studies that we are aware of that used linked administrative data on employment did so with administrative data from a single U.S. employer linked to retrospective survey reports from a sample of its employees (Duncan and Hill 1985; Mathiowetz and Duncan 1988). Linked administrative data have, however, been used to evaluate survey reporting on earnings (Bricker and Engelhardt 2008; Kapteyn and Ypma 2007; Kim and Tamborini 2014) and, in countries with register data, on unemployment (Pina-Sanchez, Koskinen, and Plewis 2014).
A third, less common approach is to use independent benchmark survey data to evaluate retrospective reports from a different survey. Manzoni et al. (2011) used panel data (the German Socio-Economic Panel) to evaluate memory bias in retrospective reports in a cross-sectional survey (the German Life History Study) that covered the same period. Our study similarly uses an independent, annual panel survey, the National Longitudinal Survey of Youth, 1997 cohort (NLSY97; Bureau of Labor Statistics 2014), as the benchmark data source. We use the NLSY97 to evaluate retrospective questions on first substantial employment across three U.S. surveys.
The benchmark-survey approach results in two major contributions to the evaluation of retrospective survey questions. First, it allows the evaluation of surveys for which there is no independent source of verification, whether in earlier contemporaneous responses by the same individual in a panel survey or in linked administrative data. There are no independent sources of verification of first substantial employment for respondents in the three surveys we examine. Second, external survey data may provide additional information on sociodemographic subgroup differences in the events reported. This information can then be used to predict which subgroups may be more susceptible to inaccurate responses to questions and thereby to introduce biases to any subsequent analyses that draw on subgroup differences. In the present study, we identify differences between sociodemographic groups in the likelihood of having had two or more jobs in a year, having had two or more jobs in a month, and whether the first job of six months or more was part-time versus full-time. We then treat these measures as indicators of group-level employment-history complexity and salience and, on the basis of the evidence they offer, use them to generate predictions that less economically advantaged groups will more accurately report the occurrence and timing of their first substantial employment. Our study’s empirical findings support these predictions across the three evaluated surveys.
These two major methodological contributions from using independent survey data as a benchmark source are especially relevant to countries in which, at any given period, multiple nationally representative surveys are fielded. Estimates from these multiple surveys may be viewed as having been generated from independent samples of a common population. On the basis of survey theory, researchers can make informed, a priori evaluations of which of these multiple surveys’ questions is expected to provide the more accurate information and therefore can serve as the benchmark data source. In particular, survey theory tells us that reports obtained from questions about very recent events will be more accurate than reports on events from longer ago.
In Section 1.1 of this paper, we introduce key concepts and findings from theory on the accuracy of survey reporting. In Section 2 we describe how we compare retrospectively reported estimates of the occurrence and timing of first substantial employment from the Add Health, NSFG, and SIPP 2004 and 2008 panels to estimates from the benchmark NLSY97. We discuss the rationale for this method of assessing retrospective-reporting error, and its limitations. In Section 3 we present the results of our evaluation of the retrospective reports of first employment. In Section 4 we discuss implications of our results for survey methodologists and social scientists.
1.1. Theory on Survey Reporting Accuracy
At least four major factors have been identified as accounting for inaccuracies in respondents’ reporting of life-course events (see, e.g., Schaeffer and Presser 2003). Respondents are more likely to remember and report accurately on the occurrence and characteristics of events that are salient to them (e.g., a particular job) or on salient categories of events (e.g., their employment histories overall). Events that are rarer, and have greater social and economic consequences for the respondent, whether positive or negative (Linton 2000), are typically more salient to them.
Whereas more salient events are easier for respondents to remember, topics situated within more complex patterns of events are harder for them to remember (Sudman, Bradburn, and Schwarz 1996). Complexity of employment histories can be characterized by a respondent’s having multiple and layered jobs, with many starts and stops of jobs, overlapping of jobs, and spells of both full-time and part-time employment. Mathiowetz and Duncan (1988) found that respondents with more, shorter spells of unemployment within a period of time (i.e., multiple start and stop dates) reported their unemployment less accurately than respondents with single, longer spells of unemployment or no unemployment. Freedman et al. (1988) found a higher prevalence of errors in reporting part-time employment than in reporting full-time employment. Belli and colleagues (Belli et al. 2001, 2013; Sayles et al. 2010) found substantial inaccuracies in respondent reporting of hours and weeks worked and of transitions in and out of employment and unemployment.
The length of the period a respondent is asked to recall presents a cognitive challenge to the respondent’s recall abilities (Sudman et al. 1996). Longer recall duration can result not only in underreporting of the occurrence of discrete events but also in oversimplification of reporting on sequences of events (Dex and McCulloch 1998; Jürges 2007; Manzoni 2012; Manzoni et al. 2011; Pierret 2001). For example, Manzoni et al. (2010) and Manzoni et al. (2011) found a “smoothing” effect of reporting on complex employment after extended recall time had elapsed, with, respectively, oversimplified reports of employment careers in later versus earlier reports and oversimplified reports of employment transitions in retrospective reports versus panel surveys. Among errors of recall, telescoping is the term used to describe the process of misreporting the timing of the event (Gaskell, Wright, and O’Muircheartaigh 2000). In “backward telescoping,” a respondent reports an event as having occurred earlier than it actually did. In “forward telescoping,” he or she reports an event as having occurred later than it actually did. Cognitive ability is also theorized to aid recall accuracy. Although we are not aware of tests of the role of cognitive ability specifically in the employment domain, Wu et al. (2001) found that teenage girls with higher scores on cognitive tests, in particular higher verbal ability, reported more accurately the timing of their first sexual intercourse.
The use of individuals’ life events as cues to anchor their recall of the timing of other events can also play a role in reporting accuracy. Individuals who can recall a topic in relation to a personally significant event, and those who are prompted to report on whether an event occurred before or after a shared, publicly significant event, report more accurately on event timing than those with no such anchoring details (Loftus and Marburger 1983). Manzoni (2012) found that women with children report on their employment transitions more accurately than women without children, although Manzoni did not find this same effect for married versus unmarried women.
Individuals’ sociodemographic characteristics—such as women’s birth cohort, race/ethnicity, family formation behavior, and socioeconomic status—may also affect reporting accuracy. Exploring this is important in the U.S. context, in which racial and socioeconomic inequalities may lead to large differences in early working-life experience by race and education (Bernhardt et al. 2001; Kalleberg 2011). These differences in work histories may in turn lead to differences in the salience and complexity of employment patterns among members of different sociodemographic groups, with implications for these groups’ aggregate reporting accuracy. Mathiowetz and Duncan (1988) found more accurate reporting of unemployment spells among individuals with higher education levels, but they found that this association disappeared after controlling for variables representing the salience and complexity of the unemployment spells being recalled.
2. Data and Methods
Using the NLSY97 as the standard for comparison, we evaluate the accuracy of women’s retrospective survey reporting on the date of their first substantial employment in retrospective questions from the cross-sectional 2006–2010 NSFG, from Wave 1 of the 2004 and 2008 panels of the SIPP, and from Wave 4 of Add Health, conducted in 2008–2009 of Add Health, conducted in 2008–2009. Table 1 summarizes the characteristics of the samples, interviews, and outcomes we analyze. The SIPP, NSFG, and Add Health ask retrospective questions about first employment differently (see the Appendix, available online, for additional details on question wording). In the SIPP, first substantial employment is defined as a first job, irrespective of whether full-time or part-time, of six months or more in duration. In the NSFG, it is defined as a first period of full-time work of six months or more (not necessarily all at the same job). In Add Health, it is defined as a first period of full-time work of any duration, undertaken while not a student, and not including summer jobs. For brevity, we refer to these three distinct operationalizations of first employment collectively as “first substantial employment.”
Summarized Characteristics of the Evaluated Surveys and of the NLSY97 Benchmark Survey
Note: Add Health = National Longitudinal Study of Adolescent to Adult Health; NLSY97 = National Longitudinal Survey of Youth, 1997 cohort; NSFG = National Survey of Family Growth; SIPP = Survey of Income and Program Participation.
NLSY97 respondents are interviewed annually and are sequentially asked about events in their different life domains (Pierret et al. 2007). Pierret (2001) showed that, for the 1979 cohort of the National Longitudinal Survey of Youth, which has a comparable interview structure to the NLSY97, annual reporting of employment yields substantially more accurate reports of numbers of employers than does reporting in biennial interviews. When they were 14 years of age, NLSY97 respondents were asked about all employment activities since they turned 14. In subsequent interviews, they were asked about all employment in the previous year. Respondents who had had jobs at the previous interview were prompted to report on whether and when those previous jobs ended, as well as on current jobs. We construct our benchmark measures of occurrence and timing of first substantial employment in the NLSY97 from all jobs that respondents held after they turned 16, including jobs that began before they turned 16 but continued while they were aged 16 or older. Sixteen is the age at which the U.S. Fair Labor Standards Act no longer sets limits on the number of hours an individual can work (U.S. Department of Labor 2015), which we consider to be a prerequisite for our definition of “substantial” work. The level of detail in the NLSY97, with start and end dates and hours worked for all jobs, allows us to match each of the NSFG, SIPP, and Add Health first-employment definitions and to match to Add Health’s additional information on number of jobs and duration of first full-time job.
Our population of interest is women born in the United States between 1980 and 1984 who are non-Hispanic white, non-Hispanic black, and Hispanic of any race. For brevity, we refer to these three groups below as white, black, and Hispanic, respectively. We include in our study only women born in the United States, because the foreign-born population sampled in 1994 for Add Health and 1997 for the NLSY97 differs from that sampled in the mid- to late 2000s for the NSFG and the SIPP. We examine 1980–1984 birth cohorts in the NSFG and SIPP because women born in these years can be found in both these two surveys and the NLSY97. The sample design of the Add Health restricts our comparisons with the NLSY97 to women born between 1980 and 1982. For more details on the analytic samples, see the online Appendix.
To measure the occurrence of first employment, we examine whether a woman had a first substantial job or employment experience by the end of calendar year 2002 (SIPP and NSFG) or by her age at the end of 2002 (Add Health). We choose 2002 as a cutoff point both because it represents the latest year reported on fully by all members of the 2004 SIPP panel (jobs starting in 2002 are followed long enough to have potentially “lasted 6 straight months or more”) and because it serves as a meaningful marker of “early employment” for all five birth cohorts. Women born between 1980 and 1984 were aged 18 to 22 at the end of 2002; those born between 1980 and 1982 were aged 20 to 22.
To measure the timing of first employment, we examine the respondent’s age at the beginning of her first reported substantial employment, among those respondents who report having had substantial employment experiences by their retrospective interviews in 2008–2010. In 2008–2010, women born between 1980 and 1984 ranged in age from 24 to 30; those born between 1980 and 1982 ranged in age from 26 to 30.
To create the NLSY97 sample to measure whether a first substantial job was held by 2002, we include NLSY97 respondents who were interviewed every year, with no attrition or skipped years, through an interview that covered December 2002 (for SIPP and NSFG comparisons) or that covered the month when the respondent reached the age she would be at the end of 2002 (for Add Health comparisons). When the outcome variable is the respondent’s age at her first reported substantial employment, we observe NLSY97 respondents through an interview that covers the last interview date of the comparison survey.
Attrition reduces our sample sizes for the Add Health and SIPP, and for the benchmark NLSY97. In the Add Health, 16.3 percent of the original group of non-Hispanic white, non-Hispanic black, and Hispanic women of any race who were born in the United States between 1980 and 1982 and were interviewed in Wave 1 dropped out of the survey prior to Wave 4. In each SIPP panel, we require observation up to Wave 2, four months after Wave 1. Between Waves 1 and 2, 16.3 percent and 16.8 percent of cases were lost to attrition, respectively, in the SIPP 2004 and 2008 panels.
Attrition varies in the NLSY97 according to the comparison for which it is used. To create each NLSY97 sample for comparison with SIPP, NSFG, or Add Health on the respective measure of first substantial employment by 2002, we include NLSY97 respondents who were interviewed every year, with no attrition or skipped years, through either a 2003-wave interview or a 2002-wave interview that took place during or after December. In doing so, we lose to attrition 21.6 percent of the sample of white, black, and Hispanic women born in the United States between 1980 and 1984. To create the NLSY97 sample for comparison with Add Health on the measure of whether a first substantial job was held by the respondent’s age at the end of 2002, we include NLSY97 respondents who were interviewed every year, with no attrition or skipped years, through either a 2004-wave interview or a 2003-wave interview that took place during or after December 2003. In doing so, we lose 21.4 percent of the sample of white, black, and Hispanic women born in the United States between 1980 and 1982.
When the outcome variable is timing of first reported substantial employment, we observe NLSY97 respondents until the last relevant interview date of the comparison survey. For the SIPP 2008 comparison, we include NLSY97 respondents who were interviewed continuously at every wave up through an interview covering December 2008, including both 2008 and 2009 interviews. For the NSFG comparison, we include NLSY97 respondents who were interviewed continuously at every wave up through an interview covering June 2010, including 2010 interviews only. For the Add Health comparison, we include NLSY97 respondents who were interviewed continuously at every wave up through an interview covering February 2009, including both 2009 and 2010 interviews. In excluding respondents not interviewed continuously through December 2008 in the SIPP comparison, we lose to attrition an additional 17.1 percent of NLSY97 women (33.9 percent cumulative attrition). In excluding respondents not interviewed continuously through June 2010 in the NSFG comparison, we lose to attrition an additional 18.3 percent of NLSY97 women (39.9 percent cumulative attrition). In excluding respondents not interviewed continuously through February 2009 in the Add Health comparison, we lose to attrition an additional 14.6 percent of NLSY97 women (36.0 percent cumulative attrition). We discuss possible biases due to attrition in the NLSY97 in the “Results” section.
2.1. Analyses
We evaluate the effects on reporting accuracy of recall duration, anchoring family formation events, and employment-history salience and complexity. We evaluate the effects of recall duration primarily by comparing reports about our two dimensions of first substantial employment from the retrospective questions in the SIPP 2004 and 2008, NSFG, and Add Health to panel reports in the benchmark NLSY97. For example, a first full-time job occurring in 2001 would be reported in 2008 or 2009 by an Add Health respondent, whereas it would be reported on in 2001 or 2002 by an NLSY97 respondent (see again Table 1 for the interview years of each evaluated survey). The SIPP allows recall-duration effects to be assessed additionally by comparing estimates from the same question asked in the initial wave of the 2004 panel versus in the initial wave of the 2008 panel. The respondent’s year of birth is a further measure of recall duration: a respondent with an earlier year of birth will likely have a longer time since the beginning of her employment history than a respondent born later. We evaluate the effects of anchoring by interpreting having married or given birth as biographical anchors relative to the timing of first substantial employment.
To measure employment-history salience and complexity, we rely mostly on aggregate differences by sociodemographic group. We use the NLSY97 to identify these differences. We estimate proportions with two or more jobs of any type in a given year, two or more part-time jobs in a given month, and whether the respondent’s first job was full-time or part-time. We interpret a larger number of jobs in a given year or month as indicating higher complexity, and we interpret full-time jobs as being more salient than part-time jobs. We estimate these statistics by single-year birth cohort, race/ethnicity, mother’s education, and by family formation by 2002, anticipating that these group-level differences in employment-history salience and complexity will induce group-level differences in accuracy of reporting on first substantial employment.
In the Add Health only, we are also able to include direct measures of the complexity and salience of an individual’s employment history. These are the length of her first reported full-time job and the number of jobs of at least 10 hr per week lasting nine weeks that she reports having had between 2001 and February 2009. We interpret the length of the first reported full-time job as a measure of salience, with longer jobs indicating greater salience. We interpret the number of jobs as a measure of complexity, with more jobs reported indicating greater complexity.
In multivariate analyses, we separately pool the NLSY97 with the NSFG, the NLSY97 with the SIPP, and the NLSY97 with the Add Health. We conduct two regression analyses with each pooled data file. The first regression analysis is a logistic regression in which the outcome variable is whether the respondent reported attaining first substantial employment by 2002, between the ages of 18 and 22. In this analysis, we include both women who reported having a substantial job by 2002 and those who did not. In each pooled sample, we test for differences in the retrospectively reported first substantial employment outcome measures (i.e., as reported in the NSFG, SIPP, or Add Health) relative to those same measures derived from the benchmark NLSY97 panel reports. We include as a covariate a variable denoting whether the respondent is drawn from the retrospectively reporting survey (SIPP, NSFG, or Add Health). We interact this retrospective-survey variable with our sociodemographic covariates, which include single-year birth cohort, race/ethnicity, and whether a woman ever gave birth or ever married by the end of calendar year 2002 (SIPP and NSFG) or by the age she attained in December 2002 (Add Health). In the regressions contrasting the NSFG and Add Health versus NLSY97, we also include mother’s educational attainment as a covariate. (No comparable mother’s-education variable is available in the SIPP.) We assess in this first regression whether retrospectively reporting women were more or less likely to report having had a first substantial job by 2002. Included in the “zeros” of this outcome variable are women who were employed but had either forgotten the job, misremembered its length, or forgotten that it had occurred already by 2002. Recall error will be indicated by retrospective reporters’ (in the SIPP, NSFG, or Add Health) lesser likelihood of reporting a first substantial employment by 2002 relative to panel reporters in the NLSY97.
Our second regression analysis is a linear regression in which the outcome variable is the respondent’s age at her first reported substantial employment. In this analysis, worse reporting among retrospectively reporting respondents will be reflected by an older age at the start of their reported first substantial employment. We limit this sample to women who reported having experienced first substantial employment at any time by 2008–2010, thereby excluding women from the 2004 SIPP panel. We include the same sociodemographic covariates as in the logistic regression of first job or employment by 2002. In the Add Health/NLSY97 comparison of women’s reporting of age at first job at full-time employment, we also include as covariates the length of the job the respondent said was her first full-time job and the number of jobs the respondent reported having had of at least 10 hr per week that lasted at least nine weeks, between 2001 and the year prior to the interview date. We use the sample weights of the respective surveys throughout our analyses to account for differences in the sample designs and oversampling plans. See the online Appendix for additional detail on weighting.
Because we use an independent benchmark survey, not an alternative source of verification on a given surveyed individual’s retrospective responses, our estimates of response error are aggregate measures over all members belonging to a sociodemographic group. Our main measures of retrospective reporting accuracy versus inaccuracy are regression coefficients for the “retrospective survey” main effect and for interactions of the sociodemographic-group variables with “retrospective survey.” These coefficients are interpretable respectively as conditional-mean estimates of retrospective-question bias for the reference sociodemographic category in the regression model and as mean sociodemographic differences in this bias. These estimates of retrospective-reporting bias are additionally subject to sampling error present in both the retrospective and benchmark surveys, and this sampling error is represented by the standard errors of the regression coefficients.
Unlike an evaluation using an alternative source of verification on a given surveyed individual’s retrospective responses, our method of comparison with an independent benchmark survey does not provide for evaluation of reporting variance, except as generated by differences in conditional-mean bias across subgroups. This distinction matters mainly for our estimates of bias in reporting the timing (less for the occurrence) of first substantial employment. We do not attempt to distinguish, for example, a uniformly later reporting of first substantial employment in the evaluated survey (mean-increasing reporting bias) from an average over some “too early” but more “too late” reports of first substantial employment (mean-increasing and variance-increasing reporting bias).
Systematic error in the reporting of first-employment timing that results from forward telescoping would tend to reduce variance in retrospective reporting of first substantial employment by making those reports closer on average to the survey-report interview. We discuss and evaluate (in the “Results” section) the potential for forward telescoping to confound our interpretation of underreported occurrence of first substantial employment by 2002. Finally, nonrandom attrition in the NLSY97 is a potential threat to our “retrospective-question bias” interpretation of differences between the evaluated retrospective survey and the benchmark NLSY97 survey estimates of first-employment occurrence and timing. We address this attrition issue with additional analyses, described in Section 3.
3. Results
The focus of our results is on the occurrence and timing of first substantial employment. The presentation of our results is organized as follows. We begin with comparisons (Table 2) of the sociodemographic characteristics of the same birth cohorts observed in the NLSY97 and the three evaluated surveys: the NSFG, SIPP, and Add Health. These comparisons establish the NLSY97 and the evaluated surveys as reasonably representing the same populations. We follow this with statistics from the benchmark NSLY97 data describing how complexity of employment histories and experience of high-salience events vary by these sociodemographic characteristics (Table 3). This serves to generate predictions about which groups may be most challenged in reporting accurately their first substantial employment in the three evaluated surveys. We next present overall differences between the NLSY97 and the evaluated surveys with respect to occurrence and timing of first substantial employment (Table 4). Multivariate results on each of the outcome measures follow, for occurrence of first substantial employment period by 2002 (Table 5), and for age at first substantial employment (Table 6). These multivariate analyses allow us to test predictions about which groups will have experienced more recall error.
Descriptive Statistics, Women Born in the United States, 1980–1984 or 1980–1982
Sources: National Longitudinal Survey of Youth 1997 (NLSY97); National Survey of Family Growth (NSFG), 2006–2010; Survey of Income and Program Participation (SIPP), 2004 and 2008 panels; and National Longitudinal Study of Adolescent to Adult Health (Add Health), Waves 1 to 4.
Note: All estimates are weighted. Chi-square p value indicates the statistical significance of difference in the distribution of retrospective surveys from the comparable distribution in NLSY97.
Included NLSY97 respondents born between 1980 and 1984 were interviewed at every survey round up to and including an interview covering the entire calendar year 2002.
Included NLSY97 respondents born between 1980 and 1982 were interviewed at every survey round up to and including an interview covering the entire calendar year 2003.
The family formation histories of Add Health respondents and NLSY97 respondents in the parallel sample are coded as ever having given birth, or ever having married, at or before the respondent’s age at the end of 2002.
Sample includes respondents with a valid value on each of the variables included in the relevant logistic regression model. NLSY97 percentages reported for 1980 to 1984 are for respondents with valid values on the variables included in the SIPP (2004 and 2008 panels) regression analysis, except for mother’s education, in which case they represent respondents with a valid value on the variables included in the NSFG regression analysis.
Complexity of Employment History Experienced up to End of Calendar Year 2002, by Year of Birth, Race/Ethnicity, Mother’s Education, and Family Demographics, among Women Born in the United States between 1980 and 1984
Source: Annual panel reports for respondents interviewed at every wave between 1997 and end of calendar year 2002 in the National Longitudinal Survey of Youth 1997. All percentages are weighted.
Among women with jobs of six months or more by the end of 2002.
Among women with any employment by the end of 2002.
Chi-square p value is for each group versus the reference category.
Reference category for the relevant χ2 test.
Reporting of First Substantial Job or Employment Timing and Characteristics, Women Born in the United States between 1980 and 1984
Sources: National Longitudinal Survey of Youth 1997 (NLSY97); National Survey of Family Growth (NSFG), 2006–2010; Survey of Income and Program Participation (SIPP), 2004 and 2008 panels; and National Longitudinal Study of Adolescent to Adult Health (Add Health), Waves 1 to 4. NLSY97 respondents included in the “by 2002” measures were interviewed at every survey round up to and including an interview covering the entire calendar year 2002; for age at first employment measures, NLSY97 respondents were interviewed at every round up through an interview that includes reporting on the date noted in the table.
Note: Chi-square p value indicates the statistical significance of the difference in the distribution of retrospective question responses from the comparable distribution in NLSY97. Estimates are weighted.
Spell of continuous full-time employment of six months or more.
Any job of six months or more, either full-time or part-time
Worked full-time at least 35 hr per week while not primarily a student by age at end of 2002, not including summer jobs.
Sample includes respondents with valid values on each of the variables included in the relevant regression model; see Tables 5 and 6.
Spell of continuous full-time employment of six months or more, starting any time six or more months before survey interview. The NLSY97 sample is of respondents interviewed in the parallel calendar year.
Includes all white, black, and Hispanic women with information on employment. The NLSY97 sample size presented for the years 2003 to 2006 is the average of the sample sizes observed from the end of 2003 to the end of 2006, which were depleted by attrition. The sample sizes were 2,932 in 2003, 2,912 in 2004, 2,749 in 2005, and 2,741 in 2006.
Jobs of at least 10 hr per week and lasting nine weeks more, between 2001 and February 2009
Jobs held while not primarily a student and not including summer jobs, by February 2009.
p <.05, **p <.01, ***p <.001, difference from NLSY97.
Logistic Regression for Reporting First Substantial Employment by the End of 2002, Women Born in the United States between 1980 and 1984
Sources: National Survey of Family Growth (NSFG), 2006–2010; Survey of Income and Program Participation (SIPP), 2004 and 2008 panels; National Longitudinal Study of Adolescent to Adult Health (Add Health), Waves 1 to 4; National Longitudinal Survey of Youth 1997 (NLSY97) respondents interviewed at every wave through the end of calendar year 2002 in the NSFG and SIPP comparisons and through the end of 2003 in the Add Health comparison
Note: The family formation histories of Add Health respondents and NLSY97 respondents in the parallel sample are coded as ever having given birth or ever having married at or before the respondent’s age at the end of 2002. In the Add Health/NLSY97 comparison, jobs are limited, per the Add Health questionnaire, to those undertaken while not primarily a student and do not include summer jobs. SIPP 2008 panel respondents are statistically significantly less likely to report a first job than SIPP 2004 panel respondents (p = .009). “Retrospective survey” refers to either the SIPP, the NSFG, or Add Health. Regressions are weighted.
p < .10. *p < .05. **p < .01. ***p < .001.
Linear Regression (Ordinary Least Squares) for Age at First Substantial Employment among Women Born in the United States, 1980 to 1984, with First Substantial Employment by 2008 to 2010
Sources: Survey of Income and Program Participation (SIPP), 2008 panel; National Survey of Family Growth (NSFG), 2006–2010, 2008 to 2010 interviews; National Longitudinal Study of Adolescent to Adult Health (Add Health), Waves 1 to 4; National Longitudinal Survey of Youth 1997 respondents interviewed at every wave through the end of calendar year 2008 in the SIPP comparison, through June 2010 in the NSFG comparison, and through February 2009 in the Add Health comparison
Note: In Add Health, age at first job is asked directly of the respondent; for this comparison in the NLSY97, we calculated age at first job as the respondent’s age in the starting month of her first reported job. In the SIPP/NLSY97 comparison, we calculated age at first job as year of first reported job minus year of birth. In the NSFG/NLSY97 comparison, we calculated the respondent’s age in the first month of her full-time employment spell of six months or more using her month and year of birth. The family formation histories of Add Health respondents and NLSY97 respondents in the parallel sample are coded as ever having given birth, or ever having married at or before the respondent’s age at the end of 2002. “Retrospective survey” refers to the SIPP, NSFG, or Add Health. Regressions are weighted.
p < .10. *p < .05. **p < .01. ***p < .001.
Table 2 shows the composition of the analytic samples. The NSFG and SIPP 2004 and 2008 panels do not differ significantly from the NLSY97 on distributions of women by birth year. The Add Health distribution is skewed toward older cohorts, however, because of its school-based sampling design. The NSFG and Add Health do not differ significantly from the NLSY97 with respect to race/ethnicity. Respondents in the 2004 and 2008 panels of the SIPP, however, include somewhat higher proportions of Hispanic women and lower proportion of black women. Overall, women in the NSFG have somewhat more college-educated mothers than women in the NLSY97, whereas the opposite is true for women in Add Health. Women in the SIPP 2004 and 2008 panels are more likely to have ever married by the end of calendar year 2002 than are women in the NLSY97, whereas in neither the NSFG nor the Add Health is there any statistically significant difference from the NLSY97. For none of the three evaluated surveys is there a statistically significant difference from the NLSY97 with respect to the numbers of women with a birth by 2002. Overall, we conclude that the evaluated surveys are sufficiently similar to the NLSY97 in their sociodemographic distributions to justify the assumption that they represent the same populations.
We next use the NLSY97’s panel-reporting detail to compare the complexity and salience of employment histories by these same sociodemographic dimensions (see Table 3). The results are consistent with a scenario in which more-advantaged women delay first substantial employment and consequently have early employment experiences that are both lower in salience and higher in complexity. On salience, white women and women whose mothers have bachelor’s degrees had the lowest percentages (32 percent and 24 percent, respectively) of first jobs of six months or longer that were full-time. Among black and Hispanic women, 39 percent and 38 percent of their first jobs of six months or longer were full-time jobs, and among those whose mothers did not graduate from high school, 41 percent of their first jobs of six months or longer were full-time jobs. The more education a woman’s mother had, the lower was the likelihood that the woman’s first job of six months or longer was a full-time job.
Concerning complexity of early employment experiences, white women and women whose mothers have bachelor’s degrees had the highest percentage of years with two or more jobs of any kind (45 percent in both cases) and the highest percentages of employed months with two or more part-time jobs (10 percent and 12 percent, respectively). Black and Hispanic women had, respectively, only 32 percent and 33 percent of years with two or more jobs of any kind and only 5 percent and 6 percent of their employed months with two or more part-time jobs. The less education a woman’s mother had, the lower was the percentage of the woman’s years with two or more jobs and the lower was the percentage of her employed months with two or more part-time jobs. We therefore expect that black and Hispanic women, and women whose mothers had relatively lower educational attainment, will report their first employment more accurately.
Women who engaged in early family formation similarly had early employment experiences higher in salience and lower in complexity. On salience, women who had already given birth by 2002 or who had already married by 2002 were more likely to have had their first jobs of six months or longer be full-time jobs (47 percent and 50 percent, respectively, compared with 31 percent and 32 percent for women without early birth or early marriage). On complexity, women who had ever given birth by the end of 2002 had only 36 percent of their years with two or more jobs, and 4 percent of their employed months with two or more part-time jobs, compared with 43 percent of years and 10 percent of months for women without early birth. Women who had ever married by the end of 2002 had only 5 percent of their employed months with two or more part-time jobs, though they were no less likely than never-married women to have two or more jobs per year. We therefore expect that women who had early family formation experiences will report their first employment more accurately.
In Table 4, we compare the evaluated surveys to the NLSY97 on the two main outcome variables, which are occurrence and timing of first substantial employment. Consistent with longer recall duration inducing underreporting, women in all three retrospective-reporting surveys were less likely to have reported having a first substantial job or employment by the end of calendar year 2002 than women in the NLSY97 (see the top rows of panel A). The differences, however, are in all cases quite small. Women in the SIPP 2004 and 2008 panels were, respectively, 2.9 and 4.9 percentage points less likely to have reported having had a first job of six months or more by 2002 than women in the NLSY97. The 2.0 percentage point greater magnitude of difference of the SIPP 2008 than the SIPP 2004 from the NLSY97 is in the direction of longer recall inducing greater underreporting, but this difference was not statistically significant (p = .11). Women in the NSFG were 3.5 percentage points less likely to have reported a full-time employment spell of six months or more by 2002 than women in the NLSY97. Women in the Add Health were 2.9 percentage points less likely to have reported a first full-time, nonsummer job undertaken while not primarily a student than women in the NLSY97, but this difference is significant only at the .10 level.
Consistent with longer recall duration inducing the forgetting of earlier jobs, retrospective reports from the SIPP, NSFG, and Add Health exhibit higher proportions of respondents reporting older ages at their first substantial employment than in the NLSY97 (see panel B). Between 8 and 9 percentage points fewer SIPP and NSFG women reported retrospectively that their first substantial employment occurred when they were aged 18 to 21, and 5 to 6 percentages points more reported that it began when they were aged 22 to 24, compared with the NLSY97. In Add Health, 9 percentage points fewer women reported first substantial jobs at age 17 or younger than in the NLSY97 (12.7 percent vs. 21.4 percent).
An alternative to the interpretation that the observed discrepancies in occurrence and timing of first substantial employment between the evaluated surveys and the NLSY97 indicate forgetting of jobs in the evaluated surveys is that they instead indicate respondents’ “forward telescoping” (Gaskell et al. 2000) these jobs. That is, respondents may have inaccurately reported remembered first substantial employment as starting at a later age or year than when it actually began. We provide two sets of analyses that present evidence on differentiating between recall error due to forgetting versus recall error due to forward telescoping. First, we use the NSFG’s month-of-interview variable to redefine the occurrence of a first spell of full-time employment of six or more months to that of a spell that began any time up to six months before the respondent was interviewed. This rules out forward telescoping as a possible explanation, as there is no longer any period to which the recalled job could be forward telescoped. We conducted this test for NSFG reports of first substantial employment by six months before an interview, for interviews occurring between 2007 and 2009 to cover three full interview years. Results are presented in the lower set of rows in panel A of Table 4. As in our focal comparison (reporting on any full-time employment of six months or more up to the end of calendar year 2002), respondents in the NSFG were again less likely than respondents in the NLSY97 to report having experienced a full-time employment spell of six or more months’ duration (83.1 percent in the NSFG vs. 84.9 percent in the NLSY97) by six months prior to the time of interview. This difference is of a smaller magnitude than for the difference in reporting on full-time employment of six or more months’ duration by 2002 (46.1 percent in the NSFG vs. 49.6 percent in the NLSY97), and it is not statistically significant. The results of this comparison leave room, therefore, for an interpretation that the difference between the NSFG and NLSY97 on the percentage with any full-time employment of six months or more by 2002 is due to either forgetting or forward telescoping.
We also test for forgetting versus forward telescoping in retrospective reports more directly, using the Add Health Wave 4 reports. We compare Add Health reports with the NLSY97 on two dimensions: number of jobs of at least 10 hours per week that lasted nine or more weeks that respondents experienced between 2001 and 2008/2009 (i.e., up to Wave 4) and length of first full-time job while not primarily a student and not including summer jobs (see panel C of Table 4). Because these measures cover up until the time of the Wave 4 interview, any forward telescoping that occurred would not affect their validity (although backward telescoping might still be a concern). The results show that substantially fewer jobs were reported retrospectively in the Add Health’s Wave 4 than in the NLSY97’s annual panels. Many more Add Health respondents reported having had two or fewer jobs between 2001 and 2009 (35.3 percent vs. 19.5 percent), whereas far fewer Add Health respondents reported six to nine jobs (10.9 percent vs. 28.9 percent). There is also evidence of forgetting shorter duration jobs in Add Health. Consistent with longer jobs being more salient, and therefore more likely to be recalled as being first jobs, Add Health respondents reported substantially lower percentages of first full-time jobs that lasted two months or less, or between three and five months, and higher percentages of first full-time jobs that lasted six months or more, relative to NLSY97 respondents. Two thirds of first full-time jobs reported retrospectively in the Add Health were of at least six months in duration, whereas just under half of first full-time jobs in the NLSY97 were of a duration of at least six months. We interpret this as evidence that Add Health respondents tended to forget earlier, less salient (i.e., shorter duration) first full-time jobs. Thus forgetting does seem to play a substantial role that cannot be accounted for by forward telescoping.
A further potential alternative explanation for the observed discrepancies in occurrence and timing of first substantial employment between the evaluated surveys and the NLSY97 is attrition in the NLSY97. Previous research has documented differences in the labor market behavior of those who do and do not attrite from longitudinal surveys (Zabel 1998). Selective attrition in the NLSY97 may be an issue of particular concern if nonattriting NLSY97 individuals have more stable work histories, and therefore likely higher prevalence of substantial employment, than attriting individuals. To test for possible biasing effects of attrition in the NLSY97, we compared annual cumulative NLSY97 panel reports in 2003 to 2006 with one-time retrospective reports from the SIPP 2008 panel. Specifically, we compared differences from the NLSY97 when, in place of 2002, we used 2003, 2004, 2005, and 2006 as years by which any job of six months or more had occurred. The number of NLSY97 sample members still present in this 1980–1984 birth cohort gradually falls from 3,145 in 2002 to 2,714 in 2006. If the NLSY97 were increasingly unrepresentative because of attrition, we would expect there to be a changing pattern of difference from the SIPP 2008 estimate for later years. In particular, if NLSY97 attritors tended to be those with less stable employment histories, we would expect the gap between the NLSY97 and the SIPP estimates of any substantial employment to widen between the 2002 and 2006 years by which a job of 6 months or more had been experienced. If attrition were not a biasing factor, we would expect to see no systematic pattern of change in the deficit of the SIPP compared with the NLSY97 over those years. Results are again presented in panel A of Table 4. They do not indicate that attrition is a biasing factor. We find a consistent pattern of lower percentages reporting having had a first job of six or more months in the SIPP than in the NLSY97 and no pattern of either an increasing or diminishing gap with increasing cumulative attrition in the NLSY97. In results not presented here, we also estimated a logistic regression of attrition from the NLSY97 before 2003 on the same five sociodemographic regressor variables used in the main multivariate models of first substantial employment occurrence and first substantial employment timing that we present below. We found offsetting directions of likelihood of attrition on variables associated with economic disadvantage. Black women and women who had an early birth (i.e., by 2002) were less likely to have attrited, whereas women whose mothers had no more than a high-school education were more likely to have attrited before 2003.
We conduct our main tests of retrospective-recall accuracy with regressions pooling data from each of the three retrospective-report surveys (SIPP, NSFG, and Add Health) with the panel-report survey (NLSY97). Results of bivariate statistical tests of the regression outcome variable across predictor variables, comparing each retrospective survey with the NLSY97, are reported in the online Appendix. Table 5 shows the results of logistic regressions estimating the probability of reporting first substantial employment before the end of calendar year 2002 in retrospective reports versus the NLSY97. In these regressions, the main-effect coefficient for each of the sociodemographic variables represents the reporting of that particular sociodemographic group in the reference survey, the NLSY97. We interpret these panel reports as best representing the actual behavior of these groups. The coefficients of interest for testing recall-accuracy are those that indicate differences in the outcome variable between retrospective reports (in the NSFG, SIPP, or Add Health) and the NLSY97. A statistically significant negative coefficient on the retrospective-survey main effect indicates underreporting of first substantial employment by 2002 among respondents who are members of the reference-category groups of the sociodemographic variables in that particular survey. In the SIPP versus NLSY97 comparison, consistent with an adverse recall-duration effect on recall accuracy, women in the 2004 and 2008 SIPP were significantly less likely to report a first six-month job by 2002 than women in the NLSY97, and women in the 2008 SIPP panel were significantly less likely to report such a job than women in the 2004 SIPP panel (p = .009, results not shown). Similarly, women in the NSFG were significantly less likely to report a first six-month job by 2002 than women in the NLSY97. The coefficient for being an Add Health respondent is also negative, but it is not statistically significant (p = 0.15).
Given these negative coefficients for retrospective-survey main effects, we interpret a positive coefficient for the interaction of a sociodemographic-group category with “retrospective-survey” as indicating less underreporting among that group. We interpret a negative interaction coefficient as indicating more underreporting among that group than for the reference-category sociodemographic group. Consistent with adverse recall-duration effects, women from later cohorts were statistically significantly more accurate in reporting their first substantial employment than were women from earlier cohorts across all three evaluated surveys.
Consistent with favorable effects on reporting of higher employment salience and lower complexity, black women in both the SIPP and NSFG, but not in Add Health, were more accurate in reporting first substantial employment by 2002 relative to white women. Hispanic women in the NSFG and Add Health were also more accurate in reporting their first substantial employment. Consistent with anticipated salience, complexity, and anchoring effects, women who had ever given birth by 2002 in the SIPP and Add Health were more accurate in reporting their first substantial employment. Women in Add Health who had never married, however, reported less accurately.
Table 6 shows the results of our linear regression model of age at the start of first reported substantial employment by 2008–2010, among women who were aged 24 to 30 and had had any substantial employment by that time. The main-effect coefficient for each of the sociodemographic variables represents the change in age in the benchmark survey, the NLSY97. For each sociodemographic group, a positive coefficient represents an older age at first substantial employment, and a negative coefficient represents a younger age at first substantial employment.
Again, the coefficients of interest for testing recall accuracy are those for the retrospective survey (SIPP, NSFG, or Add Health). Consistently across the three retrospective surveys, respondents report first substantial employment as occurring on average half a year older than NLSY97 respondents with otherwise identical observed characteristics. We interpret this as consistent with an adverse recall-duration effect. Given these retrospective-survey main effect coefficients are positive, a statistically significant negative coefficient for the interaction of a covariate with “retrospective survey” indicates that retrospective reports from this group are more accurate than retrospective-survey respondents who are members of the reference category on the covariate. Consistent with Hispanic women’s employment histories having higher salience and lower complexity, their reporting of age at first substantial employment in the NSFG and Add Health was younger (i.e., more accurate) than for white respondents with otherwise identical observable characteristics. Similarly, women in Add Health and the NSFG whose mothers had relatively lower education were more accurate reporters of age at first substantial employment than were women whose mothers had a bachelor’s degree. The age at which SIPP and NSFG respondents who had a birth by 2002 reported that their first substantial employment began was on average 0.7 to 0.9 years younger than respondents without a birth but with otherwise identical observed characteristics. Given our earlier findings on differences in employment histories between women with and without an early first birth (higher salience and lower complexity on average for women with an early first birth), these estimates of more accurate reporting among women with a first birth are consistent with salience and complexity effects. They may also be due to the anchoring effect on employment reporting of having had a birth event. Consistent with shorter recall duration, the age at which Add Health respondents born in 1982 reported that their first substantial employment began was on average half a year younger than that reported by women born in 1980.
In the Add Health versus NLSY97 comparison, Table 6 shows two individual-level indicators of employment history salience and complexity: number of jobs since 2001 and duration of first full-time job (for distributions of these variables, see again Table 4). Although we expected that reporting would be worse among those women with a larger number of jobs, there was no statistically significant relationship with age at first full-time job found for women who reported having more versus fewer jobs between 2001 and 2008/2009. Consistent with a salience effect, however, among Add Health women who said their first job lasted three to five months, the age at which they reported those purported first jobs began was on average two thirds of a year younger than for those who reported that their first jobs lasted two months or less, relative to NLSY97 respondents. Among Add Health women who said their first jobs lasted six months or more, the age at which they reported that those purported first jobs began was instead almost one and a half years older than those who reported first jobs of two months or less. Theory on the relationship between topic salience and reporting accuracy indicates that more salient events are often falsely reported as having happened more recently than they actually did because they “loom large” in the respondent’s memory (Loftus and Marburger 1983). By extension, in the present study, having had more salient jobs may not necessarily contribute to more accurate overall reporting on first employment. Respondents might rather give the longest, most salient jobs greater emphasis in their reports by misreporting these more salient jobs as having been first jobs. Our results suggest that among Add Health respondents who reported that jobs of less than six months were their first full-time jobs, those who reported first jobs of three to five months’ duration remembered the timing of their jobs better (on average, reporting them as beginning at younger ages) than those who reported first jobs of less than two months. Hence, respondents with first jobs of three to five months retrospectively reported their age at the start of these longer short first jobs more accurately on average because of the jobs’ greater salience. On the other hand, when Add Health respondents reported even more salient jobs of six months or more as being their first jobs, they were more likely on average to report those jobs as starting at older ages, suggesting that some of the Add Health respondents had in fact forgotten earlier, shorter jobs.
4. Discussion
In this study, we have examined the accuracy of retrospective survey reporting on first substantial employment among young women born in the United States between 1980 and 1984, in three major nationally representative surveys: the SIPP, the NSFG, and Add Health. Previous studies have evaluated retrospective reporting of first demographic events (e.g., Lauritsen and Swicegood 1997; Peters 1988; Wu et al. 2001) and first health events (e.g., Knauper et al. 1999; Yoshihama et al. 2002). We know of no previous study that looked specifically at the accuracy of retrospective reporting on first employment. We derived our hypotheses about the accuracy of respondents’ recall of the occurrence and timing of first substantial employment from three main sources: (1) general survey recall theory, (2) studies that evaluate accuracy of recall of employment at a specific time or over some particular period in the past, and (3) evidence of sociodemographic differences in complexity and salience of early employment experiences that we obtained from our benchmark NLSY97 data. This latter evidence from the NLSY97 gives us reason to expect that reporting on a first substantial employment’s occurrence and timing may not be an easy recall task. For none of the sociodemographic groups we studied was the majority’s experience that their first job of six months or longer was full-time. For young black and Hispanic women, and for young women whose mothers had no more than high school education, about 60 percent of their first jobs of six months or longer were part-time. For young white women and for young women whose mothers had a college education, two thirds to three quarters of their first jobs of six months or longer were part-time. In addition, almost half of the early employed years among white women and women whose mother had a bachelor’s degree included two or more jobs. This was also true for about a third of the first employed years of black and Hispanic women and women whose mothers had less than a high school diploma. These statistics suggest that job sequences of high complexity and low salience were the early employment experiences of many respondents.
Our results show discrepancies between retrospective and panel reports that are consistent with expectations from general theory of survey recall accuracy related to the length of the recall period, the salience and complexity of women’s employment histories, and the presence or absence of time-anchoring biographical details in respondents’ lives (Sudman et al. 1996; Tourangeau, Rips, and Rasinski 2000; Schaeffer and Presser 2003). This theory, and previous research on unemployment and employment recall (Jürges 2007; Manzoni 2012; Manzoni et al. 2010; Mathiowetz and Duncan 1988; Pierret 2001), indicates that a longer recall period would lead to poorer reporting of both employment and unemployment. Consistent with this expectation, we found that women who must look back over longer periods of time to remember their first substantial employment reported less accurately on the occurrence and timing of their first substantial employment. This includes women reporting retrospectively overall, as well as retrospectively reporting women from the 2008 versus 2004 SIPP panel and from earlier versus later birth cohorts.
Survey-recall theory additionally asserts that more salient events will be better recalled but that as the complexity of the full event history increases, accuracy of recall with respect to any single event will diminish. Supportive evidence has previously been offered by Freedman et al. (1988) with respect to part-time employment and by Belli et al. (2013) with respect to the reporting of hours and weeks of employment and of transitions into and out of employment. We compared sociodemographic groups that we knew, from our analyses of the benchmark NLSY97 data, differed on salience and complexity dimensions of their early employment sequences. Consistent with theory and with previous evidence on employment reported for a given period, we found that women from sociodemographic groups whose employment histories are characterized by higher salience and lower complexity—including black and Hispanic women, women whose mothers had lower education, and women with early births—reported more accurately either the occurrence or timing of their first substantial employment. These findings on sociodemographic differences in reporting accuracy are substantively opposite to, but theoretically consistent with, the work of Mathiowetz and Duncan (1988). They found that more-educated individuals reported more accurately on their unemployment spells, but that this was because more educated individuals’ spell sequences were low in complexity. We found that more-educated individuals reported less accurately on their first substantial employment spells but that this follows from their employment spell sequences being higher in complexity (more jobs) and lower in salience (more likely to have been part-time). In both Mathiowetz and Duncan’s study and our study, difficulty of the recall task appears to have been the determining factor behind sociodemographic differences in recall accuracy.
In our comparison of Add Health with the NLSY97, we were able to test more directly individuals’ performance in reporting job sequences with higher complexity and lower salience. We found that retrospective reports of Add Health respondents included too few jobs and were skewed toward incorrectly reporting longer (more salient) jobs as first full-time jobs. As a result, Add Health respondents were more likely on average to report their first full-time jobs as starting at older ages.
Survey-recall theory further asserts that important biographical life events provide “anchoring” assistance with respect to the timing of events being recalled. We found that women who had early births more accurately reported the occurrence and timing of their first substantial employment experiences. This is consistent with other studies that have applied “anchoring” theory to employment, notably Manzoni’s (2012) finding that women with children report on their employment transitions more accurately than women without children, and with Loftus and Marburger’s (1983) findings that anchoring events improve recall in other domains. As noted above, however, women with early births also had lower complexity and higher salience employment histories. Anchoring may therefore not entirely explain their more accurate recall than women without early births.
In summary, we found that when retrospective reporting of first substantial employment differed statistically significantly from our benchmark panel-survey data, the results were largely consistent with theory from the survey-recall literature and with previous evidence on reporting about employment at or across a given period. This previous evidence, however, has very largely been from surveys in which contemporaneous panel-survey responses about a particular event have been available as a source of verification against reports from the same survey respondent who later reported about the same event retrospectively. For none of the three surveys we evaluated was this type of corroborative question available for checking respondents’ reporting about first substantial employment. It is valuable, therefore, to have shown in the present study that a benchmark survey can alternatively be used to evaluate overall accuracy of retrospective reporting and to assess which sociodemographic groups are likely to report that employment event more and less accurately. We also caution, however, that whereas using a benchmark survey for comparison allows well for the assessment of mean direction and magnitude of reporting error, its usefulness is more limited for assessing variability in reporting error.
The magnitudes of difference we found between the estimates of first substantial employment based on the annual panel reports of the benchmark survey and the estimates based on the retrospective questions of the three evaluated surveys were relatively small. For example, timing of first substantial employment was on average about half a year later in the retrospective reports than in the benchmark survey’s annual panel reports. Furthermore, given the policy importance of early employment for the analysis of welfare and work programs (e.g., Pavetti and Acs 2001), it is reassuring that differences between estimates from reports in the retrospective survey questions and estimates from the benchmark panel survey estimates were especially small for women from disadvantaged groups, including those who have children at young ages. We do not claim that these more-disadvantaged women are inherently better (or worse) reporters of when they first achieved a stable period of employment. Instead, because their early employment experiences are marked by fewer total jobs, and fewer part-time jobs than more advantaged women, their recall task is simpler. It is therefore unsurprising that estimates of the timing and occurrence of their first substantial employment tend to be closer on average to those from the benchmark data source.
Footnotes
Acknowledgements
This paper was presented at the 2015 annual meeting of the American Sociological Association. We are grateful for helpful comments and suggestions from Meredith Kleykamp, Liana Sayer, Wei-hsin Yu, and three anonymous reviewers.
Funding
The authors received financial support from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (grants R03HD084974 and R24-HD41041).
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
