Abstract
This paper discusses potential methodological issues in the design and implementation of calendar recall aids such as the Life History Calendar for cross-cultural surveys. More specifically, it aims to provide insights into how the use of landmark events in calendar interviewing may be influenced by cross-cultural variability. As an example, we compare the landmark events reported by Dutch and American respondents in two studies in which calendar recall aids were used. The study discusses differences that were found between the two countries in the numbers and types of reported landmark events, as well as in the temporal distribution of those events. The outcomes suggest that it is important for researchers to examine how landmark events in calendar instruments translate in diverse cultural contexts.
Introduction - Event History Calendars in Social Research
Researchers in many scientific disciplines use information on past behavior and the life events of individuals in their empirical research. Often, this information on past events is collected retrospectively by means of (semi) standardized survey questionnaires. The problem with retrospective self reports, however, is that their quality can be compromised by recall error such as omissions, dating error, and biased retrieval (Dex, 1995; Moss and Goldstein, 1979; Schwarz and Sudman, 1994; Grémy, 2007). To reduce the reporting bias caused by recall error, social researchers have developed a variety of aided recall techniques, such as check-lists, cue lists, and decomposition strategies (for discussions of the positive and negative effects of those techniques see Van der Zouwen et al., 1993, and Belli et al., 2000). In recent years, the use of calendar-based recall aids, such as the Life History Calendar, has become increasingly popular. This type of recall aid is based on the idea that retrospective questions about certain types of events or behaviors might be easier to answer if the respondent can relate the timing of those events to other events that occurred in close temporal proximity. This technique is assumed to be specifically effective if the parallel event is personally significant to the respondent and has an “annual significance” (Auriat, 1993), such as the birth of a child.
Calendar instruments combine several types of memory cues, which stimulate the retrieval of autobiographical events and help the respondent date those events accurately (Belli, 1998; Belli and Callegaro, 2009; Glasner and Van der Vaart, 2009). First of all, they include public and/or personal landmark events from the reference period (as discussed above). Secondly, there is a visual display of the time dimension. The total reference period is divided into smaller time units, such as years, months or days. Finally, the data provided by the respondent about one or more thematic domains is represented on separate parallel timelines, thereby providing additional memory cues.
Calendar techniques have been integrated into large-scale, longitudinal social surveys such as the German Life History Study (GLHS), the Survey of Health, Ageing and Retirement in Europe (SHARE), and the Panel Study of Income Dynamics (PSID). Until now, calendar instruments have seldom been used in multinational surveys, and the authors are not aware of any methodological studies describing such efforts. However, as they have been known to increase the quality of retrospective self-reports on issues such as housing, job histories, and purchases, applying them in multinational demographic and health surveys could be considered an attractive option.
In the following, we will discuss the use of temporal landmarks in calendar interviews, and describe the way in which the multinational character of a study might affect their effectiveness in retrospective surveys. Intercultural differences in reports of landmark events will be illustrated by examples from two calendar studies, which were conducted in the United States and in the Netherlands.
Landmark Events
In most calendar studies, respondents will be asked to provide a number of personal events from the reference period which will help them date other events as they progress through the interview. By definition, those landmark events need to be time-tagged so that they can function as reliable dating cues for the respondent. Usually, those time-tagged landmarks are very salient autobiographical events, which “stand out” in memory and have been retrieved and rehearsed relatively often. Shum (1998) defines landmark memories as playing an active as well as a passive part in autobiographical memory. Not only do landmark events serve as indexes that (actively) help people organize and access other autobiographical memories, they are also stored in memory in a more detailed way than other events.
Instructions as to which type of events the respondent might use as landmarks differ per study. In general, the researcher will be interested in those landmarks from a functional point of view, that is: if the landmark helps the respondent date other events more accurately, it does not matter what type of event the respondent uses. Nonetheless, most surveys in which calendars are used will give a number of examples, often including vacations, family events, major health events and the like. Our findings from earlier studies indicate that reports of different types of landmark events might be slightly biased towards the event types that are mentioned as examples in those instructions (Van der Vaart and Glasner, 2011).
Besides the personal events mentioned by the respondents, many instruments also offer public event cues (Hoppin et al., 1998), such as natural disasters (Loftus and Marburger, 1983), and other memorable historical events. The Neighborhood History Calendar used by Axinn and his colleagues (1997) in Nepal contained memorable public events from national (for example, the deposition of the king), regional (natural disasters) and local levels (accidents or neighborhood level changes, such as household electrification). Instead of written cues, other researchers have used icons and toy figures (Engel et al., 2001), or adhesive pictures (Hoppin et al., 1998) to make their calendar instrument more attractive and more easily understood by populations with limited literacy.
Furthermore, institutional, economical, and educational calendars might provide respondents with temporal reference points, such as school terms and public holidays. Institutional calendars, of course, differ across nations. It is possible that those calendars shape perceptions of time as well as the de facto dispersion of events within the calendar year. If survey respondents use information from institutional calendars as dating cues, this could lead to differences in dating bias between countries.
Comparison of Landmark Events from Two Studies
In an effort to illustrate some of the methodological issues that we raised in the previous section, we present a comparison between the use of landmark events by Dutch and American respondents in survey interviews with calendar instruments. It explores differences in the types of landmarks reported, and in the distribution of landmarks over time.
Design of the Studies
The landmark events that we used for this study were derived from two separate studies, one in the United States (n=231) and one in the Netherlands (n=67). In both studies, calendar instruments were used to collect retrospective data from respondents. Personal landmarks were either recorded by the interviewer (US study) or by the respondents themselves (Dutch study). In the first study, a methodological comparison of Event History Calendars and question-list surveys within the Panel Study of Income Dynamics (PSID) conducted in spring 1998, interviewers administered a two-year Event History Calendar during a telephone interview (for details, see Belli et al., 2001). In the second study, a consumer survey done in the Netherlands in spring 2004, computer-assisted telephone interviews (CATI) were conducted. During the interview, respondents could use a simple seven-year calendar instrument, which had been sent to them by mail and which they had filled out in private before the interview (for a more detailed description of this study, see Van der Vaart and Glasner, 2007). Respondents were asked to send the completed calendar back to the researchers after the interview.
To enhance the comparability of the two data sets, we only selected events that were reported as having taken place during the two-year reference period of study 1 − that is, in 1996 or 1997 − or in the most recent two years of the reference period of study 2 − 2002 and 2003 − both studies having been conducted in April.
Coding Scheme
Landmark events from both studies were transcribed and the verbatim descriptions were entered in SPSS. Our approach can best be described as emergent coding: we based our coding scheme on the data rather than on theoretical principles (Stemler, 2001). Two of the authors and one graduate student coded each data set independently so that disagreement between coders could be resolved by “majority of votes”. Inter-coder reliability, Krippendorff’s Alpha (Krippendorff, 2004), was high (≥.89) for both data sets and all pairs of coders. We developed a first classification scheme with 17 categories, which were later condensed into six categories, covering 97 percent of our data. Only 22 out of 715 selected events could not be classified as being vacations, health events, family events, births or deaths, work and education events, or home and leisure events (including residential moves). For the purpose of this international comparison, “Family and Relationships” and “Births and Deaths” were merged into one category.
Results
Table 1 shows the numbers of landmark events that were found in the two data sets by category. We found large differences between countries in the (relative) numbers of reported events in at least three of the five main categories. US respondents tended to report more family as well as work and education events, whereas Dutch respondents reported more vacations and, to a certain degree, health events.
Reported events by category and data set and inter-coder reliabilities
Next to the differences in category frequencies, we found discrepancies between the data sets with regard to the distribution of events over time (Figure 1). Dutch respondents tended to report relatively more events for the first two months of the year, while American respondents reported far more events for November and December. Even though the total number of reported events was higher for the most recent year of the reference period in both data sets, the temporal distribution of events across months was very similar for both years within both data sets.

Distribution of landmark events over months in the Panel Study of Income Dynamics (US) and the Dutch Consumer Survey (NL).
In the US sample, respondents reported only 19 percent (versus 32 percent in the Dutch consumer survey) of vacations to have taken place during the first five months of the year. In the same sample, 30 percent of all health events were reported for November or December, versus only 10 percent in January or February. In the Dutch data, this discrepancy is smaller with 13 percent of all health events having taken place in January or February versus 23 percent in November or December).
One of the reasons for this difference in the temporal distribution of events could be differences in employment benefit policies between the two countries or possibly differences in cultural attitudes toward work and leisure. Dutch employees can (at least in theory) take all their leave days and an unlimited number of sick days from the beginning of the calendar year, whereas those employed in the US often have to accrue leave days during the course of the year.
The dissimilar temporal distributions of landmark events found in the two calendar studies may have had an effect on data quality as we would expect survey respondents to relate the dates of other events to those landmark dates. In earlier studies on free recall of life events, heaping of reported events was found around temporal boundaries, such as the start and end dates of college terms (Pillemer et al., 1988). Anderson (2005) suggests that those so-called “calendar effects” might be due to people using those temporal boundaries as anchoring points when trying to determine the dates of other events. In doing so, they might subsequently underestimate the distance between the anchor (the landmark) and the target event, an effect that has been found in a variety of contexts (see Tversky and Kahneman, 1974). In our example, this could mean that American respondents would be more inclined than Dutch respondents to project transition dates towards the end rather than the beginning of the calendar year. In a hypothetical survey with a one calendar year reference period, this could lead to underreporting/omission of transitions that happened in January or February, as the respondent might (falsely) assume that those events took place close to the temporal landmarks at the end of the previous calendar year.
Discussion
Based on our preliminary results we suggest that researchers should test how landmark events in calendar instruments translate in diverse cultural contexts. There may be considerable variation across countries in the way in which respondents generate personal memory landmarks and use them as temporal anchoring points. The question remains as to how and if these potential differences actually influence the effectiveness of Event History Calendars. It is possible that culture-specific temporal heaping of events could bias date reports for autobiographical events. However, further research is warranted. The comparison presented in our study is merely illustrative and practical implications remain unclear. The next logical step in this line of research would be to explore the issues and questions raised above in an experimental design.
Footnotes
Funding
This work was supported by the National Institute on Aging (grant number 5R01AG17977-5) and the Netherlands Organization of Scientific Research (grant number 400-03-331).
