Computer Audio-Recorded Interviewing as a Tool for Survey Research

Abstract

Developers must understand the needs of user populations and the potential benefits of the new software to them, so that the development team can create an effective system. Especially for applications employed outside the computing profession, it is important for the software team to learn context and workflow and to understand the value that their development work will bring to the users. This article discusses design and implementation considerations for computer audio-recorded interviewing (CARI), a method coming into widespread use for survey research in the social sciences. When implemented as part of the data collection process, CARI allows a survey manager to listen to the exact circumstances of how questions were asked and answered during the interview, a much more powerful approach than prior indirect methods of quality control and improvement. Design considerations can be complex when planning an integrated system. Based on a decade of experience and prior implementation of several distinct CARI systems, this article explores a part of the operational world of survey research from the eye of the system developer. It offers context for those developers who are unfamiliar with survey research or for anyone who is unfamiliar with CARI operations. Discussion focuses on benefits, requirements, user goals, system design challenges, and options.

Keywords

audio recording system design interface design CARI computer audio-recorded interviewing survey research qualitative research

What do the following have in common: making policy decisions for handling child abuse, estimating cocaine use in cities and states, knowing if the cost of living is going up or down, and setting the fees for hunting and fishing licenses? All of them depend on data collected through research surveys, underlying decisions made by federal, state, and local agencies. Nearly every step of gathering, managing, and delivering survey data occurs electronically, shepherded by software and hardware specialists who bring together computing technology and social science research. Though not visible, software and system engineers contribute indirectly to sound policy choices and social measures though their support of survey research.

It can be difficult to ensure that survey responses are factual and accurate. The widespread adoption of computers for surveys around the turn of the 21st century changed many aspects of survey data collection (Couper and Nicholls, 1998). With computerization came new options, and processes for reducing error changed. The word “error,” here, is used both in its common meaning of “mistake” and in its statistical sense, referring to estimates of completeness and accuracy (Groves, 2004).

Historically, survey data quality had been managed through analysis of response data sets and indirect performance management techniques. The adoption of computing technologies for survey data collection introduced the more direct approach of digital audio recording that lets a manager listen to the exact circumstances of how questions were asked and answered during administration of the questionnaire. The technique has come to be known within the social sciences as computer audio-recorded interviewing (CARI; Biemer, Herget, Morton, & Willis, 2000; Hicks et al., 2010), a valuable and powerful method. One encounters it during interactions with service centers, market researchers, or other telephone contacts, when the introductory words announce, “This call may be recorded for quality purposes.”

Recognized as a technology that can be expected to grow in multiple directions (Couper, 2005), digital audio recording has surged in popularity as a way to confirm survey data authenticity, evaluate interviewers’ job performance, and support other research needs. Governmental organizations such as the U.S. Census Bureau, Statistics Canada, Statistics New Zealand, Britain’s NatCen, various research organizations, and universities have adopted or are in the process of adopting CARI as part of routine operations (Thissen, Fisher, Barber, & Sattaluri, 2008), giving rise to an enlarged body of knowledge and growth in software applications. This article looks at the value, purposes, and design of systems for digital audio recording and review for survey research.

The Interviewing Cycle

Survey data collection can be viewed as a feedback cycle. Interviewers using computers enter responses from survey participants, deliver the data for compilation and review, receive feedback, and apply that guidance in the next interview. Thus, the process forms a loop, with information from one interview affecting conduct of the next. Use of audio recording improves feedback as shown in Figure 1. Strengthening feedback improves the quality of the data products and efficiency of the process. Information from audio files may also influence decisions to modify the survey instrument, to correct collected data or to flag certain data points for a second call to the respondent for confirmation. All of these actions increase the value of the eventual product of survey work: the final analytic data set.

Figure 1.

Overall CARI workflow. CARI = computer audio-recorded interviewing.

Preparation for a Computer-Based Survey

Prior to beginning data collection, system developers create or configure infrastructure to support data collection; social scientists design the questionnaire and operations; instrumentation specialists program and test the questionnaire; and statisticians select the sample population. When all is ready, data collection can begin.

Questionnaire programmers typically use a product such as Blaise (http://www.blaise.com) as shown in Figure 2, Computer-assisted Survey Execution System (CASES; http://cases.berkeley.edu/software.html), or other software. Blaise and CASES are specialty packages intended specifically for survey administration. Blaise supports audio recording through configurable built-in functionality, and CASES allows it through user-programmable extensions. The programmed questionnaire, called an instrument, controls the flow of interview administration by using conditional constructs, looping elements, and fills (insertion of respondent-specific or locale-specific information within the question) in conjunction with built-in consistency checks and computations. The instrument may also branch to other software to initiate and terminate calendar entries, global positioning system data capture, recordings, or screenshots as part of the data.

Figure 2.

Example of a questionnaire item programmed with Blaise software, as it would appear during the interview.

When audio recording is employed, the instrument must be programmed to activate and deactivate a microphone at specific times via configuration files or logical control. With Blaise software, for example, the programmer sets configuration parameters for which questions to record. With CASES software, the programmer calls a custom-built external application which in turn manages the recorder. The software tags recordings by respondent, question, and time stamp, in case a particular question is presented more than once to the same respondent. Not all survey software has the capability of recording, but many support custom extensions, allowing a developer to activate/deactivate the computer’s recording capabilities on demand.

Each survey uses one or more data collection modes, a term that refers to the approach for gathering data:

In person, with the interviewer and respondent physically together, often called computer-assisted personal interviewing (CAPI);

Telephone, with the interviewer speaking by phone to the respondent, known as computer-assisted telephone interviewing (CATI); and

Self-administered questionnaires, such as web forms.

Of these, both CAPI and CATI modes can support audio recording techniques due to the presence of spoken interactions. Web surveys, whether programmed through online software services or custom-built, cannot easily record audio. While it is possible to activate client-side recording capabilities through applets, control of client-side functionality may not be allowed by the user’s security settings. In addition, web surveys rarely include vocal interactions, and so there would be little to record.

The choice of which questions or sections to record depends more on survey methodology than on technical concerns. For some purposes, recording a single question and response from the beginning, one in the middle and one at the end may be adequate. Longer recordings, such as whole sections of the questionnaire, provide a better basis for managing interviewer performance or confirming data entry. Targeted selections may prove desirable, if certain questions are key for data analysis and therefore need a higher level of data quality control, or if certain questions or sets of questions are under evaluation for how well the phrasing elicits valuable responses.

A nontechnical concern of great importance is the legality of recording. In the United States, laws vary from state to state as to whether just one or all parties must give consent. National survey organizations usually opt to collect informed consent from interviewers as an employment requirement and from respondents at the time of data collection. Instrument software configuration must allow the respondent to refuse recording without declining to participate in the interview.

Once the instrument has been programmed and background case-specific data sent to the field laptops or scheduled for the call center, the routine of production data collection is ready to begin. Many systems interoperate to support interviewing and subsequent processes, as shown in Figure 3. Though it is a detailed figure, the systems within it can be understood best by looking at the three vertical sections: on the left, the interviewer’s desktop, laptop, or mobile system; in the center, movement of information to and from a central repository; and on the right, systems for reviewing the gathered data. Thus, the illustration spans both geographical location and groups of users. For software developers, it is important to understand full system integration, even when working on only a subsystem. Hardware, software, and communications systems may vary from one CARI implementation to another, but this diagram describes the common and fundamental concepts and components. A generalized survey system such as this must be flexible and comprehensive enough to enable all expected modes of usage and to survive generations of hardware and software.

Figure 3.

Data flow and systems that support CARI data collection, transfer, and review. Dashed lines indicate optional processes, and shaded background indicates CARI-related functionality. CARI = computer audio-recorded interviewing.

Data Collection

In Figure 3, the left panel corresponds to software and processes taking place on the interviewer’s workstation, corresponding to the data collection activities of a survey. CAPI survey fieldwork and telephone interviewing both start with case management software that presents preloaded contact information and other details that may be needed for reaching the subject and encouraging participation, analogous to customer management systems used by sales force staff to reach buyers and close a deal. The preprogrammed survey instrument presents question text and response options. It follows programmed skip logic, routing the interview on an appropriate path through the possible questions, based on answers to branch-point questions. For example, the instrument would skip pregnancy and childbirth questions for male respondents.

Interviewers are trained to read each questionnaire item exactly as displayed, which may be as simple as the example shown earlier in Figure 2 or very complex, depending on the subject matter. Although interviewers are trained in reading verbatim and the response options are usually limited, the conversational exchange between interviewer and respondent often contains a wealth of information not captured by the response data. For example, when the words are difficult to read, as in surveys of medical conditions, the interviewer may stumble. If the text of the question or response list is lengthy or unwieldy, the interviewer may be inclined to paraphrase, affecting data quality. Respondents often ask for clarification of the questions, express hesitation before responding, or elaborate on their answers. Audio recording captures the nuances of the exchange, enriching the data entered by the interviewer.

As the interview progresses, audio files may be recorded for some or all questions and responses according to configuration of the instrument. Some surveys may record the entire interview, while others select a few specific questions or a probabilistic sampling of some parts. The recordings are tagged with identification fields and stored along with response data. It is important that neither interviewers nor respondents play an active role in the process of recording once they have completed initial procedures of providing consent to record. Neither person is allowed to be aware of the timing or extent of recording. Otherwise, a shortcutting interviewer (Ericksen, Kadane, & Tukey, 1989) might attempt to turn off recording during malfeasance, or a respondent’s answers might be biased or at least distracted by the start and stop of audio recording.

After the interviewing session ends, all collected data files are assembled, typically compressed and encrypted, and prepared for transfer to a central location. Transfer processes may occur immediately in telephone call centers that have full connectivity, or somewhat later for interviewing systems that lack continuous connectivity, such as CAPI laptops. Whether synchronous or asynchronous, the transfer process begins on the interviewer’s computer, as shown at the bottom of the left panel of Figure 3, and continues with Internet or intranet data transfer, shown in the middle panel, delivering data to centralized storage in the rightmost panel, where it becomes available for review and evaluation.

Data Storage

Four types of data are received and stored at the home office or data facility:

Response data: answers to the survey questions;

Metadata: descriptors specifying the data elements, such as question text and list of response options;

Paradata: process metrics and identifiers, such as start and stop time stamps, presence or absence of recording, interviewer identification; and

Auxiliary data: recordings, location coordinates, calendar entries, or images.

Audio recordings contain response data as well as auxiliary information. Under some circumstances, audio recordings may replace data entry by the interviewer, with later transcription of responses for better capture of long answers or discussions.

Data storage requirements depend on several characteristics of the survey: the instrumentation software, system employed for audio review, the size of the survey’s subject pool and the size of the response, metadata, paradata, and auxiliary data sets. Auxiliary data such as recorded audio may be stored alongside or separate from response data. In a file-based system, the database only holds links or locations of the audio recordings, and playback software must locate and retrieve the external files. In a database system with relational object storage, audio recordings may be packaged along with response data in binary large objects (blobs), and the review software must unpack the blobs prior to playback. The chosen method typically meshes with the data transfer system for receipt and storage on a flow basis, allowing quick turnaround of review and feedback. At the small end of the scale, an interviewer’s workstation may hold MySQL files that are moved to a central location periodically. At the large end of the scale, the U.S. Census Bureau employs Oracle databases with blobs for storage of response data, auxiliary files, and paradata together (Nguyen, Thissen, Siege, & Bikmal, 2010).

The size of audio files places a load on storage, bandwidth, and tracking systems, compared with data collection that omits recordings. Such burden may be diminished by compression and judicious archiving or deletion policies. Fortunately, the cost of storage has declined in recent years as a significant fraction of operational expense, though data transfer of large files remains a challenge in areas lacking broadband.

Quality and Operational Review

CARI review processes take place in the rightmost panel of Figure 3. Data stored at the central site are made available to quality monitors through special-purpose software, usually known as a CARI review system. Review may include all or a subset of the audio recordings, perhaps supplemented by screen capture images or any other information collected during the interview. Often the review takes place as a codification of qualitative observations, providing a systematic synopsis that serves as a basis for scaled comparisons. CARI review is a synthetic method related to those described by Barnett-Page and Thomas (2009) for summarizing other types of qualitative observations in a semiquantitative form.

The reviewing system may be as simple as use of a file share with playback software and a mechanism for keeping notes, as used by Crichton and Childs (2008) for in-depth qualitative research on a modest scale. It may be as complex as a large commercial relational database with a role-based web application for playback, coding, and tracking of the cases and their audio segments, along with functionality for management of the review operation itself, as is found at the U.S. Census Bureau. It may fall in between, as does the QUEST system at RTI International, with a single coding interface and a versatile scoring scheme (Kinsey, 2012).

Although the process of recording may come to mind first, the review step is the heart of CARI’s value. With review and feedback, the data gathered during the interview can inform a variety of uses beyond simple responses, and it provides a basis for intervention and control of the data collection processes when needed. In many ways, collecting and storing the recordings present a lesser challenge than implementing software for the review stage. Design and specification of a CARI review system may present a major software development challenge, especially since such systems have only recently been requested by survey organizations and potential users are uncertain as to their exact requirements. To guide others who may need to build such a system, insight into design options is provided later in this article.

Feedback

Of the data collection cycle, only the action of delivering feedback to interviewing staff, questionnaire designers, and data analysts is done without software mediation, generally being presented in person or by telephone. Yet, the feedback would lose effectiveness without input from the CARI system. The results of audio review pass to data collection supervisors who recognize and encourage accomplishments or address concerns. The process offers guidance for the interviewing staff and a mechanism for influencing and controlling future data collection quality. CARI evaluations may also inform improvement of the questionnaire or correction of data issues, influencing the data product directly or indirectly.

Challenges in Designing a CARI Review System

All software development goes through stages of defining needs, designing the system, implementing, testing, release, and refinement, whether those stages are agile, cyclic, or sequential. CARI technology came into use in survey organizations fairly recently, and many operations are still discovering the ways in which it can aid their work. At this date, general purpose commercial CARI review software is not available, and each survey house must assemble its own specifications.

The CARI review portion of a full data collection suite pertains specifically to audio playback, comparison with expected content, and subsequent entry of the reviewer’s evaluations. In addition to software requirements that are common to multiuser data presentation systems, CARI review system design requirements generally include:

Scalability, from small surveys with a few hundred interviews to large national or international surveys, with hundreds of thousands of interviews;

Confidentiality and security, such as limited overall access, with role-based access to functionality and data and system tools;

Configurability, since surveys vary with respect to protocol, content, and outcome goals;

Interfaces that meet high demands for usability; and

Additional, organization-specific or survey-specific needs or concerns.

In addition, the CARI review system must integrate well with audio capture, transmission, and storage systems.

Scalability

Expectations of scale shape CARI review system design in several ways. One interview may yield many minutes of audio recording, even if statistical sampling algorithms are used to reduce the total number of questions recorded or to limit the number of recordings reviewed. For example, a survey may choose only to record a few questions and answers during the interview, selecting those of greatest analytic importance to the outcome data set, or those placed strategically for managing interviewing performance, such as recording a few each from beginning, middle, and end. Multiplied by hundreds or thousands of interviews per survey, and again by the number of surveys, the quantity of audio files to be tracked, stored, and processed in a CARI review system can quickly overwhelm a simple design.

The number of system users usually grows with the number of recordings. If each CARI reviewer plays 10 min of recording per interview and spends a few minutes making notes, for example, that reviewer can evaluate up to four cases per hour. If the organization must review 100 interviews per day, perhaps three reviewers can handle the workload. Do those reviewers themselves need to be monitored? If so, a fourth user may be needed, in a supervisory role. To monitor 1,000 interviews per day, the number of simultaneous users might approach 40 or 50. If the amount of time per review exceeds 10 min, those numbers expand proportionally.

At the smallest scale and simplest design, observations might be stored as free-form text commentary, with no software development at all. For very small operations, CARI reviewers could employ a free audio player and pair it with a spreadsheet for recording notes. However, such a system cannot handle more than a very low flow of interviews, cannot provide easy summaries, offers no security or linkage features, and would be subject to the inconsistencies inherent in most manual operations. For organizations serious about adopting CARI for the long term, more planning and investment is needed.

Confidentiality and Security

Role-based system design can streamline process management, particularly for large operations. To manage the system, security, progress of CARI reviewing, and outcomes of review, users may be assigned to specific roles:

System administrator role, to create new survey entries, manage user accounts, archive old data, and automate data input;

Survey manager role, to choose and configure coding schemes, monitor reports, and look up specific situations;

Security monitor, to audit user accounts, activity logs, and system error files;

CARI supervisor, to manage the daily workflow of reviewing; and

CARI monitor, to conduct the audio reviews.

Built-in reports, usage displays, user account management tools, and other system features become necessary, as the scale of operations grows. The system maintains ties within the database among review results, identity of the monitors (for workload management), identity of interviewers (for feedback), and linkage of audio segments to questions, respondents, and surveys.

Configurability

Organizations sponsor surveys for a multitude of reasons. In some cases, the purpose is to capture a quick snapshot of health, activities, knowledge, or demographics. In other cases, the purpose may be to explore an experimental hypothesis, to probe the underpinnings of an aspect of society, or to define variations among subpopulations such as eating habits. Some surveys are conducted to test the survey questions themselves or the effectiveness of the protocol. Because the goals vary, the information sought from audio recordings varies accordingly.

At this time, audio recording for surveys tends to be used in one of the three ways: for evaluating the performance of interviewing staff, for evaluating the questionnaire and survey protocol, or for confirming data quality and authenticity. Other potential uses are for primary data capture, as in recordings transcribed later, or for editing and correcting data entered by the interviewer. Consider how to extract such information from a collection of audio recordings: The generally accepted approach is through coding. The monitor, or audio reviewer, listens to the recording while selecting codes from a list of target characteristics. For example, in the case of performance monitoring, one desired behavior could be correct reading of the questionnaire items. Codes could indicate verbatim reading, minor change, major change, or other descriptions. Thus, the CARI review system should support a variety of coding schemes, likely with associated algorithms and weights for producing evaluation scores.

If the system supports multiple surveys with nonoverlapping or partially overlapping staff assignments, the CARI review administrator will need tools for setting up users with specific rights and privileges, account activation, deactivation, and change. Similarly, each new survey requires some degree of setup, perhaps identical for all surveys but potentially differing by sponsor, organizational branch, or other external constraint.

At system design time, this need for configurability affects database structure, interface functionality, display dynamics, and other fundamental system requirements. It may be possible to retrofit a flexible account management or coding scheme onto a non-configurable system, but the preferred practice is to lay the groundwork early and avoid rework.

Interface Design

Central to any CARI system is an interface for reviewing recordings. The CARI monitor needs to be able to listen to the audio, view any other required information such as question text, response or ancillary data, and enter review notes. Linkages among the notes, audio files, question, interviewer, respondent, reviewer, and other information are preserved in the supporting database, often in a many-to-many relationship, as a question may be recorded multiple times, and a single recording might pertain to multiple questionnaire items.

When CARI is used to manage interviewer performance, the system normally presents the reviewer with one recording at a time and stores an assessment of positive and negative interviewing attributes, with comments in free-form text. That approach is currently in use at RTI International’s QUEST system (Sattaluri, Spain, Nguyen, & Thissen, 2010) and is shown in Figure 4. QUEST supports a counting mechanism, in which reviewers count the number of occurrences of specific behaviors that are heard, and the system computes performance scores based on those counts. The option of entering counts, as opposed to a Boolean choice of present/not-present, allows refinement of the CARI-based metrics by including a factor of intensity. QUEST categorizes behaviors into groups that can be expanded or collapsed on screen to reduce screen space or skipped entirely if not relevant to a specific survey. Parts of the interface not shown provide identification information, a playback button that activates the user’s audio player, and navigation to other parts of the system.

Figure 4.

Coding screen design with expandable sections for categories of behavior.

Figure 5 shows an alternate CARI interface designed for the U.S. Census Bureau. In this system, a playback list is accompanied by an image of the interviewer’s computer screen, showing the wording of the question, response options, and any keyed information. The review page presents an embedded image of the interviewing screen at full resolution in an expandable window with scrollbars, so that the reviewer can inspect as much detail as needed. Alongside the image, a playback system allows several recordings to be played in sequence. Just below the screen image and playback list, a hierarchical set of assessment categories with one or more subgroups appear on screen dynamically when the coder selects any of the top-level categories (Nguyen et al., 2010). To preserve screen space for lower resolution screens, without losing the richness of available data, this system employs a tabbed design with the primary functionality on the main tab (shown) and less-frequently needed information and operational capabilities on the other tabs.

Figure 5.

Coding screen with dynamic coding categories showing tabbed page design for greater information density.

Other Concerns

For any CARI system, the review and coding screen is the core of the review but is not the only interface. Common activities, each with separate screens or sets of screens, include system management, user management, survey management, case assignment, workload access, statistical sampling, summary reports, data export, archiving, security review, and other role-dependent operations. Such screens are not visible to users who are not authorized to access them.

Additional overall system design, development, and support requirements may arise, such as a need to support simultaneous users working from different locations or in multiple time zones, mechanisms for evaluating the fairness of the review process itself, data flow into and out of the CARI review database, and maintenance of data linkage without divulging identities of individuals. In the future, CARI systems may be adapted to new technologies, such as new data collection hardware, cloud storage, or speech analytics.

Using a CARI System to Manage Surveys

In general, what gains can a survey organization hope to achieve through implementing an audio recording and review system such as CARI? Survey operations attempt to reduce error, especially for research surveys that guide policy makers or contribute to scientific understanding of contemporary problems. The three modules of the Census Bureau system reflect the main uses of CARI technology for total error reduction: improving questionnaires, confirming the quality of the collected data, and managing interviewer performance. Each of these forms of evaluation provides feedback that can improve the overall value of the survey.

As noted earlier, the Census Bureau’s CARI system was developed module by module over the course of several years. The behavior coding module was the first to be released, and the 2010–2011 American Community Survey (ACS) Content Test used it to evaluate new questionnaire modules and alternate wording of specific questions. This survey collected data in both CATI and CAPI modes, as shown in Table 1 (Pascale, 2011). In addition, data from the ACS Content Test were later used to test the quality assurance (QA) module, and results formed the basis for numerous system enhancements. At present, neither the QA nor the Coaching module has gone into production use, though the systems are available now and adoption plans are underway.

Table 1.

Summary statistics for the ACS Content Test. ACS = American Community Survey; CAPI = computer-assisted personal interviewing; CATI = computer-assisted telephone interviewing.

	CATI	CAPI
Sample size	23,673	15,202
Completed + partial cases	4523	6384
Interviews coded	727	701
Recordings coded	27,163	22,676

Staff who used the CARI system for behavior coding in the ACS Content Test cited several features as being important to their work:

For analysts, the ability to define a behavior coding scheme specific to the survey;

For coders, the availability of an image of the interviewer’s screen along with the audio;

For supervisors, real-time reports on coding progress; and

For managers, availability of tests for interrater reliability among coders.

The Content Test thus provided valuable information not only for the research purpose of the survey but also for evaluating the performance of the CARI system itself.

CARI coding schemes address the three major sources of error in questionnaire design and administration, errors resulting from the human aspects of gathering information: behavior of the interviewer, respondent reactions, and performance of the questionnaire. Although questionnaire improvement was the focus of the ACS Content Test, the other two aspects of quality management take precedence after a questionnaire has been finalized and moved into long-term or widespread use.

Behavior of the Interviewer

Survey personnel require training and monitoring to minimize certain errors that can arise from the way the interview is presented:

Failure to adhere to protocol, causing bias or variations in survey responses from person to person;

Loss of control over the interaction, when interviewing difficult respondents;

Erroneous use of the computerized system, such as entering responses incorrectly;

Shortcutting or skipping parts or all of the questionnaire; and

Deliberate falsification of responses by the interviewer, either at the unit level or at the item level.

Each recorded segment, when evaluated according to a standard set of assessment criteria, provides direct evidence of performance and data quality.

One very difficult task is obtaining proof of authenticity or, conversely, the degree of data falsification and shortcutting. An interviewer may finesse a questionnaire to skip lengthy sections by falsifying the response to gateway questions, those such as “Did you ever smoke?” that lead to a subsequent series of more detailed questions. An interviewer in need of funds may not bother to contact the subject at all, instead completing a fictitious interview and pocketing the payment while charging hours at the expected level. However, the CARI reviewer can confirm validity by noting the behaviors on key questions, the sound of two distinct voices and reasonable answer patterns. Suspicions about specific circumstances can be flagged for more intensive review.

Some individuals raise concerns about resistance of data collectors toward CARI monitoring. Few studies have been done, but in one, a debriefing of interviewers on a national CAPI survey, 82% of the interviewers reported feeling extremely positive, somewhat positive, or neutral about the overall use of CARI. Approximately 90% of the interviewers felt extremely positive, somewhat positive, or neutral about using CARI as a way to evaluate and provide feedback to interviewers, and even more accepted it as a falsification detection method (Biemer et al., 2000). Another study indicated that levels of acceptance or resistance to CARI might depend on how the technology was introduced or on geographic differences (Arceneaux, 2007).

Behavior of the Respondent

Errors relating to the respondent’s actions also come in several forms:

Deliberate falsification, perhaps due to a preference not to reveal certain information, fatigue, or giving erroneous answers that fit societal expectations;

Accidental error, when the respondent does not comprehend the question correctly;

Item-level nonresponse, when the respondent refuses to answer or does not know the answer; and

Mode effects, in which the mechanism of delivering the questions (telephone vs. in-person) may have an effect on the content of responses.

Causes for these errors may be difficult to detect, but by listening to audio recordings, the circumstances may be understood. For example, confusion on the respondent’s part is often indicated in the audio file by unscripted discussion, as the interviewer provides additional information or explanations. When a situation conducive to error is suspected through review of CARI, the questionnaire may be modified, additional materials provided to the respondent, or additional training given to interviewers on how to provide better guidance.

Performance of the Questionnaire

Good design of survey questionnaires and even of individual items depends on many factors, some of them fixed across surveys and some relating to the population being examined. A well-designed question captures correctly and comprehensively the information desired by the survey writer. However, it is not easy to achieve. Errors relating to the design of the questionnaire include the following:

Cognitive effects, in which ambiguous wording or abstruse vocabulary may not be understood well by the respondents;

Short-term memory effects, such as when long lists of response options are asked aloud and the respondent has trouble remembering the first option by the time the last one has been offered;

Long-term memory effects, when respondents are asked to recall events long past;

Behavioral effects, in which the presentation of questions elicits an emotional response such as anger, resentment, or enthusiasm on the part of either the interviewer or the respondent; and

Translation faults, for surveys administered in multiple languages.

For example, items with awkward, lengthy, or complex wording give rise to incomplete or misclassified responses. Even if the respondent understands, interviewers may commit mistakes or shortcut the work when trying to match a conversational response to a finite set of response options (Mitchell et al., 2008). Errors from questionnaire design may be assessed by review and behavior coding of audio recordings (Pascale, 2011) and improved for the next release of the instrument.

Summary

In summary, expertise in computing joined with subject matter expertise lets survey research take advantage of digital audio recording technology for a range of purposes. How an organization chooses to use the information or address challenges may differ from one implementation to another, but many design questions are held in common. The CARI review system presents the greatest development challenge, since organizational needs continue to evolve, commercial systems are not readily available, and flexibility of design demands foresight and understanding of survey research operations.

The CARI technique offers possibilities for other domains aside from survey research, as a monitoring approach for any situation involving recording-capable devices and conversation between individuals. One could imagine spot-checking a selection of airplane-tower communications, taxi-dispatch contacts, 911 calls, or other vocal interchanges. Similarly, audio monitoring could be used for almost any public-facing occupation that employs electronic equipment capable of recording, whether bank tellers, cashiers, or hotel clerks, where the quality of interaction with customers affects business. By combining the power of coding with the concepts of recording and review, simple pieces add up to a powerful source of information for evaluation and feedback.

Footnotes

Author’s Note

The author would like to recognize the numerous people in the Research Computing Division and Survey Research Division at RTI as well as in many branches of the U.S. Census Bureau who participated in CARI system design and development work over the years.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Funding for development of QUEST software shown in Figure 4 was provided by RTI International. Funding for development of the software shown in was provided to RTI International by the US Census Bureau under federal contracts 50-YABC-2-66053 Task Order 016 and YA-1323-09-CQ-0014 Task Order 005.

References

Arceneaux

(2007). Evaluating the computer audio-recorded interviewing (CARI) household wellness study (HWS) field test. Proceedings of the American Statistical Association, statistical computing section, American Statistical Association, Alexandria, VA, 2811–2818.

Barnett-Page

Thomas

. (2009). Methods for the synthesis of qualitative research: A critical review. BMC Medical Research Methodology, 9, Retrieved from http://www.biomedcentral.com/1471-2288/9/59

Biemer

P. P.

Herget

Morton

Willis

W. G.

(2000). The feasibility of monitoring field interview performance using computer audio-recorded interviewing (CARI). Proceedings of the American Statistical Association’s section on survey research methods, American Statistical Association, Alexandria, VA, 1068–1073.

Couper

M. P.

(2005). Technology trends in survey data collection. Social Science Computer Review, 23, 486–501.

Couper

M. P.

Nicholls II

W. L.

(1998). The history and development of computer assisted survey information collection methods. In Couper

M. P.

Baker

R. P.

Bethlehem

Clark

C. Z. F.

Martin

Nicholls

W. L.

II O’Reilly

J. M.

(Eds.), Computer assisted survey information collection (pp. 1–21). New York, NY: John Wiley.

Crichton

Childs

(2008). Clipping and coding audio files: A research method to enable participant voice. International Journal of Qualitative Methods, 4, 40–49.

Ericksen

E. P.

Kadane

Tukey

J. W.

(1989). Adjusting the 1980 census of population and housing. Journal of the American Statistical Association, 84, 927–943.

Groves

R. M.

(2004). Measurement error across disciplines. In Biemer

P. P.

Groves

R. M.

Lyberg

L. E.

Mathiowetz

N. A.

Sudman

(Eds.), Measurement errors in surveys (pp. 1–25). Hoboken, NJ: John Wiley.

Hicks

W. D.

Edwards

Tourangeau

McBride

Harris-Kojetin

L. D.

Moss

A. J.

(2010). Using CARI tools to understand measurement error. Public Opinion Quarterly, 74, 985–1003.

10.

Kinsey

. (2012). Use of CARI to evaluate telephone and field interviewer performance. Paper presented at Federal Computer Assisted Survey Information Collection Workshop (FedCASIC), Washington, DC. Retrieved from https://fedcasic.dsd.census.gov/fc2012/ppt/06_kinsey.ppt

11.

Mitchell

S. B.

Strobl

M. M.

Fahrney

K. M.

Nguyen

M. T.

Bibb

B. S.

Thissen

M. R.

Stephenson

W. I.

(2008). Using computer audio-recorded interviewing to assess interviewer coding error. Proceedings of the Joint Statistical Meetings, American Statistical Association, Alexandria, VA, 4414–4421.

12.

Nguyen

M. T.

Thissen

M. R.

Siege

B. C.

Bikmal

S. H

. (2010). Development of an integrated CARI interactive data access system for the US Census Bureau. Proceedings of the 13th International Blaise Users Conference, Baltimore, MD. Retrieved from www.blaiseusers.org/2010/papers/8b.pdf

13.

Pascale

. (2011). Using behavior coding to evaluate questionnaires. Paper presented at Federal Computer Assisted Survey Information Collection Workshop (FedCASIC), Washington, DC. Retrieved from https://fedcasic.dsd.census.gov/fc2011/ppt/03_pascale.pdf

14.

Sattaluri

Spain

C. J.

Nguyen

M. T.

Thissen

M. R

. (2010). Technical challenges in the development and implementation of QUEST. Paper presented at the International Field Directors and Technologies Conference, Chicago, IL. Retrieved from http://ifdtc.org/PC2010/presentation_2010_files/8E-Sridevi%20Sattaluri.pdf

15.

Thissen

M. R.

Fisher

Barber

Sattaluri

. (2008). Computer audio-recorded interviewing (CARI), A tool for monitoring field interviewers and improving field data collection. Proceedings of the International Methodology Symposium 2008, Statistics Canada, Gatineau, Canada. Retrieved from http://www5.statcan.gc.ca/bsolc/olc-cel/olc-cel?lang=eng&catno=11-522-X200800010955