Does Research on Evaluation Matter? Findings From a Survey of American Evaluation Association Members and Prominent Evaluation Theorists and Scholars

Abstract

Research on evaluation theories, methods, and practices has increased considerably in the past decade. Even so, little is known about whether published findings from research on evaluation are read by evaluators and whether such findings influence evaluators’ thinking about evaluation or their evaluation practice. To address these questions, and others, a random sample of American Evaluation Association (AEA) members and a purposive sample of prominent evaluation theorists and scholars were surveyed. A majority of AEA members (80.95% ± 7.60%) and sampled theorists and scholars (84.21%) regularly read research on evaluation and indicate that research on evaluation has influenced their thinking about evaluation and their evaluation practice (97.00% ± 3.38% and 94.00% ± 4.79%, for AEA members, and 100% and 100%, for prominent theorists and scholars, respectively).

Keywords

research on evaluation research influence evaluation theory evaluation practice American Evaluation Association

Background and Introduction

In the last decade, the number of “research on evaluation” investigations has increased substantially, with many evaluation theorists and scholars, practitioners, and graduate students publishing and presenting findings from their investigations of evaluation theories, methods, and practices in scholarly journals and at meetings of specialized associations (e.g., Alkin, Vo, & Hansen, 2013; Azzam, 2010, 2011; Brandon & Fukunaga, 2013; Christie, 2007; Christie & Azzam, 2004; Christie & Fleischer, 2011; Coryn, Hattie, Scriven, & Hartmann, 2007; Coryn, Noakes, Westine, & Schröter, 2011; Cullen, Coryn, & Rugh, 2011; Davies & MacKay, 2014; Fleischer & Christie, 2009; Heberger, Christie, & Alkin, 2010; Johnson et al., 2009; LaVelle & Donaldson, 2010; Miller & Campbell, 2006; St. Clair, Cook, & Hallberg, 2014). In general, these studies have consisted of surveys of American Evaluation Association (AEA) members on various topics (e.g., Fleischer & Christie, 2009; Seidling, 2015; Szanyi, Azzam, & Galen, 2013), descriptive accounts of evaluation education and training programs (e.g., Davies & MacKay, 2014; LaVelle & Donaldson, 2010), research on and development of evaluation methods (e.g., Reichardt, 2011; St. Clair et al., 2014), and systematic reviews of evaluations applying differing evaluation approaches (e.g., Brandon & Fukunaga, 2013; Chouinard & Cousins, 2013; Coryn et al., 2011; Johnson et al., 2009; Miller & Campbell, 2006), among many others.

Although interest in and appeals for empirical investigations of evaluation theories, methods, and practices appeared far earlier (e.g., Shadish, Cook, & Leviton, 1991; Smith, 1993)—following a so-called “golden age” of research on evaluation beginning in the 1970s and continuing through the 1980s (e.g., Brown, Braskamp, & Newman, 1978; Braskamp, Brown, & Newman, 1982; Cousins & Leithwood, 1986; Patton et al., 1977; Weiss, 1977; Weiss & Bucuvalas, 1980)—recent events, including Christie’s (2003), Henry and Mark’s (2003), and Mark’s (2007) influential publications on research on evaluation as well as the formation of the Research on Evaluation (RoE) Topical Interest Group (TIG) within AEA in 2008 and the newly created AEA “Research on Evaluation Award” in 2015, have brought greater attention to the perceived need for and potential benefits of systematic investigations of evaluation theories, methods, and practices (cf. Christie, 2011; Coryn & Westine, 2015; Miller, 2010; Stufflebeam & Coryn, 2014; Szanyi et al., 2013). Similarly, the American Educational Research Association (AERA) has a Special Interest Group (SIG) for RoE, which was established in the early 1990s as well as a “Research on Evaluation Distinguished Scholar Award” created in 2011.

Despite this continued and seemingly growing interest, little, however, is known about how published findings from research on evaluation investigations influence, or do not influence, evaluators’ thinking about evaluation or their evaluation practices (which was the impetus for the current investigation). Christie (2003), for example, found a significant disconnect between prescriptive evaluation theories and actual evaluation practice, which may suggest that a similar disconnect could potentially exist between findings from research on evaluation and their influence on, or application to, evaluation practice. Relatedly, Szanyi, Azzam, and Galen (2013) found that a majority of evaluators have an interest in many research on evaluation subjects (in particular, research on evaluation’s impact), but few are actually interested in conducting research on evaluation.

Definition of Research on Evaluation Used for This Investigation

Historically, research on evaluation has been an ambiguous term used to describe many forms of both systematic and unsystematic inquiry into evaluation theories, methods, and practices including, for example, post hoc reflections on evaluation experiences (though Christie, 2011, suggests that these practice-based reflections are pertinent to advancing evaluation theory and practice). Brandon and Fukunaga (2013) defined research on evaluation as “[any] … systematic inquiry into the methods, practices, and profession of program evaluation, with potential implications of its findings for evaluation theory” (p. 27). Brandon and Fukunaga’s definition, however, confines research on evaluation to research on program evaluation and evaluation theory (excluding research on evaluation methods and practices). An intentionally broader definition, encompassing research on evaluation not specifically related to but inclusive of program evaluation, as well as other types of evaluands or objects of inquiry (e.g., products, personnel; see Coryn & Hattie, 2006; Scriven, 1991), was applied for purposes of this investigation:

Any purposeful, systematic, empirical inquiry intended to test existing knowledge, contribute to existing knowledge, or generate new knowledge related to some aspect of evaluation processes or products, or evaluation theories, methods, or practices.

This conceptual and operational definition of research on evaluation was publicly vetted at the business meeting of the RoE TIG at the 2014 AEA annual conference and was generally perceived as accurate by those in attendance at the meeting (many of whom have conducted and published research on evaluation). Although a great amount of debate occurred during the vetting process, it significantly improved the final definition presented here.

Questions Investigated

The focal questions investigated in the study were:

What proportion of AEA members regularly read peer-reviewed, published research on evaluation?

For those AEA members who regularly read peer-reviewed, published research on evaluation, what influence do findings from research on evaluation have on their thinking about evaluation and their evaluation practice?

What proportion of prominent evaluation theorists and scholars regularly read peer-reviewed, published research on evaluation?

For those prominent evaluation theorists and scholars who regularly read peer-reviewed, published research on evaluation, what influence do findings from research on evaluation have on their thinking about evaluation and their evaluation practice?

Method

Design

A cross-sectional design was used to investigate and address the focal research questions. More specifically, the design consisted of a simple random sample survey of AEA members and a purposive sample survey of prominent evaluation theorists and scholars.

Sample

Following an application procedure with and approval from the AEA Research Request Task Force, the names and e-mail addresses (as well as secondary information relevant to the study; e.g., gender, ethnicity, and highest level of education) of all AEA members as of October 24, 2014, was obtained. The member database provided by AEA had a total of N = 7,026 unique records. Prominent evaluation theorists and scholars (n = 33) were purposively selected to participate in the survey—based on their reputations in the field and their contributions to evaluation theory, method, and practice (many were past presidents of AEA and/or founders of major evaluation theories and approaches)—and were excluded from the sampling frame of general AEA members. Excluding the prominent theorists and scholars (n = 33) as well as the authors and two graduate students (n = 10; who developed and pilot/pretested the surveys) resulted in a final usable sampling frame for the survey of AEA members of N = 6,983. A simple random sample (using a bound on the error of estimation of ±5% and a conservative population proportion of p = .50) of n = 378 AEA members was estimated to address the focal research questions. A 20% oversample (n = 76) was taken to account for potential nonresponse, resulting in a total sample size of n = 454 AEA members. With the theorists and scholars included, the overall selected sample size was n = 487. At the time of the surveys, eight of the AEA members selected for inclusion in the sample had undeliverable e-mail addresses (n = 4), were unavailable (n = 2), or asked to be removed from AEA’s “research list” during the administration of the surveys (n = 2), reducing the usable sample of AEA members to n = 446 and the total sample to n = 479. None of the e-mails sent to prominent evaluation theorists and scholars were undeliverable.

From the random sample of AEA members, a response rate of 43.94% (n = 196) was obtained. From the purposive sample of prominent evaluation theorists and scholars, the response rate was 63.63% (n = 21). Combined, the overall response rate was 45.30%. Shown in Table 1 are some of the general traits and characteristics for both the AEA member sample and the sample of evaluation theorists and scholars contrasted with those of all AEA members. As shown in Table 1, the simple random sample of AEA members is generally congruent with the overall AEA member population. Other than z-test results for American Indian, Native American, and Alaska Native (z = 3.89, p < .00) and doctorate level of education (z = 2.52, p = .01), differences between the obtained sample of AEA members relative to the AEA member population were not statistically significant, suggesting that a reasonably representative sample was obtained.

Table 1.

AEA Member Sample, Prominent Theorist and Scholar Sample, and AEA Member Population Traits and Characteristics.

Trait/Characteristic^a	AEA Member Sample^b (n = 196; %)	Prominent Theorist and Scholar Sample^b (n = 21; %)	AEA Member Population (N = 7,026; %)
Gender
Male	28.57	52.38	26.27
Female	66.84	47.62	64.56
Ethnicity
African American, Black	8.16	0.00	7.17
American Indian, Native American, Alaska Native	2.55	0.00	0.48
Asian	7.14	0.00	6.02
Caribbean Islander	0.51	0.00	0.63
European American, White	60.71	66.66	56.99
Latino or Hispanic	4.08	0.00	3.97
Middle Eastern or Arab	0.51	0.00	0.74
Native Hawaiian or Pacific Islander	0.00	0.00	0.23
Other	2.55	4.76	4.34
Highest level of education
Doctorate	50.51	100.00	41.52
Master’s	39.29	0.00	41.94
Bachelor’s	5.10	0.00	5.61
Other	0.00	0.00	0.85
Country
United States	84.18	81.82	80.03
Other	12.24	18.18	14.86
Primary work setting
College/university	33.67	71.43	30.84
Nonprofit organization	19.39	0.00	21.02
Private business	18.88	14.29	20.35
Federal agency	8.16	0.00	5.31
State agency	4.08	0.00	3.19
Local agency	2.04	0.00	2.08
School system	1.53	0.00	2.32
Other	5.61	0.00	6.12
Major activity
Evaluation	45.92	28.57	41.45
Research	11.73	28.57	13.25
Consulting	12.76	0.00	11.19
Management/administration	9.18	4.76	9.22
Student	8.67	0.00	6.26
Teaching	3.57	19.05	4.92
Other	2.04	0.00	3.22
Number of years conducting evaluation (M, SD)	12.92 (10.47)	37.25 (11.37)	—
Number of years as an AEA member (M, SD)	7.38 (7.88)	27.47 (5.91)	—
Attended an AEA conference
Yes	75.15	100.00	—
No	24.85	0.00	—
Attended an RoE TIG Session at an AEA Conference^c
Yes	23.72	80.00	—
No	33.05	00.00	—
Unsure	43.22	20.00	—

Note. AEA = American Evaluation Association; RoE = Research on Evaluation; TIG = Topical Interest Group.

^aWithin subgroups, traits/characteristics percentages do not always total 100% due to item nonresponse/missingness and/or rounding error. ^bEstimates derived from obtained samples. ^cEstimates derived from respondents indicating that they had attended an AEA conference.

Instrumentation

The surveys of AEA members and prominent evaluation theorists and scholars consisted of closed-response, partially closed-response, and open (free)-response items. The surveys were intentionally brief and designed only to elicit information pertinent to the study (e.g., demographic characteristics were provided in the AEA database and, therefore, not asked of respondents). A series of skip patterns were also used to further reduce the burden of response where possible (e.g., if a respondent indicated “never,” “seldom,” or “don’t have access” for reading a particular evaluation journal, then the respondent was not asked how frequently he or she read research on evaluation published in that particular journal in subsequent items).^1,2

For the purposes of the study, “read research on evaluation” was operationally defined as “the frequency of reading articles in peer-reviewed evaluation journals,” specifically research on evaluation. Frequency of reading peer-reviewed evaluation journals and reading research on evaluation in peer-reviewed evaluation journals was elicited in the survey questionnaires using an ordinal level of measurement consisting of the categories “never,” “seldom,” “often,” or “very often.” In the questionnaires, items regarding the influence of research on evaluation on “thinking about evaluation” was operationally defined as “… how you conceptualize and/or define evaluation” and “evaluation practice” was defined as “… how you go about conducting evaluation.”

Internal consistency (i.e., the lower bound estimate of reliability) over all ordinal-level items was estimated using ordinal α and Ω (Gugiu, Coryn, & Applegate, 2010), assuming a congeneric (i.e., unidimensional) measurement model. For the full usable sample of AEA members and theorists and scholars ordinal α = .88 and ordinal Ω = .91.

Procedure

The surveys of AEA members and prominent evaluation theorists and scholars were administered from February 9, 2015, to March 20, 2015, using the Qualtrics web-based survey system. An initial e-mail message inviting the selected AEA members and theorists and scholars and informing them of the study and its purposes was sent 1 week prior to the initial administration of the surveys on February 2, 2015. Reminder e-mails were delivered weekly over the administration period thereafter to those who were selected for the sample but who had not yet responded. Throughout the planning and administration of the surveys, the principles of Dillman, Smyth, and Christian’s (2009) “tailored design method” for conducting surveys were carefully applied in an effort to increase the quality and quantity of responses.

Institutional review

The study was reviewed and approved by the Western Michigan University (WMU) Human Subjects Institutional Review Board. Both the sample of AEA members and evaluation theorists and scholars read an electronic informed consent prior to participating in the study.³

Data Processing and Analysis

Closed-response and partially closed-response data obtained through the web-based surveys were downloaded from the Qualtrics survey system as tab-delimited files and then imported into SAS 9.3 for processing and analysis. Where relevant, bounds on errors of estimation, B (notated by ± [i.e., sampling error]; sampling errors were not estimated for the evaluation theorists and scholar sample as the sample was nonrandom) for statistical estimates of population parameters were calculated.

Two coders independently coded the open-ended responses, following collaborative construction of an emergent coding scheme derived from an initial screening of the open-ended responses. Text segments, phrases, and words were assigned as 1 when a coding category was present (and as 0 when not present) so that frequencies could be tabulated for each response. Interrater agreement for the independent coding procedure for exact agreement was p_o = .88 and accounting for the probability of chance agreement was κ = .83 (Davey, Gugiu, & Coryn, 2010). Following the independent coding procedure, the two coders worked collaboratively to build consensus on the final coding of the open-ended responses.

Results

To investigate the focal research questions, two primary analyses were conducted. The first was a descriptive summary of the survey results. The second was a thematic analysis of responses to the open-ended items within the surveys.

Descriptive Findings From the Surveys of AEA Members and Prominent Theorists and Scholars

Nearly all (96.94% ± 2.38%) AEA members and all (100%) sampled theorists and scholars consider research on evaluation important. Even so, and although she considers research on evaluation to be important, one notable theorist/scholar commented that research on evaluation is, in general, “… of poor quality—surveys of member opinions such as this survey. [with] Sampling bias, etc. Too often the complexity of evaluation is reduced to measurable but meaningless variables.” Another, however, mentioned that research on evaluation “… provides essential information to guide the development of high quality professional practice, explore the utility of theoretical ideas on how best to do evaluation, and refine theory on critical topics.”

As shown in Figure 1, the American Journal of Evaluation (AJE) and New Directions for Evaluation (NDE) are, overall, the most frequently read journals (i.e., “often” or “very often”) by a majority of AEA members (70.35% ± 6.76% and 51.18% ± 7.44%, respectively) who as members of AEA are automatically subscribed to these two journals. In addition to AJE (90.48%) and NDE (80.95%), theorists and scholars tend to read other journals semiregularly or regularly (e.g., Evaluation: The International Journal of Theory, Research and Practice, Journal of MultiDisciplinary Evaluation).

Figure 1.

Frequency of reading evaluation journals*. *Percentages do not always total 100% due to rounding errors and because the estimated percentages for each group do not include a column for “don’t have access,” which was asked in the surveys (see “Instrumentation”). Therefore, where subgroup rows do not total 100%, the remainder is the proportion of respondents who indicated that they do not have access to a particular journal. For example, prominent theorists and scholars indicated “never” (75%) or “seldom” (10%) reading the journal Research Evaluation for a total of 85%, the remaining 15% indicated that they do not have access to the journal (resulting in a total of 100%).

AEA members most often read articles on evaluation methods (92.85% ± 4.64%), reflections on evaluation practice (87.80% ± 6.15%), or research on evaluation (80.95% ± 7.60%), whereas the sampled theorists and scholars most often read articles on evaluation theory (94.73%), evaluation methods (89.47%), research on evaluation (84.21%), and evaluation ethics (84.21%), as shown in Figure 2. In the percentages displayed in Figure 2, the sample sizes for types of articles most frequently read by AEA members varied from n = 115 to n = 126 and for theorists and scholars ranged from n = 17 to n = 19, which were used as the denominators for the estimates shown.

Figure 2.

Types of articles most frequently read in evaluation journals by general American Evaluation Association members and prominent evaluation theorists and scholars.

Of general AEA members and the sampled prominent theorists and scholars who read at least one evaluation journal “often” or “very often,” each group reads research on evaluation published in a diverse range of evaluation journals, as shown in Figure 3. Proportional to the sample sizes associated with the percentages presented in Figure 3, both groups most frequently read articles reporting findings from research on evaluation in AJE and NDE, though AJE is the most commonly read by both.

Figure 3.

Frequency of reading research on evaluation published in evaluation journals*. *Percentages do not always total 100% due to rounding errors. Percentage of those indicating “often” or “very often” reading a particular evaluation journal and as derived from a skip pattern in the surveys (i.e., respondents indicating “never” or “seldom” reading a particular journal were not asked how frequently they read research on evaluation in those journals and, therefore, the ns are substantially smaller for American Evaluation Association members than the ns shown in Figure 1).

As illustrated in Figure 4, research on evaluation has influenced both general AEA members’ and the sample of prominent theorists and scholars’ thinking about evaluation and their evaluation practice, whether “some” or “a lot.” For general AEA members, the influence of research on evaluation on their thinking about evaluation and their evaluation practice is essentially equivalent (97.00% ± 3.38% and 94.00% ± 4.79%, respectively). Research on evaluation has influenced all theorists and scholars’ thinking about evaluation as well as their evaluation practice (100% and 100%, respectively), though if only considering whether research on evaluation has had a lot of influence, research on evaluation more greatly influences their thinking about evaluation (80.00%) as opposed to their evaluation practice (25.00%). Very few AEA members (3.00% ± 3.33% and 6.00% ± 4.64%, respectively) and none of the theorists and scholars indicated that research on evaluation has had “no” or “very little” influence on their thinking about evaluation or their evaluation practice (due to the small frequencies, these categories are not labeled in Figure 4). In the figure, the denominators used for the percentages shown for AEA members was n = 100 and for theorists and scholars was n = 15 for influence on thinking about evaluation and n = 16 for influence on evaluation practice.

Figure 4.

Influence of research on evaluation on general American Evaluation Association members’ and prominent evaluation theorists and scholars’ thinking about evaluation and their evaluation practice.

Thematic Findings Emerging From Open-Ended Item Responses

Of those AEA members (n = 101) and theorists and scholars (n = 20) who indicated that research on evaluation is important and who also provided written responses as to why they consider it important, eight predominant themes emerged, as well as an “other” category (many responses were assigned to more than one coding category). Here, the category other (see Figure 5) included infrequently occurring themes and/or vague statements that could not be meaningfully coded.

Figure 5.

Why general American Evaluation Association members and prominent evaluation theorists and scholars consider research on evaluation important.

By far, and as shown in Figure 5, general AEA members and prominent theorists and scholars believe that findings from research on evaluation contribute to “improving, informing, and guiding evaluation practice” (40.59% and 50.00%, respectively). Statements such as “It is important to learn about what works well for evaluators and what does not work as well and why, and how practice links with theory” and “If we believe that evaluation can lead to improvement in our evaluands, we should apply it reflexively to our own work so that we continue to improve our theory and practice” exemplify this theme. A smaller minority of general AEA members (20.79%) and theorists and scholars (30.00%) reported that research on evaluation “provides empirical evidence” to support sound practices. As one AEA member noted, “Without sound foundational evidence to guide the process of evaluation, evaluators can only choose a particular thought-leader’s approach to evaluation to guide their efforts … [and we] … would be left with a celebrity-based practice rather than an evidence-based practice.”

Of AEA members (n = 19) who indicated that they rarely (i.e., “never” or “seldom”) read research on evaluation and who also provided written responses as to why four major themes emerged. These themes were “lack of time” (52.63%), “lack of interest or application to work” (36.84%), other (26.31%), and “too esoteric or complicated” (10.52%). As with why some consider research on evaluation important, the category other included infrequently occurring themes and vague statements that could not be meaningfully coded. Of those who rarely read research on evaluation, a majority indicated that they do not read research on evaluation due to time constraints. This theme is exemplified by statements such as “Time constraints prevent me from perusing these journals the way I would like to.” As related to being too esoteric or complicated, statements such as “… articles that fall into ‘research on evaluation’ tend to be very esoteric and cannot be directly applied to my work …” or are “… too complicated, wordy, and sometimes other worldly” represent the theme.

Discussion

Although there are a small number of notable exceptions (e.g., Cousins & Leithwood’s, 1986, review of research on evaluation utilization and Patton et al.’s, 1977, analysis of evaluation utilization in health care), carefully conducted empirical investigations of evaluation theories, methods, and practices were relatively rare prior to Christie’s (2003) investigation of the theory–practice relationship in evaluation. In the decade following Christie’s (2003) study, however, such investigations have been occurring with increasing regularity. Even so, in 2004, Weiss recommended that advancing the practice of “… doing [italics in original] evaluation, teaching people how to do evaluation … and in general, advancing the practice [italics in original] of evaluation” (p. 166) should be given greater priority, rather than conducting more research on evaluation.

Despite Weiss’ (2004) admonition, the obvious influence of research on evaluation on AEA members’ and other leading theorists and scholars’ thinking about evaluation and their evaluation practice, coupled with the recent surge in published research on evaluation investigations, might suggest otherwise. Findings from rigorous, systematic research on evaluation studies could potentially lead to better evaluation practices. That is, evaluation practices that are grounded in empirical evidence as opposed to prescriptive theories of practice. By integrating research on evaluation findings into evaluation practice, evaluators could potentially produce better quality, more useful, and higher impact evaluations (Christie, 2003, 2011; Coryn & Westine, 2015; Henry & Mark, 2003; Mark, 2007; Shadish et al., 1991; Smith, 1993; Stufflebeam & Coryn, 2014).

Unanimously, AEA members and prominent theorists and scholars consider research on evaluation important, though this is not always strongly associated with whether they actually read research on evaluation or whether it influences their thinking about evaluation or their evaluation practice; in particular, among general AEA members. General AEA members tend to exclusively read AJE and NDE, whereas theorists and scholars tend to read more widely. Both AEA members and evaluation theorists and scholars read most evaluation journals infrequently, if at all. Of those who read peer-reviewed evaluation journals frequently, research on evaluation ranks among the most frequent types of articles read by both groups, though evaluation methods was one of the mostly frequently read by both groups.

Limitations

This study offers limited insight into how AEA members and prominent evaluation theorists and scholars interact with and are influenced by findings from empirical investigations of evaluation theories, methods, and practices published in evaluation journals and the results are, therefore, subject to several caveats. Firstly, one obvious limitation, or potential point of contention, could arise from the definition of research on evaluation used for the study. This definition intentionally excludes certain activities that some (e.g., Christie, 2011) consider research on evaluation, such as reflections on practice, for example. Secondly, the operational definitions used for estimating the frequency of which AEA members read research on evaluation and the journals which are read most frequently could be considered a significant limitation as many members of AEA very likely read other journals that influence their thinking about evaluation and their evaluation practice (e.g., American Journal of Public Health, Journal of Development Effectiveness, and Journal of Research on Educational Effectiveness). Thirdly, the survey questionnaire did not indicate a time frame for respondents (e.g., how often in the last year) as regards reading research on evaluation, or evaluation journals more generally. Fourthly, the response rate of the survey of AEA members and prominent evaluation theorists and scholars was less than anticipated (and desired), thus reducing representativeness and generalizations, even though the response rate was greater (overall, 45.30%) than that of other recent surveys of AEA members, none of which used probability sampling methods (e.g., Fleischer & Christie, 2009 [29.81%]; Seidling, 2015 [15.30%]; Szanyi et al., 2013 [28.82%]). Fifthly, there is some likelihood, although that probability is unknown, that social desirability influenced responses to the surveys (Furnham, 1986), thus possibly biasing and inflating estimates regarding reading and the influence of research on evaluation among AEA members and prominent theorists and scholars. Finally, it is unknown how many e-mail survey requests sent to AEA members are filtered as “spam” or “junk” by their e-mail servers, thus the intended recipients never received such requests, further reducing response rates. Given the exploratory nature of the investigation, these limitations are, however, somewhat offset by the unique knowledge generated.

Future Research

Being predominately a practice-based discipline, with only slightly more than one third of AEA members working in academia, methods other than scholarly journals for disseminating findings from research on evaluation (e.g., evaluation listservs, evaluation websites, and evaluation blogs) should be carefully studied so that evaluation practice can continuously improve and, ultimately, better serve society by incorporating findings from research on evaluation into contemporary evaluation practice. Relatedly, evaluation methods and practices are (seemingly) increasingly driven by external forces and demands (i.e., by what those who commission evaluation [i.e., funders or sponsors] want or expect from evaluation and evaluators) rather than what is presumed to be good practice based on empirical findings from research on evaluation and/or from evaluation theory. If true, then practitioners likely have less of an extrinsic motivation to read research on evaluation and incorporate research on evaluation findings into their practice, and this too then is worthy of investigation. Moreover, response rates from recent surveys of AEA members have been consistently low, likely resulting in biased samples. Means for increasing response rates (e.g., tangible incentives), therefore, need to be investigated so that samples of AEA members (whether random or nonrandom) better represent the overall population of AEA members and, ultimately, increase the generalizability of findings produced from investigations of AEA members. In doing so, it should be noted that it is very likely that some “members” of AEA register for the association only when presenting at or attending the AEA annual conference rather than having a true interest in being an active member of the association, as is the case with many professional associations. Finally, “how” research on evaluation influences evaluation practice and theory (as opposed to merely “whether” it does) should be carefully investigated in future studies.

Footnotes

Acknowledgments

The authors would like to thank the AEA RoE TIG and the AEA Research Request Task Force for making this study possible. They would also like to thank the AERA RoE SIG for conferring the “Research on Evaluation Distinguished Scholar Award” to the first author in 2014. Additionally, the authors also would like to thank Interdisciplinary PhD in Evaluation (IDPE) students Cheryl Endres and Sabrina Holley for pretesting the questionnaires used in the study and the three anonymous reviewers for their insightful feedback on and suggestions for improving the original version of this article.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article: (IDPE program at Western Michigan University (WMU).

Notes

References

Alkin

M. C.

A. T.

Hansen

(Eds.). (2013). Using logic models to facilitate comparisons of evaluation theory. Evaluation and Program Planning, 38, 33–88.

Azzam

(2010). Evaluator responsiveness to stakeholders. American Journal of Evaluation, 31, 45–65.

Azzam

(2011). Evaluator characteristics and methodological choice. American Journal of Evaluation, 32, 376–391.

Brandon

P. R.

Fukunaga

L. L.

(2013). The state of the empirical research literature on stakeholder involvement in program evaluation. American Journal of Evaluation, 35, 26–44.

Braskamp

L. A.

Brown

R. D.

Newman

D. L.

(1982). Studying evaluation utilization through simulations. Evaluation Review, 6, 114–126.

Brown

R. D.

Braskamp

L. A.

Newman

D. L.

(1978). Evaluator credibility as a function of report style: Do jargon and data make a difference? Evaluation Review, 2, 331–341.

Christie

C. A.

(2003). What guides evaluation? A study of how evaluation practice maps onto evaluation theory. In Christie

C. A.

(Ed.), The practice-theory relationship. New directions for evaluation (Vol. 97, pp. 7–36). San Francisco, CA: Jossey-Bass.

Christie

C. A.

(2007). Reported influence of evaluation data on decision makers’ actions. American Journal of Evaluation, 28, 8–25.

Christie

C. A.

(2011). Advancing empirical scholarship to further develop evaluation theory and practice. Canadian Journal of Program Evaluation, 26, 1–18.

10.

Christie

C. A.

Azzam

(2004). What’s all the talk about? Examining EVALTALK, an evaluation listserv. American Journal of Evaluation, 25, 219–234.

11.

Christie

C. A.

Fleischer

D. N.

(2011). Insight into evaluation practice: A content analysis of designs and methods used in evaluation studies published in North American evaluation-focused journals. American Journal of Evaluation, 31, 326–346.

12.

Chouinard

J. A.

Cousins

J. B.

(2013). Participatory evaluation for development: Examining research-based knowledge from within the African context. African Evaluation Journal, 1, 1–9.

13.

Coryn

C. L. S.

Hattie

J. A.

(2006). The transdisciplinary model of evaluation. Journal of MultiDisciplinary Evaluation, 3, 107–114.

14.

Coryn

C. L. S.

Hattie

J. A.

Scriven

Hartmann

D. J.

(2007). Models and mechanisms for evaluating government-funded research: An international comparison. American Journal of Evaluation, 28, 437–457.

15.

Coryn

C. L. S.

Noakes

L. A.

Westine

C. D.

Schröter

D. C.

(2011). A systematic review of theory-driven evaluation practice from 1990 to 2009. American Journal of Evaluation, 32, 199–226.

16.

Coryn

C. L. S.

Westine

C. D.

(Eds.). (2015). Contemporary trends in evaluation research (Vols. I–IV). (Sage benchmarks in social research methods). London, England: Sage.

17.

Cousins

J. B.

Leithwood

K. A.

(1986). Current empirical research on evaluation utilization. Review of Educational Research, 56, 331–364.

18.

Cullen

A. E.

Coryn

C. L. S.

Rugh

(2011). The politics and consequences of including stakeholders in international development evaluations. American Journal of Evaluation, 32, 345–361.

19.

Davey

J. W.

Gugiu

P. C.

Coryn

C. L. S.

(2010). Quantitative methods for estimating the reliability of qualitative data. Journal of MultiDisciplinary Evaluation, 6, 140–162.

20.

Davies

MacKay

(2014). Evaluator training: Content and topic validation in university evaluation courses. American Journal of Evaluation, 35, 419–429.

21.

Dillman

D. A.

Smyth

J. D.

Christian

L. M.

(2009). Internet, mail, and mixed-mode surveys: The tailored design method (3rd ed.). Hoboken, NJ: Wiley.

22.

Fleischer

D. N.

Christie

C. A.

(2009). Evaluation use: Results from a survey of U.S. American Evaluation Association members. American Journal of Evaluation, 30, 158–175.

23.

Furnham

(1986). Response bias, social desirability, and dissimulation. Personality and Individual Differences, 7, 385–400.

24.

Gugiu

P. C.

Coryn

C. L. S.

Applegate

E. B.

(2010). Structure and measurement properties of the patient assessment of chronic illness care instrument. Journal of Evaluation in Clinical Practice, 16, 509–516.

25.

Heberger

Christie

C. A.

Alkin

(2010). Understanding the influence of evaluation on other disciplines: A bibliometric analysis. American Journal of Evaluation, 31, 24–44.

26.

Henry

G. T.

Mark

M. M.

(2003). Toward an agenda for research on evaluation. In Christie

C. A.

(Ed.), The practice-theory relationship in evaluation. New directions for evaluation (Vol. 97, pp. 69–80). San Francisco, CA: Jossey-Bass.

27.

Johnson

Greenseid

L. O.

Toal

S. A.

King

J. A.

Lawrenz

Volkov

(2009). Research on evaluation use: A review of the empirical literature from 1986 to 2005. American Journal of Evaluation, 30, 377–410.

28.

LaVelle

J. M.

Donaldson

S. I.

(2010). University-based evaluation training programs in the United States 1980–2008: An empirical examination. American Journal of Evaluation, 31, 9–23.

29.

Mark

M. M.

(2007). Building a better evidence base for evaluation theory: Beyond general calls to a framework of types of research on evaluation. In Smith

N. L.

Brandon

P. R.

(Eds.), Fundamental issues in evaluation (pp. 111–134). New York, NY: Guilford.

30.

Miller

R. L.

(2010). Developing standards for empirical examinations of evaluation theory. American Journal of Evaluation, 31, 390–399.

31.

Miller

R. L.

Campbell

(2006). Taking stock of empowerment evaluation: An empirical review. American Journal of Evaluation, 27, 296–319.

32.

Patton

M. Q.

Grimes

P. S.

Guthrie

K. M.

Brennan

N. J.

French

B. D.

Blyth

D. A.

(1977). In search of impact: An analysis of the utilization of federal health evaluation research. In Weiss

C. H.

(Ed.), Using social research in public policy making (pp. 141–184). Lexington, MA: Lexington Books.

33.

Reichardt

C. S.

(2011). Evaluating methods for estimating program effects. American Journal of Evaluation, 32, 246–272.

34.

Scriven

(1991). Evaluation thesaurus (3rd ed.). Thousand Oaks, CA: Sage.

35.

Seidling

M. B.

(2015). Evaluator certification and credentialing revisited: A survey of American Evaluation Association members in the United States. In Altschuld

J. W.

Engle

(Eds.), Accreditation, certification, and credentialing: Relevant concerns for U.S. evaluators. New directions for evaluation (Vol. 145, pp. 87–102). San Francisco, CA: Jossey-Bass.

36.

Shadish

W. R.

Cook

T. D.

Leviton

L. C.

(1991). Foundations of program evaluation: Theories of practice. Thousand Oaks, CA: Sage.

37.

Smith

N. L.

(1993). Improving evaluation theory through the empirical study of evaluation practice. Evaluation Practice, 14, 237–242.

38.

St. Clair

Cook

T. D.

Hallberg

(2014). Examining the internal validity and statistical precision of the comparative interrupted time series design by comparison with a randomized experiment. American Journal of Evaluation, 35, 311–327.

39.

Stufflebeam

D. L.

Coryn

C. L. S.

(2014). Evaluation theory, models, & applications (2nd ed.). San Francisco, CA: Jossey-Bass.

40.

Szanyi

Azzam

Galen

(2013). Research on evaluation: A needs assessment. Canadian Journal of Program Evaluation, 27, 39–64.

41.

Weiss

C. H.

(1977). Research for policy’s sake: The enlightenment function of social research. Policy Analysis, 3, 531–545.

42.

Weiss

C. H.

(2004). Routine for evaluation: A Cliff’s Notes version of my work. In Alkin

M. C.

(Ed.), Evaluation roots: Tracing theorists’ views and influences (pp. 153–168). Thousand Oaks, CA: Sage.

43.

Weiss

C. H.

Bucuvalas

M. J.

(1980). Truth tests and utility tests: Decision makers’ frames of references for social science research. American Sociological Review, 45, 302–313.