Abstract
In view of corporate wrongdoings like Enron’s accounting fraud and Volkswagen’s emissions scandal, the need to prevent unethical decision-making in the business sector has become widely accepted. Human resource development is of high relevance in this regard: a multiplicity of companies utilizes ethics training programs to teach their managers and employees business ethics and to develop their ethical competences. However, knowledge about the efficacy of these training programs is still rather fragile. In the present study, we (a) develop a framework of relevant design categories to consider in creating ethics training programs; (b) consolidate empirical insights by reviewing 92 studies about the effectiveness of standalone business ethics training programs regarding their impact, dependent variable and measurement methods, design, and conceptual foundation; and (c) identify remaining research gaps and provide theoretical-conceptual considerations for further investigation.
Keywords
Ethics training, and the implementation of ethics training programs in organizations, is viewed as an essential element in institutionalizing business ethics (Ardichvili & Jondle, 2009; de Colle & Werhane, 2008; Foote & Ruona, 2008; Warren et al., 2014; Weber, 2015). Such training aims to develop and strengthen ethical competences—also known as ethical capacities (cf. Hannah et al., 2011) and ethical skills (Falkenberg & Woiceshyn, 2008)—that positively influence the ethical behavior of organizational members and managers. The importance of these competences has become widely acknowledged, as their lack has been associated with the occurrence of corporate scandals caused through fraud or corruption that severely threatened the well-being of individuals, companies, and society at large (Schwartz, 2017). In addition, these competences may be beneficial to overcome new challenges and crises that companies face and employees must deal with (Fregonese et al., 2018; Langher et al., 2017). Wang et al. (2009) indicate that organizations utilize a variety of human resource development (HRD) interventions to improve their members’ competences for crisis management. Valentine et al. (2013) also state that HR practices can be useful in establishing and consolidating corporate ethics. Put differently, “ethics training is necessary” to develop the competences that help in avoiding or overcoming organizational crises and related challenges (Craft, 2010, p. 600).
Ethics training is a specific form of HRD. In this regard, the HR department plays an important role in the formation and implementation of such training (Foote & Ruona, 2008). As a consequence, “there is interest in the role of the HR specialist as a guardian of ethics” (Winstanely & Woodall, 2000, p. 7; see also Valentine et al., 2013). HR experts get involved in ethics particularly by developing and running ethics training programs along with communicating ethical guidelines or codes (Valentine et al., 2013; Winstanely & Woodall, 2000).
Although the general public assumes that ethics training is an effective means to strengthen business ethics, there is still debate among researchers about its efficacy in the workforce. To better understand the impact of ethics training on ethical skills, a comprehensive review of related studies is needed and conducted in the present work. As pointed out by Briner et al. (2009) and Kunisch et al. (2018), systematic reviews have become a key method to consolidate existing knowledge and to synthesize evidence-based results. In line with recommendations for well-established practices in compiling empirical literature reviews (Fisch & Block, 2018; Torraco, 2005; Van Wee & Banister, 2016), we provide three major contributions by (a) developing a framework of training design categories, (b) taking stock of empirical insights gathered from 92 studies about the effectiveness of business ethics training, and (c) identifying and discussing remaining research gaps and providing theoretical-conceptual considerations for further investigation.
Our review reveals that studies heavily rely on cognitive antecedents of ethical behavior (e.g., Kohlberg, 1969; Rest, 1979). More comprehensive frameworks to guide and substantiate empirical studies on ethics training effectiveness are just beginning to emerge with integrated theories (e.g., Schwartz, 2016, 2017) that go beyond traditional cognitive explanations for ethical behavior. Due to the large variance among studies of their variables of interest and measurement methods, generalizability across studies is restricted. Additionally, our review results suggest that future studies will benefit from strengthening their methodological rigor, opening the black box of training design that is largely ignored in extant research, tackling the need for broader theoretical frameworks, and studying practitioners rather than student samples, which continues to prevail.
We structure our article as follows: First, we introduce an evaluation model and develop a theoretical framework of design categories to analyze ethics training programs. Then, we explain the procedure of our review and, more precisely, how relevant studies were identified, which criteria were employed to select studies for the review, and the characteristics that were analyzed in these studies. In the subsequent section, we provide an overview of the studies and the results drawn from our analyses. More specifically, we examine extant evidence regarding the different variables and measurements of training effectiveness and their conceptual foundations. Finally, we critically discuss these results and highlight remaining research gaps.
Theoretical Background
Evaluation of Training Programs
To assess the effectiveness of ethics training programs, an evaluation framework is needed. A widely used framework is Kirkpatrick’s evaluation model (Kirkpatrick & Kirkpatrick, 2006), which is viewed as a standard model in research on training and development (Alvarez et al., 2004; Blanchard et al., 2000; Grohmann & Kauffeld, 2013; Kennedy et al., 2013; Kong & Jacobs, 2012; Phillips & Phillips, 2001; Rodriguez & Armellini, 2013). This model is the most established and, consequently, the most widely utilized evaluation framework in organizations (Kong & Jacobs, 2012), and training experts frequently use this model to evaluate training interventions (Kennedy et al., 2013). Kirkpatrick’s model is widely accepted due to, inter alia, its simplicity and ease of use (Alliger & Janak, 1989; Alvarez et al., 2004; Grohmann & Kauffeld, 2013; Khasawneh & Al-Zawahreh, 2015; Liebermann & Hoffmann, 2008). Many other evaluation models are built on Kirkpatrick’s framework (Kong & Jacobs, 2012).
Despite its simplicity, the evaluation model is comprehensive and does not only focus on the measurement of learning effects. To minimize biased conclusions from the evaluation of only one subset of training outcomes, Kirkpatrick’s model covers various effects of the training and includes the subjective satisfaction of training participants, the learning effects, their transfer into practice, and the benefits of this application for the organization (Galloway, 2007). More precisely, Kirkpatrick’s evaluation model consists of four levels to evaluate the program effectiveness: reaction, learning, behavior, and results (Kirkpatrick & Kirkpatrick, 2006).
(1) “Reaction” is a measure of user satisfaction and captures how training participants respond to the training program. The aim is for participants to react favorably to the training program, as this tends to foster their motivation to learn, develop, and grow. This evaluation level can be assessed with questionnaires that are applicable to a wide variety of training programs (Grohmann & Kauffeld, 2013). More specifically, “happiness sheets” have been proposed to gauge the degree of favorableness to which participants react to the training program (Kirkpatrick & Kirkpatrick, 2006, p. 27).
(2) “Learning” indicates the extent to which learners change attitudes, enhance knowledge, and/or improve skills through the training program. Suitable procedures for determining learning must measure attitudes, knowledge, and skills both before and after the training program and use a control group to identify the training effectiveness (Kirkpatrick & Kirkpatrick, 2006).
(3) The third level, “behavior,” refers to the change in behavior of participants that arises after the training program. Hence, this level reflects how training contents are utilized in practice (Grohmann & Kauffeld, 2013). To measure the degree to which the change of attitudes, knowledge, and skills has a behavioral impact (also known as the “transfer” of the training), common procedures include surveys of and/or interviews with training participants, supervisors, subordinates, and other individuals who are able to observe the behavior of participants before the training program and after a reasonable time for transfer (Kirkpatrick & Kirkpatrick, 2006). Experimental designs must again include an adequate control group.
(4) “Results” measure the outcomes that occur after the training attendance, which include, for instance, increased productivity, decreased costs, higher sales, and fewer accidents. Although these outcomes appear to be easily measurable, this is the most challenging part of the evaluation process, as it is difficult to precisely identify to what degree these results have been induced by the training program rather than another determinant (Kirkpatrick & Kirkpatrick, 2006). Stokking (1996) confirms that the evaluation of training programs “is notoriously difficult because . . . training is not the only relevant causal factor” (p. 179).
Evaluating the effectiveness of training programs poses ethical challenges that must be appropriately dealt with. In this regard, the American Evaluation Association (American Evaluation Association [AEA], 2018) has proposed some principles to guide the professional ethical conduct of evaluators. According to these principles, evaluators should accurately communicate the applied methods and approaches, possess and maintain the ability to carry out competent evaluation practices, disclose potential conflicts of interest, respect and honor the dignity and well-being of individuals, and acknowledge the common good. In short, the evaluation should be conducted systematically and with integrity (AEA, 2018). In addition, the Academy of Human Resource Development (2018) provides ethical standards for HRD professionals in practice, research, consulting and teaching. Six general principles should guide HRD professionals and include that they should (a) know their limitations of expertise (competence), (b) be honest, fair and respectful to others (integrity), (c) clarify their professional role and accept their responsibility for their behavior (professional responsibility), (d) respect people’s rights and dignity, (e) contribute to others welfare, and (f) be aware of their professional responsibility to the community.
Dimensions to Analyze Extant Evidence on the Effectiveness of Ethics Training
Building on Kirkpatrick’s model, we analyzed extant evidence on the effectiveness of ethics training programs. As shown in more detail next, the impact of training programs may refer to different variables and measurements utilized in prior empirical studies. Because ethics training effects may depend on the specific design of the training programs, we also examined information in this regard. Furthermore, to reflect the underlying mechanisms that cause specific effects to arise or fail to appear, we highlighted the theoretical foundation used to substantiate the training effectiveness. Following prior research and conceptual considerations, we, therefore, employed the following dimensions and characteristics to guide our in-depth analysis of articles that are included in our review:
Impact of training
We determined if the studies indicated no effect or a positive, mixed, or negative impact from ethics training on ethical behavior or associated variables (such as the perceptions of the ethical climate within the organization).
Dependent variable and measurement
Ethics training programs strive to strengthen the ethical behavior of participants. Ethical behavior as the dependent variable of empirical effectiveness studies can be captured either broadly or narrowly and be measured in a variety of ways. In line with related review articles (Craft, 2013; O’Fallon & Butterfield, 2005; Treviño et al., 2006), we employed the more general framework of the four components proposed by Rest (1979) to guide our analysis of the empirical evidence on ethics training effectiveness. Rest’s (1979) model suggests that ethical behavior consists of the four components (or processes) of awareness, judgment, intention, and action. Consequently, ethical behavior consists of the behavioral processes used to perceive the ethicality of a specific situation, justify the superiority of a moral course of action, develop the motivation to choose the moral course of action, and have the endurance and ego-strength to act accordingly. Various individual characteristics, cognitive as well as non-cognitive, may influence the ability of an organizational member to pursue ethical sensitivity, judgment, motivation, and/or action. We identified the components covered in the reviewed studies and the instruments used to measure the effects. Furthermore, we designated if the impact of ethics training was measured on practitioners or students. Ethical awareness, judgment, and intent broadly refer to the second level of Kirkpatrick’s model, whereas ethical action captures the third level—the applied contents in practice (transfer).
Information about the training design
We searched for information about, and coded, three generic categories that are essential for the design of ethics training: (a) the learning approach, (b) the number of different training methods, and (c) the duration of the training (see Figure 1). These categories are derived from theoretical frameworks and studies (explained in detail as follows) and are expected to influence how participants are involved in and react to the training program.

Categories of Ethics Training Designs.
First, ethics training can employ different learning approaches. The learning approach determines whether the learning process maintains the active or passive involvement of participants. Thus, we distinguished an active learning approach and a passive learning approach of ethics training. Moreover, active and passive involvement of participants can be combined. An active learning approach rests on learning activities that require the active involvement of participants, such as developing solutions to ethical scenarios, in which participants must make decisions about ethical issues. A passive learning approach, in contrast, does not require the active engagement of participants, whose role may be satisfied by listening to a lecture or watching a video. Active learning approaches can furthermore differ on whether they require social interaction. Social interaction as a subcategory of active learning approaches occurs if there is some form of exchange (e.g., of ideas or opinions) between participants and/or participants and instructors. Group discussions of ethical issues, for instance, represent an active approach with social interaction, whereas choosing potential courses of action to respond to a given ethical scenario requires participant action but no social interaction.
The underlying theories of this first design category are the constructivist and socio-constructivist learning theories, as well as theoretical considerations of leading developmental psychologists such as Piaget (1970) and Blatt and Kohlberg (1975). Constructivist theories share the basic assumption that learners actively construct knowledge (Loyens & Gijbels, 2008). Learners activate and use knowledge that they already have to interpret new information. Thus, learning is not only information acquisition but also knowledge construction and transformation. The second assumption is that cooperative learning (via social interaction) contributes to this knowledge construction (Loyens & Gijbels, 2008). A representative of the constructivist and socio-constructivist learning theories is Vygotsky’s (1987) sociocultural theory of cognitive development. In line with the assumption of cooperative learning, Vygotsky (1987) suggests that individuals benefit from interaction and information exchange to build competences because all information is context- and culture-dependent. Social exchange therefore supports the learner in constructing his or her own knowledge bases. Similarly, Piaget (1970) claims a rather active role of learners. Knowledge does not result from a merely passive process by copying objective information. The learner must act upon—or “displace, connect, combine, take apart and reassemble”—objects to extend his or her knowledge (Piaget, 1970, p. 704). Based on the notions of the constructivist learning theory, Blatt and Kohlberg (1975) also recommend interaction through group discussion in ethical training, and the moral psychologist Lind (2016) emphasizes active and interactive methods to promote ethical competences.
The empirical insights from the general field of learning suggest that cooperative learning through social interaction tends to be effective and superior to individual learning (Hattie, 2009). These findings also support the strength of involving peers in (joint) learning situations (Hattie, 2009). The important role of social interactive approaches to promote learning is moreover evident in the well-established research field of instruction design. For instance, Gagné et al. (2005) suggest employing a collaborative learning environment in which learners work (and learn) together. Within the field of HRD, prior research has indicated that social interaction could be useful for learning (Martin et al., 2014), developing skills, and changing the attitudes of organizational members, whereas a passive learning approach could be sufficient to present information for knowledge acquisition (Carter, 2002; Heneman et al., 1989).
Second, ethics training programs can rest on different training methods like listening to lectures, discussing case studies, and watching videos. Martin et al. (2014) provide a review of well-established and widely used training methods in HRD that include lectures, case studies, team training, and role play. A training method is a didactic preparation of content aimed at enabling and supporting learning in an effective, efficient, and sustainable way (Meyer & Thielsch, 2017). There is a variety of training methods that can be assigned to two forms: the employed social form (who with whom?) and the applied task form (such as discussions, lectures, and role play) (Meyer, 2011; see also Cole, 1981, for a similar classification of training methods). Social forms can be traditional classroom teaching, individual work, partner work, and group work. Task forms include activities like listening to lectures, giving presentations, and engaging in discussions. Training methods may incorporate various forms that could also be combined. For instance, group work could include tasks that should be solved together by searching databases and collecting information in conjunction with discussing a story. These are two different training methods with the same social form (group work) but diverse tasks. Accordingly, various means can be utilized to transmit knowledge, open new lines of thought, and support the learning process of participants. In this regard, ethics training programs can be distinguished between those that include only a single training method (such as exclusively listening to lectures) and those that integrate a variety of developmental means (such as listening to lectures, watching videos, and discussing ethical scenarios).
The underlying theoretical framework of this second design category refers to learning style models. These models suggest that learners are different in their preferred way in which they learn best and build their individual learning styles. Learning style models differentiate, for instance, visual and verbal learners (Riding & Rayner, 1998; Willingham et al., 2015). Another example is Kolb’s (1984) model of four basic learning styles (for an overview of other models, see Hawk & Shah, 2007). Despite their differences, learning style models have two basic notions in common. First, a learning style is a consistent characteristic of an individual that remains stable across various situations. Second, learning is more effective when the individual’s preferred style is applied (Willingham et al., 2015). To reach as many kinds of learners as possible, learning style models suggest utilizing multiple training methods. This is also in line with the recommendations of Weber (2007), who suggests that ethics training should use different pedagogical activities and be sufficiently flexible to match the participants’ learning styles.
Third, ethics training programs can vary in duration. This design category pertains to several facets, such as the frequency of training sessions (i.e., only one session, two, or several), their length (e.g., 15 minutes or 3 hours), and their temporal distribution in the case of multiple training sessions (that could be held, for instance, within a week or within a year).
The underlying considerations of this design category are learning principles, which are proposed in the well-established instruction design literature (Gagné et al., 2005; Magliaro et al., 2005; Reigeluth & Carr-Chellman, 2009). The instruction principle “repetition” suggests that learners need repeated practice to improve their competences and to become more confident in using them (Gagné et al., 2005). Lind (2015) suggests that ethical training is comparable to muscle training and therefore requires time, patience, and regular repetition. Furthermore, empirical evidence indicates that spaced practice (temporal distribution and/or repetition of several training sessions) tends to be superior to massed practice (concentrated in a single training session) (Hattie, 2009). Massed practice is training in which participants perform a task continuously without interruption (like a block seminar), whereas spaced practice is training in which participants are given rest intervals—or intermittent pauses—within and between the sessions (Donovan & Radosevich, 1999), such as when training takes place once a week for 10 consecutive weeks (Goldstein & Ford, 2002; Kauffeld & Lehmann-Willenbrock, 2010). In general, the acquisition and retention of knowledge, skills, and competences tend to be enhanced by spaced practice (Demptster, 1988; Hattie, 2009). The efficacy of spacing length is related to the complexity and novelty of the tasks. Learners may benefit from increased spacing time when the task is more complex and demanding (Hattie, 2009).
Conceptual substantiation
Regarding the disclosed conceptual substantiation of the reviewed studies, we determined if authors made a theoretical framework for hypotheses formulation explicit (i.e., provided a precisely developed theoretical framework), outlined a theoretical reference (i.e., referred to theoretical substantiation without explaining in more depth how they used the theory for the development of their study and the formulation of their hypotheses), or relied only on prior empirical research to develop and substantiate their own investigation (refraining from any disclosed explanation of their theoretical underpinning).
Method
Procedure to Identify Potentially Relevant Empirical Studies
To conduct a comprehensive study (and to update and go beyond previous reviews), we used the following sources to identify the relevant articles for our analysis. First, we used the search engines of the EBSCO, PsychARTICLES, and Web of Science databases. In addition, individual publisher databases were searched, including those of Emerald, Sage, SpringerLink, Taylor & Francis, and Wiley. Moreover, we manually searched the following major journals within the field of business ethics and HRD: Advances in Developing Human Resources, Business and Society, Business Ethics: A European Review, Business Ethics Quarterly, European Journal of Training and Development, Human Resource Development International, Human Resource Development Quarterly, Human Resource Development Review, International Journal of Training and Development, and Journal of Business Ethics. This selection of search engines, databases, and journals is supported by previous reviews (e.g., Craft, 2013; O’Fallon & Butterfield, 2005) and various rankings of top journals in business ethics (e.g., Albrecht et al., 2010) and HRD (e.g., Seo et al., 2019; Wang et al., 2012). Our systematic search, which also covered cited references, resulted in more than 200 studies of potential relevance to be considered in our review.
Procedure to Select Studies to Be Included in the Review
Ethics training is an instrument of HRD that includes all educational and developmental methods intended to implement and promote ethical behavior within organizations (Ruiz et al., 2015; Treviño, 1992; Weber, 2015). In addition to ethics training, there are various terms for ethics education instruments such as (pervasive vs. standalone) business and society courses (Wynd & Mager, 1989), ethics courses (Sorensen et al., 2017), ethics instruction (Waples et al., 2009), and ethics intervention (LaGrone et al., 1996). We subsume all these educational and developmental courses, instructions, and interventions under the term “ethics training.”
We employed four criteria to select the studies to be included in our review:
(1) The subject of the study is an empirical investigation of the effectiveness of ethics training. Because our focus is on empirical studies, we dropped purely theoretical, conceptual, and prescriptive papers.
(2) The empirical studies are about business ethics training. For this reason, we dropped empirical studies on ethics training without reference to the business sector, such as medical ethics training.
(3) Business ethics training participants are adults who are practitioners or students (e.g., adult students at universities). We therefore dropped all empirical studies on the ethics education of adolescents and children.
(4) The study examines the impact of standalone business ethics training. Thus, we dropped studies about pervasive approaches such as holistic ethics programs, as well as empirical investigations of the impact of overall business education. We focused on standalone training because widespread approaches make it more difficult to clearly determine the factors and components that caused the observed outcomes.
Following this selection process, our final sample consists of 92 studies. This sample size exceeds previous empirical reviews and meta-analyses of ethics training (Craft, 2013: 2 relevant studies on ethics training; O’Fallon & Butterfield, 2005: 3 studies; Waples et al., 2009: 25 studies; Weber, 1990: 4 studies) not only due to the longer timeframe but to the more comprehensive inclusion of relevant studies.
Results
General Overview of Studies
Out of the 92 studies included in our review, only 1 was published in the 1970s, 8 were published in the 1980s, 32 in the 1990s, 34 in the 2000s, and 17 studies were published between 2010 and 2017. This indicates a peak of publications in the 2000s. The highest number of publications by year was 7 in 2006. However, 6 studies were published in 2017. This comparably high number of studies was also reached in the years 1991 and 2000.
Major Findings
In this section, we present the major results of our investigation regarding the impact of ethics training, the dependent variable and its measurement, the provided information about the training design under study, and the conceptual foundation of the articles. Table 1 provides a summary of the major findings.
Summary of Major Findings.
Impact of ethics training
We determined if the reviewed studies found no impact or positive, mixed, or negative effects of ethics training on ethical behavior. Table 2 displays the results of the studies regarding the impact of ethics training.
Extant Evidence on the Impact of Ethics Training on Ethical Behavior.
Note. N/A = not applicable.
About 65 (out of 92) studies demonstrated a positive impact (71%), 15 studies (16%) showed mixed results, 11 studies (12%) did not observe any (statistically significant) impact, and only 1 study (1%), published in 1998, indicated a negative impact as a result of ethics training.
The proportion of studies that found a positive impact from ethics training rose from 63% in the 1980s to 88% in the 2010s. Correspondingly, the proportion of studies that found no effect from ethics training declined from 25% in the 1980s to 6% in the 2010s.
Sample sizes to estimate the impact of ethics training ranged greatly from 19 to 17,248 subjects (Raile, 2013) with an average sample size of 430. After excluding the extraordinarily large sample of Raile (2013), the average sample size was 241.
Dependent variable and measurement
Ninety-one studies measured the effectiveness of ethics training regarding its impact on a component of ethical behavior (such as ethical judgment or ethical intent) from Rest’s (1979) four-component model. One study referred to perceptions of ethical climate within the organization as the dependent variable. Comparing the reported effects is challenging because different components of ethical behavior were studied, and they were measured in divergent ways. Including combinations, prior studies investigated ethical awareness (23 out of 92 studies), ethical judgment (53), ethical intent (18), and ethical action (13). Only one study, as already mentioned, examined the impact of ethics training on perceptions of ethical climate (for in-depth information about ethics training and ethical climate in human resource management, see Wells & Schminke, 2001). Consequently, the most commonly examined component of ethical behavior, and its corresponding training effects, was ethical judgment. This is in line with the predominant conceptual foundation based on theories of moral development and reasoning, as indicated later. Studies that investigated ethical action tended to publish positive or mixed effects of training rather than negative effects or no impact.
About 17 studies examined not only one component of ethical behavior but combinations of two processes (such as ethical awareness and ethical judgment). With the exception of three articles, all of these studies covered ethical judgment. None of the studies investigated more than two processes or all four components of ethical behavior. About 5 studies out of the 17 found mixed effects from ethics training.
Regarding the whole sample of 92 studies, 2 studies did not provide information about their measurement methods. The most frequently utilized measurement instrument was the defining issues test (DIT) (Rest, 1979; Rest et al., 1986), which was applied in 25 studies. This frequency follows from the prevalent reference to theories of moral development and the wide use of ethical judgment as the dependent variable. Two studies used the DIT partly to develop a new or revised instrument. Remarkably, 29 studies applied completely self-developed instruments and 33 studies utilized partly self-developed measurement methods (i.e., instruments that revised or supplemented existing ones). Consequently, most of the studies relied on (partly) self-developed instruments. Other measurement instruments were also utilized, but not frequently, such as Clark’s (1966) personal business ethics score, Rokeach’s (1973) value list, and Harris’s (1990-1991) questionnaire on ethical beliefs regarding questionable business practices. Due to the variety of measurement instruments that were frequently, at least in part, self-developed, competing findings cannot be traced systematically to the application of a specific method. The authors of (partly) self-developed instruments provided few or no insights into the development of these instruments and the substantiation of their validity.
Information about the training design
Many studies addressed only the general effectiveness of training programs and ignored their specific design. More precisely, 36 studies (39%) revealed no details about the design of the studied training program. However, 56 studies (61%) gave at least some information about the training design (sometimes rather limited, such as the duration but no information about the learning activities).
Only 5 out of 11 studies (45%) that concluded that ethics training had no impact provided information about the design of the studied training program. Notably, the one study that indicated a negative impact of training gave no information about the training design. In contrast, 42 out of 65 studies (65%) that indicated a positive impact disclosed information about the training design. Nine out of 15 studies (60%) with mixed effects gave information about the training design.
Employing the dimensions outlined in Figure 1, we analyzed the information about the learning approach, the number of employed training methods, and the duration of the studied training programs. Of these 50 of 56 studies with information about the training design referred to an active learning approach (17 studies) or a combined (active and passive) learning approach (33 studies). Only one study employed a strictly passive learning approach without any interaction (Green & Weber, 1997). Studies with an active learning approach observed positive and mixed effects. Results showing no impact were given only in studies using a combined learning approach.
Many studies stressed the focus on active methods, and the most common activity (41 studies) was the discussion of ethical issues (e.g., ethical dilemmas and case studies). Accordingly, interaction among participants occurred in many training programs (43 studies). Dilemma discussions and an interactive approach were also suggested to enhance moral development according to Kohlberg’s cognitive moral development theory (Blatt & Kohlberg, 1975).
Training can consist of one, two to three, or several different training methods. Nine studies used one training method, 26 studies used two or three methods, and 18 studies used four or more methods but no study used more than six methods. Regarding the impact of training in our review, the findings are inconclusive on the need for a combination of multiple training methods and on the optimal combination of activities for improving ethical behavior. Combining various training methods may, however, address individual learning styles that differ from person to person (Weber, 2007).
Statements about the duration of ethics training also varied widely and made it difficult to gather conclusions on this design category. The duration of academic ethics courses during a semester ranged from 5 to 17 weeks. Regarding the whole sample of reviewed studies, there were also shorter sessions, such as a half-day workshop. Thus, the length and the frequency of single training sessions differed greatly. Both short and long training programs were associated with positive effects, mixed results, and no impact. The optimal length of an ethics intervention can therefore not be substantiated on extant evidence because of the variety of statements and the missing or limited information about this design dimension.
Conceptual substantiation
Fifty studies explicitly disclosed their theoretical framework, 16 studies gave at least a theoretical reference, 24 studies made no theoretical framework explicit and gave no theoretical reference but provided an empirical literature review as a starting point for their investigation, and two studies lacked any information about their substantiation. There is a great diversity of conceptual foundations. The 66 studies that used an explicit framework or gave a theoretical reference employed 41 different theories, some of which were used in combination. In 20 studies, more than one theory was utilized.
The most commonly used theory is Kohlberg’s (1969) theory of cognitive moral development (also known as CMD theory or moral stage theory), which elaborates on Piaget’s (1950) more general work on cognitive development. In sum, 45 studies used Kohlberg’s theory: 34 studies with an explicitly outlined theoretical framework and 11 studies with a theoretical reference. Thirty of the 45 studies combined Kohlberg’s theory with another theory, such as Rest’s (1979) work on moral development or—particularly to scrutinize potential gender differences—Gilligan’s (1982) ethics of care.
Regarding the variety of (41 different) theories, only a few were employed in more than one study. These theories include Rest’s (1979) four-component model of ethical behavior (applied in 23 studies), Gilligan’s (1982) ethics of care (nine studies), Treviño’s (1986) person-situation interactionist model (seven studies), Piaget’s (1950) theory on cognitive development (five studies), Jones’s (1991) issue-contingent model (four studies), Bandura’s (1976) social learning theory (three studies), Ajzen’s (1991) theory of planned behavior (two studies), and Wotruba’s (1993) ethical decision action process (two studies).
Further In-Depth Comparisons
Type of participating subjects
Empirical business ethics studies vary regarding the participating subjects, who are generally either students or practitioners. Of these 65 studies (71%) measured the effects of ethics training on students. The results of 22 studies (24%) relied on practitioners. Four studies used a mixed sample (consisting of students and practitioners). One study did not disclose sufficient information about the type of subjects. The findings according to each dimension are summarized in Table 3 and explained next.
In-Depth Comparison: Type of Participating Subjects.
Impact
In both student and practitioner studies, the impact of ethics training tended to be positive (71% of 65 student studies and 73% of 22 practitioner studies). Regarding mixed results and those showing no impact, the studies with students as subjects reported no effects from training more often than those with practitioners, whereas practitioner studies observed more mixed results in comparison. This finding is in line with the conclusion by O’Fallon and Butterfield (2005) from 41 empirical business ethics studies that students tended to indicate less ethical responses than practitioners (see also Craft, 2013).
Dependent variable and measurement
The studies based on student samples most frequently examined ethical judgment (40 studies), followed by ethical awareness (18) and ethical intent (13). Ethical action was addressed in three studies. In the practitioner studies, the most frequently examined variables were ethical judgment (10 studies) and ethical action (9). Ethical awareness and ethical intent were covered in four studies each. One study examined the perceptions of ethical climate. Therefore, the behavioral component of ethical action was more frequently addressed in practitioner rather than student studies (41% compared to 5%).
Both groups of studies primarily measured the dependent variable using (partly) self-developed instruments (66% of student studies and 82% of practitioner studies). The student studies used the DIT to measure the dependent variable more often than practitioner studies (34% vs. 18%). This may also follow from the comparably stronger focus on ethical judgment as the more frequently scrutinized dependent variable in studies based on student samples.
Information about the training design
Students studies provided more information about the training design than practitioner studies (69% vs. 45%). In both groups, studies with positive effects were more likely to disclose training design information than those with mixed or negative effects or showing no impact. There were 45 student studies with information about the training design (35 of which indicated positive effects) and 10 practitioner studies with information about the training design (6 of which indicated positive effects).
Conceptual foundation
Practitioner studies more often employed an explicitly explained theoretical foundation (64% compared to 52% of the student studies). However, after aggregating the explicit foundations and theoretical references, these differences leveled out (72% of 65 and 73% of 22 studies, respectively).
Type of conceptual foundation
As previously indicated, the reviewed studies differed in the explicitness of their conceptual foundation. In just over half of the studies, a theoretical foundation was explicitly used and disclosed in sufficient detail (54%). About a sixth (17%) of the studies gave at least a theoretical reference. Almost one-third of the studies did not make any theory explicit (28%).
Impact
Studies with an explicit theoretical substantiation indicated a positive impact from ethics training more often than studies that merely provided a theoretical reference or did not make the use of a theory explicit (75% vs. 62% vs. 65%, respectively).
Dependent variable and measurement
Studies with an explicit theoretical framework commonly used ethical judgment as the dependent variable (36 out of 50 studies). Nineteen of these studies used the DIT to measure the dependent variable. The studies with a theoretical reference also often used ethical judgment as the dependent variable (8 out of 16 studies). The studies without an explicit theory or reference examined ethical awareness about as much as judgment and intent. Ethical action, regardless of whether a theory was made explicit, was the least examined dependent variable.
Information about the training design
Sixty-eight percent of the studies with an explicit theoretical framework and 69% of the studies with a theoretical reference provided information about the design. In contrast, only 42% of the studies without any explicit theory made corresponding disclosures.
Discussion
In this section, we discuss the findings and summarize gaps in and critique the extant knowledge as outlined in Table 4.
Overview of Gaps and Critique.
Conceptual Substantiation
Many studies gave no explicit theoretical substantiation for hypotheses formulation. The studies with a theoretical foundation often used Kohlberg’s CMD theory, sometimes linked with the cognitive development approach of Piaget (1950) and the theoretical concepts of Rest (1979) and Gilligan (1982). Kohlberg’s work is well-established and remains a major theory used to substantiate studies about training effects, especially on ethical judgment.
Kohlberg’s (1969) theory posits that the moral judgment of individuals develops over time through an invariant and irreversible sequence of six stages. The stages indicate that individuals develop their moral judgment with an increasing justice orientation. Therefore, justice is the principle that raises one stage above another in moral reasoning. The higher the position of individuals within the stages, the more they tend to substantiate their solutions to moral dilemmas, which indicates—according to this stream of theory—an increasing ability to judge ethically. The first stages are characterized by a person’s orientation toward punishment, rewards, and relations to family and peers. The higher stages are characterized by obedience to legitimate authorities and, finally, universal values of justice (cf. Dellaportas, 2006; Kohlberg, 1967).
Like any other theory, Kohlberg’s approach has been criticized, particularly due to its conception of justice and its rather narrow focus on cognitive processes (e.g., Pritchard, 1984). More recent theories therefore attempt to pursue a broader view and to also cover non-cognitive influences. Such theories incorporate a broader set of variables that may influence ethical behavior, including the specific situation, individual moral capacities, gender differences, and non-cognitive-based factors such as emotions (e.g., Reynolds, 2006; Schwartz, 2016, 2017; Zollo et al., 2017). However, these models were seldomly used in the reviewed articles. Nonetheless, alternative theoretical foundations tend to be on the rise. Comparing older (1970–1999) and newer (2000–2017) studies revealed that the usage of Kohlberg’s theory declined over time (88% vs. 56%), whereas the variety of theories increased (14 vs. 34 different theories). In the 1980s, all studies (100%) with an explicit theoretical framework or a theoretical reference used Kohlberg’s theory. Of the studies published in the 1990s, 76% were developed based on this theory. More recent years show a sharper decline, with 59% of the studies in the 2000s and 57% in the 2010s referring to Kohlberg’s theory.
The rise of alternative theoretical approaches may indicate that the concepts of cognitive development are being challenged by theories with additional factors that explain and predict ethical behavior. One recently proposed comprehensive framework is Schwartz’s (2016) integrated approach of rationalist-based (CMD theories) and non-rationalist-based theories. Within this framework, Schwartz (2016) combined cognitive factors with individual factors, the situational context, and mental processes such as emotion and intuition. Zollo et al. (2017) investigated the interplay between (rational) moral reasoning and moral intuition, as prior studies indicated that ethical decision-making may not always be deliberate and reasoned (Reynolds, 2006) and may follow sensemaking intuition (Sonenshein, 2007). It is noteworthy that classical ethical theories such as deontology and consequentialism were not referred to in the reviewed studies. Rather, theoretical foundations from learning and moral psychology, such as the CMD theory, dominate extant studies. Further research could also be built on classical ethical theories and utilize these to study ethics training programs.
Assessment of Ethics Training Effectiveness and Design
The measured effects were not completely conclusive, but most of the studies revealed positive consequences of ethics training. Only one study (published in 1998) indicated a negative impact of ethics training. Eleven studies found no impact. Competing effects may also be caused by different measurements and the extent of rigorousness of the empirical investigation to which we refer in the next subsection.
In addition, many studies, especially those that were unable to prove any effect, did not give a precise description of the design of the training. About 65% of the positive-impact studies and 60% of the studies with mixed results provided information about the design (for an example of comparably clear training information, see Loe & Weeks, 2000), but only 45% of the studies that measured no impact revealed their underlying training concepts.
Regarding training design, it appears reasonable to suggest that different components of ethical behavior may need different training-based stimulations. In this regard, some studies that examined multiple components of ethical behavior showed competing effects depending on which component was addressed (Honeycutt et al., 1995; Kumar et al., 1991; Marnburg, 2003). If the training is not suitably designed for the component that is supposed to be stimulated, it is less likely that the specific behavioral process will be effectively strengthened. Stimulating, for instance, ethical awareness may require other training methods and activities than developing ethical intent. For example, a dilemma discussion can stimulate and improve ethical awareness because the participants are presented with ethical topics, problems, opinions, and solutions. This helps sensitize the training participants to ethical challenges in real-life situations but may not sufficiently strengthen their motivation to select and implement the morally superior course of action. To consider and capture these differences, a more systematic and fine-grained derivation and examination of design elements related to the relevant ethical component in the training is warranted.
Regarding the four processes of ethical behavior, the component of ethical sensitivity (awareness) can be viewed as paramount because a person must perceive a situation as ethically relevant to reach the next component of ethical decision-making: ethical judgment. If organizational members remain unaware of the ethical relevance associated with their decision tasks, they will be unlikely to make efforts to balance ethical considerations and to develop morally sound courses of action. Moreover, interdependencies among the various components of ethical decision-making must be taken into account to provide a comprehensive view of the transitions and interrelations between the four components of ethical behavior. Such comprehensive and fine-grained frameworks of analysis and design may also illuminate in more detail whether ethics training could simultaneously strengthen several (or all) components of ethical behavior or should focus on one component. Highly ambitious goals of ethics training may be associated with more complex and time-consuming training designs and methods compared to training programs that are intended to stimulate only one component of ethical behavior.
In this context, it is worth emphasizing that all the reviewed studies only addressed the potential effectiveness and ignored the costs of ethics training programs. Studies that balanced the positive consequences of training programs and the costs of their implementation were absent. This is important in comparing the eligibility of various measures of implementing corporate ethics training programs that had similar effects in promoting ethical behavior but may have been implemented at different cost levels.
E-based training methods can potentially reduce the costs of corporate training and enhance flexibility (Cascio, 2019). The relevance of this form of training has been indicated by Martin et al. (2014). There were only three studies in our review that examined the effectiveness of e-based learning. Although the findings of Bodkin and Stevenson (2007) indicated a positive impact, Lau (2010) found positive effects only in combination with other learning activities. Sorensen et al. (2017) investigated e-based learning without face-to-face activities: they broadcasted recorded lectures, used online threads for discussions, and asked participants to write reflection papers. Interaction between the participants and the instructor was limited to online threads. In these online discussions, participants were requested to contribute a minimum of one high-quality response to the questions of the instructor and two high-quality responses to the comments of other participants. In contrast to the critique of Antes et al. (2009), online courses may therefore provide some degree of social interaction. Video-based communication technologies also make on-demand instruction and video-based exchange feasible. The overall effect of the e-based approach was positive (Sorensen et al., 2017). Although the results of these initial studies appear encouraging, e-based training methods, their design, and their effectiveness will benefit from additional study and in-depth investigations to better inform researchers and practitioners how they can be used to strengthen ethical behavior in the business sector (for more information about e-based learning as a training trend, see Cascio, 2019).
Some scholars have argued that the development of ethical competences takes time and requires regular confrontation with moral dilemmas (Lind, 2015). Bodkin and Stevenson (2007) compared two groups of training participants and concluded that the members of the group that had more time for reflection achieved a higher level of improvement of their ethical perceptions. Relatedly, some studies (Arlow & Ulrich, 1985; Richards, 1999; Weber, 1990) suggested that the positive effects of ethics training should be interpreted with caution because these effects may have been short-lived and not necessarily sustainable in the long term. Although extant evidence does not reveal the optimal duration or repetition of ethics training, it seems reasonable to infer that ethical capacities, competences, and skills are not built overnight but require time to develop sustainably. Regular repetition of training may be beneficial to lifelong learning and the continuous development of ethical capacities, competences, and skills. In sum, it is important for future studies to consider not only short-term but also long-term effects of ethics training. Accordingly, studies should not only measure effects immediately or shortly after the last training session but also after a long period of time to better assess the sustainability of the training effectiveness.
Dependent Variable, Measurement, and Methodological Rigor
As pointed out, there is some variance among the studies regarding the dependent variable of interest and the utilized measurement methods. As a consequence, generalizations across studies are restricted. The consistent utilization of well-established and sufficiently validated instruments to measure the various components and processes of ethical behavior would facilitate further consolidation of extant wisdom.
The measurement of ethical behavior should be assessed critically. Kohlberg’s (1969) moral judgment interview has been replaced by less time-intensive procedures such as the DIT and self-developed measurements, which can be carried out and evaluated more easily than interviews. Kohlberg was concerned about the DIT and expected insufficient assessment results because, he argued, an assessment of ethical judgment must rest on open answers and autonomous formulations of participants (Kohlberg, 1979; Rest et al., 1999). Other researchers also suggested that the DIT may not be appropriate to capture all nuances of ethical judgment (Antes et al., 2009). Furthermore, there are researchers who investigate and prefer assessment centers as a possible alternative to measure ethical behavior (Eigenstetter et al., 2012). However, although these comprehensive tools provide better substantiated conclusions about the effects of ethics training, their application is time-consuming and, consequently, costly, which may diminish their practical eligibility.
Regarding the methodological design of the empirical investigation, a criticism is that a control group is often missing in pretest-posttest designs. Rigorous studies on the measurement of ethics training should be designed as longitudinal studies. For a sufficiently substantiated assessment of the training effectiveness, a comparison of the level of ethical behavior before and after the treatment must be carried out (pretest-posttest design). Pretest-posttest designs are commonly used in behavioral research to measure the change resulting from treatments. Because factors other than the treatment can influence the changes, the results must be compared with a control group that did not receive the treatment (Dimitrov & Rumrill, 2003). It is noteworthy that 32 studies in our review used a pretest-posttest design with a control group.
Type of Participating Subjects
Our comparisons between studies of students vs. practitioners revealed that student samples prevailed. The high frequency of student samples was expected, as it is difficult to get practitioners to participate in research studies. However, it is widely questioned whether results gained from student samples are relevant regarding real-life behavior within the workplace. There are varying views about the relevance of student samples and the permissibility to transfer results gained from these samples to business professionals and real-life workplace settings (Haslam & McGarty, 1998; Smart, 1966). However, although the moral development and individual characteristics may differ between students and practitioners, business students later become practitioners. Consequently, empirical results about the determinants of ethical behavior of students may have some explanatory and predictive power for the workplace. However, these predictions may be limited due to the fictitious scenarios and vignettes that are used to measure ethical behavior, which may not be tantamount to real-life observations. It is important to note, however, that practitioner studies also utilize such proxies. The conclusions gained from practitioner studies are therefore not necessarily applicable to actual behaviors in real settings but may also deserve careful interpretation.
Regarding the dependent variable, both groups of studies dealt considerably with the effects of ethics training on ethical judgment. However, the student studies seldom addressed the effects on ethical action, which was, in contrast, a frequently considered consequence in practitioner studies. The important role of ethical action in practitioner studies may stem from an interest in the payoff of ethics training in organizational situations and particularly in their ultimate outcome in the form of actual behavior (for more information about the role of training effects on behavior as an HR outcome in organizations, see Tharenou et al., 2007).
Implications for HRD Research
Our review has revealed that positive effects of standalone ethics training programs tend to prevail. Regarding further research, our insights suggest that more in-depth studies are needed that reflect on various dimensions and forms of training design, distinguish training program outcomes with reference to the various components of ethical behavior and elaborate on the conceptual substantiation why and how ethics training programs strengthen ethical competences of participants. Future research needs to incorporate that ethics training programs can be designed very differently. Effects will hence not only depend on whether an ethics training program is in place. Rather, the specific design and implementation of the program deserve more attention. Depending on the intended training consequences, different training program designs may be warranted to strengthen ethical awareness, ethical judgment, ethical intent, or ethical action. Future studies may also address the complex interactions between these various components of ethical behavior. Such studies need to reveal their theoretical underpinning and scrutinize in more detail the mechanisms through which ethics training programs influence the behavior of their participants. Extending the analyses to individual characteristics that may positively or negatively moderate training outcomes appears to be a particularly promising avenue that combines organizational-level means of HRD and individual learning outcomes.
Implications for Practice
Our review entails important implications for practice that refer, most notably, to the design and evaluation of ethics training programs. First, the HR professionals who are supposed to develop an ethics training program have to be clear about, and determine from the outset of the design process, what the intended outcomes of the training program are (Wallo et al., 2020). Depending on exactly which ethical competence is to be strengthened, different training designs could be suitable. In general, ethics training programs tend to actively involve the learners, provide a combination of various training methods, and offer repetitive training sessions. Furthermore, the discussion of ethical dilemmas is well established and frequently used in practice. The HR department, or HR manager, in charge of developing an adequate ethics training program needs to carefully ponder the intended goals of the program and select and align the essential design components of the program accordingly.
Second, and relatedly, the practice of ethics training programs will benefit from their thorough evaluation. HRD professionals may utilize Kirkpatrick’s model (2006) for training evaluation that also provides guidance about which information needs to be collected to assess a training program adequately and comprehensively. This evaluation should also include feedback from the training participants about their experiences and acceptance from attending the program. The results of the evaluation should be reflected by and inform the HR department how to further advance the ethics training program. The development and implementation of an ethics training program should be an open learning process that accompanies ongoing evaluation, incorporates feedback information and strengthens the program further based on these insights.
Conclusion and Further Research
Most studies on ethics training effectiveness found positive consequences of training programs regarding ethical behavior. Ample evidence furthermore indicated that training frequently utilized an active learning approach, especially with interaction by or among the participants, adding support to corresponding theories of cognitive development. Thus, the discussion of ethical dilemmas and case studies seems to be a well-established and frequently used learning activity. Our derivation from learning style theories and the results of empirical studies suggest a combination of learning activities: most of the studies used a variety of two or more training methods. An optimal combination of training methods cannot be concluded, as it depends on the context in which the training program is embedded and the issues the program tackles. Similarly, the optimal duration of training varies but theoretical considerations and empirical results suggest a need for regular repetition and sufficient time for reflection (see also Bodkin & Stevenson, 2007).
Unfortunately, many studies treated ethics training as a black box and only investigated whether ethics training made a difference. A simple yes-or-no answer may not be adequate and does not capture the complexity of the subject. However, many studies did not provide sufficient information about the training design and concrete design elements. Effects are contingent on the context and depend on the specific design of the ethics training. Future research that reflects on the various design dimensions according to which ethics training can be distinguished and considers in depth the context in which training programs are provided will help enhance our knowledge on ethics training effectiveness.
Furthermore, many studies made no theoretical substantiation explicit. As a consequence, the theoretical underpinning of ethics training programs and their potential consequences deserve more attention. Different underlying mechanisms may drive the impact of ethics training depending on the component of ethical behavior they refer to. Ethical behavior covers the processes of ethical awareness, judgment, intent, and action (Rest, 1979). The most widely examined process was ethical judgment, whereas the remaining components of ethical behavior were addressed less frequently. Practitioner studies were comparably more interested in the effects of ethics training on ethical action than student studies. The instruments to measure the various components of ethical behavior considerably differed—most were self-developed instruments or based on the DIT. Because the methodological rigor of corresponding studies also varied widely, it was difficult to generalize the results across studies.
Further research needs to develop comprehensive frameworks to examine in-depth why a particular training design leads to the development of ethical capacities, competences, and skills and eventually an improvement in ethical behavior. This should rest on a strong theoretical foundation about the underlying mechanisms that influence ethics training effectiveness and clarify the component(s) of the broad construct of ethical behavior that should be studied. More in-depth analyses are sought to elucidate which design elements appear to be beneficial for which components of ethical behavior, and for which specific reasons these relations are to be observed. In this regard, the impact of various forms of ethics training such as different learning approaches (active vs. passive vs. mixed), different training methods (general discussion vs. specific business dilemma, lecture vs. discussion), different lengths of the training sessions (e.g., 5 hours vs. 30 hours), their frequency (on singular or repeated occasions), and their temporal distribution (within a short or long period of time, such as 1 week or 1 year) need to be studied systematically. These effects may moreover be influenced by the organizational setting in which they are embedded. Such research promises novel insights about the form of the design category and combinations of design elements that are most effective in strengthening the ethical behavior of training participants.
Footnotes
Authors’ Note
The authors declare that the manuscript is their original’s work; it has not been published elsewhere and it is not under consideration for publication elsewhere at the time it is submitted to the Human Resource Development Review.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
