Abstract
Employee performance appraisals are complex events in organizations. They occur in contextually rich environments and have implications for careers, training opportunities, remuneration, and interpersonal relationships. For years, the study of performance appraisals has mirrored this complexity and has revealed a multitude of variables that can influence the accuracy of performance ratings. Of late, the importance of managers’ intentions as a determinant of performance ratings has gained prominence. What is less understood is where these intentions come from and what determines their relative strength or weakness. In the current paper, we present a model that explains the simultaneous presence and strength of multiple rating intentions that managers can have when rating employee performance.
Keywords
From 360-degree rating formats to forced distribution systems, employee performance evaluations are a ubiquitous and recurring human resource management practice. In every subjective rating system, an individual must gauge the performance of another person and then physically render his/her judgment on a performance evaluation form and/or orally provide feedback to the employee. It is widely acknowledged that at several points in this sequence, the process can go askew with raters committing errors in judgment, memory, and attention (Arvey & Murphy, 1998; Ilgen, Barnes-Farrell, & McKellin, 1993; Landy & Farr, 1980). The performance rating process becomes much more complex when one considers that appraisals do not occur in objective and neutral contexts, but instead are carried out in contextually rich environments where pressures, expectations, and consequences exist. As a result, in addition to cognitive errors, performance ratings can be influenced by situational pressures (Harris, 1994; Mohrman & Lawler, 1983) appraisal procedures (Jawahar & Williams, 1997), measurement formats (Fay & Latham, 1982; Murphy Martin, & Garcia, 1982), raters’ dispositional tendencies (Bernardin, Cooke, & Villanova, 2000; Murphy, Cleveland, Skattebo, & Kinney, 2004), and intentional rater adjustments (Harris, 1994; Kane, 1994; Murphy, 2008; Murphy & Cleveland, 1995). Given the multitude of influences on performance ratings, it is understandable why subjective performance ratings have suffered from low levels of reliability (Viswesvaran, Ones, & Schmidt, 1996) and problems with leniency and halo (Austin & Villanova, 1992; Viswesvaran, Schmidt, & Ones, 2005).
Over several decades, researchers have continuously tried to improve the quality of performance ratings (Farr & Levy, 2007). Traditionally viewed as a measurement problem and then tackled from a cognitive perspective (Landy & Farr, 1980), some of the most recent advances in appraisal theory have argued that rater goals and intentions are an important determinant of performance ratings (Murphy, 2008; Murphy & Cleveland, 1995). Therefore, instead of viewing managers as simply lacking the ability to effectively rate employee performance, researchers have argued that managers must possess the motivation to do so (Banks & Murphy, 1985; Harris, 1994; Murphy & Cleveland, 1995). Part of possessing the appropriate motivation may involve the intention to rate accurately as well as lacking other rating intentions, for instance, wanting to avoid conflict or manage impressions. A recent review of the literature highlights that there is a broad range of intentions that managers can hold when rating subordinates (Spence & Keeping, 2011) and that raters with different goals provide different performance ratings (Spence & Keeping, 2010; Wang, Wong, & Kwong, 2010; Wong & Kwong, 2007).
Decades of research and theory reveal that employee performance ratings can be influenced by a multitude of factors from several key domains: (a) measurement, (b) rater cognitions, and (c) rater volitions. Although rater motivation has been considered for several decades, the view that managers are active and volitional agents in the appraisal process is a relatively new idea in comparison to other research streams (cf. Harris, 1994; Landy & Farr, 1980). Work in this area has focused on establishing that raters do, indeed, have considerations other than accuracy, and have tried to map what these considerations are (Harris, 1994; Longenecker, Sims, & Gioia, 1987; Murphy & Cleveland, 1995; Spence & Keeping, 2011; Wang et al., 2010; Wong & Kwong, 2007). The literature is in agreement that managers’ intentions do affect ratings and that there are a number of considerations that managers may have when rating performance. For instance, models of intentionalor motivatedrating behavior (Harris, 1994; Kane, 1994; Murphy & Cleveland, 1995) include factors such as avoiding conflict, complying with organizational norms, and impression management and propose that managers can knowingly alter performance ratings for a number of reasons. What is lacking, however, is an understanding of where these different considerations come from and what determines their relative strength/intensity. That is, not all raters will want to avoid conflict and, of those that do, not all will want to avoid conflict to the same extent. The current paper provides a theoretical framework for understanding where these intentions come from and how strong they are likely to be.
One implication of the large number of factors that affect performance ratings is that it can be difficult for researchers to investigate specific phenomena, as no one factor operates in isolation of the others. For example, research on rater cognitions was often criticized for having too narrow a focus because the studies did not include contextual and motivational factors (Banks & Murphy, 1985; Dipboye, 1985). In addition to explicating how different rating intentions can arise, the current paper situates these processes alongside other domains of research, namely, measurement and cognitive streams. The result is a model that (a) explains the occurrence of multiple rating intentions, (b) integrates these different intentions and their causes alongside measurement and cognitive processes, and (c) proposes how these intentions and their causes work together alongside measurement and cognitive processes to affect performance ratings.
The model is built around the theory of planned behavior (TPB; Ajzen, 1991; Ajzen & Fishbein, 2005), an established, empirically supported, theoretical framework for predicting behavior from intentions. Building off an established theory is advantageous because it allows us to formulate theoretically grounded research propositions.
Background on the theory of planned behavior
Given that the TPB is the foundation of our model, the current section will provide some background on the theory and introduce how it will be utilized. The TPB proposes that the most proximal or immediate predictor of a behavior is one’s intention to perform that behavior (Ajzen, 1991; Ajzen & Fishbein, 2005). Data supports this theoretical proposition, with meta-analyses demonstrating a strong relation between intentions and behavior (Armitage & Conner, 2001; Sheeran, 2002; Webb & Sheeran, 2006). Furthermore, these data support the causal nature of intentions leading to behavior, as a meta-analysis of studies that manipulated behavioral intentions and then examined behavior showed a strong relation between intentions and behavior (Webb & Sheeran, 2006).
Moving beyond prediction towards the explanation of behavior, Ajzen and Fishbein (2005) suggest that individuals consider three focal issues that lead to intentions. Although not necessarily occurring at a conscious level, individuals ask three integral questions when deciding to engage in a particular behavior: (a) What are the potential positive and negative consequences of the behavior? (b) What are the expectations of valued individuals or groups for my behavior? and (c) What factors might either assist or hinder me in engaging in this behavior? The answers to each of these questions predict three important constructs, respectively: attitudes, subjective norms, and perceived behavioral control.
Model overview
In the current paper, we use the TPB to help predict and understand the behavior of a manager rendering a subordinate’s performance rating, a common dependent variable in much of the appraisal literature. The use of the TPB as a framework for understanding how performance ratings are rendered offers distinct advantages. One, the TPB is a well-established and empirically supported theory that has been used to predict and understand human behavior. Since its introduction to the social sciences, the TPB has been used to understand and predict a wide range of behaviors, from the type of interview technique that was used during an employee selection process to the occurrence of traffic violations (e.g., Elliot, Armitage, & Baughan, 2003; van der Zee, Bakker, & Bakker, 2002). Because the TPB has a distinct emphasis on the understanding of behavior and not just prediction (Ajzen, 1991), we can follow the theory and propose specific pathways to explain, rather than merely predict, performance ratings. Second, by applying the TPB to employee performance appraisal research, we can simultaneously account for the influence of dispositional, contextual, motivational, measurement, and cognitive variables in the prediction of appraisal ratings. Consequently, we are able to integrate these disparate literatures into one model.
To understand how a manager renders a performance rating, we carefully examined the literature to look for themes and commonalities among rating motivations. As result of this search, we orchestrate the current paper around the prediction of four common rating intentions: accuracy(i.e., rating objectively and impartially), avoiding conflict (i.e., appeasing the employee), benevolence (i.e., providing the employee with a helpful or considerate rating), and impression management(i.e., self-enhancement). This list of intentions is not meant to be comprehensive; rather it is comprised to represent the diverse and relatively common intentions that are discussed in the appraisal literature.
To facilitate comprehension and clarity, we present our complete model using two figures. A simplified model that outlines the prediction of a performance rating through several rating intentions is presented in Figure 1. We use this figure to conceptually depict the notion that the performance rating rendered by a manager can simultaneously stem from more than one rating intention. However, we suggest that each of these intentions is predicted by the core components of the TPB. In turn, these core components are determined by many personal and contextual variables, which the TPB calls “background factors.” We have chosen to label this group of variables “tributary variables” as we feel it more appropriately reflects the role of these variables within an appraisal context. A tributary is a stream or river that does not flow directly into the sea, but instead feeds into a mainstream that then flows into the sea. Thus, this label emphasizes the indirect, yet important, influence that we propose these variables have on both intentions and performance ratings.

Heuristic depiction of overall model with example intentions. IM = impression manage.
Figure 2 presents an elaborated version of our model, explicating the relationships involved in determining a single rating intention and identifying the propositions discussed in the paper. In Figure 2, we present the tributary variables in more detail and depict the important role that the core components of the TPB have in linking tributary variables to a rating intention, and finally, to the rating itself. In addition, we illustrate two moderators of the relationship between a rating intention and the performance rating: (a) internal control and (b) actual behavioral control, which represent factors within and outside of the manager that limit his/her ability to rate in line with his/her intentions.

Detailed model with propositions labelled.
Model elaboration and propositions
Managers’ attitudes and performance ratings
According to the TPB, an individual’s attitude toward a behavior is a core determinant of whether or not that behavior will occur. An attitude refers to an individual’s overall evaluation of the favorability of a particular behavior, determined by the assessment of a behavior’s consequences via two dimensions: instrumental (i.e., are the consequences valuable/worthless?) and experiential (i.e., are the consequences pleasant/unpleasant?; Ajzen & Driver, 1992). Thus, if a behavior’s consequences are pleasant and desirable, an individual’s attitude toward the behavior is likely to be positive. Because we are interested in simultaneously modeling several rating intentions, we consider managers’ attitudes toward each of these intentions.
Although performance appraisal researchers have not directly examined the effect of attitudes toward conflict avoidance, benevolence, and impression management, there is some research to support the idea of a relation between general attitudes toward performance appraisals and performance ratings. Longenecker et al. (1987), for example, found that managers’ beliefs and attitudes toward the performance appraisal system influenced the accuracy of performance appraisals. In interviews with executives, the researchers found that if managers failed to see value in the performance appraisal process, they were more likely to manipulate performance ratings. More specifically, if managers possessed the attitude that ratings were of little value, they were less likely to rate accurately, whereas if managers believed there was merit in the process, they were more likely to take it seriously and attempted to provide accurate ratings.
Researchers have long been interested in how perceived rating consequences alter performance ratings (Cleveland & Murphy, 1992; Harris, 1994; Murphy & Cleveland, 1995; Whisler, 1958). However, there is no widely accepted theoretical rationale to explain what happens when a rater foresees that a rating will result in multiple consequences. That is, there is no theory to predict why and how managers will evaluate and respond to simultaneous rating consequences. The TPB’s conception of attitudes can account for this, and makes specific predictions as to how managers will respond. Recall, the TPB’s conceptualization of a behavioral attitude is the overallfavorability of performing that behavior considering the behavioral consequences. Therefore, each perceived consequence is evaluated and then aggregated to determine the valance and strength of the overall attitude. This framework will be applied to each of the rating intentions discussed herein. In what follows, we outline how the attitude construct is implicated in different ratings intentions.
Attitude toward rating accurately
A manager’s attitude toward rating accurately is defined as the overall favorability or unfavorability of rendering an accurate performance rating for a particular subordinate, within a given context, at a given time. This includes general attitudes about rating accurately in his/her organization, as well as more specific attitudes about rating a particular subordinate accurately for a particular performance period in the organization. For example, if a manager expects that, as a consequence of rating accurately, an employee will be demoralized and that the organization will not benefit from the accurate evaluation of its employees, the manager will have a negative attitude toward rating the employee accurately. In this case, both the value and the experience of providing an accurate rating will be negative. Moreover, the strength of the attitude will be determined by the relative strength of these beliefs. According to the TPB, this less-than-positive attitude would result in a decreased intention to provide an accurate rating and, therefore, result in a biased rating. Our first proposition flows directly from this line of reasoning:
Attitude toward rating to avoid conflict
A manager’s attitude toward avoiding conflict is defined as the overall favorability or unfavorability of avoiding conflict with a particular subordinate, in a given context, at a given point in time. Thus, a manager’s attitude toward avoiding conflict includes general attitudes about conflict avoidance in his/her organization, as well as more specific attitudes regarding a particular subordinate during a particular performance period. Although there may be some variability with respect to how managers generally perceive conflict, we expect that most of the variability in this attitude stems from the degree to which managers perceive that conflict with a particular employee is of concern. Attitudes are largely formulated by the evaluation of perceived consequences (i.e., Are they valuable? Are they pleasant?). Therefore, it is possible that under certain circumstances, with a specific employee, that conflict may be valuable. Thus, an increase in the perceived situational utility of conflict would decrease the severity of a manager’s overall negative attitude toward conflict, possibly creating a favorable attitude towards creating conflict. In some instances it has been argued that managers are inclined to provide intentionally low ratings in order to send messages to specific employees (Longenecker et al., 1987). In such a situation, a manager might not have a negative attitude toward avoiding conflict, but rather he/she may be seeking it. An attitude that views conflict as favorable would result in an intention to rate in a manner to produce conflict. Conversely, an attitude that conflict is neither valuable nor pleasant would generate an intention to avoid conflict.
Attitude toward rating benevolently
A manager’s attitude toward rating benevolently is defined as the overall favorability or unfavorability of providing a particular subordinate with a helpful or considerate rating, in a given context, at a given point in time. Some managers perceive that acting in the best interest of the employee is the goal of performance appraisal. This “best interest” may take many forms, with the ultimate goal of attempting to provide employees with performance ratings that are helpful or considerate for the context and time. For instance, managers have stated that they have intentionally softened a performance rating if the employee had recently gone through a difficult personal issue (e.g., death in the family, divorce, etc.), because it would not serve the employee or the organization well to provide a historically good employee with a low performance rating. Specific to in-role job difficulties, Jawahar (2005) found that raters effectively adjusted performance ratings to accommodate various performance constraints faced by ratees (e.g., longer distances to travel, more competitors in a sales territory). Similarly, it has been suggested that managers can use the performance appraisal process as a means to motivate employees rather than simply using it to evaluate employees (Cleveland & Murphy, 1992). Thus, we propose that managers who possess a favorable attitude toward rating benevolently will be more likely to provide a benevolent rating. Note that the label of benevolent rating is used broadly to represent ratings that are intended to be useful, helpful, considerate, or accommodating.
Attitude toward rating to manage impressions
Lastly, we propose that managers can also have favorable or unfavorable attitudes toward managing impressions during the appraisal process, which can influence rating intentions, and, subsequently, performance ratings. A manager’s attitude toward rating to manage impressions is defined as the overall favorability or unfavorability of a manager for making him/herself appear positive, in a given context, at a given point in time. Performance appraisal researchers have long recognized that managers may use the performance rating process to manage their own impressions (e.g., Bass, 1956; Cleveland & Murphy, 1992; Villanova & Bernardin, 1989). Villanova and Bernardin (1989) discussed impression management as the tendency for supervisors to provide appraisal ratings that either directly or indirectly forward their self-interests. The authors defined impression management as “any behavior that alters or maintains a person’s image in the eyes of another and that has as its purpose the attainment of some valued goal” (p. 299). This idea has been supported by recent empirical work, which found that managers are more likely to inflate performance ratings if they perceive that the ratings they give are tied to their own performance ratings (Spence & Keeping, 2010). We propose that the extent to which managers will engage in impression management during the appraisal process will be partially determined by their general attitude toward impression management, as well as their attitude toward impression management in the specific rating context. Specifically, we propose that:
Subjective norms and performance ratings
The second core component of the TPB is subjective norms. Subjective norms refer to an individual’s perception of social pressure to perform or not perform a particular behavior (Ajzen & Fishbein, 2005). As a whole, this component of the theory fits well within the existing performance appraisal literature; the possibility of managers feeling pressured to rate a certain way due to social pressure has been discussed within a performance appraisal context for a number of years (Harris, 1994; Mohrman & Lawler, 1983). Mohrman and Lawler (1983) suggested that the organizational context has a moderating effect on performance appraisal systems by informing managers how they should interpret the performance appraisal process. The authors suggest that in a competitive organization, managers would not view the appraisal as “developmental” (i.e., used to identify opportunities for improvement), even if the appraisal system were touted as such, because the norms of the organization would dictate that this would not be a reasonable expectation. Similarly, Glickman (1955) suggested that in competitive settings, rating inflation could beget rating inflation. For example, if some managers inflate their employees’ ratings, these managers are giving their employees a competitive advantage. Therefore, in order for other managers to give their employees a level playing field, they, too, would have to inflate their ratings.
As a whole, the literature in this domain suggests that there can be important overarching norms that dictate to managers how or how not to rate their employees’ performance. Consistent with these ideas, we also assert that normative pressures are an important component in determining rating behavior; however, we propose that norms are most influential when they pertain to specific rating intentions. In the context of the TPB, perceptions of normative pressures are generated from managers considering whether or not other importantpeople think they should engage in the particular behavior (Ajzen & Fishbein, 2005). This is an apt conceptualization in the performance appraisal context because managers can be susceptible to pressure from a number of sources. Specifically, managers may feel that their superior(s), other managers in their organization, coworkers, or subordinates think they should rate a certain way. Such expectations for ratings can be communicated indirectly through organizational norms and social cues, or directly through explicit statements and requests. In what follows, we integrate subjective norms alongside our four focal rating intentions.
Subjective norms for rating accurately
Subjective norms for rating accurately refer to a manager’s perception of the social pressures to render accurate performance ratings (Ajzen & Fishbein, 2005). In the performance appraisal literature, norms are perhaps most frequently discussed with respect to rating inaccuracies, most notably rating inflation or leniency. Harris (1994) argued that situational variables, such as organizational norms, influence raters’ motivations to rate accurately. He suggested that if there are norms for raters to provide lenient performance ratings, this will encourage raters to be lenient. Similarly, Longenecker et al. (1987) found that managers were more likely to distort performance ratings if they perceived that top management did not take the appraisal process seriously. In interviews, managers stated that if others did not take the performance appraisal process seriously, then they would be likely to adopt a similar mindset and not strive to produce accurate ratings. More recently, Spence and Keeping (2010) found that presenting individuals with normative information indicating how other managers in their organization rate employees altered the performance ratings in the direction of the norm. Compared to when no information was available, norms for inflation resulted in higher ratings and norms for accuracy prompted lower performance ratings.
Somewhat related to norms, a concept referred to as “trust in the performance appraisal process” was found to predict performance ratings (Bernardin & Beatty, 1984). The authors defined trust in the appraisal process as raters’ perceptions about the extent to which other raters in their organization rate accurately. Bernardin and Beatty (1984) found that the less trust individuals had in the performance appraisal process (i.e., the more they perceived that others did not rate accurately), the more lenient were the ratings that they provided. Similarly, Bernardin, Orban, and Carlyle (1981) conceptualized and operationalized trust in the performance appraisal process as raters’ opinions as to whether the typical rater inflates ratings. In other words, is it normative to rate accurately or inaccurately? They developed the Trust in the Appraisal Process Survey and found that the more people believed that others inflated ratings, the more likely they were to inflate their own ratings. Tziner, Murphy, and Cleveland (2002) also examined trust in the performance appraisal process and found that individuals who have low trust have a greater tendency to assign higher ratings.
The research described before illustrates the potential effect that social pressures within organizations can have on rendering accurate performance ratings. In general, it has been demonstrated that norms can lead to inflation of performance ratings and contribute to rating inaccuracies. In the context of our model, we propose that perceived pressures to provide accurate or inaccurate ratings will directly influence a manager’s intention to provide an accurate rating.
Subjective norms for rating to avoid conflict
We expect that normative pressures to avoid or not to avoid conflict will alter a manager’s intention to avoid conflict in the appraisal process. Subjective norms for rating to avoid conflict refer to a manager’s perception of what important people value with respect to handling conflict (Ajzen & Fishbein, 2005). When rating employee performance, a manager may or may not feel that his or her boss, other managers, and/or subordinates feel it is necessary or desirable to avoid conflict. This pressure can be explicitly conveyed in organizational values that stipulate honesty and authenticity, or more indirectly through social information (i.e., witnessing conflict or failures to avoid conflict; Salancik & Pfeffer, 1978). On the other hand, in highly litigious or bureaucratic organizations it may be desirable to avoid conflict, as discord is likely to be time-consuming and costly. As such, we propose that social pressures with respect to conflict will influence a manager’s intention to render a rating that avoids conflict.
Subjective norms for rating benevolently
Subjective norms for rating benevolently refer to a manager’s perception of the social pressures to provide useful or kind performance ratings (Ajzen & Fishbein, 2005). Indirect evidence that managers alter performance ratings when trying to provide useful ratings can be found in the differences between ratings that are conducted for administrative versus developmental purposes (Jawahar & Williams, 1997). Specifically, ratings are typically lower when they are to be used for developmental purposes (i.e., identifying areas for employee improvement) as compared to administrative purposes (i.e., making formal personnel decisions; Jawahar & Williams, 1997). Although these findings do not represent a direct test of how social pressures alter intentions to render benevolent ratings, it stands to reason that managers will face different social pressures in different contexts. Findings that speak more directly to the effect of social pressures on the process of rendering performance ratings can be found in the rater accountability literature (Mero, Guidice, & Brownlee, 2007). It has been purported that performance ratings will differ as a function of whether the manager is accountable to the subordinate or to his/her own supervisor (Harris, 1994). Tetlock and Kim (1987) defined accountability as the amount of social pressure to justify one’s judgments to others. Consequently, it has been suggested that managers will rate in a manner that is most likely to gain them favor with those to whom they are accountable (Mero et al., 2007). If a manager is accountable to his/her employees, the manager will likely be concerned with the employees’ best interests (i.e., motivating them, procuring resources for them, and keeping them happy). Much like a manager facing pressures to avoid conflict, we propose that social pressures to provide employee-focused ratings will influence managers’ intentions to provide benevolent ratings.
Subjective norms for rating to manage impressions
Subjective norms for rating to manage impressions is defined as a manager’s perception of the social pressures to render performance ratings that advance his/her self-interest (Ajzen & Fishbein, 2005). The performance appraisal literature has purported and found that managers will rate to advance their own self-interest (Spence & Keeping, 2010; Villanova & Bernardin, 1989). Although this is typically viewed as a stand-alone motivation, Harris (1994) positioned this tendency as being the result of situational as well as personal variables. Although no specific propositions were made as to which factors can result in a manager’s motivation to manage impressions, Harris (1994) identified compliance with organizational norms as a predictor of impression management. Building upon Harris’ ideas and using the TPB, we propose that managers will perceive varying degrees of social pressure to manage a positive self-image when rating performance, which will predict managers intention to render ratings with this goal in mind. For instance, a manager’s intention to use the performance rating as a means to manage impressions will be partially determined by his/her perception that managing impressions is the appropriate thing to do (i.e., “Other managers do it”).
Perceived behavioral control and performance ratings
Lastly, according to the TPB, perceived behavioral control is another proximal determinant of behavioral intentions. Perceived behavioral control refers to “people’s perception of the ease or difficulty of performing the behavior of interest” (Ajzen, 1991). In general, the construct of perceived behavioral control is very relevant in a performance appraisal situation, as evaluating employee performance and providing feedback can be a difficult task. One reason it can be difficult is that it can require the communication of negative information. As McGregor (1957) suggested, managers might find it uncomfortable to transition from their daily task of motivating employees to evaluating their worth during appraisal time. Given the potentially difficult and uncomfortable nature of performance evaluations, it is understandable that some managers may be, or perceive themselves to be, more adept at handling employee evaluations compared to others.
Next, perceived behavioral control is discussed in the context of our focal rating intentions.
Perceived behavioral control to rate accurately
In describing perceived behavioral control, Ajzen and Fishbein (2005) also use the term “self-efficacy,” a construct that has received attention in the performance appraisal literature (e.g., Bernardin & Villanova, 2005). Most typically, rater self-efficacy has been studied in relation to performance appraisal inaccuracies by showing associations between low rater self-efficacy and inflated ratings (e.g., Tziner & Murphy, 1999). The increased occurrence of high performance ratings coming from raters with low self-efficacy is thought to be the result of raters coping through an avoidance strategy. For example, a manager may not feel capable of handling confrontations that result from low ratings, or providing truthful justifications for low ratings. Consequently, it is much easier for the manager to try to reduce the likelihood of these negative events occurring by providing everyone with high performance ratings. If no negative feedback or information has to be disseminated, the likelihood of having to face negative consequences is greatly diminished (Bernardin & Villanova, 2005; Tziner & Murphy, 1999).
A concept, closely related to managers’ self-efficacy, which has also been investigated in relation to appraisal inaccuracy, is rater discomfort with the appraisal process. Rater discomfort has been proposed to be a result of a lack of confidence in the ability to perform performance evaluation duties. Research has demonstrated an association between rater discomfort and elevated ratings. Villanova and his colleagues developed the Performance Appraisal Discomfort Scale and found that a rater’s level of discomfort with the performance appraisal process is stable and positively associated with the level of his/her performance ratings (Villanova, Bernardin, Dahmus, & Sims, 1993). The general idea is that individuals will experience discomfort when their circumstances are in disagreement with their preferences, so they try to avoid these circumstances. For instance, if managers prefer not to evaluate employees, they will likely experience discomfort when having to do so, and attempt to avoid actually evaluating employees by rating employees uniformly high.
Perceived behavioral control for rating to avoid conflict
A manager’s perceived behavioral control for rating to avoid conflict refers to a manager’s perception of how easy or difficult it would be to avoid conflict. This construct holds particular promise in helping to understand rating inflation, as the most common way for a manager go about avoiding conflict is by elevating performance ratings. Receiving favorable performance ratings is usually a pleasant experience for employees and providing such ratings is relatively easy and effortless for managers in many organizations.
However, a manager may perceive that there are constraints for rating in a manner that avoids conflict. For example, an employee may possess ongoing and persistent negative attitudes regarding the organization and/or performance appraisal. In this case, a manager may feel that his/her ratings will have no effect on avoiding conflict with this employee, resulting in a low perceived behavioral control to rate to avoid conflict. Similarly, some organizations utilize rating formats that limit the amount of control that managers perceive they have over the ratings they provide. For example, with a forced distribution system, some employees must be placed on the low end of the distribution. Managers who are rating performance in this kind of a system may experience less perceived behavioral control to avoid conflict because some employees will be unhappy.
Perceived behavioral control for rating benevolently
A manager’s perceived behavioral control for rating benevolently refers to a manager’s perception of how easy or difficult it is to rate in a manner that will be useful, considerate, or accommodating to employees. As with perceived behavioral control for rating to avoid conflict, perceived behavioral control for rating benevolently would stem from the constraints that managers believe will affect their ratings. For example, a manager may want to provide a long-time employee with a favorable rating, even though his/her performance was uncharacteristically poor for the year because the employee experienced a death in his family. However, managers will differ in their perceived ability to provide benevolent ratings.
Perceived behavioral control for rating to manage impressions
Lastly, perceived behavioral control for rating to manage impressions refers to managers’ perceptions of the extent to which it is easy or difficult to self-enhance when rating employee performance. These perceptions are expected to directly influence managers’ intentions to manage impressions during the performance appraisal process. As with the other rating intentions, organizational and situational constraints will influence the extent to which managers feel they can utilize the performance ratings of their employees to manage a positive impression of themselves.
Tributary variables and rendering a rating with specific intentions
When attempting to predict behavior, general attitudes and personality traits are notoriously poor predictors (Ajzen & Fishbein, 1977). In our model we propose that tributary variables do not influence performance ratings directly, but rather, influence ratings through the core components of attitude toward rating accurately, subjective norms for rating accurately, and perceived behavioral control to rate accurately.
It is within this general category of tributary variables where a substantial amount of performance appraisal research fits. Much of this research has focused on predicting ratings and has sought to establish direct relations between these variables and actual performance ratings. For example, this line of research has attempted to answer questions such as, will having to provide face-to-face feedback result in higher ratings? Do the personality traits of raters predict ratings? Or, is the purpose of the appraisal associated with ratings? By categorizing this literature into the tributary variables section of our model, we effectively introduce mediating mechanisms to a large portion of the performance appraisal literature. As has been the case with the TPB, we hope that by conceptualizing this literature as distal variables mediated by the core components we will gain better prediction and understanding of how managers render performance ratings. Moreover, we go beyond existing mediated models that propose rater goals predict performance ratings (cf. Murphy, 2008) by introducing mediational mechanisms to explain how these intentions arise and how tributary variables are related to intentions; namely, through attitudes, norms, and behavioral control. Specifically, we propose that different tributary variables can work through TPB to determine the presence and/or strength of different rating intentions.
We divide tributary variables into four sections: Manager Characteristics, Situational Factors, Subordinate Characteristics and Subordinate Job Performance. Given the volume of variables involved in these processes, the propositions we develop next are at an intentionally high level as it would not be possible to articulate specific meditational chains for each variable. However, the relationships explicated in the propositions are specific and, thus, researchers should be able to apply our framework and test these propositions with particular contextual variables.
Manager characteristics
A major tributary variable category that we suggest should influence the core components is the personal characteristics of a manager. To examine this category, we draw on the fairly large literature that has examined the role of a rater’s dispositional factors in predicting performance ratings. A general premise in all of these studies is that a portion of the variance in rating inaccuracies, typically rating leniency, can be accounted for by stable individual tendencies. The idea that rating leniency is a stable predisposition dates back to Guilford (1954). Guilford proposed that certain people are naturally more lenient than others and that rating style is relatively stable within raters.
More recently, the idea that raters have a stable rating tendency has received empirical attention. In a series of studies, Kane, Bernardin, Villanova, and Peyrefitte (1995) found that rater leniency was stable within individuals. Additionally, Bernardin et al. (2000) found that agreeableness and conscientiousness, relatively stable personality traits, are associated with performance ratings. More specifically, the authors found that conscientious individuals tended to provide low ratings, whereas agreeable people were more likely to provide high ratings. Conscientious raters are thought to provide more accurate (i.e., less lenient) ratings because they are believed to be more diligent, performance oriented, responsible, and thorough individuals. Conversely, more agreeable raters are believed to provide more lenient ratings due to their tendency to be more generous, kind, and sympathetic (McCrae & John, 1992). Bernardin et al.’s (2000) results are particularly appealing, as they provide more nuance to the traditional conception of trait-based leniency. More recently, Murphy et al. (2004) found that rater goals when rating performance are relatively stable over time and can predict ratings. It has often been suggested that raters have specific goals in mind when rating employee performance (e.g., Cleveland & Murphy, 1992). These findings suggest that such motives may be relatively stable and almost trait-like.
In our model, we propose that managers’ dispositions, traits, and tendencies do not influence ratings directly, but rather predict different rating intentions by altering the core components associated with each of the rating intentions. For instance, highly conscientious managers may have different attitudes toward rating accurately than less conscientious managers. Because highly conscientious individuals are diligent, performance-oriented, and thorough (McCrae & John, 1992), they may have a more favorable attitude toward rating an employee’s performance thoroughly and accurately, irrespective of the rating consequences. Similarly, highly conscientious individuals might feel more capable of handling the consequences of negative ratings, altering their perceptions of social pressure and perceived behavioral control. Putting the example together, it is possible that having high levels of conscientiousness can lead to the intention to rate accurately and weaker intentions to avoid conflict because highly conscientious managers would have more favorable attitudes toward rating accurately, be less influenced by social pressures to inflate ratings, and have greater perceived behavioral control to provide accurate ratings.
Another example of how the research on dispositional characteristics may fit into our model is in the domain of managers’ appraisal experience. In a qualitative investigation, Bernardin and Villanova (2005) found that managers’ reports of performance appraisal critical incidents differed as a function of their performance appraisal experience. More specifically, the authors found that less experienced managers more frequently mentioned interpersonal features of the performance appraisal process, whereas more experienced raters commented on more administrative problems. Spence and Keeping (2010) found that individuals with more performance appraisal experience provided lower performance ratings than those with less experience when rating the same employees. In accordance with our model, we suggest that rating experience influences different rating intentions by impacting the core components of the model. Specifically, it is logical to expect that increased experience may result in a weaker intention to rate to avoid conflict because with an increase in experience, managers may be less concerned with the negative consequences of conflict and will not have as favorable an attitude toward avoiding conflict as those with less experience. Consequently, a weaker attitude toward avoiding conflict will result in a weaker intention to rate to avoid conflict. We would like to note that an effect of rater experience on conflict avoidance intentions is but one possibility, and that rater experience may influence performance ratings through other rating intentions as well. Although performance appraisal research has not specifically examined the relations between managers’ individual difference variables and the TPB constructs, our model provides the theoretical rationale for doing so.
Situational factors
Performance appraisal research and theory has suggested that managers consider the outcomes of performance appraisals prior to rating and that these considerations alter the ratings that are provided (e.g., Cleveland & Murphy, 1992). Quite frequently, the potential consequences of ratings are the result of situational variables (e.g., appraisal purpose) or the structure of the performance appraisal process (e.g., having to provide face-to-face feedback). For example, researchers have suggested that ratings will be influenced by managers who attempt to avoid such things as conflict (Villanova & Bernardin, 1989), uncomfortable situations (Fisher, 1979; Villanova et al., 1993), having to communicate negative feedback (Larson, 1984), and potentially damaging interpersonal relationships (Harris, 1994). A number of empirical investigations have found evidence to suggest that individuals will be more lenient when they are required to give face-to-face feedback (Ilgen & Knowlton, 1980; Klimoski & Inks, 1990; Stockford & Bissel, 1949). According to our model, we propose that the leniency found in situations involving face-to-face feedback may be the result of the creation of an intention to avoid conflict via a change in managers’ attitudes. For example, when a manager knows that he/she must provide a subordinate with feedback, there are foreseeable consequences involved in communicating negative information (i.e., conflict). According to the TPB, the consideration and awareness of foreseeable negative consequences associated with a particular behavior will result in a negative attitude toward the behavior (Ajzen & Fishbein, 1977). In the case of performance appraisals, if there are foreseeable negative consequences associated with providing a low rating, managers may have a positive attitude toward rating to avoid conflict and form an intention to rate to avoid conflict.
Similar to affecting managers’ attitudes, different situations could evoke different social norms or perceptions of behavioral control, and could potentially create different rating intentions. For instance, in some contexts it may be more or less expected for managers to manage a positive impression or to rate benevolently. As mentioned earlier, the purpose of the performance appraisal has been shown to influence the level of performance ratings, with administrative appraisals typically producing higher performance ratings than developmental appraisals (Jawahar & Williams, 1997). Using our model, we can propose that different appraisal purposes create different subjective norms, which work to create different rating intentions. Namely, it may be normative to provide elevated ratings in an administrative context in order to avoid limiting an employee access to opportunities and resources (i.e., to be benevolent) and more honest ratings in a developmental context (i.e., rate more accurately). Furthermore, a manager’s perceived ability to render an effective rating in different rating contexts could be different, such that managers may perceive that it is easier to provide harsh ratings in a developmental context compared to an administrative context due to differences in procedural or policy mechanisms, which also contribute to the formation of different rating intentions. As a result, we propose:
Subordinate characteristics
In addition to manager characteristics and situational factors, subordinate characteristics have also been examined to determine how they contribute to performance ratings. Characteristics such as subordinate demographics and dispositional tendencies are included under this classification. Extant research has examined the relationship between subordinate characteristics and the performance ratings they receive. For example, subordinate characteristics, such as likeability and similarity to the manager have generally been shown to result in higher performance ratings, irrespective of actual performance (Bates, 2002; Cardy & Dobbins, 1986). Other research has shown that demographic characteristics, such as race or gender can alter the ratings that are received (e.g., Dobbins, Cardy, & Truxillo, 1986; Kraiger & Ford, 1985; Schmitt & Lappin, 1980). However, these findings are not universal, with other researchers failing to find similar differences in performance ratings as a function of gender or race (e.g., Pulakos, White, Oppler, & Borman, 1989; L. M. Shore & Thornton, 1986; Thompson & Thompson, 1985).
Examination of the relationship between subordinate characteristics and performance ratings should benefit from the inclusion of the mediating mechanisms that our model provides. We suggest that subordinate characteristics predict ratings by influencing managers’ rating intentions via the core components for each intention. This configuration is consistent with other frameworks that have argued that ratees’ personal characteristics can have an indirect effect on performance ratings through managers’ perceptions (e.g., Lefkowitz, 2000). For instance, managers may possess more favorable attitudes towards rating benevolently for employees whom they have an affinity for and are similar to (Wayne & Liden, 1995) or possess different subjective norms for rating accurately as a result of a ratee’s age (Ferris, Yates, Gilmore, & Rowland, 1985), race (Cox & Nkomo, 1986), gender (Fuegen, Biernat, Haines, & Deaux, 2004) or even disability status (Czajka & DeNisi, 1988). In addition, subordinate personality and gender may lead managers to have different levels of perceived behavioral control to avoid conflict as individuals respond differently to performance feedback as a result of these characteristics (Ilgen & Davis, 2000; Kernis & Sun, 1994; Roberts & Nolen-Hoeksema, 1989; Smither, London, & Reilly, 2005; Stucke & Sporer, 2002). Although these are but a few possibilities, they serve to illustrate the types of variables and relationships that we classify within this section. We offer the following proposition:
Subordinate job performance
We also include the “actual” job performance of the employee that a manager is evaluating under the category of tributary variables. We do this because we conceptualize job performance as one of the factors, albeit a central factor, a manager must consider when rendering a performance rating. In line with the rest of our model, we purport that the relationship between a subordinate’s job performance and a manager’s intention to rate that performance will be mediated by a manager’s attitude, subjective norms, and perceived behavioral control to rate in line with the different intentions. For instance, managers may have different attitudes toward rating accurately for high and low performers. Similarly, there may be different organizational norms for rating low performers compared to average performers. Additionally, managers may perceive that it is more difficult to provide a low performer with a low rating than it is to provide an average performer with an average rating because a low rating conveys negative information.
In addition to different levels of performance, different domains of job performance, such as task performance (Murphy, 1989), contextual performance (Organ, 1988), and counterproductive performance (Robinson & Bennett, 1995) could have differential effects on the core TBP components of our model. Although previous research has demonstrated that raters differentially consider each of these performance dimensions (e.g., Rotundo & Sackett, 2002), the mechanisms by which this occurs are elucidated by our model. For instance, contextual performance may give managers a stronger favorable attitude to provide a benevolent rating or counterproductive performance may create a more positive attitude for managers to rate accurately.
Consistent with the other tributary variables in our model, we propose that a subordinate’s job performance can create different rating intentions by influencing the core TPB components. Unlike the other tributary variables, however, we also propose a direct effect between a subordinate’s job performance and a manager’s rating intention. We believe it is difficult for a manager to form an intention for a rating without considering the performance of whom he/she is evaluating, because the evaluation of performance is the reason the process is occurring. Moreover, we suggest that a subordinate’s job performance may influence any of the rating intentions, so we do not specify one in our next proposition. Based on these arguments, we present the following two propositions:
Intentions to rate and rendering a rating: Direct effects and moderators
As we have stated previously, a key premise of the TPB is that intentions are the most proximal predictor of behavior (Ajzen & Fishbein, 2005) with meta-analytic data supporting the predictive ability and directionality of the effect (Sheeran, 2002; Webb & Sheeran, 2006). In our model, this is represented by the direct relation between a manager’s rating intention and the actual rating. As such, we present the following proposition:
To this point, many of the variables under consideration are ones that are within the volitional control of the rater. However, we know that despite one’s best intentions, behavior consistent with these intentions does not always follow due to constraints created by ability. The remainder of the paper will focus on outlining factors that moderate the intention–rating relation. We classify these mechanisms as ability and breakdown these mechanisms into two categories: actual behavioral control and internal control to rate (Figure 2). These two categories are meant to represent the measurement and cognitive literatures, respectively, and thus, represent larger conceptual domains full of different variables and processes rather than single constructs. As such, they are depicted in Figure 2 as ovals rather than rectangles to distinguish them from other parts of the model. As a heuristic for understanding these two areas of our model, it may help to conceptualize actual behavioral control as primarily representing variables in the environment that prevent managers from rating in line with their intentions, such as appraisal formats, policies, processes, and systems. Internal control represents the cognitive, or internal, variables and processes that often prevent managers from rendering ratings in accordance with their intentions.
Ability
Actual behavioral control
In addition to perceived behavioral control, the TPB suggests that, not surprisingly, actual behavioral control can influence behavior. This concept is very relevant to the performance appraisal context as numerous constraints are placed on managers such that they may not be able to rate as they intend. For instance, appraisal formats, budgetary constraints, and the inability of managers to be omniscient and omnipresent can limit rating behavior. In fact, much of the existing measurement-based appraisal research has worked from the starting point of creating constraints that discourage inaccuracy and facilitate accuracy. This research has focused on manipulating various aspects of performance appraisal forms, such as the number of scale points, number of dimensions, and types of anchors, to induce accuracy (Landy & Farr, 1983).
Situational variables linked to appraisal formats, such as budgetary constraints, can also affect ratings. For instance, if ratings are used to determine bonuses, a manager will not be able to provide everyone with an outstanding rating if there is not enough money in the budget to provide the corresponding bonuses. Moreover, from an epistemological view, managers are limited in their capacities to know their employees’ actual level of performance. Managers may believe they know their employees’ true performance, but this belief can be inaccurate due to the limited opportunities managers have to observe performance.
As such, by including this dimension in our model, we believe our model is effective in depicting why measurement-based approaches to performance measurement cannot singlehandedly eliminate rating errors. In fact, under our current framework, even unconventional and creative rating inaccuracies can be explained. In the case of forced distributions, some research has shown that raters adapt their rating behavior to the constraints of the situation. For example, Murphy (1992) suggested that forced distributions are not flawless and that the restrictions they place on raters can be overcome when managers create an “it’s your turn” effect. In such a case, raters will essentially rotate their employees through the distribution, having every employee spend time at the top of the distribution in order to get around the problem of having to classify and rank employees.
We propose that situational constraints such as time pressures can limit a manager’s ability to rate in line with an intention and that rating format changes merely serve to alter the type of rating behavior exhibited, not the quality of ratings. Such predictions are understandable when considered within the context of our model: imposing restrictions on managers is just one aspect of a much larger process that determines how managers will rate their employees. Although we do not make specific propositions as to what behaviors will result from which constraints—this could be a topic for several other papers—we use this section to illustrate how our model is useful in explaining how rating format and restrictions alter performance ratings. As such, we propose the following:
Internal control
This section of the model pertains to cognitive variables and processes that can alter performance ratings, and is meant to embody the large cognitive literature. Although this section is not a component of the TPB, its inclusion accounts for a large portion of the performance appraisal literature. The concept that manager cognitions alter performance ratings is popular and heavily studied. In fact, a large portion of the research on performance appraisal inaccuracies has focused on the role of rater cognitions. Precipitated by Landy and Farr’s (1980) review of the performance appraisal literature, research turned from measurement issues to rater cognitions in order to explain and understand performance rating errors. In doing so, researchers began to examine how cognitive processes—the ways in which individuals attend to, select, encode, and retrieve performance information—contribute to appraisal errors (Ilgen et al., 1993). Given the scope and magnitude of this research, it would not be possible in the current paper to go into detail about how specific cognitive processes contribute to performance errors in our model. For our purposes, we will provide a very brief description of where these processes fit relative to others. Specifically, we refer to managers’ cognitions as “internal control” and propose that it moderates the relationship between a manager’s rating intention and actually rendering a rating in line with this intention.
Although cognitive performance appraisal research has been effective in increasing our understanding of the cognitive processes involved in evaluating performance, it has often been criticized for its laboratory focus (Banks & Murphy, 1985). Cognitive performance appraisal research is conducted in labs and, as a result, the experiments are free of contextual variables that are present in actual performance appraisal situations. The absence of contextual variables in cognitive research creates the implicit assumption that managers are trying to rate accurately. Because there is no context, no rating consequences, or rater motivations, managers should have no motive other than rating accurately. However, at the same time, the lack of context and motivations present in these studies speaks to the importance of their findings. Research has found that even when people are trying to rate accurately, there are significant differences in performance ratings as a result of biases and processing errors. Moreover, research has found that individuals can exhibit racial and gender biases when rating and that likeability, rater mood, and frame of reference can significantly alter performance ratings.
Consistent with other authors (Banks & Murphy, 1985; Harris, 1994) we propose that these cognitive and affective processes are not the starting point of rating error, but rather they influence performance ratings after a number of contextual, motivational, and dispositional variables have had their say. As we have shown in the first part of the model, even if a manager possesses the pure ability to rate accurately (i.e., the rater is free of biases and is not susceptible to cognitive errors), he/she will not necessarily produce accurate ratings. As exemplified in our model, inaccuracy could result if (a) a manager’s attitude toward accuracy is not positive, (b) if he/she perceives social pressure to rate inaccurately, and (c) if he/she perceives that he/she does not have the control to rate accurately. Thus, we suggest that cognitive biases moderate the relationship between a manager’s rating intention and rendering a rating in accordance with this intention, such that these biases still influence the way in which managers rate, irrespective of their intentions.
Putting it all together
We have proposed a model that integrates several core areas of performance appraisal research (i.e., measurement, contextual, motivational, and cognitive research) into one unifying framework. The basis for the model is the TPB, an established psychological theory to predict and understand behavior, which we have shown is compatible with existing performance appraisal theory and research. The behavior we predict is a performance rating rendered by a manager.
To summarize, we propose that the most immediate predictor of a performance rating is a manager’s rating intention (or intentions). Our model focuses on the intentions of rating accurately, rating to avoid conflict, rating benevolently, and rating to manage impressions. For each of these intentions, tributary variables, comprised of manager characteristics, situational factors, and subordinate characteristics, predict a manager’s rating intention by first influencing the manager’s attitude, subjective norms, and perceived behavioral control to rate in accordance with a particular intention. These three core components work together to form a manager’s rating intention. We highlight the role of a subordinate’s job performance as working through the same mediational pathway as other tributary variables, but also directly influencing manager’s rating intentions. The relationship between a manager’s rating intention and rendering a rating in accordance with this intention is moderated by a manager’s actual behavioral control and internal ability to rate in line with the rating intention. Finally, although we specify propositions at the level of individual intentions, we acknowledge that rating intentions may also interact to influence a manager’s rating.
To provide a specific example of our general propositions, we suggest that an experienced manager, with high levels of conscientiousness, in a developmental rating situation will likely intend to render an accurate rating because he/she will hold a positive attitude toward rating accuracy (due to conscientiousness, and purpose of appraisal), will have high perceived behavioral control (due to high level of experience), and will perceive social pressure to rate in the best interest of the employee (due to the developmental context). Subsequently, the manager will be limited by his/her cognitions (i.e., did the manager observe, store, and retrieve accurate performance information?) and by situational constraints (i.e., does the manager have time and an opportunity to render the desired rating on the appraisal form?).
Boundary conditions
Our model was devised to explain rating intentions in a context whereby a manager rates subordinates in a formal performance appraisal context. A formal performance appraisal context being defined as “a discrete, formal organizationally sanctioned event, usually not occurring more frequently than once or twice a year, which has clearly stated performance dimensions and/or criteria that are used in the evaluation process” (DeNisi & Pritchard, 2006, p. 254). This is not to say that our model could not be adapted to other contexts, such as the rating of peers, the rating of supervisors, informal feedback situations, or performance management contexts. Such applications would provide exciting opportunities for further theoretical development and refinement of the processes outlined herein. In making this stipulation, our intention is to explicitly identify the precise conditions that our model was configured for.
Conclusion
We believe that our model provides exciting opportunities for empirical research and further theoretical development. We have specified a causal chain of performance rating influences and outlined numerous theoretically grounded propositions. In doing so, our goal was to present a model that (a) explains the presence of multiple rating intentions, (b) integrates the explanation of these intentions alongside existing appraisal research and theories and (c) proposes how these intentions and their causes work together alongside known processes (i.e., measurement and cognitive process) to affect performance ratings. We hope the current paper will stimulate further research and theoretical developments in the performance appraisal literature.
Footnotes
Funding
This research was supported by grants from the Social Sciences and Humanities Research Council of Canada.
