Abstract
There is a sizable literature on higher education, both in the United States and beyond, that draws attention to the phenomenon known as grade inflation. We offer an interpretation of grade inflation that turns on the choices students have over academic departments, and we argue that patterns in grades cannot be considered in isolation from the incentives that students have to sort themselves strategically across departments. Our argument draws on a game-theoretic model in which students of varying abilities face a choice between enrolling in a department whose grades are inflated and thus ability-concealing versus enrolling in a department whose grades are ability-revealing. In equilibrium, all grades are high. Nonetheless, what appears to be grade inflation is a result of the fact that the ability-revealing department in our model attracts highly talented students seeking to distinguish themselves from students of lesser ability, who avoid said department because enrolling in it is costly. Our formalization shows how student sorting can confound grades, and it implies that a full understanding of university’s grade distribution requires knowing which departments in the university are ability-concealing and which, in contrast, are ability-revealing.
Introduction
Grade inflation appears by many measures to be endemic across the higher education landscape. Ray (2014) documents rising grades in the United Kingdom, and as of 2012, “80 percent of students [in Germany] graduate[d] with one of the top two grades.” 1 A 2010 study of grade trends at over 160 American universities revealed that mean grade point average has risen by nearly a tenth of a point per decade (Rojstaczer and Healy, 2010), and the revelation that the most commonly awarded grade at Harvard University is an “A” unleashed a firestorm of media attention on the university’s grading policies. 2 In 2001, 91% of Harvard’s graduating class received honors, rendering this indicator of student ability essentially meaningless in the sense of distinguishing among Harvard undergraduates. 3 And at Yale University, approximately 62% of grades were “A” and “A” as of Spring, 2012, whereas in 1963 this proportion was only 10. 4 Empirical research on grading tends to show that the phenomenon of grade inflation is broad-based, although Kuh and Hu (1999) argue that, in the United States, grade inflation is disproportionately a feature of research universities and elite liberal arts colleges.
Grade inflation can be associated with numerous social problems. Because inflated grades can mask variance in student abilities (Pattison et al., 2013; Sabot and Wakeman-Linn, 1991), post-education institutions that must contend with universally high grades will have difficulty selecting the best students for the most demanding tasks. This will impose an efficiency cost on society. In addition, differing rates of grade inflation across departments can skew student enrollment decision; STEM (science, technology, engineering, and mathematics) departments are disproportionately less likely to grade inflate, and a student who receives an early and poor grade in an STEM class is less likely to major in an STEM field (Ost, 2010; Sabot and Wakeman-Linn, 1991; Strenta et al., 1994). Moreover, there are gender and socioeconomic differences in the types of students whose grades have increased over time; women in particular are disproportionately likely to enroll in disciplines prone to grade inflation (Riegle-Crumb et al., 2012; Riegle-Crumb and King, 2010) and in disciplines perceived as “soft” (Carnevale et al., 2011; Swift et al., 2013). This will make it difficult for the most talented women to stand out from the rest. There is also evidence that, among American high school students, grades have increased more slowly for Hispanic students than members of other ethnic groups (Sawtell et al., 2003), and in general minority students consistently receive lower grades than White students (Farkas and Hotchkiss, 1989; Van Laar et al., 1999). We return later to a discussion of social consequences of grade inflation, but for the moment, it suffices to note that differential rates of grade inflation across department and student types can reinforce stereotypes which affect some students more than others.
As we review shortly, scholars have offered a variety of explanations for observed upward trends in grades. Some explanations for grade inflation focus more heavily on (and tend to blame) the role of educational institutions and the incentives that their rules and procedures engender, while others draw attention to the behaviors of students and the extent to which their choices over departments and courses have downstream effects on measures like average grades. Certainly there are connections between these two ways of thinking—students make decisions in an environment crafted by universities—but it is nonetheless still useful to distinguish between incentives facing faculty members and incentives facing students.
Having noted this dichotomy, we offer an interpretation of grade inflation that turns on student choice over departments or, more broadly, courses of study. In particular, we present a formal model that allows students to choose whether they want to study in a department whose grades are inflated versus a department whose grades are accurate. Inflated grades can be thought of as ability-concealing and accurate grades as ability-revealing. The students in the model differ in their underlying levels of talent, and this induces students to have different preferences over the extent to which they want their grades to reveal their underlying abilities. Students in the model sort themselves strategically across departments, and this confounds the interpretation of grades. Among other things, the model shows that trends in grades—upward or downward—can be induced by changes in student sorting abilities. In light of this, caution should be exercised when drawing connections between trends in grades and potentially deleterious consequences for educational institutions and society at large.
In what follows, we discuss literature on grade inflation, and we then present our model and explain its various components. The basic model has two types of equilibria, and we show that, in equilibrium, strategic sorting by students leads to high grades. We then present an extension of the model which generalizes our initial, and coarse, characterization of student ability and also allows for what we call an education bonus. In the extension, we observe high grades in equilibrium, and the extension yields a somewhat counter-intuitive result: the more difficult a university’s ability-revealing department, the higher are average grades and the more it appears that grade-inflation is ubiquitous. This extension has policy implications for efforts to rein in high grades, and we discuss such implications and others in the conclusion.
Grade inflation
Concerns over grade inflation in higher education are not new (e.g. Ekstrom et al., 1994), and as we noted in the introduction, one consequence of grade inflation is the masking of true student abilities. To the point, Sabot and Wakeman-Linn (1991) show that introductory course grades received in low-grading departments are better predictors of student performance in future classes compared to grades given in what appear to be grade-inflating departments. Similarly, they show that alternate predictors of student ability (e.g. standardized test scores, parental education levels, high school grades, and so forth) are associated with student grades in low-grading departments but not departments that routinely give high grades. These dual findings show that inflated grades effectively mask student abilities and diminish the extent to which grades signaling underlying skills and talent levels. 5
One of the commonly cited reasons for an increase in grades across American colleges and universities has been the increasing weight placed on instructor evaluations in hiring and tenure decisions (Eiszler, 2002; Stratton et al., 1994). Students tend to give better evaluations to professors who award them higher grades (Johnson, 2003), and thus an increased reliance on teacher evaluations during evaluation processes can create incentives for high grades. Nelson and Lynch (1984) argue that the relationship between evaluations and grades can be exacerbated by stagnating faculty salaries, and Pressman (2007) notes that pressure for high grades will tend to be stronger for untenured but tenure-track professors and strongest for adjunct faculty, whose employment depends on enrollments.
Perrin (1998) and Kelly (2009) draw attention to the fact that a university professor may, in the course of grading, compare her students not just to other students at her own university but also to the typical American student. Perrin writes, “[Professors] imagine our students at a mythical Average U., and give the grades they would get there.” If a faculty member believes that her institution’s admission policies lead to a highly talented student body, then it follows that said faculty member should in general assign high grades. On this point, see Achen and Courant (2009) and their anecdote of
a [University of Michigan] chemistry professor who had stuck to the standards of his own undergraduate work for decades, but who came to notice that incoming graduate students at Michigan often had better grades than graduates of [his] department with similar knowledge and skill.
Other arguments on the subject of grade inflation focus specifically on student course selection. For example, in some American universities students are allowed to take classes without grades appearing on transcripts. Strategically minded students may seek to take advantage of this practice by ensuring that grades for their most difficult classes are not visible. If students strategically select certain classes to have non-visible grades, then average grade point may increase even in the face of fixed grading policies (Birnbaum, 1977). Foreshadowing the model that we present here, Prather et al. (1979) argue that changes in average grades can reflect changes in enrollment patterns; their empirical research finds that
English majors tend to receive relatively higher grades in education courses than in their other courses, while the grades they receive for physical science and foreign language courses are, on the average, lower. Physical science courses generally record lower grades for all majors, while teacher education courses comparatively record higher grades for all majors. (pp. 21–22)
The implication of such a finding is that average grades reflect student selection into coursework of interest.
Bar et al. (2009) analyze data from Cornell University and find that publicly available median grades allow students to select into leniently graded classes. Strenta et al. (1994) and Ost (2010) have similar findings, and both note that low grades in first year science classes increases the probability that a student chooses a non-science major. There is also evidence that grading policies respond to the perceived value of a major (Freeman, 1999). When a department’s graduates do not perform as well on the job market, they are forced to “buy” students with higher grades. Jewell et al. (2013) validate this, finding that there is substantial departmental variation in grade inflation.
In contrast to theories of grade inflation that consider student and institution incentives, Adelman (2008) argues that increasing grades may be explained by improved student ability and/or teaching quality. In this view, increasing grades are not inherently problematic. Brighouse (2008) emphasizes that, to assume that there has been no improvement in student quality over the past 30 years is to assume that have been no efficiency gains in higher education over this period.
There is some formal work on grade inflation, but the literature is not extensive. Four examples are Yang and Yip (2003), Chan et al. (2007), Franz (2010), and Popov and Bernhardt (2013). In the former, schools have incentives to give high grades because this helps weaker students obtain jobs; this leads to labor market inefficiencies. In Chan et al. (2007), employers cannot determine whether students with high grades are high quality or whether the university that granted said grades is an easy-grading institution; as in Yang and Yip (2003), this leads to inefficient labor market outcomes. Franz (2010) models professor–student interactions with an eye on the costs on faculty that students impose by requesting high grades. In equilibrium, the “nuisance” students in Franz (2010) lead professors to inflate grades. Finally, Popov and Bernhardt (2013) propose a model where universities compete for job market outcomes. They show that more selective universities have the strongest incentives to grade inflate.
Students and institutions in this limited formal literature are strategic, but extant models in the literature do not allow students to sort themselves across departments (or other academic units) in the way described here. Insofar as contemporary university students appear to be very attuned to grading policies and how they vary by field of study (and even by class and professor), our model of sorting fills a gap in the literature.
Model
We now describe a model that sheds light on the dynamics of student sorting across university departments and resulting patterns in grades. The model is set in a single university and includes a group of students and two departments. Its premise is that the students in the university have already been admitted but must choose a department or course of study in which to enroll. As will be clear, a student’s choice between departments is informed by her interest, or lack thereof, in signaling her intrinsic ability level to a labor market that she will enter upon graduation.
Students
In the model, there are two types of students, low ability and high ability. A student’s ability is fixed and exogenous, and let the proportion of high-ability students in our hypothetical university be
We assume that the post-university labor market rewards high ability, and by implication being high ability is valuable to a student. In particular, if a student is known to be of high ability, then after her education is complete she earns a wage that we call
This latter assumption is not binding; one could treat what we call the low wage
Asserting that there are two types of students—low and high ability—is a simplification. We could have assumed that student ability exists on a continuum, and we consider such an extension to our model after we present initial results.
Departments
We assume that our hypothetical university contains departments whose grades are either ability-revealing or ability-concealing. These types of departments differ only in the manner in which they assign grades to their students. It is broadly accepted that grading practices differ between departments and that these differences impact student enrollment patterns (Bar et al., 2009; Ost, 2010; Sabot and Wakeman-Linn, 1991; Strenta et al., 1994).
An ability-revealing department is one that offers courses with regular and discriminating examinations, projects, assignments, and so forth. These examinations, say, allow the department to know whether a given student enrolled in the department is of low ability or is of high ability, and the department indicates this knowledge via grades. In particular, a grade-revealing department assigns an “A” grade to high-ability students (because these students did well on the department’s examinations) and a “B” grade to low-ability students. Assuming that an ability-revealing department assigns grades of “A” and “B,” as opposed to “A” and “C” or “A” and “D,” is of no consequence. The key here is that an ability-revealing department assigns grades that discriminate between low- and high-ability students. The regular and discriminating examinations given in an ability-revealing department require an effort cost for enrolled students, who know that these examinations and related assignments will ascertain their underlying abilities. Let
In contrast, an ability-concealing department is one whose courses do not discriminate between low- and high-ability students. The courses in an ability-concealing department are by definition not excessively challenging, and the key is that all students enrolled in them receive excellent grades, in particular, marks of “A.” Moreover, the lack of discriminating examinations means that students in an ability-concealing department are not subject to the effort cost comparable to the cost
We could have assumed that students enrolled in an ability-concealing department are forced to pay an effort cost akin to the cost required of students in a grade-revealing department. Had we done this, our model’s equilibria, which follow shortly, would have been a function of the difference between the effort cost required of a student in an ability-revealing department and the cost required of a student in a grade-concealing department. Thus, the assumption that
We also could have assumed that low- and high-ability students pay different costs for attending an ability-revealing department. That is, we could have posited that high-ability students pay
Labor market
As we noted above, we do not formally model firms in a post-education labor market, nor do we model, say, admissions committees in graduate institutions. However, we assume that the market knows which types of departments are ability-revealing and which are ability-concealing. This does not strike us as a particularly strong assumption although we recognize that one could argue that firms, graduate schools, and other post-graduate institutions are not informed about which departments in universities give ability-discriminating grades.
Average grades and grade inflation
As will be clear shortly, our model generates a distribution of grades across students, and from this distribution we can calculate average grades in equilibrium. Before we discuss equilibria, we need to define grade inflation. In our two-type model, high-ability students cannot receive inflated grades because they are of high ability; only low-ability students can have inflated grades. Thus, we say that a low-ability student receives an inflated grade if said student receives a grade that is equal to or greater than the grade received by a high-ability student. If a low-ability student receives a grade of “A,” then we say that this student’s grade is inflated. Why? A high-ability student who attends an ability-revealing department will by definition receive an “A” grade. Thus, a low-ability student who also receives an “A” grade has an inflated grade.
It is important to distinguish between the fraction of inflated grades in equilibrium and the fraction of high grades. These two quantities are distinct: a high grade is not necessarily inflated if it is earned by a high-ability student. Distinguishing between inflated versus high grades helps us understand the difference between situations where there is serious grade inflation (many students receive grades greater than their abilities) versus situation where there are many high yet accurate grades.
Beyond grade inflation
Key to our model is the opportunity for individuals—in our case, students—to signal their types by engaging in costly behavior—taking classes in an ability-revealing department. What makes grading particularly interesting is the fact that grading scales typically have upper bounds. 10 As we will see in our forthcoming equilibrium analysis, this can lead to situations where many students receive the same top grade even if said students differ in their underlying abilities.
If grades lacked an upper bound, then there would be no such thing as the maximum grade a student could receive. Still, an ability-concealing department in this situation could nonetheless give all of its students identical marks. This would be fully consistent with the department’s being ability-concealing. In other words, in an ability-concealing department, the distribution of grades can be compressed even the absence of something like a top grade. With this in mind, even if grades were in theory unbounded, our model still would speak to the existence of different types of departments, some that offer grades that discriminate between students and others that do not.
Many evaluation systems have “top” grades, however, and our model thus applies to situations beyond university grading. Firms regularly have to assess their employees, for example. Some tasks in a firm are presumably ability-concealing and others are ability-revealing. If a firm’s employees have a voice in choosing what they do within a firm, then the dynamics we have touched on in our model might lead to low-quality employees choosing tasks or career paths that are ability-concealing; high-quality employees will do the opposite.
Similarly, suppose that a set of legislators in a city council are of two types, low and high quality, and are faced with a policy reform problem, that is, how best to reform a municipal social welfare program. A legislator can choose between an easy reform, one that does not accomplish much but poses little risk, or a challenging reform, one that might expose a low-quality legislator to the critique that he or she did not design a reform very well. Suppose that after a reform effort, an interest group generates a rating—that is, a grade—of said reform. In this framework, high-quality legislators may seek to distinguish themselves by choosing challenging reforms if voters are sufficiently attuned to the interest group’s post-reform ratings. However, if interest group ratings are bounded above like grades, then the notion of rating inflation is plausible if, say, all legislators choose easy reform projects that shed little light on underlying legislator quality.
Our general point here is that the study of grading extends beyond academic environments. The model we have offered is one of evaluation in an environment in which tasks differ in the extent to which they identify talented individuals. The equilibrium dynamics that are forthcoming below will be helpful guides in understanding all such environments.
Equilibria
We assume that a university has two departments, one that is ability-revealing and the other, ability-concealing; the assumption of two departments is not constraining and we comment on it later. The university’s students,
The essence of a game-theoretic model is that the utility of a player selecting a given strategy is conditional on the strategies chosen by other players. One can see evidence of this type of mutual interdependence in the model described here when considering the value of a student’s enrolling in an ability-concealing department. To make this clear, suppose that a low-ability student is considering such a department. If all other students also enroll in the ability-concealing department, and if there are mainly high-ability students because πH
The (Bayesian) equilibria of the model depend on the fraction πH

Model equilibria as a function of
The proof of Lemma 1 is in Appendix 1, and the lemma characterizes the model’s pooling and separating equilibria. We discuss these equilibria in this order.
Pooling equilibrium. When
of high-ability students is large and indeed almost all students are of high ability. The intuition for this is as follows. Recall that
Continuing, when every student enrolls in the ability-concealing department, which is what happens in the pooling equilibrium under consideration, then all students receive the same grades, in particular, grades of “A.” An outside observer assessing this situation—in which the fraction of “A” grades is one and there is no variance in awarded grades—might be inclined to say that this is a situation characterized by rampant grade inflation. Such a characterization would be inaccurate, however. Rather, the fraction of inflated grades in the pooling equilibrium is
When
One might want to argue that the presence of all students’ pooling on an ability-concealing department is an observable indication that almost every student is of high ability. Thinking empirically about actual trends in grades, this reverses the concern that many have articulated about inflated transcripts. To the point, in the pooling equilibrium discussed here, an abundance of students who enroll in an ability-concealing department means that almost every student is highly talented and not, say, that all students lack ability and are choosing an ability-concealing department because they fear being exposed as such by a grade-revealing department. In our pooling equilibrium, high-ability students do avoid paying the effort cost
In the pooling equilibrium here, the ability-revealing department has no students in it. Presumably this is not ideal for the department, and indeed one might conjecture that such a department would anticipate a lack of students and change its grading policy prior to student enrollment decisions. Department grading policies are probably sufficiently sticky so that changing grading norms is not a simple process, and from this perspective treating department grading policies as exogenous seems natural. Nonetheless, we are exploring the matter of strategic department grading policies in other research.
Separating equilibrium. The model always has a separating equilibrium in which low-ability students attend the ability-concealing department and high-ability students, the ability-revealing department. In this equilibrium, whose existence is not a function of the relationship between
This feature of the separating equilibrium has one rather notable consequence: grades appear inflated in both departments. To be precise, in the separating equilibrium all high-ability students receive top grades—because they are in fact of high ability and are enrolled in an ability-revealing department—and low-ability students receive top grades, too—because they enroll in an ability-concealing department which provides everyone with high grades. Thus, our model’s separating equilibrium, which exists for all values of πH and
In fact, the distribution of grades in the separating equilibrium has literally zero variance because in it every student receives an “A.” These numerous “A” grades, however, reflect fundamentally different dynamics. “A” grades received by high-ability students are accurate evidence of excellent students being willing to subject themselves to an ability-revealing process; thus, these “A” grades do not reflect grade inflation. In contrast, however, “A” marks received by low-ability students are evidence of low-ability students avoiding an ability-revealing process; “A” grades received by these students do reflect inflation, and thus the fraction of inflated grades in the separating equilibrium is
The model’s separating equilibrium is more compelling than the previously discussed pooling equilibrium because the latter only exists when the fraction of high-ability students is very large. With this in mind, we argue that our model shows that student sorting by itself is sufficient to lead to a situation in which all students receive identical grades, all of which are “A” marks; this situation looks like one in which grade inflation is a serious problem but, at least for high-ability students, it is not.
We earlier mentioned that our assumption about the existence of only two departments in a university is not binding. If there were more than two departments in our hypothetical university—some ability-concealing and others ability-revealing—the separating equilibrium we have described here would continue to exist as long as the ability-revealing department or departments imposed effort costs beyond those imposed by the ability-concealing departments. The key to the equilibrium is not the number of departments per se; rather, the key is the fact that ability-revealing departments impose more of a cost on enrolled students than do ability-concealing departments.
Another notable feature of the separating equilibrium is that it requires only that πH be neither 0 nor 1. If πH were 1, then all students would be of high ability and the only equilibrium that would exist in the model would be one in which students pooled on the ability-concealing department. Observationally speaking, grades would appear to be inflated in this scenario, but in reality they would not be because all students would be of high ability. If, on the other contrary, πH were 0, then all students would again pool on the ability-concealing department. This would yield a situation with rampant grade inflation, one wherein all students of low ability are labeled by an ability-concealing department as high ability.
Extension: Continuous student ability and an education bonus
One might argue that our characterization of student ability as binary—either low or high—is too coarse and that this may be responsible for the result, above, that, when students separate, there is no variance in student grades. With this in mind, we now offer an extension of our model that allows us to explore the consequences of allowing student ability to exist on a continuum. Along with this change we also include in the extension an education bonus that a student receives if she enrolls in an ability-revealing department. As shown below, the extension of our model does lead to variance in student grades; however, it does not change our fundamental results about grade inflation and the effects of student sorting on the distribution of grades.
Let student ability be denoted
When student ability exists on a continuum, we can no longer speak simply of “low” and “high” ability students. In addition, with continuous student ability, we need a more refined characterization of grades and of post-education wages. We continue to assume that an ability-revealing department is one that assigns grades based on underlying student abilities, and with a continuous distribution of abilities this is obviously a bit of an abstraction insofar as a finite number of class letter grades—“A,” “A
With respect to ability-concealing departments, we continue to assume that such a department awards very high grades to all of its students. In particular, in the model extension, we assume that every student in it receives a grade of 1. This is parallel to our earlier assumption that ability-concealing departments award grades of “A” to their students.
In terms of wages, suppose that a student of ability
As was the case in our initial model formulation, a student whose wage is not known receives a base wage in the labor market corresponding to expected ability level where this expectation is taken given equilibrium student behavior. Such a student cannot receive the education bonus
Considering both effort cost, education, and the effect of ability on what we are calling base wages, if a student with ability
The equilibrium of the model extension depends on a cutpoint that we call
If
If
The knife-edge condition
Lemma 2 shows that the indifference cutpoint

Extended model equilibria as a function of c and
Suppose first that
Suppose one were to argue on normative grounds that student pooling on the ability-revealing department is a good thing, that is, that society benefits when
Now consider the case
When
In the semi-pooling equilibrium, the average student grade is
Differentiating this expression with respect to
The result about the effect of the effort cost
When the education bonus e increases, then the effects described above move in the opposite difficulty will lower average grades is, however, a partial equilibrium assertion. Since students sort themselves conditional on department difficulty, making an ability-revealing department more difficult will drive students away from it and thus have the opposite effect of what the dean or other figure intended.
Such an argument holds conditional on the existence in our hypothetical university of an ability-concealing department, and this highlights the possibility that there is a collective action problem in university grading, one that the dean could overcome if she could simultaneously convince an ability-revealing department to be more difficult while convincing (or compelling?) an ability-concealing department to become ability-revealing. We will return to this point later. At the moment, though, it suffices to note that, when students have an ability-concealing department as an option, the more difficult the ability-revealing department, the higher are average grades and thus the more grades look like they are inflated.
One sees a similar point when examining the variance in grades in the extended model’s semi-pooling equilibrium. This variance is
Algebra shows that this variance approaches
More importantly, the derivative of the above variance is negative for relevant parameter values. In other words, the more difficult the ability-revealing department becomes, the less variability there is in student grades. This is because increasing difficulty leads to an increasing number of students in an ability-concealing department and accordingly less grade variability. If, say, our aforementioned university dean or administrator were to argue that his or her institution should seek extensive variability in grades—because, say, variability in grades makes it easier to distinguish low and high ability students—the implication of our extended model is that the dean should insist that the ability-revealing department be as easy as possible on its students.
Given the definition of
We motivated the extension of our model with the recognition that the coarse way in which we modeled student ability—low versus high—diminishes our ability to ascertain whether in equilibrium there is variance in student grades. Our extension shows that this concern was indeed valid. Namely, as long as the effort cost
Our final point about the model extension concerns the possibility of including a cost term for enrolling in an ability-concealing department. Were we to have done this, then we would have seen that what we call
Discussion
We have offered a game-theoretic analysis of student grading, an analysis motivated by empirical studies documenting upward trends in grades in higher education institutions. Our formalizations shed light on the implications of student choice over departments, a key feature of grading processes in universities that is often neglected in discussions of grading trends in higher education. The students in our model are both effort-averse and forward-thinking, and this induces a dynamic in which the best students seek to distinguish themselves from their lower-ability counterparts and are willing to undertake costly behaviors so that their true abilities are revealed to a post-education labor market. The end result of this is that average grades are high but not because of, say, lax standards or enrollment pressures. Rather, grades are high because good students appropriately earn them from an ability-revealing department and lesser students garner them, so to speak, from an ability-concealing department.
To be clear, we are not arguing that our model should be thought of as a (or “the”) comprehensive explanation of grade inflation in higher education. Our primary objective has not in fact been to offer a complete theory of grading in educational institutions but rather to encourage scholars interested in grading to consider assiduously the consequences of student sorting on grade distributions. Existing literature makes it clear that there are a variety of explanations for the types of grade inflation that empirically driven scholars of education have identified, and our models should remind those considering these explanations to be mindful of how sorting can manifest itself.
The model adduced here shows how student sorting at one level of the higher education landscape—students within universities—leads to distributions of grades that seem extensively inflated but, for high-ability students at least, are nonetheless accurate. There are additional levels of sorting that we have not specifically engaged, and these include sorting across universities and sorting within departments. Although we have not modeled choice of university, one could envision a more general model of education wherein a student selects into the best university that accepts him or her and then chooses a department within the chosen universities. If some universities are known for being ability-concealing and requiring little effort, then students of higher ability will presumably not apply to these institutions and instead pursue education in costly, ability-revealing institutions. These latter institutions will then be disproportionately populated by high-ability students, which will compound the grading dilemmas caused by within-university selection into departments. Given the recent increase in competition in the United States competition for admission into elite colleges and universities, it is conceivable that across-institution sorting may be a notable factor in explaining nationwide increases in average grades. 12
What does our model say about contemporary trends in grades? Perhaps the most direct implication is as follows: within-institution trends in grades are hard, if not outright impossible, to interpret in isolation from trends in the extent to which students sort themselves strategically into departments. Put another way, student sorting confounds grading, and therefore analyses of grade trends that are executed independently of sorting dynamics can be misleading. In our basic model where parameters are reasonable (i.e. the cost of attending an ability-revealing department is not too high compared to long-term wage streams), literally all students receive high grades and there is correspondingly no variance in the distribution of grades. Viewed from the lens of grades only, this situation looks problematic and in need of remedy; it is problematic, however, only for low-ability students as the high grades received by top students correctly reflect these students’ high-ability levels. Challenged to defend its plethora of high grades, an ability-revealing department in this situation might respond, “All of our students are excellent!” Due to student sorting driven by high-ability students seeking to distinguish themselves from low-ability students, this claim would be accurate.
Put another way, our models show that the grading practices of individual departments cannot be assessed simply by observing whether they assign many “A” marks or, say, “C” marks. Suppose that a university dean were to compare the grades across two departments in her jurisdiction, and suppose that she were to notice that both consistently give many (or perhaps exclusively) “A” grades. Should the dean insist that these two departments raise their grading standards or, say, ramp up the effort levels required for classes in said departments? Not necessarily. Of the two departments, if one is ability-concealing, then only top students choose the other, thus inducing this department to give a plethora of high grades that are accurate. If this department were to raise its standard, this would not alleviate the selection incentives that we have explored here.
Our model is not explicitly dynamic, but it nonetheless suggests that one source for observed trends in grades could be the emergence of one or two ability-concealing departments in a university. That is, suppose that many years ago all departments in a hypothetical university were ability-revealing and entailed effort costs. Were this the case, then we would expect these departments to have issued both low and high grades. Suppose then that an exogenous shock—the Vietnam War, as some have conjectured 13 —led one department in a university, or perhaps a small number of departments, to adopt grade-inflating practices and simultaneously reduce the effort needed to enroll in said department or departments. As soon as this were to have happened, we would be in a situation where the university had a combination of both ability-concealing departments and ability-revealing departments. In the presence of both types of departments, our model suggests that forward-looking students of high ability will seek to separate themselves from lower-ability students, the latter of whom will choose ability-concealing department and the former, ability-revealing departments. The result of this will be that all students earn high grades. In this example, the culprit for high average grades overall is the presence in a university of a small number (or even a non-small number) of ability-concealing departments. Indeed, one could argue that ability-revealing departments are somewhat at the mercy of ability-concealing departments: once some of the latter exist, the former will enroll only good students. This leads to a flattening of the grade distributions produced by grade-revealing departments.
This point highlights a collective action problem associated with grading. A department that by itself wanted to address institution-wide grade inflation can be stymied by the ability-concealing behaviors of other departments. If an ability-revealing department were to make its classes increasingly challenging in an attempt to mitigate inflation, then it would make the overall grade inflation problem worse and in so doing decrease its own enrollments. To the extent that low enrollments are problematic for departments who might want to use enrollment figures to argue for faculty positions, no department has an incentive on its own to increase the cost associated with its classes. This sort of collective action dilemma means that university administrators should not assume that individual departments will ever be able to coordinate themselves and form a solution to what administrators might consider a grade inflation problem.
Our model has implications for department enrollment patterns associated with gender, race, and socioeconomic status. We have assumed throughout this article that students know their own ability levels, but it is worth considering how it is that a student might learn whether she is of low or high ability and whether it is possible for a student to think, wrongly, that she is of low ability when she actually is of high ability. This sort of error is particularly pernicious for students because a high-ability student who believes that she is of low ability may prefer an ability-concealing department over an ability-revealing department—even though the latter would be more valuable. Moreover, if pre-university evaluations for a given group of students are systematically biased in this way so that students in said group regularly underestimate their abilities, then this group will enroll in ability-concealing department even though they should not. This will lead to labor market inefficiencies (high-ability individuals will not be treated as such) and depress the group’s aggregate, post-education earnings in the long run.
There is evidence that these kinds of concerns shape student enrollment patterns across departments. Ehrlinger and Dunning (2003) show that women often underestimate their scientific knowledge and aptitude. According to our model, this will lead women disproportionately to select into ability-concealing departments, and this will have society-wide implications, namely, potentially wasted talent and a lack of women in certain scientific fields. Similar concerns have been raised about African-American students (Ewing et al., 1996; Nacoste, 1989), and there is evidence that, ceteris paribus, students of low socioeconomic status are less likely to pursue further academic study than students of higher socioeconomic status (Erikson et al., 2005; Jackson et al., 2007). Per our model, the culprit for this could be systematic biases in pre-university grades which lead low socioeconomic students to doubt their own abilities.
A final point on this subject concerns risk aversion. If a group of students is more risk averse than another, this could lead members of the group to enroll in ability-concealing departments rather than risking enrolling in the ability-revealing departments. The relationship between gender and risk aversion is not simple (e.g. Eckel and Grossman, 2008; Jianakoplos and Bernasek, 1998; Schubert et al., 1999), but it is worth pointing out that, if women are more risk averse than men, then this dynamic will compound the types of biases, noted above, that drive women disproportionately toward ability-concealing departments.
We conclude with comments that link our game-theoretic approach to empirical work on grade inflation, and here we pursue two approaches. First, we discuss our model’s implications for the ways in which one might seek to understand whether grades in a given institution are inflated. And second, we consider how one might test to see whether our model’s dynamics roughly approximate what one observes in higher education institutions.
With respect to the first point, suppose that a researcher at a university wanted to know about the extent to which her institution’s departments were ability-concealing or ability-revealing. Knowledge of this type could in principle inform university policy decisions insofar as regulating ability-concealing departments by compelling them to issue ability-revealing grades could “solve” the problem of grade inflation to the extent that it is considered a problem. How would an interested party determine which departments in a university are ability-concealing? As is clear from our theoretical results, looking at which departments have high grades is not sufficient and can actually be misleading. Rather, ability-concealing departments can be identified because their students vary in ability yet receive similar grades. Presumably the administration in most if not all educational institutions knows, say, standardized test scores of all its enrolled students. Departments whose enrolled majors, say, have high variance in test scores yet have received disproportionately high grades may be ability-concealing.
To be precise, suppose that of two departments in a university, one had high variance in student test scores and low variance in grades, and suppose that the second had low variance in test scores and high variance in grades. This would presumably indicate differences in grading practices in a way that falls roughly along the ability-concealing versus ability-revealing dichotomy described here. These variances are still subject to a student sorting confound, but this is generically true unless a university were to compel all of its students to enroll in common course. Absent such a common course, combining variances in measures of ability like standardized tests with variances in grade distributions might yield a plausible picture of which departments in a university are ability-concealing and which are ability-revealing. In the long run, this will aid the general understanding of grading dynamics and how educators and researchers should interpret trends—both upward and downward—in grades.
Another approach to assessing whether grades in a university are inflated is to consider whether post-education institutions that in theory rely on these grades actually use them. Universities know the enrollment choices of their students and which students receive high grades; universities will often also know which students, say, receive interview opportunities and valuable job offers. Armed with this information, universities could in principle determine which grades are correlated with post-education success. For example, if there is a department that produces high grades which do not predict post-academic success, then it follows that this department may be guilty of grade inflation. Similarly, if there is another department such that grades from it—low or high—predict success, then this department is presumably ability-revealing and thus attracting the best students. Moreover, if a university observes that its students are subjected to grueling and extensive interviews, this might suggest that none of its grades is particularly meaningful. Long interviews are only necessary, one would think, if grades are not signaling abilities, that is, if there is university-wide grade inflation. Similarly, if student activities like internships and formal recommendations carry more weight in the post-education labor market than grades, then one might surmise that grades are lacking in their signaling ability.
With respect to our second empirical point, we now consider four empirically testable hypotheses based on our model.
First, students will sort themselves based on self-perceived ability across types of departments. This hypothesis could be tested by conducting a survey among undergraduate students prior to enrollment decisions. Our model suggests that student perceptions about the difficulty of grading in a department should be correlated with the perceived ability level of the students majoring or concentrating in that department. Students who perceive themselves as high ability should cluster in difficult concentrations, ceteris paribus, while students who believe they are low ability will do the opposite. Since an important implication of our model is that the grade distributions within a department cannot be, on their own, evidence of grade inflation, clustering in self-perceived student ability could be an important sign that grade inflation is occurring at disparate rates among departments.
Second, employers’ preferences for hiring students in departments perceived as ability-revealing will stem in part from the perceived higher ability level of students in those departments. This hypothesis could be tested using a survey of employers who frequently recruit undergraduates on a particular campus. Employers could be asked about whether they prefer certain majors, how they interpret grades among students from different majors, and how they perceive the grading practices within those majors. Employer reactions to mock student profiles that only vary based on the major granting department could be used to measure how much of a difference perceptions of different majors make. These responses could then be used to assess the weights employers place on grades varies with the perceived grade inflation in the major. In the survey considered here, it would be important to focus on comparing departments that have similar course content and employers that do not have a strong preference for students with specific technical skills.
Third, departments which place a high value on attracting the highest ability students will be the least likely to have inflated grades. A key implication of our model is that high-ability students will select into non-grade inflating departments. Consequently, departments that want to attract these students will have an incentive to provide grades that accurately reflect student abilities. This hypothesis could be tested by conducting a survey of department faculty regarding both the types of students a department is seeking to attract and the department’s grading policies, whether it mandates curves, mandates median grades, and considers grade distributions in the context of course evaluations. We would expect departments that put an emphasis on attracting the highest ability students to be the most aggressive in combating grade inflation, ceteris parbus, while departments that emphasize enrollment maximization regardless of student ability would be more tolerant of grade inflation.
Fourth, student sorting will increase in the presence of increased information about department grading practices. Students can only sort themselves across department types if they know which departments are ability-concealing and which are not. This said, it is probably rare that students have perfect knowledge about department grading practices. Consequently, in situations where information about department grading practices suddenly increases, we would expect a surge in student sorting. Suppose that an institution were to promulgate a policy by which class median grades are made public. Our model suggests that this sort of natural experiment will lead to increased student sorting.
With this last point in mind, we end by noting a previously cited study that draws on data from Cornell University. As we described earlier, Cornell began publishing information about course median grades in 1996, and Bar et al. (2009) discuss changes in student enrollment patterns in Cornell following the 1996 policy change. In particular, they show that enrollment in low-grading classes declined after Cornell began publishing median grades; these classes are presumably ability-revealing. However—and this is the key point—this effect was large for students with relatively low grade point averages and yet statistically indistinguishable from zero for students in the top 10% of Cornell’s grade distribution. In other words, Cornell’s policy change, which allowed students to determine which classes were ability-revealing and which were ability-concealing, produced a situation wherein low-ability student shifted out of the former in favor of the latter. High-ability students did not behave this way, and this is what our model would suggest. That the Cornell study produces results in line with our formalization is pleasing. This emphasizes the role that student sorting plays in grading and that incentives for sorting should be a part of the academic literature on grading practices.
Footnotes
Appendix 1
Acknowledgements
The authors thank Craig Volden for the conversation that inspired this paper and Mark McPeek, Russell Muirhead, two anonymous referees, and seminar participants at Dartmouth College, the University of California at San Diego, the University of Virginia, and Yale University for helpful comments.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
