Abstract
This research documents systematic gender performance differences (GPD) at a top business school using a unique administrative data set and survey of students. The findings show that women’s grades are 11% of a standard deviation lower in quantitative courses than those of men with similar academic aptitude and demographics, and men’s grades are 23% of a standard deviation lower in nonquantitative courses than those of comparable women. The authors discuss and test for different reasons to explain this finding. They show that a female instructor significantly cuts down GPD for quantitative courses by raising the relative grades of female students. In addition, female instructors increase women’s interest and performance expectations in these courses and are perceived as role models by their female students. These results provide support for a gender stereotype process for GPD and show that faculty can serve as powerful exemplars to challenge gender stereotypes and increase student achievement. The authors discuss several important implications of these findings for business schools and for society.
Women tend to be underrepresented in high-paying careers and jobs in management. The overall gender disparity in wages is substantial—women earn 77 cents for every dollar earned by men—in management occupations in the United States (U.S. Bureau of Labor Statistics 2016). The paucity of women in some business sectors contributes a large share of the female gender gap in wages (Blau and Kahn 2017) and handicaps economic growth (Hsieh at al. 2019). It can also have a negative impact on corporate structure and culture, with compounding downstream consequences for women’s success. Discrimination against women in terms of salaries earned and promotions garnered has been amply documented in many fields with lopsided gender representation, notoriously so in the technology firms of Silicon Valley (Cao 2017) as well as in finance, academia, and local government (Kolhatkar 2017). While lawsuits may take care of specific instances of discrimination, a contributor to the corporate culture that fosters such discrimination—namely, gender imbalance—remains to be addressed.
Why is there a gender imbalance in certain business fields, and how can it be changed? Many technology positions in Silicon Valley firms, the current hotbed of highly lopsided gender ratio, represent quantitative business fields and encompass multiple business disciplines such as marketing, finance, operations. Many other top-paying companies and industries also have substantial levels of gender imbalance and wage gaps, especially at higher levels of management (e.g., financial services: Edelman et al. 2019; consulting: Marriage 2019). Business schools train the talent fueling most of the management jobs. Even though business schools in the United States boast 43%–47% female student representation (Department of Education 2018), if female students are not successful in the academic paths that lead to more lucrative career options, the pipeline issue cannot be fixed. Therefore, it is imperative to explore what gendered achievement disparities exist in business education and whether interventions can reduce or expunge these disparities.
In this article, we investigate the differences in academic achievement of male and female students at a top undergraduate business program and examine what drives these differences. Our empirical analyses focus on the grades of 6,312 undergraduate students who represent the 2005–2018 graduating classes, focusing on students’ academic performance in the introductory courses of the core curriculum, where coursework is mandatory and students are randomly assigned to different sections of a course. This focus eliminates concerns about student self-selection. Due to the coordinated nature of the core curriculum, all sections of a course in a given term have the same syllabi, materials, and exams, further aiding comparability of performance outcomes. We supplement the administrative data with a survey of current undergraduate business students to assess their expectations, interests, and perceptions across different types of courses.
We document stark gender disparities in academic performance across different types of courses, even after we control for each student’s initial academic aptitude measures, current grade point average (GPA), family background, and a rich set of other demographics as captured by their college applications. While the grades of women are systematically lower than those for men in quantitative courses (e.g., 33.7% of a standard deviation lower in finance), they are systematically higher than those for men in others (e.g., 33.5% of a standard deviation higher in organizational behavior). We show a correlation between how quantitative a course is perceived to be by students and the gender performance difference (GPD) levels we document. Specifically, a more quantitatively perceived course is correlated with a positive GPD (men outperform women), and a less quantitatively perceived course is correlated with a negative GPD (women outperform men). We also show that women also are less likely to take jobs in high-paying sectors (e.g., investment banking, technology), which leads to a gender wage gap at the time of graduation.
A priori, it is unclear why student gender would affect academic performance across quantitative and nonquantitative courses. Why should students of different genders, otherwise similar in their academic aptitudes, family background, and other demographics, perform differently in business school classes? In science, technology, engineering, and math (STEM) programs, women tend to perform worse than men (for a recent review, see Kahn and Ginther [2017]). While the commonly cited potential reason of low female representation does not apply in the case of business education, the education and psychology literature streams have discussed two other potential explanations that may apply. The first explanation is that men (women) innately prefer quantitative (nonquantitative) courses (innate preferences hypothesis). The second explanation is based on gender stereotypes and may be student-based or instructor-based. The student-based explanation is that gender stereotypes held by students can influence their performance in different courses (gender stereotype hypothesis—student-based). The instructor-based explanation is that instructors hold these gender stereotypes and subsequently influence student performance through instructor behaviors, and evaluations (gender stereotype hypothesis—instructor-based).
It is important to examine the respective congruency of these potential explanations with the grade disparities observed in the business education context. If innate preferences drive GPD in business education, then interventions are not likely to be effective and may not improve welfare. However, if GPD is shaped by gender stereotypes, then interventions that challenge those stereotypes might reduce GPD. One intervention that can challenge students’ gender stereotypes is an instructor who is a successful counterexample to the stereotype.
We provide evidence for the causal impact of instructor gender on GPD. Overall, we find that women’s grades in quantitative courses are 11% of a standard deviation lower than that of men. However, when taught by female instructors, we document that female students’ performance improves by 7.7% of a standard deviation in quantitative courses and that this effect is driven by female students with mid-to-high math aptitudes. In addition, we find that female instructors increase female students’ initial interest and performance expectations in quantitative courses and are viewed as role models by female students. This pattern of results provides support for a student-side process explanation of gender stereotypes and demonstrates that instructors can improve students’ academic success by providing a counterexample to the stereotype. Furthermore, we show that the data do not corroborate an explanation for GPD based on instructor bias, because the degree of subjectivity in grading does not change the impact of instructor gender on GPD and because students do not perceive a difference in the fairness of instructors based on their gender.
We find, however, that instructor gender does not affect male students’ performance, interest, or perception of instructors in any of the courses. The finding that instructor–student gender match in nonquantitative courses, in which male students have an observed handicap, does not improve male students’ performance suggests that male students’ underperformance may either be driven by innate preferences and not by gender stereotypes, and/or that having a male instructor in nonquantitative classes is not effective in challenging gender stereotypes regarding these courses.
Our article contributes to the long-standing marketing literature on consumer behavior research that has examined gender and gender-identity differences. This literature considers not only gender differences in responses to advertising and messaging (e.g., Dahl et al. 2009; Fisher and Dube 2005; Grohmann 2009; Meyers-Levy and Loken 2015) but also gender and identity differences in more global preferences (Gao, Mittal, and Zhang 2019; Govind, Garg, and Mittal 2020; Meyers-Levy and Loken 2015). Our research also contributes to the education literature, which we discuss in more detail in the next section. In that literature, our article is closest to Carrell et al. (2010), which documents a positive role of instructor gender in reducing GPD in STEM courses in the Air Force Academy. We contribute to the education literature in three main ways. First, we study GPD within a business school at a large public university, with immediate downstream implications for gender disparities that have garnered much attention in the corporate sector. Second, while most studies of the impact of instructor gender on student achievement have been correlational, as in Carrell et al. (2010), our identification strategy relies on exogenous student–instructor assignment to address identification challenges that arise from students’ self-selection into courses or instructors that may have confounded other work. We also improve on Carrell et al. (2010) by controlling for unequal instructor gender representation across courses, which may otherwise lead to an aggregation bias due to a potential correlation between instructor gender representation and GPD across courses. Such a correlation is plausible because academic success in undergraduate programs in a field is a strong predictor of pursuing graduate work in that field (Sax 2001). Third, we provide empirical evidence regarding the congruency of different potential mechanisms behind the observed GPD patterns. While these potential mechanisms have been discussed in the literature, causal evidence supportive of them has been scarce. We provide insight into these mechanisms both by examining the grade outcomes of students across instructors of different genders and by conducting a supplemental survey of the same student population.
Our results have several important implications for business education and the corporate careers for which it prepares students. Even though business schools themselves have achieved gender representation close to parity, this article shows that equal representation has not translated to equal academic achievement. Academic achievement gaps not only create inequity in education but also may have far-reaching consequences, because they may shape occupational choices (Ost 2010). We find that female students who have a mid-to-high aptitude for quantitative subjects benefit the most from role models. It is disconcerting that the professional world may be missing out on these women due to the misallocation of talent driven by stereotyped beliefs. Therefore, arresting these gaps in situations where they may arise (such as at business schools) is paramount to create more gender-healthy pipelines for senior management positions.
There is hope. Our results also provide an important avenue for reducing gaps in academic achievement disparities within business schools. We find that female students achieve higher grades in quantitative courses when they are taught by female instructors. As such, business school administrators can help align female students’ abilities more accurately with their educational choices by assigning more female instructors to teach quantitative courses.
If better alignment leads to a better allocation of students to careers, this would create a shift with several significant benefits. A reduction of GPD in quantitative courses may increase the representation of talented women in careers that quantitative courses prepare students for (e.g., finance, consulting, consumer technology), which are typically more lucrative. In addition, women’s success in quantitative fields can help recruiters hire and retain a more diverse workforce—a goal that many top-paying companies have underscored as being paramount. For both these reasons, business schools also have much to gain from reducing GPD.
Our results indicate that faculty teaching assignments and hiring practices in business schools have important downstream consequences for students and employers. The direct implications of female faculty representation in quantitative fields with regard to disparities in women’s academic achievements in business education also speaks to the importance of reducing gender gaps in universities’ hiring practices. With regard to marketing faculty, as the discipline of marketing becomes more data-driven and analytical, our results suggest the need to hire more female faculty to bolster female students’ success in these fields.
In what follows, we provide a review of related literature, describe our data, and provide evidence for GPD in business education. We then discuss potential drivers of GPD and test for their implications in our data. We conclude by offering recommendations based on our findings.
Literature Review
Given the equity implications and downstream consequences, a large body of prior research has studied GPD in academic achievement. Prior GPD research has focused primarily on preuniversity education (elementary, middle, and high school) around the world. The focus on GPD at the university level is more recent, and only a few studies have been conducted. We first discuss these GPD studies and then discuss research on instructor gender effects on GPD. Importantly, GPD findings in the extant literature, as well as instructor effects on GPD, are mixed, highlighting the necessity and importance of examining these questions in a particular educational context.
Prior research at the elementary and middle school levels documents evidence, across countries, both for boys outperforming girls and for girls outperforming boys. For example, girls have been shown to do better in reading, writing, and math in elementary and middle school in England (Machin and McNally 2005). However, in other results, boys outperform girls in math; for example, Fryer and Levitt (2010) find no mean differences between boys and girls in terms of their math performance on entry to kindergarten in the United States but find that girls lose more than two-tenths of a standard deviation relative to boys over the first six years of school.
Findings for GPD among high school students are also inconsistent across countries. Using data on individual student performance across 41 countries, including the United States, Machin and Pekkarinen (2008) demonstrate that among 15-year-olds who take the same standardized tests, female students outperform male students in reading, but male students outperform female students in math. In the contexts of Korean and Chinese high schools, however, Lim and Meer (2017) and Xu and Li (2018) find that girls generally outperform boys, including in math.
Researchers focusing on GPD beyond high school have focused a lot on STEM disciplines, where most courses are quantitative, such as introductory physics programs (Kost, Pollock, and Finkelstein 2009; Lorenzo, Crouch, and Mazur 2009; Miyake et al. 2010). Koester, Grom, and McKay (2016) use data from a large Midwestern university’s 116 introductory courses in STEM, social sciences, and humanities. They document a large GPD in STEM courses and in economics in favor of men. They do not find GPD favoring women over men in any courses, including social sciences and humanities. Carrell, Page, and West (2010) examine the context of the Air Force Academy, where women account for only 17% of the student body. They document that female students do worse than their male peers in STEM courses but do not find evidence of GPD in English and history courses.
An important question is whether GPD can be attenuated. Studies have discussed several possibilities (for a recent review, see Ceci et al. [2014]). One factor that has been considered is instructor gender. However, only a few studies have been able to address this question causally because of the difficulty of finding a context in which students do not self-select into courses and/or instructors.
Starting again with primary and secondary education, we find that the effect of instructor gender on GPD depends on the context studied. While girls outperform boys in general in Korean and Chinese high schools, having female instructors has a positive effect on female student academic performance (Gong, Lu, and Song 2018; Lim and Meer 2017; Xu and Li 2018). However, in U.S. primary schools, Antecol, Eren, and Ozbeklik (2014) document a negative effect of having a female instructor on female students in math classes. While these studies find a positive or negative gender-match effect for females, other research at the primary and secondary education finds no significant effect of a gender match on outcomes (e.g., Puhani 2018; Winters et al. 2013).
At the university level, results for gender-match effects have also been mixed, with Hoffman and Oreapoulos (2009) finding that male students performed worse with a female instructor in the University of Toronto’s Arts and Science program, Carrell et al. (2010) finding positive effects only for female instructors on female performance in the Air Force Academy, and Griffith (2014) finding gender-match effects for both male and female instructors at a small, selective liberal arts college in the northeastern United States.
To summarize, both GPD effects and gender-match effects in the extant research are mixed and vary depending on context. Business schools are unique in having both quantitative and nonquantitative courses, with neither course type dominating the curriculum (unlike STEM and liberal arts curricula) and a gender ratio close to equal (unlike the Air Force Academy). Furthermore, business schools attract people who want to prepare for the corporate world. All these factors can result in a very different GPD in business education. This research contributes to the GPD literature by documenting GPD and the impact of instructor gender in the context of business education at a top public university. An important contribution to the literature stems from the novel evidence we present for the mechanism behind GPD.
Data
We rely on four sources of data. We obtained the first data set from the business school library of a large public university in the U.S. Midwest; this data set contains undergraduate business administration (UBA) program bulletins listing core courses and their timing requirements. The second data set combines three administrative databases obtained from the same university: (anonymized) student grades in all enrolled classes, student background characteristics obtained from their university application, and instructor demographics. The third data set also comes from the business school library and includes most of the syllabi of the fixed core courses between fall 2006 and winter 2017. The fourth data set includes survey responses of 102 junior UBA students currently enrolled in the same business school; this survey provides us with student perceptions.
Using the administrative data sets, we follow the academic performance of 6,312 UBA students who represent the graduating classes of 2005–2018. Because the school admitted two cohorts as it transitioned from a two-year program to a three-year program in 2006, the data comprise 15 cohorts of students. The data span all introductory fixed core classes (discussed next) taken between fall 2003 and winter 2017. The survey then provides a lens to examine possible reasons for performance differences. We describe each data set in turn.
Program Bulletins: The Structure of the Core Curriculum
The UBA core (i.e., compulsory) curriculum consists of introductory courses in accounting, business law, business economics, business communications, finance, marketing, operations, organizational behavior, statistics, and strategy. The Business School Registrar publishes a bulletin specifying the timing of these core courses. Some of these courses are “floating core,” meaning that students can choose when to take these and thus can select their professor. In contrast, “fixed core” courses must be taken at a specific time in the program (e.g., fall semester of sophomore year). 1
Our analyses focus on students’ academic performance in the fixed-core curriculum of the UBA program because the fixed-core program is mandatory and structurally rules out student self-selection into courses and instructors. The exogenous assignment is ensured by the registrar’s office, which divides each cohort into five or six sections and randomizes students into these sections conditional on the female student proportion being the same across sections. Because instructors are assigned to sections and because students remain with their section mates in the fixed-core program, they are not able to choose the instructor teaching the class. We provide empirical evidence for the registrar’s success in ensuring random assignment in Appendix A.
Administrative Data
The administrative data combine three databases: students’ grades in university classes, students’ background characteristics obtained from their university application, and instructor demographics.
Student grades
As in other studies of GPD, we measure academic performance as students’ grades in each course.
2
The grades are determined on an A+, A, A−, B+, B,
Individual student background characteristics at the time of college application
Approximately 37% of the students are female, 65% are White, 25% are Asian, 3% are Black, and 3% are Hispanic. For 65% of the students, English is the primary language spoken at home. Students’ households vary in education and income levels. For example, while 10% of students are first-generation college students, 23% have at least one parent with a doctorate degree (in 17% of cases, parental education level is unreported and coded as such). Similarly, while 11% of students come from households with less than $50,000 income per year, 38% come from households with more than $150,000 in annual income (in 19% of cases, parental income is unreported and coded as such).
Table 1 breaks down the summary statistics of these characteristics as well as the GPAO variable by student gender and provides a test of differences across genders. We find significant differences across many demographic variables across female and male students. For example, male students are slightly more likely to be White and from a family with at least one parent who has a doctorate, whereas female students are more likely to be first-generation college students.
Student Characteristics.
Notes: Gender was recorded in the admistrative data as a binary variable.
The data also provide several measures of academic performance before students joined the university (e.g., SAT component scores, advanced placement [AP] subject test scores, high school GPA). The application process allows for SAT and/or ACT test scores. We have SAT exam scores for 53% of the UBA students and ACT exam scores for 71% of the students. In total, 24% of the students had taken both exams; less than .5% of students did not report either exam score.
Consistent with the literature on performance differences in high school, we note significant differences across male and female students in prior academic achievements by subject. On average, female students had a higher high school GPA and a higher language proficiency, as indicated by their ACT English and AP English literature scores. However, male students, on average, had a higher proficiency in more quantitative subjects, as indicated by their ACT math; ACT science; SAT math; and AP calculus, microeconomics, macroeconomics, and statistics test scores.
Instructor characteristics
For each class in which a student enrolls in the undergraduate business program, we have information on each instructor’s name, gender, ethnicity, type of appointment, and teaching experience. Table 2 summarizes these data by gender. Over our period of study, the average instructor taught for a little over five terms in the fixed-core program. Of the instructors teaching fixed-core classes, 34% are female.
Faculty Characteristics.
Female instructors are marginally less likely to be White. We do not find other differences across instructor genders. However, representation of instructor demographics varies greatly across subjects, and a lack of correlation in the aggregate should not be taken as a lack of correlation within a subject area. An important feature of the data that may not be apparent from the aggregate statistics is the lopsided distribution of female instructors across subjects. Only 17% of students are taught business economics by female instructors. In contrast, 43%, 31%, 68%, and 85% of students are taught marketing, management and organizations, business law, and strategy, respectively, by female instructors. Our empirical analyses that investigate the impact of instructor gender also control for the unequal chances of students being taught by a female professor across different courses, because the discrepancies across courses in female instructor representation may be correlated with disparities in GPD, leading to an aggregation bias in the estimates of interest (impact of instructor gender on student performance). We may expect such a correlation because academic success in undergraduate programs in a field is a strong predictor of pursuing graduate work in that field (Sax 2001).
Syllabi Data
We obtained the syllabi for 278 course–instructor combinations out of the 308 in our sample from fall 2006 to winter 2017. 3 We coded each syllabus to indicate whether and what percentage the following components contributed to a student’s grade in the course: class participation; individual in-class assignments, exams, and quizzes; individual take-home assignments, exams, and quizzes; and group take-home assignments and projects. The syllabus data confirm that all classes offered in the same term (by different professors) have the same graded components, the same distribution of points across those components, and the same grading rules, due to the coordinated nature of the UBA core program.
Table 3 reports the average percentage of the grade each of these components accounts for in the UBA core across the 278 syllabi. As we expected, the largest part of the grade is determined by individually completed in-class exams, such as midterms (28%) and finals (31%). These components are followed closely by group term projects (15%), take-home exams (10%), and class participation (10%). While most variation in the weight of these components are across courses, there is also within-course variation over terms. We use the syllabi data to test whether our results regarding the impact of instructor gender vary by the weight of different grading components in a course.
Syllabi, Graded Components.
Survey of UBA Students
We conducted a survey with 102 junior UBA students. These students were in their last semester of taking introductory core classes and were enrolled in a mandatory core class that required them to complete research studies for course credit. First, we asked them to indicate the core classes they took or were currently taking and their career interests. Then, we asked the students to select the professor they had for each core class from a drop-down menu. In the next screen, we asked them to think back on each of the core classes and rate their excitement/interest, their initial probability of getting an A in the course (to measure initial performance expectations), and their effort in each course. All three items were assessed on a 0–100 scale, with larger numbers indicating greater interest, a higher probability of getting an A, and more effort. On the next screen, for each course, we asked the students to rate how they felt they were treated by the professor and to what extent the professor was a source of inspiration or a role model (1–7 Likert scale, with larger numbers corresponding to more positive sentiments). These questions were followed by an incentive-compatible elicitation of their beliefs about the average grade difference between male and female students (GPD beliefs). We chose to do this for 8 of the 11 core classes due to time constraints of the survey. The response range was set between −1.1. and 1.1, with 0 indicating no performance difference. Positive (negative) values indicate that the respondents believed men (women) perform better than women (men) in a given course. For example, .5 meant that the respondent thought that the average grade of male students is .5 grade points higher than that of female students in that course. Finally, we asked the students to rate their perception of the quantitativeness of each course (on a 0–100 scale, where 0 is not quantitative at all and 100 is extremely quantitative). At the end of the survey, the respondents indicated their ethnicity and gender. The program randomly selected one of the eight GPD belief elicitation questions, and respondents received $5 for guessing within .005, $2 for guessing within .1, and $1 for guessing within .2 of the performance differences in grades in the core classes offered in the 2013–2014 academic year. Further details about the survey appear in the Web Appendix.
Table 4 reports the average responses. The differences across courses in terms of student interest, perceived initial probability of getting an A, and student effort are not large. Students felt that they were mostly treated fairly and had overall positive sentiments toward the professors in the core courses. However, the average responses reveal larger differences across core courses in terms of student perceptions of course quantitativeness. Students reported the following seven courses to be mostly quantitative in nature: finance (82.79), statistics (81.97), operations (77.88), accounting (level 1: 75.51 and level 2: 73.34), business economics (64.96), and business information systems (53.64). They rated the following four courses to be mostly nonquantitative in nature: strategy (34.98), business law (26.12), marketing (24.35), and organizational behavior (22.72). Therefore, in some of our analyses, we will be referring to these groups of courses as quantitative (or perceived to be quantitative) and nonquantitative (or perceived to be nonquantitative) courses, respectively.
Survey Average Responses.
We also observe that students’ GPD beliefs are positive for quantitative courses and negative for nonquantitative courses, suggesting that students expect men (women) to perform better in quantitative (nonquantitative) courses. We examine these patterns and respondents’ reasons for their assumptions in our discussion of the actual GPD estimates obtained from the administrative data in the next section.
Documenting GPDs
In this section, we document systematic grade disparities across female and male students in the introductory courses in the fixed-core business curriculum. To quantify the GPDs after controlling for differences in other demographics and academic backgrounds, we estimate the following:
where Gradescpt is the standardized grade of student s in course c with instructor p in semester-year t, and 1(gs = F) is an indicator for whether the gender (gs) of student s is female. The coefficient of interest (ψc) captures the difference in mean performance between male and female students in course c after we control for their other demographics and their academic backgrounds. We refer to ψc estimates as the GPD. Recall that standardized grades have a mean of 0 and a variance of 1 within each course, semester, and year for ease of comparability. Therefore, the ψ estimates reflect the GPD in terms of a percentage of a standard deviation. We define GPD as negative when male students perform better than female students and positive when female students perform better than male students.
The vector Xst includes all student characteristics noted in Table 1. These include demographics such as ethnicity, whether English is the student’s native language, household income, maximum parental education level, previous academic aptitude variables (e.g., high school GPA; a high school calculus indicator; their ACT, SAT, and AP test scores), and indicators for the availability of these variables. 4 Furthermore, we allow SAT and ACT scores to have different coefficients for the group of students who took both exams, so that the incremental impact of the SAT score for a student who also took the ACT is captured separately from the impact of the SAT score for a student who took only the SAT. As a control for general academic performance as a university student, we also control for the student’s cumulative GPA at the university by term t and excluding course c, which varies by term, necessitating the term (t) subscript in Xst. These controls enable us to compare the academic performance of students in the fixed-core curriculum of the undergraduate business program who had similar demographics and academic aptitudes but differ in gender. We recognize that Xst may predict academic success in each subject k differently. For example, a student’s SAT math score may be a better predictor of his or her grade in the finance core course, while the SAT verbal score may be a better predictor of the student’s grade in the business law core course. Coefficient δk(c) allows the impact of Xst to vary by the subject k of course c. The specification also includes course-term fixed effects, ηct. Robust standard errors are clustered at the course–instructor level.
Table 5 presents the results. The estimated GPD coefficient is the most negative (female students lagging relative to comparable male students) for finance and is most positive (male students lagging relative to comparable female students) for management and organizations. The magnitudes of GPD are substantial for these courses; women’s (men’s) grades in finance (management and organizations) are 34% of a standard deviation lower than the grades of men (women) with comparable academic and demographic backgrounds. Finance is followed closely by business economics (ψ = −.23), accounting (ψ = −.14 and −.07), and statistics (ψ = −.09) in having negative GPD coefficients. Management and organizations is followed closely by marketing (ψ = .21) and business law (ψ = .18) in having positive GPD coefficients. In the introductory core courses in operations, business information systems, and strategy, we do not find a significant GPD.
Gender Performance Gap (Male Minus Female) in Fixed-Core Classes.
*p < .10.
**p < .05.
***p < .01.
Notes: The dependent variable is the individual student’s normalized grade. Control variables are course by semester fixed effects and student-term control variables Xst interacted with each subject. Robust standard errors are clustered at the course-instructor level. p-values in parentheses.
It may be apparent that there is a relationship between the estimated GPD coefficients and the average student beliefs about GPD elicited by the survey. A formal rank-test confirms that the GPD rank of courses as suggested by students’ beliefs is congruent with the GPD rank of courses based on our results (Spearman’s rho = −.78, p = .022). In addition, GPD in a course is related to its quantitativeness. We find a significant rank correlation between mean perceived quantitativeness of a course from the survey results and the estimated GPD coefficient in these regressions (Spearman’s rho = −.81, p = .003).
To summarize, and for ease of communication, we repeat the analysis with the binary grouping of courses based on perceived quantitativeness results from our survey. Recall that according to the survey results, accounting, business economics, finance, business information, and operations courses are categorized as quantitative, while marketing, business law, strategy and management, and organizations courses are categorized as nonquantitative. Keeping the control variables unchanged, we estimate ψ for quantitative and nonquantitative courses with the following regression:
where subscript q(c) indicates which quantitative/nonquantitative binary classification course c belongs to. The rest of the specification remains the same.
Column 2 of Table 5 presents the results. In quantitative courses, female students tend to lag behind male students. On average, they score 11% of a standard deviation lower than male students who are academically and demographically similar. In contrast, in nonquantitative courses, male students tend to lag behind female students. In these courses, on average, female students earn higher grades than comparable male students by 23% of a standard deviation.
The magnitude of the discrepancies is substantial. For comparison, prior studies focusing on STEM coursework at the college level report average GPDs of −15% of a standard deviation in STEM course grades at the Air Force Academy (Carrell et al. 2010), and −10% difference in absolute letter grades for STEM courses (Koester, Grom, and McKay 2016). In most STEM programs, women are in the minority. Given that business schools have paid a lot of attention to equity and representation in their programs, the differences are all the more interesting and raise the question of what may be driving them.
Potential Drivers of GPDs
Why are GPDs occurring? Can educational institutions attenuate them? In this section, we explore three hypotheses for GPD. Two hinge on gender stereotypes: GPD may arise from gender stereotypes in two ways: (1) gender stereotypes held by students (we call this “stereotype bias”) or (2) gender stereotypes held by instructors (we call this “instructor bias”). There is much literature to support these first two stereotype hypotheses. The third hypothesis we propose is related to innate differences in interest across genders; this hypothesis is more exploratory in nature.
Stereotype Bias (Student-Based)
Quantitative courses tend to rely more on math skills and are typically considered a male-stereotyped domain, whereas nonquantitative subjects tend to rely more on communication skills and are typically considered a female-stereotyped domain (Fennema and Sherman 1977; Hyde et al. 1990). Stereotypical beliefs can hamper academic performance through “stereotype threat,” or the idea that a person’s actual performance may suffer when a negative performance stereotype connected to his or her identity is evoked (Steele 1997; Steele and Aronson 1995). Performance deterioration after evocation of a negative stereotype has been demonstrated in the context of math test performance and gender identity (e.g., Cadinu et al. 2005; Spencer, Steele, and Quinn 1999). For example, when the Asian identity of Asian women was evoked before a math test, their performance was better (compared with the control); however, when their female identity was evoked, they performed worse (Shih, Pittinsky, and Amabady 1999).
Stereotype threat may hamper performance through competency and self-efficacy beliefs (e.g., Bordalo et al. 2019; Bouchey and Harter 2005), thus affecting motivation, the selection of activities, and focus (Bandura 1977; Bussey and Bandura 1999). In the university context, such beliefs can influence students’ beliefs about the career they are best suited for and their likelihood of success in that field, thus affecting their interest and motivation to do well in courses related to that career path. In the context of business school, female (male) students may be less (more) interested in and less (more) motivated to perform well in the finance core course than in the marketing core course due to their beliefs about their eventual success in that field. Given that students juggle several courses and activities in a single term, male students may underperform in the marketing core class, and female students may underperform in the finance core class relative to what their academic aptitude would predict because of their differences in motivation and interest.
It has been proposed that salient examples contradicting educational stereotypes can be powerful in changing gendered beliefs (Solanki and Xu 2018). These examples can help nonstereotypical students shape and maintain an identity related to that field (Gilmartin et al. 2007; Oyserman, Fryberg, and Yoder 2007), become confident that a future in that field is attainable, and increase their interest in the field. In particular, having a female instructor in quantitative courses in which GPD is negative can change the stereotype by providing a powerful counterexample (Spencer, Steele, and Quinn 1999). This argument can also hold for male instructors in a nonquantitative courses in which GPD is positive.
In the next section, we test instructor gender as an intervention that challenges gendered beliefs. If stereotypes are the reason that students have diverging interests and expectations about success, and if instructor gender influences those stereotypes, grades and the interests of female (male) students should increase in quantitative (nonquantitative) courses when female (male) instructors teach them. We expect a stronger effect of instructor gender on GPD in quantitative courses compared with the effect of instructor gender on GPD in nonquantitative courses because female instructors in quantitative courses are rarer and thus more likely to change gendered beliefs. Therefore, we expect the interest and grades of female students to increase in quantitative courses when female versus male instructors teach those courses. If instructor gender indeed changes stereotype beliefs by providing a salient counterexample to the stereotype, we also expect female students to be more likely to rate female instructors as inspirational or good role models compared with male students in quantitative courses.
Instructor Bias
Instructors of different genders can also hold different beliefs about male and female ability, and these gendered beliefs may drive instructor behaviors, expectations, and evaluations, thus indirectly affecting students’ academic success (Lavy and Sand 2015; Leinhardt, Seewald, and Engel 1979). Research has shown that male and female instructors may differ in their perception and treatment of male versus female students (Krieg 2005; Rodriguez 2002; Stake and Katz 1982).
To elaborate in the context of our study, if instructors hold beliefs that female students will perform worse than male students in quantitative courses and that male students will perform worse than female students in nonquantitative courses, the instructor’s behavior may help realize these beliefs. For example, the instructor may give different levels of homework to male versus female students, thus facilitating their subject proficiency differently. The instructor may also grade male students and female students differently in subjective grade components or call more on male students than on female students in class. While all students in fixed-core classes get the same assignments due to the coordinated nature of the program, class participation and other subjectively graded components may influence grades in business school classes.
In the next section, we provide two empirical tests for the possibility of instructor bias. First, using the survey data, we test whether the way students feel treated by their professor varies by the student’s and the professor’s gender in quantitative and nonquantitative courses. Second, using data from course syllabi, we test whether the impact of the interaction between instructor and student gender on the student’s grade varies with the importance of subjective performance evaluations and class participation.
Innate Preference Differences Across Genders
One may also consider innate preference differences between men and women that can drive interest and motivation in different academic subjects and subsequently affect academic performance. A stream of literature in psychology claims that women are more people-oriented and men are more thing-oriented and that this dichotomy explains both college majors and vocational preferences (e.g., Lippa 1998, 2010; Su, Rounds, and Armstrong 2009). Zafar (2013) suggests that men care more about money than women and thus pursue more lucrative careers. If there are innate preference differences across genders, GPD may simply be a result of students optimally allocating time and effort into courses based on their interests.
If innate student preferences alone can explain GPD, GPD may be a substantial problem. After all, if male students inherently prefer quantitative subjects more than female students do, and if female student inherently enjoy and prefer nonquantitative courses and/or careers, what does it matter if these preferences are reflected in associated GPD? If this is the case, it would be an open question as to whether interventions are needed to change these preferences.
Note that if innate preferences drive GPD, there is no reason to believe that the instructor’s gender would change these innate preferences for some courses and not for other courses. More specifically, if innate preferences drive GPD, there is no reason to believe that female instructors would increase the interest of female (but not male) students in quantitative (and not in nonquantitative) courses.
Evidence for Drivers of GPDs
We use survey, administrative, and syllabi data to assess which of the hypotheses are most likely to be the primary drivers of the GPD differences seen in the data.
Student Effort, Performance Expectations, Interest, and GPD Expectations Across Courses
Recall that we asked students in the survey to think back to the beginning of each of their core courses and evaluate how likely they thought it would be for them to get an A in that course, how interested/excited they were about the course, and how much effort they put into the course. We also asked them about their GPD expectations, which we report in Table 4.
Effort
We do not find differences in how women and men expected to allocate effort across types of courses. However, women reported more effort overall (men = 64.01, women = 75.33; p < .01).
Performance expectations
When assessing their own performance capacity, women reported lower expectations of getting an A in quantitative courses compared with nonquantitative courses (71.72 vs. 58.69, p < .01), whereas men were equally confident across course types (64.04 vs. 64.28, p > .8). These differences support the conjecture that female students’ expectations about their performance competency vary across stereotypically male and female subjects.
Interest
Women reported being more interested in nonquantitative courses than quantitative courses (64.18 vs. 54.83, p < .01); in contrast, men reported being more interested in quantitative courses than nonquantitative courses (56.23 vs. 52.22, p = .042). Interest differences can be driven by gender stereotypes or by innate preferences. Recall that the survey asked about students’ career interests. We find that 64% of male students versus 48% of female students in the survey indicated an interest in pursuing a career in finance or consulting, the two highest-paying jobs after a UBA program. The administrative data also corroborates gendered differences in career paths: the percentage of female students among those who took three or more electives in that subject is 17% in finance, 34% in accounting, and 60% in marketing (the top three subjects of interest in electives). Taken together, these differences provide credence to the idea that student interests vary by gender. However, it is unclear whether the interest differences are driven by stereotypes or innate preferences.
GPD expectations
Recall that the students’ GPD expectations in the survey were in line with our findings for the actual GPD at the business school we studied. The survey also asked students to explain the reasons behind their GPD guesses. Interestingly, students’ lay theories correspond to theories put forth in the literature. Approximately 36% of the students made statements in line with gender stereotypes: “I also think girls are more creative and better at soft skills so those classes favor them,” and “I think that gender stereotypes in regard to numbers and the finance field is what made me choose certain answers. I think it varies across classes for that exact reason. I feel that because of these stereotypes and them being so prevalently spoken about at the business school that girls would probably do better in marketing and boys in statistics and finance.”
In addition, 38% of students mentioned gendered differences in course interest driven by differences in career interests. As one student stated, “The variation across classes could be because of the differences in career interests between males and females and the effort they put in [courses] because of these interests.” As another said, “I think men tend to concentrate more in areas such as finance and consulting, and put more effort into classes related to those fields.” A total of 13% of students mentioned lack of representation (e.g., few women in finance, few men in human resources) as a reason they expected women (men) to outperform men (women) in nonquantitative (quantitative) courses, for example stating “male to female professor ratio” or “finance is a male-dominated field” as a support for expecting GPD in certain courses. Four students reported the importance of feeling connected to the professor and the professor having an impact on female students’ performance and participation in the course. For example, one student stated, “I have noticed that some of my female classmates are more willing to participate/engage with the material depending on the type of professor. Therefore, in the courses that I believe females do better than males, it is because of the professors’ ability to make everyone more engaged and feel comfortable.” Another student referred to the gender of the professor as an important factor in explaining her GPD guesses: “Female professors can connect to the females in the class better, leading them to do better.”
Testing for Stereotype Bias (Student-Based)
Given the possibility that gendered stereotypes may be contributing to the GPD differences we find, we test instructor gender as an intervention that challenges gendered beliefs. Recall that we expect our test to be particularly strong for female instructors in quantitative courses. We argue that gender of the instructor in quantitative courses is likely to be a stronger manipulator of gender stereotypes because female instructors in quantitative courses are rarer.
We present three pieces of evidence in support of stereotype bias (student-based) driving GPD. First, using the administrative data, we provide evidence for the impact of instructor gender on GPD. Second, using the survey data, we provide evidence for the impact of instructor gender on student interest. Third, using the survey data, we provide evidence for the impact of instructor gender on the extent to which instructors are viewed as a role model.
Impact of instructor gender on GPD
In documenting the impact of instructor gender on GPD, we simultaneously address identification challenges that arise from students’ self-selection into courses or instructors and from the aggregation bias stemming from unequal instructor gender representation across courses that may have confounded some prior work in this area (e.g., Carrell et al. 2010; Griffith 2014). To estimate the causal impact of instructor gender on GPD, we rely on the random instructor–student assignment and estimate the following regression:
where 1(gs = F) is an indicator that the student is female and 1(gp = F) is an indicator that the instructor p is female. The coefficient βq(c) captures the average effect of having a female instructor teaching quantitative or nonquantitative courses compared with having a male instructor. We do not have a priori expectations of differences in the teaching effectiveness of male and female instructors. To ensure that the differences we document across instructor genders are not explained by differences in instructor descriptors (e.g., race, experience), we allow for student-gender-specific responsiveness to instructor–term-specific control variables (Xpt) by permitting the slope
Note that Equation 3 is a difference-in-differences specification where the student gender main effect is subsumed in the vector of intercepts for each term–course–student gender combination (ηctgs ). We allow student-gender intercept to vary by course to allow for gender differences in academic performance to vary by course. It is important to account for this variation. Johnson (2014) discusses the potential for an aggregation bias arising from unequal exposure of students to different instructor genders across courses/contexts. Across different fields in the business school, we may expect an unequal allocation of instructor gender across courses that is correlated with GPD in the undergraduate program, because undergraduate success in a field paves the way for a PhD and then an academic job in that field. To make inferences that are not confounded by such a correlation, it is necessary to control for the average GPD by student gender that would be observed regardless of the gender of the instructor teaching the course. In this manner, our empirical specification improves on the related prior literature (e.g., Carrell et al. 2010; Griffith 2014).
Given this specification, the coefficients of interest, γq(c), reflect the difference in grades female students achieve compared with male students in the same type of course when they are taught by a female instructor compared with their performance (relative to male students’) when they are taught by a male instructor. If female students’ relative academic performance in quantitative courses is better with female instructors than it is with male instructors, we expect γ to be positive, closing the negative gap we have documented between female and male student performance in quantitative courses. Because male students lag behind female students and the GPD is positive in nonquantitative courses, a positive γ in nonquantitative courses would suggest that the gap between female and male students when nonquantitative courses are taught by female (male) rather than male (female) instructors is larger (smaller). Therefore, a positive γ in either type of course would suggest that the gender performance gap decreases in absolute value when the instructor gender matches the gender of the students lagging behind on average (and therefore goes against the gender stereotype).
The estimates of interest appear in the first column of Table 6. We find that in quantitative courses, female instructors have no effect on the grades of male students (β = −.011, p = .559), but they have a differential positive impact on the performance of female students (γ = .077, p = .002). Focusing on the estimated coefficient on the female student × female instructor interaction, we observe that the estimate is of substantial magnitude (7.7% of a standard deviation). Recall that in Table 5, GPD was −11% for quantitative courses. Therefore, having a female instructor teach quantitative courses substantially reduces GPD, closing a majority of the original gap. In addition, this finding is entirely driven by female students doing significantly better when taught by a female instructor and not by a decline in male students’ academic achievement, as the estimate of the βq(c) = 1 coefficient (female instructors’ main effect on quantitative course grades) is small in magnitude and statistically insignificant. Results at the course level are presented in Appendix B and are in line with the results in Table 6.
Impact of Instructor Gender on Grades, Interest, Performance and Effort.
*p < .10.
**p < .05.
***p < .01.
Notes: Column headings correspond to the dependent variable evaluated by the regression. Control variables in Column 1: Course by semester by student-gender fixed effects, student-term control variables Xst interacted with each subject, and instructor-term control variables Xpt interacted by student gender. Control variables in Columns 2–4: Course by student-gender fixed effects. Robust standard errors are clustered at the course-instructor level. p-values are in parentheses.
In nonquantitative courses, we do not find any impact of instructor gender, regardless of the gender of the students taking these courses. This finding suggests that male students do not benefit from having a gender-matched instructor in courses in which they lag. This null result suggests that male students’ underperformance in nonquantitative classes may not be primarily driven by stereotypical beliefs and/or that having a male instructor in nonquantitative classes is not effective in challenging these stereotypes.
How the instructor gender effect on GPD varies by student’s math aptitude
We explore whether the instructor gender impact on GPD varies by the initial math skills of students. We run a regression that extends Specification 3 by interacting student math skill group and course type with student gender, instructor gender, and the interaction of student and instructor gender. 5
Table 7 reports the main coefficients of interest by course type and student type. We find that female instructors have a positive impact on female students in quantitative courses, but this effect is mostly driven by the grade lift of students with math skills that are in the middle of the distribution, followed by the grade lift of students with top math skills. 6 As we expected, there is no heterogeneity of instructor gender impact based on initial math skills in the GPD in nonquantitative courses.
Impact of Instructor Gender, by Students’ Initial Math Aptitude.
*p < .10.
**p < .05.
***p < .01.
Notes: The regression includes interactions of student’s initial aptitude level and course type with student gender, instructor gender, and the interaction of student and instructor gender. The table organizes coefficients from this regression across columns by student’s initial aptitude level. The regression also includes the following control variables: course by term by student-gender fixed effects, student-term control variables Xst interacted with each subject, and instructor-term control variables Xpt interacted by student gender. Robust standard errors are clustered at the course-instructor level. p-values are in parentheses.
Taken together, and interpreted through the lens of the additional process evidence we present, these results suggest that female students who have middle-to-high aptitude for quantitative subjects are the most disadvantaged by gender stereotypes and benefit the most from role models. Importantly, it is these women that the professional world is missing out on due to the misallocation of talent driven by stereotyped beliefs.
Impact of instructor gender on student interest
Turning to the survey data, we examine the impact of instructor gender on student interest and performance expectations in fixed-core courses. In particular, we estimate the following:
where 1(gs = F) is an indicator that the student is female and 1(gp = F) is an indicator that the instructor p is female. Note that because we survey one cohort of students who take fixed-core classes at the same time, there is no variation at the term t level that we need to account for. Again, student gender main effect is subsumed in
First, we study the impact of instructor gender on student interest. The coefficients appear in the second column of Table 6. Female students’ interest in a quantitative course relative to those of men’s is higher (γ = 24.72, p < .001) when the instructor is female, but instructor gender does not substantially influence the gendered interest difference in nonquantitative courses (γ = −7.48, p = .077). The instructor’s gender does not influence men’s interest in quantitative (β = 4.50, p = .138) or nonquantitative (β = −1.27, p = .679) courses. These results are in line with the impact of instructor gender on GPD.
The third column of Table 6 reports results from the same specification when the dependent variable is students’ initial expectations about performance. We find marginal evidence that female students’ performance expectations relative to those of men’s in quantitative courses are higher when the instructor is female rather than male (γ =11.64, p = .031). Instructor gender does not influence men’s performance expectations in quantitative (β = 1.62, p = .498) or nonquantitative (β = .76, p = .771) courses. These results are again in line with the impact of instructor gender on GPD. We do not find any significant effect of instructor gender on student effort for any group of students or courses (Column 4, Table 6, all ps > .49).
In summary, we find that having a female professor challenges gendered beliefs and increases female students’ interest (and to some extent performance expectations) in quantitative courses, in which gendered beliefs hamper female student performance. Taken together with the results pertaining to GPD, these results seem to suggest that the impact of female instructors on GPD in quantitative courses may operate by changing student beliefs and interests. Echoing the null effect of instructor gender on GPD in nonquantitative courses, we also find a null effect of instructor gender on student interest and beliefs in courses in which male students lag behind. If interest is an antecedent to performance, this result would suggest that instructor gender may fail to affect GPD in nonquantitative courses, because having a male instructor teach these courses does not increase male students’ interest in them.
Evaluations of the instructor as a role model
Recall that the survey asked students to rate the extent to which they considered their instructor a role model or felt inspired by the instructor on a 1–7 Likert scale. We use the same specification as in Equation 3 to explore the impact of instructor gender on these evaluations, divided by student gender and course type.
We report the estimates in Table 8. We find that female instructors teaching quantitative courses are viewed as marginally more inspirational than male instructors teaching these courses (β = .533, p = .060), but the positive inspirational/role-model effect is much larger for female students (γ = .911, p = .018). Consistent with other empirical patterns, instructor gender does not affect either student gender group’s perceptions in nonquantitative courses. These results provide direct evidence for the conjecture that instructors can be powerful role models by providing successful counterexamples to stereotypes. Taken together with the evidence we present regarding the positive impact of female instructors in closing the GPD in quantitative courses, this finding lends credence to the idea that role models can create meaningful performance changes by challenging stereotypical beliefs.
Impact of Instructor Gender on Perceptions of Role-Models and Treatment.
*p < .10.
**p < .05.
***p < .01.
Notes: Column headings correspond to the dependent variable evaluated by the regression. The regression includes the following control variables: course by term by student-gender fixed effects, student-term control variables Xst interacted with each subject and instructor-term control variables Xpt interacted by student gender. Robust standard errors are clustered at the course-instructor level. p-values are in parentheses.
Instructor Bias as a Possible Explanation for GPD
Instructor bias could be an alternative explanation for the instructor gender effects we document in quantitative courses if female instructors have preferences or beliefs that help female students (only) in quantitative courses relative to male instructors. For example, if male instructors think that women cannot perform well in quantitative classes, but female instructors do not hold this belief, we could potentially have gendered instructor differences in grading and in student treatment within quantitative courses. We test for potential instructor bias by examining differences in students’ perceived treatment as measured in the survey. If instructor bias were palpable, we would expect students to report differences in how they fairly they felt male versus female instructors treated them. Of course, bias is not always noticeable but still may lead to grade differences. We expect such grade differences to be more prominent in courses where a larger proportion of the grade is subjective. Therefore, we also test for potential instructor bias by examining whether GPD varies with how performance in a course is graded (how subjective the grade is).
Instructor gender and perceived treatment
The survey asked students to rate how fairly they thought the instructor treated them on a 1–7 Likert scale. We use the same specification as in Equation 3 to explore the impact of instructor gender on these evaluations by student gender and course type. The second column of Table 8 reports the results. We find that how students feel treated by their professor does not vary by the student’s and the professor’s gender in quantitative or nonquantitative courses (all ps > .3). This null effect is supportive of the conjecture that overt instructor bias is unlikely to be a main driver of results in this context.
Moderation by grading component weights
We turn to the administrative data to test instructor bias in grading. As discussed previously, if instructor bias is a main driver of the impact of instructor gender on GPD in quantitative courses, we would expect this effect to be larger in classes in which the instructors have more discretion over the grades and/or when subjective performance evaluations comprise a larger fraction of a student’s grade in the course.
To test this conjecture, we code the syllabi of fixed-core courses. For each class, we denote the percentage of the grade that depends on participation, group assignments, and individual exams or assignments. Almost all individual exams and individual assignments are graded by teaching assistants and/or are standardized (e.g., multiple choice, common rubric); therefore, they are unlikely to suffer or benefit from instructor bias. However, class participation may be influenced by instructor bias. 7 In addition, group projects are more likely to be graded by instructors; therefore, the potential for bias exists, but all students in a group receive the same grade. Using variation across terms within a course in the weight of class participation and group- and individual-graded components, we test the conjecture that the effect of instructor gender on GPD would be larger in classes in which subjective grading components (class participation and group assignments) represent a larger fraction of a student’s grade in the course. In particular, we extend Equation 3 by including interactions of professor gender, student gender, and the percentage of the grade that comes from participation and group assignments.
Table 9 presents the results. We provide the estimates for quantitative and nonquantitative classes across two columns for ease of exposition. The impact of professor gender on GPD (captured by γ) is not moderated by the weight of the class participation and group components versus the individual component of the grade. Therefore, we conclude that the positive impact of female instructors on closing the gender gap in quantitative courses (in which female students lag behind) is not moderated by how the students are graded in a course. Although we cannot explicitly rule out instructor bias because we do not observe instructor beliefs, these results do not show the kind of data patterns we would expect if instructor bias were a main driver of our results.
Moderation by Weight of Individual, Group, and Participation Grade Components.
*p < .10.
**p < .05.
***p < .01.
Notes: The regression includes interactions of course type, professor gender, student gender, and the percentage of the grade that comes from participation, group assignments. The columns organize the coefficients by course type. The regression includes the following control variables: course by term by student-gender fixed effects, student-term control variables Xst interacted with each subject, and instructor-term control variables Xpt interacted by student gender. Robust standard errors are clustered at the course-instructor level. p-values are in parentheses.
Innate Preferences as a Possible Explanation for Positive GPD
Our results for instructor gender effects on GPD in quantitative courses are not consistent with innate preferences as the driver of GPD. As we stated previously, if differences in innate preferences are the main driver of GPD, then female instructors should not increase interest and improve the grades of female students in these courses. Our results are more consistent with the hypothesis that gender stereotypes hinder female students’ academic performance.
In contrast, we get null results for instructor effects in nonquantitative courses. We cannot rule in or rule out gendered differences in innate preferences driving the male students’ underachievement in nonquantitative courses.
Conclusion
We document significant academic achievement gaps among otherwise similar men and women in business education. The magnitude of our findings should be of concern to business schools. Women’s grades are, on average, 11% of a standard deviation lower in quantitative courses than those of men with similar academic aptitudes and demographics, and men’s grades are, on average, 23% of a standard deviation lower in nonquantitative courses than those of comparable women. In the case of women’s achievement gap in quantitative courses, we also show that instructors whose identity counters gender stereotypes of performance in these fields can help close these gaps.
These results suggest that business schools should strive to combat gender stereotypes that may be hindering students’ achievement. Academic achievement gaps not only create inequity in education but also may have far-reaching consequences, because they may shape occupational choices (Ost 2010). Bertrand, Goldin, and Katz (2010) cite gender differences in GPA and the propensity to take finance courses as a potential contributor to the gender gap in earnings among its master of business administration students in later stages of their careers. In the business school we study, the gender gap in wages were prominent even immediately after graduation and directly correlated with career choices. Analyzing the exit survey of the last three cohorts of undergraduate alumni, we find that women earn an average of $71,186 and that men earn an average of $75,155 after graduation (difference of $3,969, p < .0001). The top starting salaries are highest for jobs in investment banking (average $84,600), consulting (average $75,000), other finance roles (average $74,100), operations and supply chain management (average $68,600), and marketing in the technology sector ($66,500). Jobs in human resources ($59,100), general management ($60,700), accounting ($60,700), marketing in consumer packaged goods (CPG)/retail ($62,100), and advertising (average $51,100) pay significantly less. The top three highest-paying tracks are the most popular in terms of percentage of students choosing their first jobs, followed by marketing jobs across different industries (tech, CPG/retail, other). Although female student representation is 41% overall, women are underrepresented among students taking jobs in investment banking (20%) and overrepresented in accounting (65%) and human resources (90%). Women are also overrepresented in marketing positions, but more so for positions in CPG and retail (84%) than for positions in the technology sector that pay more (54%). 8 In light of early achievement gaps in quantitative courses that are basic stepping stones for an academic path that prepares students for jobs in the investment banking and technology sectors, these results add to the concern that academic achievement gaps may be influencing career-path choices and success down the line.
Both for educational purposes in their own right and to help close equity gaps in the initial placement and promotion of men and women in management jobs, it is important to look for ways to arrest achievement differences in business education. Our findings suggest the power of instructors as role models. In addition, business schools may also be able to reduce these discrepancies by providing other salient exemplars that counter gender stereotypes (e.g., speakers, student organization leaders, alumni). Given that we observe achievement gaps in the introductory curriculum, our findings also highlight the necessity of focusing these efforts on the early years of the UBA program. If providing role models for women in quantitative courses taught in business schools can close academic achievement gaps, this can have long-lasting repercussions on the extent to which early achievement begets interest and future performance in those fields.
Eliminating achievement gaps due to gender stereotypes can also help more accurately align students’ abilities and talent with their beliefs and educational choices. Better alignment, in turn, will improve students’ career success and satisfaction and therefore may ultimately benefit their employers in several ways. First, better alignment can increase productivity and employee satisfaction. Second, to the extent that it also levels the playing field among men and women, it can help recruiters hire a more diverse workforce. Considering both the student-side and the employer-side benefits, increased alignment of talent and educational choices can aid business schools as they race to attract high-quality and diverse applicants as well as top-paying recruiters who are looking for quantitatively qualified female applicants.
The potential misallocation of talent to careers due to stereotypical beliefs is particularly important to consider for the field of marketing as it becomes more quantitative and data-driven. Top business schools are introducing multiple “analytics” courses and specializations, and many of these courses are offered by marketing departments. To get desirable and lucrative jobs within the field of marketing and advance in the field, these courses are becoming increasingly important. Our results point to the need to consider the representation of female faculty in delivering these courses to help allocate the right talent to the field, to benefit students, and to assist future employers looking for a diverse and talented workforce.
Our results also underscore the importance of reducing gender gaps when hiring new faculty and highlight that special attention should be paid to this issue in more quantitative disciplines. We also recommend that faculty teaching assignments be linked to gender representation needs across courses, beyond the typical practice of “which new hire is willing and able to teach which elective.” Our findings indicate that differences in quantitativeness require different assignment practices—not because of the capability of the instructor in teaching the course but because of the impact on student performance. The good news is that a change in faculty gender representation in early coursework in UBA programs can be easily realized when hiring new faculty. Interestingly, this may also result in a more gender-balanced faculty in the quantitative disciplines, with more undergraduates moving to pursue doctorates in quantitative fields; currently the faculty gender ratio is heavily skewed in favor of men at most schools. As such, our results bear many implications for the administration of business schools.
We hope that our results prove useful to business school administrators, faculty, and students. We also hope that our work inspires further research on the determinants of and solutions to academic achievement disparities across genders in higher education. After all, our economy is only as strong as the talent that fuels it.
Supplemental Material
Supplemental Material, sj-pdf-1-mrj-10.1177_0022243720972368 - Gender (Still) Matters in Business School
Supplemental Material, sj-pdf-1-mrj-10.1177_0022243720972368 for Gender (Still) Matters in Business School by Aradhna Krishna and A. Yeşim Orhun in Journal of Marketing Research
Footnotes
Appendixes
Acknowledgments
The authors thank Paul Kirsch, Regina Zmich, Heather Bryne, Lillian Chen, Lisa Kozlo, and Kevin Gates for their help in providing access to and support with institutional data from the business school. Max Resnick and Bruno Castelo Branco provided excellent research assistance. The authors also thank Timothy McKay and Mark Umbricht for their support, extensive knowledge, and help with the administrative data set. They also thank the Associate Editor for his guidance and the reviewers for their helpful comments.
Authors’ Note
The authors are listed alphabetically.
Associate Editor
Hari Sridhar
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
