Abstract
Efforts aimed at broadening participation in science, technology, engineering, and mathematics (STEM) require a holistic presentation of the state of racial and gender participation. Statistics currently used to describe participation often include raw counts of degrees and the percentages of demographic groups receiving STEM degrees. While these data provide insights into demographic trends, they do not present the complete picture because these “traditional” statistics do not capture how well a field of study reflects—or is proportionally similar to—a larger body, such as the college population. If the goal of broadening participation in STEM education is to ensure that all racial and gender groups are proportionally represented, analysts require direct measures of representation. In this article, we present a novel metric that assesses the degree to which groups are overrepresented or underrepresented in a given field. This metric calculates field-specific representation by comparing the proportion of degrees awarded to members of a demographic group in a specific field of study with the proportion of all degrees awarded to that group. Using data from the National Science Foundation and the Department of Education, we demonstrate the efficacy of this representation metric and show that it provides new insights into STEM participation levels for women and other groups considered to be underrepresented. While traditional measurements show the increasing number of degrees awarded to and the increasing share of underrepresented minority students in STEM, our metric revealed that underrepresented minorities remain underrepresented in STEM fields, especially in engineering and the natural sciences.
Introduction
Both national and local efforts seek to increase participation in science, technology, engineering, and mathematics (STEM) fields. Often, programs emphasize broadening participation, typically a policy shorthand for increasing the number of underrepresented minorities (URMs)—Black, Hispanic, and Native American students—in STEM overall and women studying the physical sciences, technology, and engineering (National Science and Technology Council, 2013). The Census Bureau projects that the United States will become a majority-minority nation in 2043 (U.S. Census Bureau, 2012). However, the demographic profile of American students, particularly American college students, does not necessarily reflect the changing national populace. This issue is particularly pertinent in the STEM fields, which are linked to higher reported earnings relative to recent graduates majoring in humanities or education fields (Melguizo & Wolniak, 2012).
Seeking to examine the relative representation of URMs and women in STEM fields, analyses of higher level demographic trends often rely on descriptive statistics. Typical figures include the number of URM students awarded STEM degrees, the percentage of all STEM degrees awarded to URM students, or the percentage of URM students awarded STEM degrees (National Academies, 2011; National Science Board [NSB], 2014). These statistics, while useful, present an incomplete, and often contradictory, picture of representation in STEM. While the National Center for Education Statistics (Chen, 2013) suggests that minorities and women remain underrepresented due to higher rates of attrition in STEM, the NSB (2014) conversely reports that minorities are earning a greater share of undergraduate STEM degrees. The NSB figure lacks the context of the broader college population, making it unclear whether URM students are increasing their relative representation. Furthermore, other data have suggested that racial and ethnic minorities have obtained equal footing in STEM fields—approximately one third of undergraduate students within each racial group, with the exception of Asian Americans, earn STEM degrees (NSB, 2014)—therefore implying that broadening participation further requires increasing the number of URMs enrolling in college. If it is true that each racial group may earn STEM degrees at near-equal rates, this figure still does not put this figure in perspective relative to the number of each group enrolled. This is particularly relevant given that URM students are disproportionately more likely to be first-generation college students, who typically experience higher rates of attrition in STEM (Kena et al., 2015; Shaw & Barbuti, 2010). Therefore, in order to assess whether women and URMs are trending toward parity in STEM, further work is needed to quantify their relative representation in the field.
Recent work has attempted to quantify the representation of URMs in STEM fields. The American Institutes for Research (2012) developed the Broadening Participation in STEM Progress Index, which compares the percentage of all STEM bachelor’s degrees earned by each racial/ethnic group with U.S. Census population estimates for each group. The index quantifies overrepresentation (value >100%) and underrepresentation (value <100%) in a way that is novel, yet the comparison of STEM graduates to national population estimates does not address whether the issue of underrepresentation lies in STEM attrition or college matriculation. Furthermore, although much research has been separately focused on women and URMs in STEM, there remains a gap in quantifying the interaction of gender and race in relation to STEM representation (Riegle-Crumb & King, 2010).
In this article, we present a metric that quantifies the degree to which demographic groups are overrepresented or underrepresented in STEM fields. Building on the framework developed by the American Institutes for Research (2012), this metric allows the quantification of representation by comparing the proportion of STEM bachelor’s degrees awarded to each race and gender group with the proportion of all bachelor’s degrees awarded to each group. The metric also addresses problems presented by metrics relying on percentage-based index values (e.g., 90% or 150% representation). While the use of ordinary percentages asymmetrically represent data—200% is a “larger” deviation from 100% than is 50%—our metric presents representation on the order of doublings and halvings, making 200% as equal a deviation as 50% representation.
Method
Data Sets and Term Definitions
We used data from both the National Science Foundation’s (NSF) Science and Engineering Indicators 2014 report and the National Center for Education Statistics’ Integrated Postsecondary Education Data System (IPEDS). Although the NSF data set uses IPEDS data, we chose to use the data found directly in the NSF’s report in order to illustrate the further insight gained by the use of our representation metric. We used data from NSF’s Appendix table 2-17 for data on bachelor’s degrees awarded by sex and NSF’s Appendix table 2-23 for data on bachelor’s degrees awarded by race/ethnicity (NSB, 2014). Appendix table 2-23 counts temporary residents, excluded in this analysis, as a separate racial/ethnic group, while Appendix table 2-17 does not distinguish permanent and temporary residents.
For further insight, we turned to the IPEDS survey, which is conducted at all institutions that participate in the Federal student aid programs as authorized by Title IV of the Higher Education Act of 1965 (National Center for Education Statistics, 2015). We used data from the IPEDS Completion Survey, which includes information on the types of degrees awarded each year and the demographics of the students receiving them. The analysis was conducted exclusively on U.S. citizens, thus students classified as “temporary residents” in the IPEDS data set were excluded.
We used two different definitions of science-related fields. For NSF data, we used their definition of science and engineering (S&E) fields, which includes engineering, natural science, and social science fields. For IPEDS data, we used the National Center for Education Statistics’ definition (Chen, 2013), which includes the broad fields of life sciences, physical sciences, mathematical and computer sciences, geosciences, science/engineering technologies, and engineering. Of note, this definition excludes the social sciences, including psychology and anthropology, as well as the preprofessional health sciences fields.
Black, Hispanic, and Native American students were classified as underrepresented minorities (URM), according to the NSF definition (NSF, 2008). White and Asian students were excluded from this classification. Because the IPEDS racial category of “Other/Unknown” was ambiguous, students in this category were classified as their own third category, inheriting the “Other/Unknown” label. Other/Unknown students have been a sizeable fraction of the college population in recent years and were therefore included in the analysis.
Calculation of the Representation Metric
Let
Let
Equation 1: Proportion of bachelor’s degrees awarded to members of group i.
Next, let
Equation 2: Proportion of field j bachelor’s degrees awarded to members of group i.
Finally, let
Equation 3: Representation metric value of group i in field j.
Intuitive Explanation
This metric operates on the assumption of proportional representation throughout the college population. If there were no factors affecting the attraction to and retention within STEM, one would expect that the proportion of engineering bachelor’s degrees awarded to Hispanic students, for example, would be the same as the proportion of bachelor’s degrees awarded to Hispanics. If the proportion of engineering degrees awarded to Hispanic students is lower than the proportion of all bachelor’s degrees awarded to that group, one can reasonably conclude that Hispanics are underrepresented in that field.
Taking this line of reasoning a step further, dividing the proportion of engineering bachelor’s degrees awarded to Hispanic students by the proportion of all bachelor’s degrees awarded to Hispanic students yields a quantitative measure of how overrepresented or underrepresented the demographic is within the field of engineering. For example, if in a given year, 20% of all bachelor’s degrees were awarded to Hispanic students, we would expect 20% of engineering bachelor’s to be awarded to Hispanic students. This proportion, based on the percentage of all bachelor’s degrees awarded to a specific demographic, is our expected value. However, if we find that in that same year only 13% of engineering bachelor’s degrees were awarded to Hispanics, we could divide this proportion by the proportion of all bachelor’s degrees awarded to the same group to find that the number of Hispanic students graduating in engineering is 65% of the number we would expect, given the group’s proportion in the general college graduation pool. We consider this expected value as a measure of representation within that field.
Last, we take the binary logarithm (
Results
Metric Insights With NSF Data Set
Summary statistics bolster the claims of increasing S&E degree numbers found in NSF’s Science and Engineering Indicators 2014. The number of S&E degrees awarded to each racial group has increased between 2000 and 2011 (Figure 1A). In 2011, 101,207 S&E degrees were awarded to URM students, an increase of 59.2% over 2000 levels. Similarly, 390,840 S&E degrees were awarded to White or Asian students nationally, an increase of 27.8% over 2000 levels. Comparing these two populations with each other, the share of all S&E degrees awarded to URM students nationally has increased from 16.6% in 2000 to 19.0% in 2011 (Figure 1B).

(A) Raw count of S&E degrees awarded. (B) Share of S&E degrees awarded to each racial group.
Another often-cited statistic—within-group share of S&E degrees—brings us closer to understanding possible representation discrepancies in S&E fields. This analysis of the data allows us to understand the rate at which individuals within a demographic group pursue degrees in S&E fields. We see that over the span of about a decade, the within-group S&E share has remained relatively consistent between 30% and 35% of all degrees awarded to each group (Figure 2A). Additionally, we see that URM students pursue S&E degrees at a rate similar to, but slightly below, the rate of non-URM students. When only considering natural sciences and engineering (i.e., excluding social sciences), the gap between URM and non-URM students is more pronounced (Figure 2B). In 2011, URM students were awarded engineering and natural science degrees at a rate over five percentage points lower than White or Asian students. Furthermore, examining engineering and natural science fields reveals that the degree gap between URM and Asian/White students is widening, with URM students receiving engineering or natural science degrees at a lower rate in 2011 than in 2000. While this method of data analysis brings some insight into S&E education discrepancies between URM and non-URM students, it still does not quantify the relative underrepresentation or overrepresentation of each group.

(A) Percentage of individuals within racial group awarded S&E degree. (B) Percentage of individuals within racial group awarded an engineering or natural science degree.
The representation metric reveals trends that are otherwise difficult to discern in the typical representations presented in Figures 1 and 2. While Figure 1 revealed increasing absolute numbers and increasing shares of S&E degrees, this analysis reveals that despite these figures, URM students have been decreasing in relative representation within S&E fields (Figure 3A). This trend may have been hinted at in the analyses presented in Figure 2; however, the representation metric allows one to quantify the overrepresentation and underrepresentation of groups within the field. In 2011, URM students had an S&E representation level of −0.058, meaning that they were awarded S&E degrees at a rate 96.1% of that which one would expect given their proportion of the college population. Once again, when examining only engineering and natural science fields, we see that URM students have been consistently underrepresented and have actually been experiencing a declining level of representation in these fields (Figure 3B). Between 2000 and 2011, URM representation in engineering and the natural sciences declined from −0.243 (84.5% of the expected value) to −0.408 (75.4% of the expected value).

Representation metric by demographic group over time using the NSF data set. (A) Representation metric by racial group using NSF definition of S&E fields. (B) Representation by racial group in only engineering and natural science fields.
The representation metric can be also used to disaggregate S&E fields and present a holistic snapshot of representation in one figure. Figure 4 presents the 2011 representation of URMs and women in S&E disciplines. By simultaneously presenting the representation of both URMs and women in each field, we see that underrepresentation is not uniform across S&E fields. For example, while computer science has racial parity, women are still notably underrepresented in the field. On the other hand, while women are well represented in both anthropology and biology, URM students remain underrepresented. Of course, both women and URMs remain significantly underrepresented in some fields, such as mechanical engineering, aerospace engineering, and physics. Aggregating the fields into their broad-level disciplines, we see again that women and URMs tend to remain underrepresented in engineering and natural science disciplines (Figure 5). On the other hand, women and URMs are actually slightly overrepresented in the social sciences and non-S&E fields.

2011 Representation of URMs and women in S&E fields.

2011 Representation of URMs and women in aggregate S&E disciplines.
Further Analysis With IPEDS Data Set
The NSF data set lacks some of the detail found in the original IPEDS data set because the NSF data stratify by race/ethnicity and by gender separately, but do not stratify by race/ethnicity and gender combined. Therefore, we applied the representation metric to IPEDS data in order to quantify how the interaction of race and gender relate to representation in STEM. At the same time, the two data sets use slightly different definitions of “STEM.” The IPEDS data use the National Center for Education Statistics definition of STEM, which includes technology fields and excludes the social and health sciences; the NSF data exclude technology fields and include the social sciences. This analysis revealed a similar but slightly different picture of representation compared with the analysis using NSF data. URM students have been consistently underrepresented in National Center for Education Statistics–defined STEM fields since 1995, making small gains in representation during the late 1990s and early 2000s before decreasing in relative representation to a value of −0.244 (84.4% expected value; Figure 6A). Moreover, including both race/ethnicity and gender in the analysis simultaneously revealed that underrepresentation in STEM is not uniform among URM graduates. While White and Asian women have increased their representation in STEM, even approaching parity, URM women have remained stagnant in their underrepresentation in STEM fields (Figure 6B). While men of both groups were initially overrepresented among STEM graduates, URM men have decreased in representation to now be below parity level in the STEM.

Representation metric by demographic group over time, calculated using the IPEDS data set. (A) Representation analysis by race/ethnicity, but not by gender. (B) Representation analysis by race/ethnicity, further stratified by gender.
Discussion
In this article, we presented a statistical metric representing the degree to which any given group is overrepresented or underrepresented within academic fields. By comparing the proportion of degrees in STEM awarded to a given group with the proportion of the college graduating class represented by that group, we demonstrated how the metric fills an information gap that is otherwise missed by other presentations of the data. While traditional measurements of STEM education demographics revealed the increasing number of degrees awarded to and the increasing share of URM students in STEM, this metric revealed that URMs remain underrepresented in STEM fields. The underrepresentation of URMs is particularly visible when considering only engineering and the natural sciences. In fact, analysis with our representation metric revealed that the gap between URM and non-URM students is widening in engineering and the natural sciences. Additionally, further analysis using IPEDS data revealed that gender representation within gender groups is not equal. While White and Asian women have approached parity in STEM, URM women remain significantly underrepresented. This finding in particular illustrates the need for statistical analyses that seek to fully capture the state of representation and not only participation in STEM.
This metric may bolster efforts to evaluate broadening participation programs. In fiscal year 2012, $1.396 billion in obligations were designated to Federal STEM education programs, many of which are focused on minority, disadvantaged, or underrepresented groups (Government Accountability Office, 2014). With these significant resources dedicated to broadening participation, there have been efforts to frame methods by which to evaluate the effectiveness of these programs, including metrics that can be used to evaluate progress (Clewell & Fortenberry, 2009; Leggon & Pearson, 2009). Our metric could be added to the list of statistical tools that can be deployed for evaluative purposes. In addition, not only does this metric allow national policy makers to quantify the trends in STEM representation at the national level but also it empowers state, local, and academic policy makers to examine their respective programs. For example, California state policy makers could examine the representation levels of both URMs and women in STEM within the California State University system (Figure 7).

2012 Institution-level representation of URMs and women in NCES-defined STEM fields within the California State University system.
While insightful, the representation metric is intended to be complementary to well-established statistical analyses. Each data analysis addresses issues that require slightly different policy remedies. For example, underrepresentation, as revealed by the metric, could warrant the funding of programs to recruit and retain URMs and women in STEM. On the other hand, if URMs were well represented in STEM but stagnant in the number of degrees being awarded, it may indicate that more emphasis should be put on URM college enrollment, as opposed to STEM recruiting or retention. Therefore, this metric should be considered another tool for policy makers and not a replacement for traditional statistical measurements.
The representation metric could be expanded to other comparisons. Since the metric is simply the comparison of two proportions, the input numbers can be interchanged to reveal other trends in the STEM pipeline. The use of a variety of representation calculations could be used to identify the “leaks” on the STEM pipeline for URMs and women. For example, taking inspiration from the American Institutes for Research (2012), this metric could be used to determine how representative STEM is of the larger U.S. population. Similarly, since graduate education first requires an undergraduate degree, the metric could be used to reveal how representative graduate STEM fields are of the undergraduate population. This would be done by dividing the proportion URMs hold in STEM graduate fields to the proportion URMs hold in the undergraduate population, or perhaps the proportion URMs hold in undergraduate STEM fields. Additionally, there has been speculation that LGBT students are severely underrepresented in STEM fields (Cech, 2015; Suri, 2015). Were adequate data acquired on the LGBT student population, this metric could be used to examine this issue.
The data presented here were aggregated for demonstrative purposes, but further work could delve into specific racial and ethnic groups. Future work with this metric could seek to analyze the representation of each URM subgroup—Black, Hispanic, and Native American—separately. Furthermore, recent White House initiatives have sought the disaggregation of data in order to accurately portray representation in STEM (Yu & Ahuja, 2013). This notion is perhaps highlighted in the finding that URM women remain underrepresented despite progress in the representation of White and Asian women.
Given the scale of the challenge required to increase STEM participation and the limitations imposed by finite resources, policy makers require better tools for identifying the nature of underrepresentation. The use of traditional data representations do not always represent the state of STEM participation accurately and can often miss key trends, such as the surging increase in representation for White and Asian women. To address the future of scientific research and the STEM workforce, we must first accurately diagnose the state of STEM education so that resources and attention can be correctly applied. We believe this metric, when used in parallel with traditional measurements, can help bring us closer to that goal.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
