On Measuring Social Science Impact

Abstract

I went to a talk at a conference a few years ago (remember those?) where the speaker referenced the ongoing debate around a paper by Ken Gergen entitled ‘Social Psychology as History’. The paper was published in the Journal of Personality and Social Psychology (JPSP), in 1973! The fact that a paper published nearly 50 years ago is still being debated by social psychologists today should remind us that the time horizons over which social science work has an impact is, on the whole, much longer than in STEM disciplines. And relatedly, social science scholars, within distinct ecosystems of reward and recognition, typically grow their reputations and expertise over a longer time scale than their STEM counterparts.

Gergen’s paper is wonderfully self-referential in this regard given its claim that social psychology is different from STEM and cannot issue the same kinds of law-like generalizations and predictions that one might find in physics and chemistry, not least because its very findings can alter the nature of the behaviour it attempts to explain. What sociologists call reflexivity impacts the study of meaning in ways that are not so relevant to the study of molecules. And for this and other reasons the impact of work in social science is often more diffuse over time rather than being about breakthrough Eureka-like discoveries.

The impact and influence of Gergen’s article is enduring because it has folded into what the LSE-based authors of The Impact of the Social Sciences called a ‘dynamic stock of knowledge’ (Bastow, Dunleavy, & Tinkler, 2014). This dynamic stock is filled not only with articles but influential books of course. That same year saw the publication of Clifford Geertz’s The Interpretation of Cultures and a year or two before that of John Rawls’ A Theory of Justice: extraordinarily important books that have made enduring contributions to anthropology and political theory respectively. Geertz and Rawls arguably made their reputations through these books in ways that dwarf the articles they published in high-impact journals. Again, this mixture of academic outputs is more a feature of humanities and social science, as contrasted with STEM, disciplines.

This is all to say that one size does not fit all when we talk of research impact or excellence. ‘Social Psychology as History’ was published in one of the most prestigious and impactful social psychology journals. JPSP is published by the American Psychological Association and has published many of the most important papers in the field. Gergen’s article is controversial still and despite his warnings most of American social psychology has wedded itself to the claims that the STEM research model is applicable to their domain too. And this in turn has led to a replication crisis which maybe argues for Gergen’s original point.

So what does the fact that JPSP has a two-year impact factor of 7.673 actually tell you about the journal? A two-year moving window, designed with STEM research in mind, that certainly doesn’t include a nearly 50-year-old article with more than 3,000 citations. One might learn more about the journal from a comment made by Malcolm Gladwell who, on being asked where he would like to be buried, replied ‘I’d like to be buried in the current-periodicals room, maybe next to the unbound volumes of the Journal of Personality and Social Psychology (my favorite journal).’ His bestselling books such as The Tipping Point, Blink and Outliers have brought much of the empirical work in JPSP to a wider public audience and with no doubt various impacts on policy and public debate.

Social scientists and publishers of social science all know that impact is a complex, subtle, diffuse and many-layered effect that accumulates through time. Yet we focus so much on a simple and blunt measure of a journal’s reputation with significant unintended consequences. The danger of a powerful incentive is captured nicely in Goodhart’s law (named after economist Charles Goodhart) which says, ‘when a measure becomes a target, it ceases to be a good measure’. And thus we see attempts at gaming the system in various ways to try and boost that over-fetishized number ranging from self-citations and citation clubs, to conservative selection by editors of papers based on the profile of the authors and their guess around likely citations. So measuring social science research becomes highly political in that the attribution of value can shape research practices themselves. In the reputation economy, academics will inevitably behave in ways that are rewarded by those making decisions on promotion, tenure, research grants and more.

We at SAGE as publishers of social science have often debated this question and our role in perpetuating the problem. We have attempted to shift the emphasis by announcing five-year impact factors for the last couple of years and have created an annual prize for the most highly cited articles we have published over a ten-year period. While counting citations over a longer period of time will always have a place in measuring quality and impact, scholarly impact and significance cannot be restricted to this kind of measure alone.

Debate around a focus on the impact factor is not restricted to social science of course. Researchers in all fields have found it to be problematic. It is a decade since the San Francisco Declaration on Research Assessment (DORA) was launched with a statement that ‘There is a pressing need to improve the ways in which the output of scientific research is evaluated by funding agencies, academic institutions, and other parties.’

That need has only intensified in subsequent years. The Metric Tide, an independent assessment of the role of metrics in assessing research led by James Wilsdon (2015) among other things concluded that narrow metrics are not a responsible way to assess research without being supplemented by qualitative judgements such as peer review. The report advocated a move to responsible metrics – distinct from the broader moves towards responsible research more generally –, which have the following characteristics:

Robustness: basing metrics on the best possible data in terms of accuracy and scope;

Humility: recognizing that quantitative evaluation should support – but not supplant – qualitative, expert assessment;

Transparency: keeping data collection and analytical processes open and transparent, so that those being evaluated can test and verify the results;

Diversity: accounting for variation by field, and using a range of indicators to reflect and support a plurality of research and researcher career paths across the system;

Reflexivity: recognising and anticipating the systemic and potential effects of indicators, and updating them in response.

In line with The Metric Tide, the UK government subsequently concluded that peer review should remain the primary method of research assessment in the next Research Excellence Framework (REF), supported by responsible uses of quantitative indicators. Another of its recommendations was for the creation of a UK Forum for Responsible Research Metrics, which was launched in 2016. Wilsdon himself now leads the Research on Research Institute which dedicates itself to helping create a healthier research ecosystem, in which the question of research evaluation is embedded.

The argument of course is not to abandon metrics entirely: we all need ways to filter knowledge claims to identify relevance and excellence. As Herbert Simon pointed out, long before the internet was born, at a time when information becomes plentiful, attention becomes the scarce resource. So yes, we need filters and signals of quality and authority. Rather the need is for the development of responsible ones that are a better guide than the journal impact factor (JIF).

However, despite the diagnosis of the problem there is no consensus on the cure, as to how social science impact can best be demonstrated. In the absence of good alternatives, we default to the simple and dominant mode. A recent survey of faculty at four United States universities from the Association of College & Research Libraries (Bakker et al., 2019) finds that social science researchers are shifting their conceptions of demonstrating impact toward ‘ways more aligned with the sciences and health sciences’ (and away from those favoured in the arts and humanities). This is problematic since impact measurements appropriate for astrophysics won’t read across well for anthropologists. Writing in Research Evaluation, the Italian National Research Council’s Emanuela Reale and her colleagues (2018) argue that the predominant methods used by natural scientists to demonstrate impact ‘tend to underestimate’ the value of social science research because of time lags, methodological variety and social science’s interest in new approaches, rather than solely iterative ones.

At the same time, an explosion of data has blown open new doors for social and behavioural research and pointed to many tools that could measure its impact. Yet the various attempts to establish alternative metrics, of which Altmetrics is probably the most well-known, often suffer from being measures of popularity rather than authority, and can also be easily gamed.

By recognizing that the issues are systemic, and to do with incentive structures in the academy, and therefore not within the scope of a single actor, we have moved beyond publishing on this topic to acting as a convener for other voices. A couple of years ago, SAGE assembled a working group meeting at the offices of Google Scholar which included a range of experts from Clarivate, to Altmetrics, to the Social Science Research Council and more. As a result, we produced a white paper highlighting the findings of this group. The report maps out stakeholder categories, defines key terms and questions, puts forward four models for assessing impact, and presents a list of 45 resources and data sources that could help in creating a new impact model. And to keep the conversation alive for the longer term, we have also developed an impact section of the SAGE-sponsored community site Social Science Space. (This space itself is also being used to gather ideas, amplify diverse opinions, and engage in debate about impact with global actors engaged on the topic.)

But beyond these efforts there is not much more an individual publisher, or journal, can do. What we have learned is that while the current measures remain the central reputation-conferring mechanisms in the academic ecosystem, for funding, promotion, tenure and other forms of advancement, the publisher’s reliance on the JIF seems merely to reflect back the preferences of the community it serves. So the challenge sits with higher education institutions and scholarly associations such as EGOS to change the ways reputations can be built in various fields. It seems to me that the leading societies in particular can start to define new criteria for assessment in ways that are responsive to the needs of their own domain. Until the interested parties can overcome this coordination problem the blunt measures will persist.

It is fitting that 1973 was not only the year in which Ken Gergen published his article in JPSP, but it also saw the launch of the Social Science Citation Index (SSCI). The Science Citation Index was created nearly a decade before. The SSCI will be 50 years old next year. Surely, we can do better than that in creating a high quality and impactful social science ecosystem.

A relevant quotation is often attributed to Einstein. It is ironic that the line is attributed to a physicist when it reflects so well the issues faced by social science in particular. How satisfying then to discover, in the preparation of this piece, that the true attribution should be to a social scientist William Bruce Cameron, who in his 1963 text Informal Sociology: A Casual Introduction to Sociological Thinking said the following (p. 13, emphasis added):

‘It would be nice if all of the data which sociologists require could be enumerated because then we could run them through IBM machines and draw charts as the economists do. However, not everything that can be counted counts, and not everything that counts can be counted ’.

References

Bakker

Caitlin J.

Bull

Jonathan

Courtney

Nancy

DeSanto

Dan

Langham-Putrow

Allison A.

McBurney

Jenny

Nichols

Aaaron

(2019). How faculty demonstrate impact: A multi-institutional study of faculty understandings, perceptions, and strategies regarding impact metrics. In Mueller

Dawn M.

(Ed.), Recasting the narrative: The proceedings of the ACRL 2019 conference (pp. 556–568). Association of College and Research Libraries.

Bastow

Simon

Dunleavy

Patrick

Tinkler

Jane

(2014). The impact of the social sciences: How academics and their research make a difference. London: SAGE Publications.

Cameron

William B.

(1963). Informal sociology: A casual introduction to sociological thinking. New York: Random House.

Geertz

Clifford

(1973). The interpretation of cultures. New York: Basic Books.

Gergen

Ken J.

(1973). Social psychology as history. Journal of Personality and Social Psychology, 26, 309–320.

Rawls

John

(1971). A theory of justice. Cambridge, MA: Harvard University Press.

Reale

Emanuela

Avramov

Dragana

Canhial

Kubra

Donovan

Claire

Flechta

Ramon

, et al. (2018). A review of literature on evaluating the scientific, social and political impact of social sciences and humanities research. Research Evaluation, 27, 298–308.

Wilsdon

James

(2015). The metric tide. London: SAGE Publications.