Abstract
This article, based on a presentation at the Future of the History of the Human Sciences workshop (2016), discusses some of the potential benefits and pitfalls of digital humanities (DH) tools and approaches for historians of the human sciences. It reviews some of the major approaches that form DH and draws on the author’s experience as part of a team creating a large DH resource to consider the complications presented by these.
A century from now, a historian looking back at our field today could be forgiven for concluding that many historians in the 2010s were preoccupied with the promise of the digital humanities (DH). If our future historian used the same tools to study us that we historians of the human sciences use to study scholarly communities of the past, she might begin by examining the ‘official chatter’ of our field: articles about DH and historians’ uses of digital methods published in academic journals. She’d certainly find many, from pieces in the American Historical Review that claim to speak to the whole historical profession to articles aimed at individual subfields, like this very article. She could then find more material by studying the job advertisements on H-Net and jobs.ac.uk and in the Chronicle of Higher Education and the Times Higher – those that had been archived and made searchable – as well as the calls for articles and the conference programs that transit our scholarly networks each day. 1 Few historians in 2019 maintain extensive physical or even electronic correspondence files of the type that inform many of our own historical studies of the human sciences, so our future scholar will likely need to find another way to examine these more private forms of disciplinary discourse. On the other hand, if today’s Twitter feeds, academic blogs, and online discussion forums are preserved, she’d have access to an extensive array of materials documenting formal and informal discussions about DH. Our future historian could even draw on humorous ‘memes’ about DH. 2 What could better demonstrate that scholars in the 2010s were thinking about what DH meant for their work than a whole set of in-jokes about it?
Thinking about what these sources might suggest about the permeation of DH in our field is, however, a useful reminder that disciplinary discussion does not necessarily translate to disciplinary acceptance, use, or even awareness. The source material I’ve described above would suggest some presence for DH in history, and even in the history of the human sciences. But how accurately would this reflect the majority of everyday scholarly practice? My sense is that our field’s relationship with DH varies considerably: a few historians of the human sciences have embraced DH wholeheartedly, with some being early adopters; others have experimented or heard a talk or two about it; some are steadfastly skeptical; and many of us have probably filed the subject away mentally, for consideration someday when the latest article manuscript gets submitted and the pile of essay marking recedes a bit. We’re aware that DH is a ‘thing’ these days, and that some of our colleagues have found it intriguing and useful, but how seriously should we take the suggestion that DH will transform the way we research, write, communicate, and teach history, and more specifically, the history of the human sciences?
For some readers of this journal, DH may have first grabbed their attention in 2014, when the publication of Guldi and Armitage’s The History Manifesto spurred a debate about whether and how a new history centered on analyzing ‘big data’ would transform how we worked. (See Jacobs, 2016, for an excellent overview of this debate from the perspective of a historian of science.) Guldi and Armitage (2014) argued that that their fellow historians needed to use computational tools such as those of DH to make best sense of the growing, perhaps overwhelming swarm of data increasingly available to us. This would enable historians to once again tell bigger, longue durée stories; in turn, the argument went, doing so would widen historians’ vision out from a myopic focus on the short-term while providing ‘real-world’ policy insights and perhaps even political inspiration. Their claims aroused a vocal variety of responses, most notably from Cohen and Mandler (2015), whose critique began by using those very computational tools to refute Guldi and Armstrong’s claims that academic historians’ focus has narrowed in recent decades. Cohen and Mandler also strongly critiqued the notion that recent historians have become, as Guldi and Armitage (2014) claimed, more inward in their focus. Here they pointed to the expansion in recent decades of ‘public-facing’ scholarship in venues ranging from television shows to historian-policymaker networks (see also Hitchcock, 2013). They agreed that tools and approaches drawn from DH form an important new contribution to the historian’s toolbox, especially when approaching ‘big data’ that might supply new insights about the past and the present. But Cohen and Mandler also concluded that all of these require ‘tough-minded assessment’ rather than the kind of ‘unqualified encomium’ (2015: 541) Guldi and Armstrong seemed to offer.
When I was asked to contribute to this symposium, I knew I would not be able to provide a ‘tough-minded assessment’ of the breadth and scope Cohen and Mandler called for, but I also knew from my own experience with DH that despite my enthusiasm, an ‘unqualified encomium’ was out of the question. I still count myself as a beginner when it comes to the uses of DH for history of science and medicine. My own research involving the human sciences has largely focused on how sociology, psychology and anthropology have (or have not) been employed to understand and change health behavior, and on the ways practitioners of and ideas from the social sciences have been integrated (or not) into biomedical understandings of health. Until fairly recently, most of my research has been conventional, relying largely on scrutiny and analysis of archival and published primary sources. But in 2015, I unexpectedly found myself involved in a large-scale DH project, specifically an exploration of how text mining could be used to facilitate historians’ and others’ use of historical medical texts. This experience was useful not only because it forced me to learn more about DH methods, but also, as I’ll discuss below, because it made me reflect more carefully about the tools and methods I rely on each day as a practicing historian.
What follows here, then, is a more pragmatic than programmatic discussion of DH. It is also a personal and idiosyncratic view, from someone who got involved with DH long after experienced practitioners and contributors had noticed a significant influx of newcomers (Nowviskie, 2010), long after it had been touted as ‘the next big thing’, and even long after that apparent ‘newness’ had been neatly deconstructed (Pannapacker, 2011). I begin by briefly describing some recent, current, and long-standing DH projects related or relevant to the history of the human sciences, as well as reviewing some of the tools and approaches available. I then discuss my own experience with a large-scale DH project and suggest how that experience might be instructive for others. My sense is that the tools and methods associated with DH are well worth our exploration: DH can help us historians do new and insightful things, and it can also enable us to do old things more efficiently. Of course, potential users need to be sure they are not simply asking the questions such exciting tools make it easy to ask, but rather using DH to ask the questions we want answers to in more creative or efficient ways. Nevertheless, the experience of grappling with the challenges posed by such new approaches to collecting, analyzing, and writing history has an added benefit, in reminding us to be careful, reflective, and transparent about our methodological choices. 3
What digital humanities offers
So what is ‘digital humanities’ anyway? Is it a disciplinary or subdisciplinary field (as job advertisements would imply)? Perhaps it is a methodological approach or orientation shared across the humanities, with differing iterations in different disciplines? Or is it a set of tools for doing humanities scholarship, or even simply an approach to scholarship? As historians of the human sciences, we know that when a new academic discipline, field, or subfield coalesces, one of the first things such a group does is engage in a veritable orgy of self-definition. Those leading the charge start giving talks and writing essays trying to set out what the field is, where it comes from, where it is going, who belongs and who does not. This has definitely been true for DH generally and for DH as practised by historians for the past decade, and many of those essays have been collected in useful readers such as Terras, Nyhan and Vanhoutte’s Defining Digital Humanities (2013) and the two editions of Debates in the Digital Humanities (Gold, 2012; Gold and Klein, 2016). Probably the broadest practical definition proposed is that DH means using digital tools and techniques to undertake humanities scholarship, whether that be research, teaching, or engagement. Another way of saying that is that DH follows on from what used to be known as ‘humanities computing’ (which itself has a long and rich history), but goes beyond that to include responsive, reflexive forms of research, teaching and scholarly communication enabled by social media and the web.
Let’s start with the last of these, which is also perhaps the most familiar to most of us: engagement and exchange. DH advocates generally maintain that ideally, at least, doing ‘digital humanities’ involves new modes of engagement for humanities scholars, not simply with texts and sources, but with each other. This extends even to social media and social networking sites, which (the argument goes) can encourage interaction and intellectual community building amongst scholars. Certainly if online engagement with each other counts as digital humanities, then lots of historians of human sciences have being ‘doing’ DH for the past decade without knowing it: many of us use social media to discuss our work, especially by posing and answering queries on Twitter or by live-tweeting conferences and talks, thus broadening scholarly conversations beyond the participants in a particular seminar room at a particular time. When Twitter initially gained popularity amongst academics in the 2010s, enthusiasts argued that social media venues like these, and the more informal (and thus perhaps more egalitarian and open) interactions they supposedly fostered, might even help change the culture of academia. This initial enthusiasm soon led to debates as to whether and how Twitter and other social media should be used in conferences and other professional scholarly settings. Social media’s appeal has been that it makes it possible to have rapid interactions with a large number of fellow scholars on a regular basis – but as enthusiasts and detractors alike have pointed out, participating in those discussions most effectively also requires the user to feel comfortable with social media and with expending time and energy on it. The lively conversations to be had on Twitter about academic work allow users to engage with new people, but have occasionally been perceived as exclusionary by those who are unfamiliar or uncomfortable with social media use. As one blogger put it back when Twitter first gained traction at academic conferences, watching a live conference Twitterfeed can initially feel like watching the ‘cool-kids table’ in the cafeteria from the outside (Pannapacker, 2012). And even as live-tweeting from a talk allows those unable to attend to know what’s going on, those tweets reflect the understanding and reactions of the person doing the tweeting, which increases the opportunity for misinterpretation. What if a speaker wants to try out a less-well-formed thesis or a controversial interpretation? Is it fair to reproduce that beyond the seminar room’s walls, to distant users who won’t be able to catch subtle social cues from the speaker, and then have those impressions persist online indefinitely, available to anyone who searches for the speaker’s name? While participation in many social networking sites can be fenced-off in private, communally policed groups, Twitter is open to anyone who wants to search it. So much of the history of human sciences deals with sensitive and offensive material from the past, and the informal, telegraphic voice many Twitter users adopt is not well suited for subtlety or sensitivity. How effectively have our existing communal norms about scholarly interaction evolved to encompass social media use, and the accessibility of our discussions to wider, and perhaps unexpected, audiences?
Whether our norms have evolved or not, as the 2010s draw to a close, Twitter and other forms of social media are now firmly entrenched in academic practice. It’s perhaps unsurprising that the initial enthusiasm about the new kinds of interaction fostered by social media has now been tempered, as we consider how expectations about using it are incorporated into what is expected of us by our peers and by those who employ us. Anyone who takes time to participate in scholarly social media interaction knows that to do so is time-consuming. The history Twittersphere encompasses many people who generously respond to queries, offer suggestions and help maintain an active scholarly community and public face for our discipline online; they also actively consider how this new kind of interaction might make for more collaborative and open scholarship. Do their peers or their institutions recognize, respect, or reward that as legitimate academic or scholarly work? What about those who now find social media participation to be less an opportunity for stimulating conversation and increasingly an institutional expectation, another way our ‘productivity’ can be measured, assessed and judged by others? On the other hand, many of us have felt the excitement of making new intellectual connections around our work through social media, and it can genuinely broaden our conversations and take them in unexpected directions. Perhaps the real question here is how we can retain our enthusiasm for the ways new media technologies can facilitate scholarly collaboration and encounters, even as their use becomes more routine.
As important as engagement with other scholars and audiences is, however, when we talk about ‘digital humanities’ we are usually talking about the approaches that use computational tools and techniques to allow us to engage with texts, objects, data, sources and information in new ways. Some of these DH tools and resources are created by teams or institutions seeking to facilitate access to materials and information, in hopes that individual scholars, no matter their location or resources, could search for, interact with and/or analyze the large swaths of material held in digital repositories. Others are essentially ready-made analytical, visualization, or mapping applications that allow the user to add her own data set and begin to manipulate it, much as SPSS, Stata and similar applications facilitate statistical analysis of data. Some scholars in DH take their cue from the open-source programming movement, and combine open-source applications, borrowed coding shared by other users, and their own coding to perform bespoke analysis on their own source materials. 4 So, when we talk about DH projects, that includes everything from individual scholars using off-the-shelf and adapted tools to analyze a focused set of materials and objects, to large-scale interdisciplinary projects that create resources, platforms and tools for scholars and other researchers to use.
So what can we do? Before I discuss some of the DH resources, tools and projects I have found interesting, it’s important to point out that I will focus mostly on analytical tools that (like many history of human sciences scholars) revolve around text. But readers should keep in mind that DH can involve far more than the analysis of text and texts. Object-based learning, for instance, is an obvious arena where digital technologies can greatly enhance how we understand and encounter the past. For many of us researching and teaching in the history of the human sciences, our future encounters with digital tools may come through collaborations with colleagues in museum studies, where virtual reality, 3D printing and haptic technologies can enhance teaching and support cultural engagement with students and communities. Digital image processing and manipulation offers yet another potential application of DH, as does mapping (which I do briefly touch on below), which is one area of digital humanities that has gotten significant uptake from social, cultural, and medical historians, and has significant potential for use by historians of the human sciences.
But for most historians of human sciences, probably the most immediate application of DH comes in its ability to facilitate new approaches to the analysis of text and texts. This can include text encoding, where users tag and mark up texts electronically to facilitate further analysis, individually, collectively and/or automatically. This is a very popular tool for literary scholars and increasingly also for historians. Alternatively, it can mean writing one’s own code or using applications designed by others, in order to treat textual content as data to be analyzed or ‘mined’. Thanks to several decades of digitization, we now have the contents of masses of books, academic journals, newspapers, magazines, letters, diaries, broadsides, pamphlets, registers, transcripts and so on available to us. Thanks to sophisticated tools based on natural language processing, we also are able to make even very large collections of textual content, known as corpora, the subjects of our analysis, and thus identify patterns, fluctuations and relationships in texts (and thus the past) we may not have been able to see before. And the power and speed of our computational tools means we can not only have these new kinds of encounters with texts, but can do so faster and on a larger scale than ever before.
So how does this type of textual analysis work? 5 Many readers will be familiar with earlier efforts, from the mid 20th century onwards to employ computational techniques to analyze text, which often determined the relative frequencies of words and patterns of word use. Literary and other scholars often use these frequencies and patterns to consider who had authored particular disputed texts, by comparing them to texts of certain authorship (see for instance Juola, 2015). Today, several tools that rely on analyzing and representing relative word frequencies are available to even the casual user. The most obvious one is the ‘word cloud’ as produced by tools like Wordle (wordle.net): these clouds are easily generated representations of the most common words in a chosen text, where the size of those words in the representation illustrates the relative frequency.
Even those unfamiliar with DH are likely to have encountered the Ngram Viewer associated with Google Books (books.google.com/ngrams). This easy tool allows the user to see the comparative frequency over time of words, terms and phrases as they appear in the gigantic corpus (‘lots of books’) of digitized books available to Google. Such tools allow for quick queries that many of us have used as a starting point for a lecture or conference presentation: the simplicity and apparent clarity of the curves produced makes for an eye-catching illustration and point of departure. Of course, Google Books Ngram Viewer users quickly realize the problems with using the results as a definitive statement. The Ngram Viewer tracks the frequencies of words and terms as they appear in an admittedly large, but far from complete selection of published books, but not necessarily other texts, printed or otherwise. Google Books itself depends on algorithms that are anything but transparent to users (Schmidt, 2018) and provides relative rather than absolute data, measuring frequency in what has been digitized rather than in what was originally published. Ngram users also quickly find that including variant spellings and usage complicate the picture, and even more so for scholars working with other languages; scholars looking over long periods of time then need to acknowledge that changing frequency of certain words and terms may have as much – or more – to do with linguistic change than cultural change (Koplenig, 2017). Once we get past those questions, even if we assume changing frequency is not simply evidence of changing underlying linguistic structures, we have to return to an older, fundamental question: what exactly does the relative frequency of a word or phrase in a large but necessarily incomplete set of texts, tell us about the ideas, understandings and experiences of people in the past?
Frequency analysis is only one of a collection of approaches that together allow for more sophisticated attempts to automatically analyze texts, in hopes of seeing either previously invisible patterns, or being able to explore more texts than before. Others include cluster analysis and topic modeling, which can reveal particular groups of terms that occur together or are otherwise apparently related in one text or many. The idea here is to do a sort of machine-assisted content analysis, where the scope and speed of analysis enables the user to either see patterns she might not have noticed herself, or to see many more such patterns at once. Of course, to do this a user needs to think carefully about the question of what it means for words or phrases to be ‘related’ in a text. Do they simply co-occur in a text, in which case the size of the text examined matters quite a lot? Or do they occur within a certain proximity – within, say, 50 characters of each other, or on the same paragraph, or on the same page? Appearing on the same printed page may in some cases be an excellent proxy for the relatedness of words or phrases, but in others it could be completely accidental. (Think, for example, of the variety of different stories appearing on a single newspaper page, or of discussions appearing on the same journal’s letters to the editor page.)
Additional tools and approaches allow the user to enhance and modify this kind of analysis, to make it something more than an automatic parsing of texts. For instance, some tools will scan large amounts of textual data and compare them to existing digitized resources like dictionaries, lists of personal names, gazetteers, lists of institutions and organizations, place names, and so on. The user can then quickly find (or at least begin to find) all the place names mentioned in a text or corpus, or whether any of a standard list of named individuals or technical terms appear in those texts, or answer similar queries. Text-mining tools go beyond checking to see if certain words – or rather, certain strings of characters that form those words – appear, to find strings of characters with a form that suggests they belong to a certain class of term. 6 More simply put, these tools can not only search for a particular proper name, phone number, or email address, but identify strings of characters that look like proper names, phone numbers and email addresses. Advances in natural language processing have enabled the creation of increasingly sophisticated precise tools that recognize patterns in syntax and structure, in a variety of languages. So far these have been more frequently adapted to allow scientists to do more ‘intelligent’ searching of masses of literature, but as I’ll discuss later, these could potentially be useful for scholars like us as well.
Another important aspect of DH is the creation of interfaces and systems that link or provide common homes for existing digitized resources. Search engines and visualization tools that bridge a large volume of diverse resources are especially valuable because they allow users to find material from multiple kinds or sets of data at once. This might sound simple, but because digital collections have been organized and made accessible in so many different ways, it’s actually an expensive and complicated task to interleave these and make them mutually usable. As we all know, a wide variety of primary material has been made available online in the last thirty-plus years, from full-text backfiles of academic journals and popular publications to maps, surveys and quantitative data sets. But while some of these share common interfaces for searching and can be used simultaneously, others do not. Here is where the work of groups like those behind two well-established DH projects, London Lives (Hitchcock et al., 2012) and Locating London’s Past (2011), becomes especially helpful. Sites like these offer a common search interface for data sets that the user would otherwise have to interrogate separately; London Lives, for instance, brings together under one search interface an tremendous number of resources, data sets, and digitized manuscripts allowing the user to explore everyday life and the institutions that governed it in 18th-century London. Meanwhile, Locating London’s Past provides the user with the ability to use some of these sources with past and current maps of the city, and thus explore and represent this source material geographically, spatially and visually. While the data the user can access through these projects is findable elsewhere, using their interfaces allows a scholar to both do an established task (searching) more quickly, and to see existing material in new ways, literally.
Indeed, tools allowing mapping and visualization are perhaps the ones where we can most easily see the potential of DH. With the availability of satellite data and the growth of GIS, it has become easier than ever to consider how people, organizations and events relate spatially. While this is obviously useful for those who want to understand past economic, crime, disease, or other patterns, it also enables us to remember how the ideas, practices and networks we study were located, literally, in the past. Consider the Mapping the Republic of Letters (2013) initiative, which encourages scholars to rethink the interactions and social and intellectual networks of early modern thinkers through interactive visualizations of their correspondence patterns. The site also hosts several exploratory case studies, allowing users to see how other scholars have used these visualizations and other tools to ask new questions about intellectual communities; users can also download many of the data sets from these projects for further individual exploration with their own choice of analytical tools. This is another obvious case where visualizing interactions geographically while also using other digital tools to explore the textual content of letters can add insight to analysis.
Visualizing relationships, networks, and associations connecting people, institutions, or even concepts is an especially productive way to use DH tools to discover or reveal patterns that may not have been visible or obvious before. An excellent example of this kind of approach, and one that speaks directly to historians of the human sciences, comes in the research led by York University’s Christopher D. Green (see for instance Green, Feinerer and Burman, 2015a, 2015b; and the broader reflections on this analysis in Green, 2016). In these and other explorations, Green and colleagues began by using the types of DH tools and techniques discussed above to analyze the textual content of articles in early psychology journals, and used this analysis to determine how similar in vocabulary articles were to each other. The group then used these measures of similarity and Gephi (a network analysis and visualization software package) to visualize these articles in a network, to reveal where articles – and thus disciplinary discussions – were clustered, and which topics tended to be outliers, relatively removed from the apparent centers of a nascent discipline’s interest. These visualizations help readers understand how the scholars, researchers and practitioners engaged in early psychology (at least those who managed to get published in key journals) envisioned the central and not-so-central topics of the field. What’s more, the visualizations gain further traction because Green and colleagues discuss them not as an end unto themselves, but in the existing scholarly context of what is already conventionally known about the journals in question, their editorial practices and the community as a whole.
DH research like this, which combines computational analysis of text with new visual representation strategies, and lines it up against existing conventional scholarship, offers a good model of what can be done by historians of the human sciences who are able to invest in and experiment with the variety of tools available. But these articles also suggest some cautions to those of us interested in using DH tools and techniques to do such exploration. First, when compared to the more conventional historical investigations published in the same journal, this approach requires significant explanation of the methodology to readers who are most likely unfamiliar with it. Despite admirably clear description and discussion of the methods used by the group, their explanation may still be dauntingly unfamiliar to some readers and opaque to others. The explanation of method also takes up a good deal of space in the articles themselves. Does that mean that historical journals will need to reconsider traditional constraints, or will less explanation be needed as these tools become more familiar to a wider range of scholarly readers? Does such explanation belong in the article itself or in appendices and notes? This is of course a familiar question to historians using statistical, demographic and other forms of analysis. And other scholars in adjacent disciplines deal with this question all the time, including extensive discussions of method in their publications as a matter of course. 7 Perhaps we should take this as a reminder of the value of discussing our methods and assumptions in a structured way. Another consideration is that visualizing complex networks of ideas and relationships allows us to see their sources differently, but the very complexity of these networks means they can be very hard to negotiate and understand. In other words, as we all know from social studies of science, producing a visualization (or any other initial product of computationally assisted analysis) is difficult, complicated work, but then that visualization is not ‘real’ but a first step in analysis. Ideally, explaining a complex methodology will remind us as scholars to be especially clear about the analytical assumptions and choices we’ve made, but – again, as we know from thinking about scientific tools – some assumptions get ‘built’ into DH tools themselves.
Tools, methods and transparency
This brings me to my own and my colleagues’ experience with a large-scale DH project, which illustrates some potential considerations for the historian of human sciences. In 2014, some history of medicine colleagues and I joined up with colleagues in Manchester’s National Centre for Text Mining (NaCTeM) to explore how text mining could enhance historians’ use of digitized material. Specifically, we hoped to employ some of the tools that NaCTeM and others have used to facilitate life scientists’ exploration of published material. One of our central goals was to create a semantic search engine that would allow users to search large quantities of historical medical texts not just for specific terms, but for entire categories of terms. In other words, instead of searching a corpus (say, the entire century and a half run of the British Medical Journal) for a particular disease name, the user could find all the disease names, or therapeutic compounds, or other identifiable entities occurring in a subset of the corpus, allowing the user to explore the range of diseases mentioned in a text. We could also set up the search system so that it would find terms that were related historically, so, for example, searching for tuberculosis would also prompt the user to see documents containing terms like consumption, pulmonary tuberculosis, and other expressions used in the past for what we now tend to call tuberculosis. What’s more, the system we set out to create was able to recognize when a disease name was simply that, referring to a disease, as opposed to when it occurred as part of a larger expression; this would allow users to search enormous swaths of text for the term tuberculosis when it was referred to the disease, without having the results cluttered up by organizational names (the National Tuberculosis Association) or instances where the term appeared as part of a larger term (tuberculosis nursing). Finally, we hoped that our system would be able to recognize stretches of text where certain kinds of entities were related, so that users could examine the form and frequency of those relationships. A simple example: ideally, the system would learn to recognize and return instances where a term had a specific relationship with an entity. So a user could find every case in a set of texts where a virus [the specific search term] was said to have a causal relationship with a disease entity, meaning any term or expression recognizable as referring to a disease: the user would get a list of results containing such diverse but ultimately similar statements as ‘the cause of rabies is the rabies virus’, ‘an unknown virus appears to be responsible for polio’ and many more. Alternatively, one could search the full run of a journal to see the range of different things that were said to be related, causally or consequentially, to a specific disease entity over time. So, instead of carefully paging through years of journals, the user could quickly locate these expressions and relationships in a vast amount of text, and then set to work making sense of them. Exploring historical texts in all these ways seemed possible because our collaborators were very used to creating tools that explore contemporary biomedical literature and records in order to find statements and expressions that represent complex knowledge, such as statements about protein–protein interactions or gene-disease associations. And our collaborators were excited about working with us, as adapting their tools to work with historical biomedical literature would be a new challenge that allowed them to demonstrate their usefulness to new scholarly constituencies.
Over the next 15 months, we produced a semantic search system that worked with the full run of the British Medical Journal (BMJ) and with London’s Pulse, the Wellcome Library collection of the London Medical Officers of Health (MOH) reports. Together these two formed a corpus of medical, public health, and social investigative writing from the 1840s onward, and we hoped that after devising a proof of concept, we’d be able to include other large resources to expand our reach. As my colleagues and I have discussed elsewhere (Thompson et al., 2016; Toon, Timmermann and Worboys, 2016) this was a valuable experience, and both we and our text-mining colleagues gained much from working together and learning about each other’s work and approaches to analysis. We also learned quite a bit about the pitfalls of this kind of DH work, which gave us a more critical and realistic appreciation of what it can and cannot do for historians.
First, to work with large amounts of text as data, it has to be processed to be in a format that can be used by the tools one employs. In our case, this meant making sure the BMJ and MOH files were in good shape and relatively free of scanning and optical character recognition (OCR) errors and thus usable versions of the printed texts they represented. The MOH reports, which had been recently processed with funding from the Wellcome Library, were (relatively speaking) in excellent shape: the newest versions of OCR proved to have been fairly good at capturing text, and the many tables and charts contained in the reports were separated from the main body of the text and processed (Wellcome Library, 2018). The BMJ files, however, had been produced over a longer period of time using older versions of OCR, and a quick look at the text files revealed that large segments of text had so many errors they were unrecognizable to our system. Our colleagues’ solution was to devise a specialized correction routine using historical medical dictionaries that could be applied to the BMJ text, and this was quite successful in improving the quality of the corpus we would be using.
The extent of the problems with the original corpus, however, reminded us of a fact all historians should acknowledge more frequently: those ‘full-text’ journal searches we’re doing with many historical texts, especially those that were digitized and processed with early versions of the technology, are not searching the same ‘full text’ we would see if we looked at the page ourselves. Rather, we are searching a version of the text that contains errors due to processing with earlier systems, or due to the poor condition of the original material. As Tim Hitchcock (2013) has pointed out, what we think are full-text searches in some cases are only searching three-quarters or even less of the full text, because the remainder is unrecognizable to the search engine. Users may not realize this, because many (but not all) of the search engines we may use search raw data files but then present us with pdf versions of the same text. But when those raw data files form the material on which one’s analysis is based, the analysis can be impeded. It’s useful to remember that, as Andrew Prescott (2018) explains about the Burney Newspaper Collection, when many materials were initially digitized, the goal at that time was to facilitate better access to a resource than existing microfilm copies provided. Today, however, many scholars have unconsciously come to expect digitized materials to have a high fidelity to the original text in line with current OCR performance. Commercial providers may have little incentive to go back and reprocess materials, such as 1920s medical journals, when usable copies exist and the market of potential users seems small; meanwhile, libraries and archives rarely have the funding to do this. This is a valuable reminder that the digitized material we often rely upon has its own issues, as it is usually the product of earlier interventions that attempted to satisfy different goals than our own. Whether we choose to further intervene to refine such resources for our own use presents us with additional questions. First, applying correction tools and routines can help clean up the corpus, and thus improve the accuracy of the analysis; on the other hand, cleaning up a corpus through automatic correction routines still misses some errors and overcorrects others. In short, to get better data sets we need to correct that data, but correcting it means intervening in it, which, as good students of the history and social studies of science, we know introduces its own set of problems. At the very least, that intervention gets blackboxed and future users may be unaware of the degree of our clean-up and the errors it may have introduced.
Once we in the Mining the History of Medicine project had cleaned up our texts, we had to make fundamental decisions about the types of entities we wanted our system to recognize. We wanted to choose classes of entities that it would be technically possible for our tools to recognize with high accuracy, but also to choose ones that our historical colleagues using these materials and tools would want to be able to find automatically. We tried to think broadly and come up with categories that others would be interested in, but quickly became conscious that our interests weren’t necessarily congruent with those of other potential users (see Toon, Timmermann and Worboys, 2016 for a longer discussion of this). We ended up focusing on 7 entity categories: Anatomical; Biological Entity; Condition; Environmental; Sign or Symptom; Subject; and Therapeutic or Investigational. But while organizing the system to automatically recognize these made it very useful to some historians, it made it much less valuable for others. A user could, for instance, easily find the whole range of therapeutic substances and interventions that were mentioned in association with tuberculosis, but might struggle to use it with more abstract and rhetorical subjects such as masculinity or culpability. We also tried to develop the system to be good at detecting causal statements or expressions where entities were associated. Here, though, we ran into another problem: 19th-century scientific writing differs quite a lot from early 20th-century scientific writing, and even more so from late 20th-century scientific writing. What this meant, practically speaking, was that our system was built on programming that was quite good at identifying straightforward statements about causation or association: x binds with y, this gene causes that disorder. It struggled, though, with the more conditional, suggestive and downright elliptical language of relationships used by Victorian medics and others to talk about disease and ill-health. In the end, we were unable to make our system recognize relationships to the extent that we had hoped and intended, with the relatively short time available to us.
Obviously, every project and every attempt to create a usable resource for others involves choices. What this project taught us, though, is just how daunting the act of making those choices could be, especially as we realized they would lessen our resource’s value in the eyes of many colleagues, all while requiring significant investment. Because developing the system required human intervention to ‘train’ it, I and my colleagues ended up making decisions about whether a term fell into an entity category for sample texts, meanwhile realizing that those decisions would affect how the system as a whole dealt with the much larger corpus. If I were writing an article involving conventional historical analysis where I’d had to make those hard decisions, I could discuss them and be fairly transparent in the text. But how would I warn future users about the decisions I and my colleagues had made, based on our own understandings of the history of medicine, that then were ‘built in’ to our large and expensive resource? The obvious answer is through ample and transparent documentation, and when we tutored potential users of the system and created materials for them, we tried to document those choices and their implications. But how would we do so once the system left our hands? As committed as we are to transparency, we’ve found that while some users will read a list of caveats and warnings, others want to dive in and use their new toy. Our proposed solution is to build in a good contextual help system, but again, doing so takes time and expertise.
These were only a few of the dilemmas that I and my colleagues faced, and there were a host of other issues we encountered that have now made us cautious when we use other digital tools and resources, or when we read DH scholarship. But I also think historians of the human sciences can learn from our experience, as you consider getting involved with DH projects, or using DH tools and approaches. I definitely encourage anyone to make the time to explore these new tools, use them to collaborate with others, experiment with them and try to understand them. Finding out how you can encounter your source materials in new ways is exciting, and there are excellent and accessible resources to allow you to do so, such as The Programming Historian site (programminghistorian.org). On the other hand, as exciting as it is to come up with new ways to explore your sources with DH tools, it’s easy to lean more and more on those sources that are easily amenable to such tools, while avoiding other sources that are not so easily digitized, collected, searched, or analyzed. It can also be easy to become focused on the tools themselves and what they can do, to become wrapped up in answering queries that a system or set of resources facilitate rather than the question you are interested in. This is especially true when using that system or set of resources requires significant investments of time and effort.
Having now had the experience of helping to create a DH tool, I also worry that because the technical make-up of these tools can be opaque to users, users can lose sight of the shortcuts and decisions that are built into those tools. But understanding the nuts and bolts of mapping software, text mining and the many other resources available to those interested in DH can be quite daunting. Furthermore, the tools themselves are often designed for very different contexts and purposes than ours: a system that parses and classifies statements about protein interactions written in 2010 will require quite a lot of modification if we want it to find patterns in Victorian discussions of mind.
Finally, if we are to use these tools well, we need to simultaneously remind ourselves that even as we use sophisticated computing to delve deep into the texts that reflect the past, we should remember that those large corpora are also physical objects produced by communities that organized knowledge in ways that suited their own purposes, ways that are not always compatible with our purposes. What do we lose when we encounter the journals or books our historical subjects sweated to produce simply as collections of characters, words, entities, relationships? What do we miss when we use DH tools to help us see more, but see it from a distance while possibly losing sight of the discussions in between? DH tools and approaches can, if we are not careful, make us further removed from the physical objects that constitute texts, or from previous understandings of space that aren’t easily represented with our mapping software, or from data that we don’t remember is the product of many forms of human intervention. It may be trite to invoke the old tale of the sorcerer’s apprentice, but unless we really understand our tools, and what they cannot and should not be expected to do for us, we won’t appreciate what they do well.
Footnotes
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
