Data struggles: The life and times of a database in Historical Climatology

Abstract

French

Open access to research data has become an issue in many contemporary sciences. One of them is Historical Climatology, a discipline drawing on archival materials to study the climate’s past. Based on fieldwork, the article explores the construction of a shared database by a group of historical climatologists and describes the strategies and hopes built into that infrastructure. I examine how the possession and provision of data relate to issues of recognition and legitimacy, thereby turning database construction into a practice of social import. Further, I argue that taking into account the diversity of research materials from which climate data is constituted – historical documents, tree-rings, ice-cores, etc. – is crucial for apprehending both the status of distinct types of data and the status of distinct research groups in the scientific field under investigation here.

Keywords

data sharing ethnography of databases Historical Climatology open access Science & Technology Studies

Introduction

Taking a stroll through digital datasets might be among the more popular ordinary practices in the sciences today. With networked computers in virtually every office and proliferating open-access platforms, zillions of heterogeneous datasets that have hitherto been kept from view or exchanged on bilateral agreements are now out there in the ‘dataverse’ (Bowker, 2013: 167), ready to be consulted with a few clicks. Although it is presumably only a fraction of the research data generated that is available beyond the contexts of its production, it seems that the data has entered a new regime of circulation, to the point of ubiquity.

To be sure, open access is unlikely to dissolve epistemological or technological disparities in acquiring and using data for research − just think of marginal laboratories striving ‘to access online resources, appropriate bandwidth, adequate expertise, and computers powerful enough to analyze data found online’ (Leonelli, 2013: 9). But for a variety of reasons, turning data into shared data remains a principle embraced and at times imposed by funding agencies (e.g. European Research Council, 2017) and journals (e.g. Nature, 2016). There is a rush towards making data widely available that does not stop at traditional boundaries between the disciplines, such that endeavours as disparate as worm science and art history suddenly find themselves on the same track, trying to catch up with demands for ‘greater returns from the public investment in research’ (Arzberger et al., 2004: 1777) and ‘[n]ew solutions to scientifically and societally relevant challenges’ which allegedly ‘require that we bring all relevant data from the past, present and future to the table’ (Michener, 2015: 42).

Clearly, none of these data dreams could ever materialize without an earthly infrastructure that allows data to be stored, sorted, named, fused with metadata, rearranged and accessed from the remotest desks around the globe. Despite its apparent abundance, the data needs palpable, local habitats to exist and to be used, and so it is the database that provides the material-organizational underbelly for all endeavours of sharing data, big or small. Disproportionate to its importance, though, the work of those scientists and technicians who invest their talents into the construction of a database figures less prominently in discussions of the Age of Data. Less glitzy than other data-related practices in the sciences, database construction is oftentimes ignored, underestimated or neglected, a fate it shares with infrastructural work in general (Star & Ruhleder, 1996: 113). In the ‘complex matrix’ defining ‘the relation between invisible and visible work’ (Star & Strauss, 1999: 24), one aspect of database construction is particularly prone to invisibility: the rationales, thoughts and reflections of the actors that accompany their construction work. I will turn to this aspect in the article by examining a case from one of the branches of today’s climate science: Historical Climatology, the study of the climate’s past and of climate–society interactions on the basis of archival materials, or ‘documentary data’, as they are commonly called.¹ More specifically, I am concerned with the hopes and strategies behind a database built by historical climatologists, IT workers, and librarians at the University of Freiburg (Germany). Drawing on ethnographic work, I will give an account of their endeavours, still ongoing, to construct a data sharing infrastructure where before there was none.

The case presented here slightly differs from existent database studies. First, the database under investigation is a small-scale, locally operated infrastructure, compared to large-scale, interconnected, or even globalized research infrastructures (e.g. Edwards, 2010). Second, while other studies have put an emphasis on the daily work of operating and maintaining databases (Pontille, 2010; Ribes & Finholt, 2009), on the diverse actors partaking in these activities (Hine, 2006; Millerand, 2012; Nadim, 2016), or on the use and re-use of the data itself (Leonelli, 2016; Zimmerman, 2008), I inquire into the entanglements of the database with the field of its emergence. What do the scientists in Freiburg hope to achieve within Historical Climatology (and within climate science more broadly) by making their data accessible to other researchers? And how do they engage in field-specific struggles by means of their database?

That the products of science − be they theories, problems, findings or infrastructures − cannot be separated analytically from the contexts of their development and use is a truism in Science & Technology Studies, though to what extent and in which ways this holds true in the case at hand has to be specified first. To do so, we have to engage in an exercise of ‘contexting’ (Asdal & Moser, 2012: 293), by weaving together the variegated circumstances for data to exist and subsist in Historical Climatology, a field that has not received any special attention from observers of science so far.

Some preliminary observations

When I arrived among the historical climatologists at the Department of Physical Geography in Freiburg for the first time, I was expecting to see them interpret the weather diaries of medieval monks, calculate mean temperatures, and produce colourful maps and graphs that show the evolution of the climate over the past millennium. And so it came as a surprise that many of the scientists I talked to were occupied with the construction of a database going by the mysterious name of Tambora. At that time in February 2014, the head of the department, Rüdiger Glaser, one of the major actors in the field and well-known in Europe and abroad for his reconstructions of climatic changes based on documentary data, had just received a renewed grant from the German Science Foundation for this database, precursor versions of which had been developed since 1997. Tambora, he explained to me, was conceived as a platform for drawing together and making publicly available the mass of documentary data that he and his collaborators had collected over the years from various archives and libraries; data that has been used to investigate, for instance, extreme events like droughts and floods or warming and cooling patterns in Central Europe during the past c. 1200 years, notably for medieval and early modern times when standardized instrumental measurements did not exist.

It is customary for scientists in the field to colloquially refer to archival sources as their ‘raw materials’: weather diaries, harvest records, town chronicles, annals, journals and gazettes, administrative records, logbooks from ships, notes of amateur observers of the skies, and other historical paper-based sources are among Historical Climatology’s research materials. The scientists have a very specific interest when they evaluate these materials and search them for any fragments that are, directly or indirectly, related to weather and climate, such as descriptions of the length of the summer or the mercilessness of the winter, of the blossoming of trees and the freezing of lakes and harbours, observations of rain, cloud cover, heat and drought, but also reports on social phenomena such as famines (Krämer, 2015) or governmental measures taken in the face of inundations (Himmelsbach et al., 2015). Some historical accounts have been systematically produced over many years, like the aforementioned monastic weather diaries, kept for generations in a uniform manner, or wine-growers’ records containing grape harvest dates, an indicator for temperatures during the growth period. Taken together, hundreds of thousands of fragments extracted from all these materials are used by the scientists as documentary data to reconstruct the climate’s past. Simply put, the data is comparatively interpreted, analysed against the backdrop of historical-climatological theory and compiled into longer series, often spanning many centuries, by the use of statistical tools (Glaser, 1996; Riemann, 2012).

Both the documentary data (in textual form) and the intermediate results of its analysis (in numerical form) are accessible through the Freiburg group’s database, and each entry is wrapped up in geographic and bibliographic metadata. Much of the data stored in Tambora can be seen as remnants of research already completed. In a sense, the scientists in Freiburg reassembled an archive after its repletion. But why doing so? From the actors’ perspective, the data is too precious to be discarded; although already used, it remains ‘a valuable treasure’ (Riemann et al., 2015: 63). Not only does Tambora document the immense amount of collection work by the Freiburg group, hence allowing the scientists to garner recognition; it also allows for the data to be reused by others within Historical Climatology and beyond for novel research questions or for trying to retrace or even replicate some of the results synthesized in Klimageschichte Mitteleuropas (Glaser, 2013), a best-selling book in its genre.

For the ethnographer of science, there is − at first sight − nothing unusual to Tambora. Its emergence on the scene of Historical Climatology is perfectly in line with current trends towards databasing and open access initiatives in the sciences. In this day and age, new databases designed for data sharing are created on a regular basis in various natural scientific fields, from Arctic Science, botany, ecology, to genomics and oceanography, manifesting what Bowker called ‘a new kind of science in which the database is the end product’, and not the scientific paper in which the data used to be ‘enshrined’ (Bowker, 2000: 643).² For the Freiburg group, constructing their database is indeed a research practice in its own right, on a par with more classical forms of doing climate science, not least because it helps keep the department alive by securing continued external funding. Thus, the decision of the group to open its local archive could instinctively be depicted as the natural outcome of shifting priorities within both the sciences and associated funding organizations, a shift that is accompanied by accrued technological means to share data beyond local contexts (Borgman, 2007).

Or it could be depicted as evidence of a spirit of sharing, a spirit that might be seen as part of the character of ‘the honourable scientist’ who readily and generously gives away what he/she has so meticulously collected. In this view, sharing data as a common good is a case of applied altruism, a nod to ‘communalism’ (Merton, 1973[1942]) in science. Oftentimes, this ethos is connected to a rhetoric of transparency and accountability with respect to the evidence upon which scientific findings rest, as for example in a report by the Royal Society, titled Science as an open enterprise, where science is universally depicted as a ‘self-correcting process since the first scientific journals were established’ (Royal Society, 2012: 13). Openness − putting the data onto the table (or into an open-access database) for everyone to scrutinize and use − contributes to the general advancement of knowledge, its proponents claim, and helps uphold the norm of ‘organized skepticism’ (Merton, 1973[1942]). There is, however, a difference between openness as an ideal state of affairs and openness ‘as a process and practice’, Levin and Leonelli (2017: 296) point out: in ordinary scientific practice, openness ‘comes in degrees and varieties’ (2017: 295) and is ‘enacted’ (2017: 283) by researchers in multifarious ways – or not enacted at all, the result being ‘empty archives’ (Nelson, 2009: 160).

On the one hand, then, the ethnographer trying to understand Tambora’s existence is in danger of having recourse to an idealizing presumption about science as an inherently altruist and transparent endeavour, a presumption that can be found in many manifestos on why sharing data is beneficial to science as an institution, and usually serving the purpose of erecting a ‘moral economy of data exchange’ (Strasser, 2012: 87). On the other hand, there is the somewhat deterministic presumption that everything scientific − be it an idea, a way of doing research or a database − can be traced back to a zeitgeist which is, in our case, infused by promises of an upcoming Age of Data. I maintain that these presumptions lack concreteness, for they do not draw our attention to the field-specific circumstances of Tambora’s existence. Most importantly, these presumptions leave unquestioned the status of data sharing as normal conduct within the sciences. Yet whenever a practice is considered as normal, maybe even as an indisputable moral necessity, it is time for the ethnographer of science to be intrigued and turn the normal into a staggering phenomenon again.

To do so, I will chiefly rely on in-depth interviews with the actors involved in constructing Tambora and on field notes taken at the Department in Freiburg. Part of the interviews were informal, part of them semi-structured, with the latter recorded on tape.³ Thanks to cyclical visits, I was able to observe successive versions of tambora.org materialize and go online. I complemented my local insights by attending conferences in the field and conducting interviews with scientists from other research groups. It remains beyond reach, though, to do justice to the complexity and plenitude of the thoughts and practices associated with this database, which is, for the actors, not merely a technical object, but part of their very existence as historical climatologists.

The stakes of making data accessible

The Science Studies literature provides us with quite a few observations suggesting that sharing data is not self-evident at all. A prominent reason that can make scientists shy away from sharing is the fear that their data might be misinterpreted, misapplied or otherwise maltreated (Baker & Millerand, 2010: 116, 128; Levin & Leonelli, 2017: 298). As Millerand (2011: 231f.) infers from her ethnographic study on the Polar Data Catalogue, a repository used by a mixed natural and social science collective for the exchange of data pertaining to climatic changes in the Arctic, reluctance to feeding the collective repository with recent data is not uncommon: at stake is the shifting status of the data from proprietary to (semi-) public good. For instance, one of Millerand’s interlocutors, a geographer, judges it unjust to be obliged to contribute data before getting the juice out of it, given the amount of work required to gather it, which is why he suggests a timespan of five years prior to sharing the data (2011: 223).

In the case of data, research organisms and software in the biological sciences, attitudes towards sharing these items are intertwined with dissimilar ‘goals, preferences, constraints and institutional settings’ (Levin & Leonelli, 2017: 282) and come with varying explanations; ‘choosing never to disclose data to the wider community, even after publication’ (2017: 289) is at the more conservative end of these attitudes. Analysing a database in mouse genomics, a field of bigger data, Hine (2006) points particularly to the concerns of PhD students who, not having arrived at their results yet, were ‘worried that contributing data would mean losing control over the data, leaving it open for someone else to use or delete accidentally’ (2006: 284). For them, the shared database signified ‘a potential threat to personal ambitions for gaining recognition’ (2006: 284).

These ideas are not foreign to historical climatologists, whether they are at the beginning or in the middle of their careers. In Historical Climatology, whole datasets are not automatically shared, even after results have been obtained from them and published. Although it is customary to give away (and receive) smaller bites of data bilaterally − for example, some passages of a particular source material − there is a careful selection of what is shared and with whom, according to my interviews. One of my interlocutors remarks, ‘Well, it depends on the agreements. There are different kinds of agreements. Potentially, you give everything. Or nothing.’ For instance, in the case of a short-term collaboration limited to one specific journal article, there can be an exchange of data for co-authorship, while main author(s) and co-author(s) might switch roles next time. After an article is completed, the data will stay at the disposal of its producer and ought not be used by the main author for an unrelated article again, the historical climatologist explains and adds: ‘You cannot be a lead author only. … It’s a good business actually to be a co-author because, you know, of course it depends on the paper, but usually the main author has to work the most. … Plus, to be in a large-scale work means at home that you are integrated into the international level.’

But there are downsides, too: if the agreement for sharing data is only informal or tacit, giving away data always involves the risk of not being adequately recognized for a contribution. The situation is different in the case of a grant-related collaboration that is regulated by a formal contract; here, the data is likely to be a resource collectively used by a group of researchers from the very beginning. Depending on the grant, it can possibly be mandatory to publish the data to some degree and in some form or another, albeit only after the research has come to a (provisional) end. These are but a few fragments of what Baker & Millerand (2010: 128) lucidly call a ‘taxonomy of data-sharing behaviour’. Generally speaking, the data is not freely circulating within the confines of Historical Climatology, but exchanged under specific, sometimes implicit, conditions only.

As to infrastructures in Historical Climatology, there has been a lack of shared databases so far. In contrast to other branches of climate science with well-known, collectively fed databases for the storage and exchange of climate data (like those hosted by the National Oceanic and Atmospheric Administration in the USA), a standard database that could have been widely used for sharing across research groups did not exist in Historical Climatology’s recent past. Tambora, designed to be a collective repository open to all research groups who wish to contribute datasets and officially labelled as a ‘climate and environmental history collaborative research environment’, is a novelty in this respect. Other databases in the field serve different purposes, such as oldweather.org , a game-like citizen science platform where individuals test their talents in transcribing historical ship logs that contain unique meteorological information on wind, precipitation, or extreme events over the oceans. Euro-Climhist (www.echdb.unibe.ch) is another example; analogous to Tambora, Euro-Climhist presents a wide array of documentary data, but does not provide tools for data contribution or data management. It is a display for data originating from the research group led by Christian Pfister, a professor of history at the University of Berne until his retirement, whose publications on climate history and its societal consequences in Switzerland became canonical (e.g. Pfister, 1985).

In providing a shared repository for documentary data to Historical Climatology, Tambora subverts the habit of exchanging data bilaterally, ‘de la main à la main’ (Heaton & Millerand, 2013: 903), which is still the prominent mode of exchange in the field. By enabling research groups and individuals contributing data to Tambora to maintain control of that data − contributors remain the sole owners of their datasets and can decide which parts will be consultable by others − Tambora’s developers hope to change the field-specific habit of keeping datasets to oneself instead of contributing to a centrally administered repository. This habit can be comprehended if one takes into account how much labour is required to constitute and assemble documentary data even for a smaller historical-climatological reconstruction of, say, precipitation patterns in the Northeastern Swiss Alps from 1450 to 1500 ad. Going to libraries and archives, reading, transcribing (and often translating), evaluating, sorting out, extracting and generally thinking about a large number of written materials is a time-consuming endeavour. Far from being a technical task, this form of data gathering is the primordial step of the research process. Contrary to half-automatized forms of data gathering (as in the case of continual instrumental measurements of precipitation and temperature), an immense personal involvement is required for this ‘bitter labour’ in Historical Climatology, as one scientist framed it. This detail helps explain why datasets are colloquially named after individual scientists who presumably took the largest share of work in constituting them: ‘Van-Engelen-Data’, ‘Glaser-Data’, or ‘Camuffo-Data’ are shorthands for referring to the datasets kept by distinct research groups, each specialized in a particular region in Europe (here, the Netherlands, Germany and Italy).

When recapitulating three decades of collecting climatological data, Glaser evokes the never-ending and sometimes futile character of his adventures in the archives. At the beginning, during the late 1980s, ‘the climate’ as an epistemic object had not yet been as commonplace beyond science as it is today, and so diplomacy was needed to convince archivists of his interest in the first place. Since historical-climatological analyses require a large stock of archival sources from different periods and places, data gathering began with touring all over Germany. Glaser depicts the mundane conditions of data gathering in one of our very first conversations:

It’s been an infinitely laborious task. That’s what sets Tambora apart, in fact. …

Every summer, for years, I was as pale as a submarine captain because I spent most of my time in the archives. … It was like stockpiling, like gathering, like a passionate drive, basically, to collect the data. This passion for collecting can run wild there. Maybe some sort of primal instinct. … With my team, I visited almost every archive. We searched systematically and thoroughly, approached all major archives as well as small ones, received lots of hints. I’ve been on the road for many years.

The stock of data accumulated over time is considered to be part of a lifetime achievement in Historical Climatology. Can it thus be an easy decision to enable other researchers to make use of precious goods so meticulously collected, not knowing how (well) they will be treated? Glaser is at the height of his scientific career, with major papers in reputed journals (e.g. Glaser & Riemann, 2009; Glaser et al., 2010), a long list of completed research projects and a tenured position at a renowned university; and still, it is not without risk to put whole datasets on a publicly accessible platform. What was usually kept on inaccessible hard disks or locked in office cabinets and exclusively shared with hand-picked colleagues, can now be consulted far beyond Freiburg, by almost everyone interested in doing so (after opening a free account on Tambora’s website). This gives the data a hitherto uncharted level of exposure well beyond the inner circle of Historical Climatology: one of the most fragile elements of the research process is now put on display, which enables critics to point at possible limitations or flaws in the data. These potential critics might be fellow scientists, or they might come from groups with a general interest in climate science, a field under close scrutiny. At this point, open access connects with public debates: Ramírez-i-Ollé (2015) describes how, in the case of relentless attacks by sceptics from outside science on climate scientists’ trustworthiness, making research data accessible can be a means for rhetorically demarcating the social boundaries of science by refuting accusations of ‘being opaque’ (2015: 398), while at the same time trying to preserve the autonomy of climate science from ‘external corrective interventions’ (2015: 402).

When interviewing Tambora’s constructors in more detail on their thoughts about their database I successively found that there is an unexpected wealth of strategic rationales that justified taking the risk of laying data into unknown hands. Rarely have these rationales been wrapped up in concerns about the public position of climate science in general; rather, I was led deeper into the scientific sub-fields where Tambora is supposed to make a difference. First, there is the collective of historical climatologists: PhD students, postdocs, and pre-eminent scholars with their distinct geographical regions of expertise and unique sets of documentary data. These scientists are the most likely users of Tambora, and they are the ones susceptible to giving the Freiburg group credit for its efforts in operating and maintaining a sophisticated database for everyone to use. They have to be convinced to contribute if the database is to become an ‘obligatory passage point’ (Callon, 1984: 196) for sharing data, in a field that has not found its digital destiny yet.

Second, Tambora, and the data stored in it, is likely to get the attention of an extended collective of scientists working on a wide range of topics concerning the history of the climate. Doing research with documentary data generated from archival materials is only one of many climatological specialities, practised by a relatively small collective of researchers. Beyond the confines of that collective, there is a vast array of heterogeneous approaches to the climate’s past, each of them depending on one specific research material: ice cores, marine and lake sediments, loess, speleothems, pollen, corals and tree rings are among the most prominent (Bradley, 2015). Every material requires its own form of analysis, like chemical analysis of CO₂ in ice cores or microscopic measurements of tree-ring structures, which gives rise to heterodox findings on multiple timescales. Basically, the field of climate history is fragmented into distinguishable sub-fields, each one of them devoted to a specific research material: the tree-ring people, the ice-core people, the paper people, and so on.⁴ To be sure, they communicate with each other, collaborate through joint conferences, projects and papers, and they engage in arguments about methods, findings and research materials, the outcomes of which are likely to have some consequences on the distribution of status as well as on the distribution of resources between the different sub-fields. Reacting to this state of affairs, the Freiburg group’s strategy is to use Tambora as a tool for enhancing the paper people’s visibility and for defending the validity of its historical sources, a contested research material, in constant need of justification – or so I have been apprised time and again by my interlocutors in the field.

I will chronicle a quarrel between the tree-ring and the paper people later; it is already conceivable from my aperçu that the existence of Tambora is imbued with issues of legitimacy and recognition. With respect to these issues, one magisterial text can hardly be ignored, although that has largely been its fate in recent Science Studies: Pierre Bourdieu’s The Specificity of the Scientific Field and the Social Conditions for the Progress of Reason (1975).⁵ In this provocative essay, science is depicted as an incessant and vigorous ‘struggle for scientific stakes’ (1975: 21), a struggle ‘in which every agent must engage in order to force recognition of the value of his products and his own authority as a legitimate producer’ (1975: 23). Since it comes from competitors only, recognition is hard to gain, as is legitimacy, hinging on ‘the power to impose the definition of science (i.e. the delimitation of the field of the problems, methods and theories that may be regarded as scientific)’ (1975: 23). There is no romanticism in Bourdieu’s targeting the social realities of scientific practice, which puts his ‘conflictual approach’ (Gingras, 2013: 76) in sharp contrast to characterizations of science as a disinterested quest for pristine findings, an example being Rheinberger (1997), who, in reconstructing the details of experimental practice, clings to ingenious individual scientists as if they were living inside their test tubes, and largely abstracts from the contested fields in which scientific practice takes place.

In a detailed critique, Bourdieu’s fragmentary sociology of science has been reassembled by Sismondo (2011), who advances some neglected facets that might deserve consideration, in spite of Bourdieu’s exclusive focus on theoretical scientific work. More specifically, Sismondo proposes taking up the notion of capital as a tool for investigating ‘the production, distribution and consumption of knowledge’ within distinct ‘political economies of knowledge’ (2011: 95). This triplet not only pertains to knowledge, I think, but also to data which may see itself transformed into capital under specific conditions of exchange, as in the case of Tambora. Tambora’s data exceeds its original purport as a scientific commodity used for reconstructing the history of the climate. By means of the database, an infrastructure of distribution, the data is made to travel beyond the micro-contexts of its production. Thereby, it takes on new roles and is integrated into new valuation regimes at the moment of being consulted at the desks of scientists who judge its quality, interpret it as evidence for or against findings published by the Freiburg group, or simply assess the group’s wealth in terms of data.

In Bourdieusian terms, the decision of constructing a database could be likened to an investment, intended to yield symbolic capital − and, by its conversion, social and economic capital as well (Bourdieu, 1986). ⁶ In the rather technical language typical to his œuvre, Bourdieu introduces the following abstraction:

Every scientific ‘choice’ − the choice of the area of research, the choice of methods, the choice of the place of publication, the choice … between rapid publication of partly checked results and later publication of fully checked results − is in one respect − the least avowed, and naturally the least avowable − a political investment strategy, directed, objectively at least, towards maximisation of strictly scientific profit, i.e. of potential recognition by the agent’s competitor-peers. (Bourdieu, 1975: 22f.)

This position, akin to ‘a priori reasoning’ (Sismondo, 2011: 81), seems too strong, but we need not take it at face value or import Bourdieu’s social theory wholesale to turn his essay into a tool for sensing the ‘struggles and strategies’ (Bourdieu, 1975: 19) that are pivotal to the ‘choice’ of sharing data in Historical Climatology. Astonishingly enough, Bourdieu arrived at the descriptions of science articulated in his essay almost with a sleight of hand, largely sparing himself empirical efforts: although he presents a few and very brief examples (primarily from physics and sociology itself), he sticks to generalized considerations. If these are to be of heuristic value, we are supposed to complicate them. What kind of ‘investment strategy’ are we talking about in the case of Tambora? How does it translate into practice? In which ways is this strategy contingent upon the Freiburg group’s position in the field? Where does this position derive from? What are the ‘rules of the game’ (Kim, 2009: 62) regulating the distribution of recognition? And how exactly is recognition granted? Although I will refrain from burdening the remainder of my article with too much Bourdieusian terminology, taking these questions to the field is worth trying.

Materiologies

To understand the strategic side of Tambora is to understand the field of its emergence first, or so it ensues from the vantage point elaborated above. As briefly outlined above, research into the climate’s history is diverse: each of the different sub-fields is contributing its share to the composition of a climatological mosaic, though some of the pieces of that mosaic might not fit together, and so questions concerning the most reliable version of the climate’s past arise. These questions are closely tied to the status of the research materials used. For Bourdieu, who was not specifically interested in non-humans (Sismondo, 2011: 91–93) and thought about science ‘without reference to scientists’ interactions with the material world’ (2011: 93), status is a property of human beings only; in our case, however, it is the status of research materials which is most important and which, in turn, bears upon the status of the findings derived from these materials and upon the status of the researchers working with them.

But how to determine the status of historical documents within climate science? There is, of course, no single indicator allowing it to be pinned down in any definite way. The amount of funding for each of the competing sub-fields, the number of chairs attributed to them, the number of citations in high-profile journals, the percentage of rejected papers based on this or that research material, and so forth might give some tentative hints on how the field is structured − notwithstanding the difficulty of determining these scientometric indicators –, but ‘status’ is a much more qualitative, relational category, I hold: it is attributed, rather than being an inherent quality of any given material, and it reveals itself to the ethnographer of science in nuances, depending on whose perspective is taken into account. Therefore, a scientist socialized in the humanities is likely to value written sources more emphatically than someone who has exclusively trained in atmospheric physics and might not have the chance to confront medieval source materials so far, one might presume (exceptions apply). Likewise, a scientist primarily concerned with reconstructing concentrations of CO₂ in algae − a case examined by Schinkel (2016) − conceives of his/her material differently from his/her colleague from the ice-core department. The ice-core scientist might be more critical towards algae; at the same time, he/she cannot resort to the same familiarity with this material than those scientists who are confronted with it day by day and have gained more intimate insights into its strengths and limitations.

When asking my interlocutors to sketch their personal hierarchy of research materials, the most frequent answer I got was that this question must not be separated from the specific purposes to which the material is employed and the research questions asked. Nevertheless, I have on many occasions encountered a colloquial dichotomy between hard and soft materials when it comes to questions of reliability. Materials taken from ‘nature’s archives’ are on one side of this dichotomy, with tree-rings, stalagmites, or ice-cores being imagined to firmly conserve climate’s deep history. On the other side of the dichotomy are all sorts of historical documents, a material of man-made origin, epitomizing Culture. Instrumental data of temperature and precipitation, albeit collected by man-made instruments, somehow fall between this divide, maybe precisely because there seems to be no concrete source material from which they are derived. My point here is that there are sets of assumptions, ideals, metaphors and values employed when scientists think about their research materials (and judge the materials of their competitors). I suggest we refer to these thoughts as materiologies. They are complementary to, but not identical with, methodologies, sets of considerations pertaining to scientific procedures. At the most basic level, materiologies come into play in Historical Climatology when researchers set out to pick the archival materials they consider best suited for a given project or when they discern ‘good’ from ‘bad’ materials at their desks, in ordinary practice.

Whenever the actors in Freiburg ponder on the differential status of their archival sources, they enter the materiological game. Matthias Herbst, the project manager of Tambora for several years and a physicist by training, points out in one of my interviews what the positions in that game were, in his view:

Historical Climatology is located in a border zone where climatologists, historians, and researchers from all sorts of disciplines meet, even from the humanities. And the issue of having to justify oneself is a particular challenge we’re constantly facing. Whenever we attend a conference in paleoclimatology, there are researchers from the natural sciences … who say: This is all just subjective, just a couple of historical sources. Our data is given particular attention indeed. To which degree is it valid? … Our position is: OK, if one considers plenty of sources by one author, it is possible to examine them in an objectifiable way. And if one combines a very large number of these sources [by multiple authors, k.d.], an objectifiable picture comes into view. But a large part of what we do really consists of justifying our work … because it does, for instance, require diverse techniques and methods, taken from diverse disciplines, and so people who work in one particular discipline may have some difficulties accepting that. That’s why it is important to us to disclose almost all of our sources, so anybody can check them. If there is any doubt, you can go back to the original source. And so we cannot afford to make embarrassing mistakes, which result from sloppy editing of datasets. This can easily be used to call them into question.

k.d.: To discredit them?

Well, I wouldn’t say to discredit, but they can be challenged more easily. I believe that, as a result of these experiences, we have developed an excessive culture of openness and a culture of justification, too.

The openness Herbst is talking about is neither deployed for its own sake, nor is it derived from a universal scientific ethos. Rather, the decision to ‘disclose almost all of our sources’ is a strategic move in a fragmented field, helping to justify the paper people’s unusual materials and methods in the face of doubt, criticism or suspicion. Making data accessible can be understood, then, as an elaborate gesture of justification signifying something like this: Look, this is our data. We have nothing to hide. Come and judge it yourself. And Herbst adds: ‘If you know better, we are happy to discuss it with you.’ This gesture is less of a reaction to any singular event or debate, but relates to long-standing questionings of the nature of historical-climatological work.

Herbst’s explanations also indicate that the Freiburg group throws its data into an arena where players from dissimilar backgrounds confront each other. Even though tree-ring, ice-core and paper people share an epistemic object, their approaches to reconstructing the climate’s past differ significantly, with historical climatologists being alone in borrowing some of their ‘techniques and methods’ from historians and from the humanities, more broadly. Further, Herbst’s depiction of the field is coupled with the dichotomy between hard and soft research materials mentioned earlier. On the one hand, there are researchers from the natural sciences (who are, in this passage of the interview, presented as ‘others’): they deal with hard materials such as tree rings and stay true to the traditional division of labour between the humanities and the natural sciences, by using analytical procedures (like densitometry, the analysis of the content of CO₂ in ice cores) which mostly originate from chemistry, physics or biology. On the other hand, there is the hybrid field of Historical Climatology, operating within a ‘border zone’ where dealing with a blend of techniques from physical geography, history, statistics or atmospheric physics is imperative to rendering archival sources − a soft material, as it were − amenable to analysis. This is how disparate techniques like source criticism and multivariate statistics, physical-geographical analysis and hermeneutics come together for findings based on documentary data to emerge.

Many of my interlocutors in Freiburg and beyond made remarks on what they perceive to be a hiatus between the status of Historical Climatology and other endeavours to investigate the climate’s past. Glaser commented that there is an implicit hierarchy of research materials across the field, a hierarchy assumed by him to operate at a deeper level than concrete analytical questions or reflections about the most useful material for a specific research question. He rephrases a typical incrimination of Historical Climatology’s analytical procedures that might be put forward by a critic: ‘This cannot be true, this is not possible. It is subjective. Interpretation of text. … That response was and is a reflex.’ For some of Historical Climatology’s critics, Glaser continues, dealing with archival materials is a suspicious, maybe even arbitrary, technique that can never be of the same order as measuring temperatures with standardized instruments or analysing the chemical composition of ice cores.

When I was loafing around their desks and talking to the scientists in Freiburg about the nitty-gritty of their research, I came to realize that their way of approaching documents from the past is much more standardized than I had imagined. When selecting and interpreting historical weather diaries, town chronicles, or administrative records, historical climatologists employ source-critical procedures and make use of hermeneutic tools that are shared within the research group and have been refined over the years, such as temperature scales and coding catalogues used to decipher climate-related observations; these tools are also meant to regulate scientific practice and tame wild interpretations of the archival materials. Sophisticated as it is, this hermeneutic form of analysis may be denied the right to be a scientific technique, Glaser recounts in the interview:

I had to endure, at conferences, that someone was suddenly saying, notable fellow researchers included: … All of these things … What you read into these sources … I am sorry, but is this supposed to be science? … It was hurtful. And, being a young man who had worked for himself, I needed the grit to oppose and say, that’s what I keep saying today as well: This is one procedure. This is what I offer. Whoever wishes to may adopt it. And whoever doesn’t, should stay out of it.

Meanwhile, this type of archaic criticism has been moderated, with an increased institutionalization and consolidation of Historical Climatology (a branch that, under its current name, is roughly three generations old), but variations of disapproval can still be found today. At stake is not the unruly application of a method, to be sure, but the very choice of a method, which in turn derives from the choice of a specific research material. Glaser takes a pluralist stance here − ‘this is one procedure’ − knowing that despising hermeneutics for being inadequate would imply staying away from most documentary data altogether. If one opts for man-made archival materials, it inevitably comes with the cost of thwarting conventional methodologies within the natural sciences.

Nature vs. Culture

As we have seen, hybridity is a – perceived − stain of historical-climatological methods, while a critical issue with respect to historical documents as research materials is reliability. More specifically, written sources might be considered as ‘per se questionable’ (Glaser’s figure of speech), as prone to deformation and fantasy, contrary to the sturdiness of ice-cores or tree-rings, or the untouched naturality of sediments. This idea of the unreliability of words on paper, of everything written, is indeed a very old idea, and one that doesn’t specifically and exclusively pertain to scientific quarrels (Mainberger, 1995). In everyday speech, there is some ambivalence as to the status of paper: it is ‘a figure both for all that is sturdy and stable (as in: “Let’s get that on paper”), and for all that is insubstantial and ephemeral (including the paper tiger and the house of cards)’, as Gitelman (2014: 3) notes. Historical climatologists would rather point to the former figure and argue that for a variety of reasons, among them the education of historical scribes, their long meteorological memory and their existential relation to weather phenomena, trust in well-chosen historical documents can indeed be justified, whereas scientists working with tree-rings might be able to exploit the latter figure to make a case against their colleagues’ use of data from paper on which ‘you can write anything’, as the saying goes.⁷

Judgements concerning the reliability of historical documents as research materials recently popped up in the field, in the course of an argument between the tree-ring and the paper people within the pages of Climatic Change, one of the flagship journals in climate science. In a recent article, co-authored by 32 historical climatologists (among them four from the Freiburg group), Wetter et al. (2014) claimed that a drought of many months in 1540 has been the single most extreme drought for centuries in Europe, even outperforming the devastating summer of 2003. The authors based their results on a large sample of data consisting of drought-related observations found in ‘more than 300 documentary sources of weather reports’ (2014: 353) originating from all over Europe. Among these historical sources are, for instance, ‘rainfall observations made by four chroniclers from Switzerland (situated in Basel, Zurich, Lucerne and Winterthur) and neighbouring Alsace (France) who kept track of the duration and yield of precipitation events’ (2014: 354).

In a vehement comment on this article, a group of tree-ring scientists judged Wetter et al.’s (2014) claim as ‘unlikely’, stating that:

In summary, the presumed 1540 ‘Megadrought’ and its associated heat wave, did not leave a conspicuous fingerprint in the tree rings of hundreds of conifers from higher elevation settings in the European Alps, Pyrenees, and Balkans, nor in thousands of angiosperms from lower elevation sites in Austria, France, Germany, Switzerland and the Czech Republic, as would be expected by the degree of prolonged drought stress hypothesized in W14. (Büntgen et al., 2015: 186)

Their point is this: if such a severe drought is not ‘clearly visible’ (2015:187) in their samples, i.e. in ‘millennial-long tree-ring chronologies available from several species and regions in Europe’ (2015: 184), it probably did not exist at all, they claim. By itself, this view reflects very well how the differential status of research materials is performatively defined in the course of controversies. Büntgen et al. doubt that man-made observations of leaf and needle loss, one of the phenomena described in the sources, are a drought indicator, ‘especially if descriptions predominantly originate from gardens, parks and orchards’ (2015: 187). Here, a metaphoricity of Nature versus Culture, of wild-growing trees versus domesticated species observed by humans in cultivated environments, sustains the attempt to find fault with the reliability of written historical accounts, with orchards becoming a symbol for the frailty of these accounts.

In their equally vehement response, titled ‘Tree-rings and people’, a collective of historical climatologists from (otherwise competitive) research groups stood by Wetter et al.’s conclusion that ‘1540 stands out as being the driest of the 510 summers from 1501 to 2010’ (Pfister et al., 2015: 195). They point out that a lack of correspondence between documentary and tree-ring data did not disqualify this claim, first, because there were still very few insights on how to bring together evidence from tree-ring and documentary data on the very subject of drought and unusually hot years (2015: 195, 197); second, because there was, suspiciously, no signal in tree-rings with respect to other ‘[w]ell-known hot and dry extremes’ in the record (such as 1616, 1636 and 1976), with ‘well-known’ indicating that these years stood the test of validation across various research materials (Pfister et al., 2015: 195f.). Pressing for more ‘systematic comparison’ (2015: 197) across dissimilar datasets, the respondents try to turn the tables by asking if tree-rings could, with respect to the reconstruction of droughts, be considered ‘representative and stable’ (2015: 195) at all.

Most interestingly for us, when explaining that tree-rings might show no signal of drought for the year under question because of heavy rainfall in late summer which ‘might have provided enough water to support continued growth’ of trees (2015: 195), the respondents enter Tambora onto the scene: ‘data from tambora.org’ (2015: 195) on heavy rainfall and floods of several European rivers is explicitly alluded to, in the main text, in support of their counter-attack on tree-ring data. Tiny as it is, this scrap from the debate exemplifies the strategic dimension of the database. Tambora is used by the paper people as a tool to pile up evidence for their claims and defend their cause. Amidst attempts of ‘monopolization of professional authority’ on the part of the tree-ring people who employ a kind of ‘boundary-work’ to rhetorically exclude ‘rivals from within by defining them as outsiders’ (Gieryn, 1983: 792), the paper people hope that Tambora allows them to interfere more powerfully into the struggle over notions of sound climatological work.

If we look at this controversy more broadly, then, what is at stake is a situation that can be framed, in theoretical terms, as a question of who is allowed to make a proper contribution to scientific knowledge, or, put more elegantly, as a question concerning ‘the capacity to speak and act legitimately (i.e. in an authorised and authoritative way) in scientific matters’ (Bourdieu, 1975: 19). Bourdieu would make us look at aspects like this:

In reality, the august array of insignia adorning persons of ‘capacity’ and ‘competence’ – the red robes and ermine, gowns and mortar-boards of magistrates and scholars in the past, the academic distinctions and scientific qualifications of modern researchers, all this social fiction which is in no way fictitious – modifies social perception of strictly technical capacity. In consequence, judgements on a student’s or a researcher’s scientific capacities are always contaminated at all stages of academic life, by knowledge of the position he occupies in the instituted hierarchies (the hierarchy of the universities, for example, in the USA). (Bourdieu, 1975: 20)

This passage might be of immediate appeal, since it bespeaks some archetypal ideas about science. In the context of Historical Climatology, though, it is insufficient. Here, ‘scientific capacities’ are insolubly dependent upon a materiality that is quite different from the robes and gowns described above: the research materials chosen − be they ice-cores, tree-rings, stalagmites, pollen, or paper − and the data derived from these materials are themselves part of the ‘insignia’ in a contested field. This twist complicates social characterizations of science that are too narrowly focused on the persona of the scientist or on academic institutions more generally.

Getting credit

One dimension of Tambora has not been targeted in detail yet: what about the database’s significance with regard to the (symbolic) relations between different research groups in Historical Climatology? The situation here is quite different from what I have described in the previous paragraphs, since hardly anyone working among the paper people has to be convinced of the avail of historical documents as a research material, nor of the validity of the hermeneutical procedures employed to analyse these documents.

Seen from within Historical Climatology, Tambora shines the spotlight on the treasures of the Freiburg group. Although historical climatologists might already have some idea of the amounts and the kinds of data collected and kept by particular research groups, the existence of Tambora allows a more thorough estimate of the Freiburg group’s wealth in terms of data. Showcasing the heterogeneity of its research materials that cover a multitude of regions and time periods sets the Freiburg group apart from other research groups whose possessions in data are considerably smaller or which have not put their data on display at all. In this sense, the database is turned into an infrastructure for ‘doing distinctions’ (Burri, 2008: 50). As in the case of the everyday matters of taste Bourdieu (1984) described − fashion, food, or furniture − it’s the subtle gradations that make a difference with respect to who is able to accumulate symbolic capital for what. In our case, not just any data and not just any database will do: the way the data is categorized and presented, the tools the database provides, the extent and quality of the metadata, or − as in the case of wine, indeed − the data’s provenance can be important criteria here.

Since symbolic capital is a relational category, it has to be granted to the Freiburg group by those who access, use or just look at the data on Tambora. By consulting the database, historical climatologists beyond Freiburg will realize how much work Glaser and his collaborators have put into the production of the datasets showcased on Tambora. This work, requiring wily craftsmanship, is highly valued among the paper people. As a general rule, most historical climatologists who entered the Pantheon are known for their data collection work. The status of this work is of some importance when it comes to questions of data re-use. It is assumed that colleagues who take up datasets from Tambora for their own analyses give credit to the work of the Freiburg group, e.g. by referring to the origin of the datasets used for a specific paper. To facilitate this practice, so-called DOIs, digital object identifiers, are currently affixed to the datasets in Tambora so that researchers can easily refer to the exact datasets they used. It is not eternal gratitude that the Freiburg group hopes for, but being correctly referred to (and, by implication, being visible as a ‘source’ in the research of others, similar to being included in a bibliography).

Getting credit can be a delicate matter, though. On occasion, the Freiburg group tries to follow who is using Tambora’s data for which research questions. Given the uniqueness, also in geographical terms, of some of the datasets in Tambora, e.g. relating to flood events of major rivers in Germany, it is possible to become aware of these datasets being integrated into analyses by researchers elsewhere, or there might be an official making of contact, since contributing substantial data can qualify for co-authorship of a paper. Tables listing datasets and their origins can be found in the supplementary materials of many papers online, so this is where to find references to Tambora. However, it has happened that members of the Freiburg group have been listed as co-authors of a paper, but without tambora.org being mentioned as the source from which the data has been taken. Glaser relates a case where numerous emails were exchanged to insist on correcting this omission, with the editor of the journal finally resolving the case in favour of the Freiburg group. On ‘the dark side of science’, Glaser says, you have to fight for the visibility of your database.

This anecdote slightly complicates our understanding of open access to data: putting data ‘out in the open’ is not synonymous with completely letting go of the data. Although freely accessible and in this respect already belonging to the public domain, the data can still be seen by its producers as somehow being part of them: understandably so, from a practical vantage point, given the arduous work invested in it. At stake here is the disentanglement of the data from its context of production. On the one hand, Tambora has been explicitly constructed to let datasets travel. On the other hand, the datasets’ ties, symbolic in nature, to their creators shall not be cut entirely during these travels. Even after having been put into the hands of others, the data remains sensitive matter to those who produced it.

This constellation is not categorically different from issues pertaining to the use of scholarly literature: authors of books or articles often care as much about the fate of their craft in the hands of others − and about being referenced and referred to − as the authors of datasets do. But while the classical referencing of texts counts among the most routine (and yet intricate) jobs of scientists, the referencing of datasets might be of a less routine nature, one might assume, given that the overwhelming plenitude of open-access datasets from which researchers can help themselves during their research has not been there for as long as the overwhelming amount of books in libraries, which turns the referencing of datasets into a more unstable practice than the referencing of articles or books. How to cite datasets correctly is nowadays taken up in myriad recommendations; I think, however, that no set of best practices will completely dissolve the tensions involved in the sharing of data that has once belonged to a more restricted sphere before being allowed to circulate.

Data’s origins

To end with what can only be a provisional analysis of the Tambora case, I will complicate the data game a little further. In most of the reflections about who receives recognition for what in the exchange of data, it is assumed that the data can indeed be traced back to its original creators, for without proper origins, receiving recognition for some dataset or another would be even trickier than it already is. If the data were generated automatically and circulated widely from the very moment of being generated, every attempt to turn it into symbolic capital would likely be in vain. In the case of ceaseless data streams coming from technical apparatuses, such as weather stations in meteorology or colliders and accelerators in physics, the question of who is at the origin of the data has to be put in a different way compared to Historical Climatology. In the latter case, every dataset bears the traces of its producer(s): textual fragments from chronicles, diaries or harvest records do not flow directly out of the archive onto the desks of historical climatologists, and documentary data has to be painfully gathered in a piecemeal fashion, from half-lit archives, by lonesome scientists. This circumstance accounts for the exceptional value of the data among the paper people, and it accounts, in turn, for the very possibility of turning data into symbolic capital.

If we look at it from a different angle, though, the question concerning the data’s origins gets convoluted. As Glaser does not cease to highlight in the interviews, the production of the mass of data now stored in Tambora involved a collective of researchers; he makes that clear in his publications as well, for instance by providing a list of data gatherers on the third page of a volume on Aufzeichnungen und Daten aus Franken, Sachsen, Sachsen-Anhalt und Thüringen 1500–1699 (Glaser & Militzer, 1993). As to datasets constituted from archival materials from the eastern part of Germany, Glaser mentions working with two historians, Matthias Deutsch and Stefan Militzer, well-versed in foraging in the archives of Berlin, Leipzig and Dresden, after the fall of the Berlin wall. With regard to the Tambora data as a whole, there has been a collaborative effort ‘with many colleagues’, Glaser emphasizes. ‘There were around 20, 25 people involved: student assistants, counterparts, colleagues, who contributed one part or another. … But I was in charge of the operation.’

From this vantage point, the term ‘Freiburg group’ used as a shorthand throughout this text extends well beyond Freiburg. Analogously, ‘the Glaser-Data’ can be thought of as a symbolic denominator for the efforts of a collective, coordinated by Glaser. And the same train of thought applies to Tambora, which is of boundary object-like character (Star & Griesemer, 1989), aligning heterogeneous actors with different interests. To complete the picture of this database, we have to consider the work of Peter Buchmann, a librarian and hydrologist by training who was involved in the construction Tambora from the very start and specialized in questions of digital rights administration and data management; of Laurenz Winter and Theo Petersen, two senior scientists who contributed expertise gained in previous database projects; of Samuel Fehr, an IT worker programming and designing Tambora; of Matthias Herbst, Tambora’s project leader, striving relentlessly to build state-of-the-art ideas into the database; of student assistants investing their efforts into digitizing paper-based sources; and of the infrastructure workers setting up, maintaining and repairing servers in basements where the digital data is materially located. The efforts of these actors are a prerequisite for documentary data to exist and subsist the way it does.

And there are more prerequisites, to be found in the deep history of Tambora’s data − and of Historical Climatology, more generally − which are of some importance for the question concerning origins. Producing data is, by its very nature, embedded into the traditions of this field, since techniques and materials are handed down from one generation of researchers to another. For example, Tambora’s creators used inventories (in the form of catalogues or lists) indicating interesting materials for climatological analyses, sometimes assembled by the forefathers of Historical Climatology, such as Gustav Hellmann (1854–1939) or Curt Weikinn (1888–1966) who produced inventories of sources that are of immense practical value today, since they indicate entry points into the panoply of written materials the historical climatologist has to roam through. Originally preserved on index cards, Weikinn’s collection is now accessible, in scanned form, through Tambora (Riemann et al., 2015: 71f.). Apart from such inventories, a repertoire of practical knowledge is continually handed on among the paper people, e.g. with respect to standards for evaluating and interpreting archival materials, much like the transmission of skills and standard procedures for handling and analysing research organisms in a laboratory (Kohler, 1994).

These observations suggest that the everyday practices of historical climatologists are composed of many layers some of them more historical, some more recent. Producing documentary data is a deeply cultural thing, in the pragmatic sense of being enmeshed in shared ways of doing. Accordingly, Tambora’s data has roots that ramify into a collective of researchers without clear-cut contours. From this point of view, the authorship of the data appears to be scattered across generations of historical climatologists.

Is this constellation comparable to what Bowker (2000: 673) describes with respect to data derived from indigenous knowledge? In the latter case, the data − for example on the whereabouts of herbs for medical treatment − belongs to a collective, handed down for generations; no specifiable individuals are at the origin of these data (a problem with respect to intellectual property law with its neglect of collectivities as owners, 2000: 672f.). With the data circulating and being used by actors who have not been implicated in its production (pharmaceutical companies, for instance), an issue of ‘fair recompense’ (2000: 673) arises. This issue is not exclusive to the case Bowker refers to; it is arguably to be found in many businesses, economic or scientific. In Tambora’s case, as in the case of indigenous herbs, the data can be seen to ‘belong to’ a collective, given the multiple actors taking part in its production.

However, there is at least one major difference between these two cases. Contrary to the ‘information imperialism’ characterizing the indigenous herbs case, with ‘a net flow of raw data out of the Third World into Western databanks’ (2000: 673), Tambora’s data largely stays within the collective of historical climatologists when it comes to generating valuable products out of it. The climatological data is not taken away from one dominated domain and then ruthlessly brought to fruition in another. Rather, the existence of Tambora enables the Freiburg group to hand down the data to the collective upon which it relied. In an interview, Glaser explains to me that:

It is pretty clear, there is a community which focuses on data collection, … on transcription, on palaeographic expertise, and there are many other scientists who have different fields of expertise. They need to come together. This research community is based on the division of labour, and that needs to be made visible. … As much as one would like to keep it [the data] to oneself, one has to take a leap. I need to untie it from my own person, I need to be able to pass it on, to unleash it.

k.d.: So that it can live on?

Yes, and be interpreted once again.

Tambora is characterized here as an infrastructure of concord. What the database ought to put on display is not only the Freiburg group’s unique datasets, it is also the distributed character of historical–climatological work. Together with its data, the paper people shall live on, wrapped up in a database.

Conclusion

To the outside observer of science, constructing a shared database might appear to be an example of applied altruism in science, or merely a technical procedure, or not fundamental to scientific practice at all. None of these presumptions applies to the case I have investigated. Instead, the coming into being of Tambora touches the core of what it means to do Historical Climatology: it is entangled with struggles over the status of documentary data, with matters of recognition, and with concerns for the traditions of the paper people. Finally, the construction of Tambora is fuelled by the hope to alter the very conditions of the field to which it responds.

Shifting between various layers of the database, as I attempted to do in this article, can yield fruit analytically. In one layer, the historicity of the Freiburg group’s data production transpires, while the turbulences between the tree-ring people and the paper people are to be found in another layer of the database’s existence. It is at the interstices of divergent materiologies where the ethnographer of science encounters some of the rudimentary values, assumptions, ideals and metaphors (remember the orchards) of the field under investigation. How scientists conceive of their research materials and of the materials of their competitors is not only consequential for how they judge the data generated from these materials, it is also a major element of their scientific identities and constitutive of what Fleck (1979[1935]) has called ‘thought collectives’. Further studies, in a comparative fashion, on the diversity of materiologies within climate science would be enlightening. There is, I believe, an abundance of materiological turbulences to be found in other fields of science as well.

Now, what will happen to Tambora’s data, out in the open? Detached from the contexts of its production and reappropriated elsewhere, the data is likely to take on new modes of existence. As unpredictable as the directions in which the data will travel is the future of Tambora itself. How will the database be received and used among the paper people and within the vast confines of climate science? And how much will the Freiburg group receive back from its willingness to open its treasure trove, in the currencies of credibility, recognition and legitimacy? In its current version, Tambora is still young, and so patience is needed in order to observe in which ways the hopes that have been built into that database will be fulfilled or not.

Meanwhile, the Freiburg group carries on, presenting Tambora around the world, implementing new tools, designing new interfaces and adding new datasets. As Heaton & Millerand (2013: 887) framed it, a database is a mutable entity that is never finished and requires constant maintenance, hence remaining fragile. In fact, if one acknowledges ‘the long now’ (Ribes & Finholt, 2009) of research infrastructures, often spanning multiple cycles of funding and enduring organizational and technological ‘tensions’ (2009: 393), the numerous vicissitudes of database construction become sensible. There is a bewildering open-endedness attached to this practice, a situation that has some consequences on the ethnographer’s characterizations, which can only be of a provisional nature. Following the life and times of Tambora and of the scientists behind it is likely to be a long-term ethnographic endeavour, or so it seems today.

Footnotes

Acknowledgements

Generously welcoming me into their offices, Rüdiger Glaser and the members of the Freiburg group made this text possible in the first place; they remain the most engaging interlocutors one could imagine. It was Christoph Hoffmann, trailblazer in the study of desktop work, who discovered the intricate nature of Historical Climatology’s materials and made me think about it. Thanks to David Jaclin for lucid comments; to Verena Halsmayer, Meritxell Ramírez-i-Ollé, Tobias Brücker, Michael Hagner, Katharina Limacher, Carlo Caduff, Christof Rothenberger, Anne Weist, Jean-Philippe Rickenbach, Anke te Heesen and Flurin Rageth for conversations; to Patricia Isabelle for shrewd remarks; and to Sergio Sismondo for striking discussions, encouragement and a desk.

Funding

This work was supported by the Swiss National Science Foundation, under a grant to the ‘Desktop Studies’ project (University of Lucerne & ETH Zurich).

Notes

Author biography

Kris Decker is completing a PhD thesis at the University of Lucerne, Switzerland, where he teaches science studies to undergraduates. Deriving from his fieldwork in climate science, a book manuscript on the cosmos of historical climatologists may soon see the light of day.

References

Arzberger

Schroeder

Beaulieu

et al . (2004) An international framework to promote access to data. Science 303(5665): 1777–1778.

Asdal

Moser

(2012) Experiments in context and contexting. Science, Technology & Human Values 37(4): 291–306.

Baker

Millerand

(2010) Infrastructuring ecology: Challenges in achieving data sharing. In: Parker

Vermeulen

(eds) Collaboration in the New Life Science. Surrey: Ashgate, 111–138.

Bontems

Gingras

(2007) De la science normale à la science marginale. Analyse d’une bifurcation de trajectoire scientifique: le cas de la Théorie de la Relativité d’Echelle. Social Science Information 46(4): 607–653.

Borgman

(2007) Scholarship in the digital age: Information, infrastructure, and the Internet. Cambridge, MA: MIT Press.

Bourdieu

(1975) The specificity of the scientific field and the social conditions of the progress of reason. Social Science Information 14(6): 19–47.

Bourdieu

(1984) Distinction: A Social Critique of the Judgement of Taste. London: Routledge.

Bourdieu

(1986) The forms of capital. In: Richardson

(ed.) Handbook of Theory and Research for the Sociology of Education, New York: Greenwood, 241–258.

Bourdieu

(2004) Science of Science and Reflexivity. Chicago, IL: University of Chicago Press.

10.

Bowker

(2000) Biodiversity datadiversity. Social Studies of Science 30(5): 643–683.

11.

Bowker

(2013) Data flakes: An afterword. In: Gitelman

(ed.) Raw Data is an Oxymoron. Cambridge, MA: MIT Press, 167–171.

12.

Bradley

(2015) Paleoclimatology: Reconstructing Climates of the Quaternary. Kidlington: Academic Press.

13.

Brázdil

Pfister

Wanner

von Storch

Luterbacher

(2005) Historical climatology in Europe – the state of the art. Climatic Change 70(3): 363–430.

14.

Büntgen

Tegel

Carrer

et al . (2015) Commentary to Wetter et al. (2014): Limited tree-ring evidence for a 1540 European ‘megadrought’. Climatic Change 131(2): 183–190.

15.

Burri

(2008) Doing distinctions: Boundary work and symbolic capital in radiology. Social Studies of Science 38(1): 35–62.

16.

Callon

(1984) Some elements of a sociology of translation: Domestication of the scallops and the fishermen of St Brieuc Bay. The Sociological Review 32(1): 196–233.

17.

Camic

(2011) Bourdieu’s cleft sociology of science. Minerva 49(3): 275–293.

18.

Carey

(2012) Climate and history: A critical review of historical climatology and climate change historiography. WIREs Climate Change 3(3): 233–249.

19.

Edwards

(2010) A Vast Machine: Computer models, climate data, and the politics of global warming. Cambridge, MA: MIT Press.

20.

European Research Council (2017) Guidelines on Implementation of Open Access to Scientific Publications and Research Data. Available at http://ec.europa.eu/research/participants/data/ref/h2020/other/hi/oa-pilot/h2020-hi-erc-oa-guide_en.pdf.

21.

Fleck

(1979[1935]) Genesis and Development of a Scientific Fact, trans. Bradley

Trenn

. Chicago, IL: University of Chicago Press.

22.

Gieryn

(1983) Boundary-work and the demarcation of science from non-science: Strains and interests in professional ideologies of scientists. American Sociological Review 48(6): 781–795.

23.

Gingras

(2013) Sociologie des sciences. Paris: PUF.

24.

Gitelman

(2014) Paper Knowledge: Towards a media history of documents. Durham, NC: Duke University Press.

25.

Glaser

(1996) Data and methods of climatological evaluation in Historical Climatology. Historical Social Research 21(4): 56–88.

26.

Glaser

(2013) Klimageschichte Mitteleuropas: 1200 Jahre Wetter, Klima, Katastrophen. Darmstadt: Primus.

27.

Glaser

Militzer

(1993) Wetter-Witterung-Umwelt: Aufzeichnungen und Daten aus Franken, Sachsen, Sachsen-Anhalt und Thüringen 1500–1699. Würzburg: Geographisches Institut der Universität Würzburg.

28.

Glaser

Riemann

(2009) A thousand-year record of temperature variations for Germany and Central Europe based on documentary data. Journal of Quaternary Science 24(5): 437–449.

29.

Glaser

Riemann

Schönbein

et al . (2010). The variability of European floods since AD 1500. Climatic Change 101(1–2): 235–256.

30.

Heaton

Millerand

(2013) La mise en base de données de matériaux de recherche en botanique et en écologie. Spécimens, données et métadonnées. Revue d’anthropologie des connaissances 7(4): 885–913.

31.

Himmelsbach

Glaser

Schönbein

et al . (2015) Reconstruction of flood events based on documentary data and transnational flood risk analysis of the upper Rhine and its French and German tributaries since AD 1480. Hydrology and Earth System Sciences 19: 4149–4164.

32.

Hine

(2006) Databases as scientific instruments and their role in the ordering of scientific work. Social Studies of Science 36(2): 269–298.

33.

Hong

(2008) Domination in a scientific field: Capital struggle in a Chinese isotope lab. Social Studies of Science 38(4): 543–70.

34.

Kim

(2009) What would a Bourdieuan sociology of scientific truth look like? Social Science Information 48(1): 57–79.

35.

Kohler

(1993) Drosophila: A life in the laboratory. Journal of the History of Biology 26(2): 281–310.

36.

Kohler

(1994) Lords of the Fly. Chicago, IL: University of Chicago Press.

37.

Krämer

(2015) ‘Menschen grasten nun mit dem Vieh’: Die letzte grosse Hungerskrise der Schweiz 1816/17. Basel: Schwabe.

38.

Lamb

(1977) Climate: Present, Past and Future (2 vols). London: Methuen.

39.

Le Roy Ladurie

(1971) Times of Feast, Times of Famine: A history of climate since the year 1000. Garden City: Doubleday.

40.

Leonelli

(2012) When humans are the exception: Cross-species databases at the interface of biological and clinical research. Social Studies of Science 42(2): 214–236.

41.

Leonelli

(2013) Why the current insistence on open access to scientific data? Big data, knowledge production, and the political economy of contemporary biology. Bulletin of Science, Technology & Society 33(1–2): 6–11.

42.

Leonelli

(2016) Data-centric Biology: A philosophical study. Chicago, IL: University of Chicago Press.

43.

Levin

Leonelli

(2017) How does one ‘open’ science? Questions of value in biological research. Science, Technology & Human Values 42(2): 280–305.

44.

Mainberger

(1995) Schriftskepsis: Von Philosophen, Mönchen, Buchhaltern, Kalligraphen. München: Fink.

45.

Merton

(1973[1942]) The normative structure of science. In: Storer

(ed.) The Sociology of Science: Theoretical and empirical investigations. Chicago, IL: University of Chicago Press, 267–278.

46.

Michener

(2015) Ecological data sharing. Ecological Informatics 29: 33–44.

47.

Millerand

(2011) Le partage des données scientifiques à l’ère de l’e-science: l’instrumentation des pratiques au sein d’un collectif multidisciplinaire. Terrains & travaux 18(1): 215–237.

48.

Millerand

(2012) La science en réseau. Les gestionnaires d’information ‘invisibles’ dans la production d’une base de données scientifiques. Revue d’anthropologie des connaissances 6(1): 163–190.

49.

Nadim

(2016) Data labours: How the sequence databases GenBank and EMBL-Bank make data. Science as Culture 25(4): 496–519.

50.

Nature (2016) Announcement: Where are the data? Nature 537(7619): 138.

51.

Nelson

(2009) Empty archives. Nature 461(7261): 160–163.

52.

Panofsky

(2011) Field analysis and interdisciplinary science: Scientific capital exchange in behavior genetics. Minerva 49(3): 295–316.

53.

Pfister

(1985) Klimageschichte der Schweiz 1525–1860. Das Klima der Schweiz von 1525–1860 und seine Bedeutung in der Geschichte von Bevölkerung und Landwirtschaft. Bern: Haupt.

54.

Pfister

Wetter

Brázdil

et al . (2015) Tree-rings and people – different views on the 1540 Megadrought. Reply to Büntgen et al. 2015. Climatic Change 131(2): 191–198.

55.

Pontille

(2010) Updating a biomedical database: Writing, reading and invisible communication. In: Barton

Papen

(eds) The Anthropology of Writing: Understanding textually mediated worlds. London: Continuum, 47–66.

56.

Ramírez-i-Ollé

(2015) Rhetorical strategies for scientific authority: A boundary-work analysis of ‘Climategate’. Science as Culture 24(4): 384–411.

57.

Rheinberger

H-J

(1997) Towards a History of Epistemic Things: Synthesizing proteins in the test tube. Stanford, CA: Stanford University Press.

58.

Ribes

Finholt

(2009) The Long Now of Technology Infrastructure: Articulating tensions in development. Journal of the Association for Information Systems 10(5): 375–398.

59.

Riemann

(2012) Methoden zur Klimarekonstruktion aus historischen Quellen am Beispiel Mitteleuropas. Freiburg i. Br.: Universität Freiburg.

60.

Riemann

Glaser

Kahle

Vogt

(2015) The CRE tambora.org – new data and tools for collaborative research in climate and environmental history. Geoscience Data Journal 2(2): 63–77.

61.

Royal Society (2012) Science as an open enterprise. The Royal Society Science Policy Centre report 02/12. London: The Royal Society. Available at: http://royalsociety.org/policy/projects/science-public-enterprise/report.

62.

Schinkel

(2016) Making climates comparable: Comparison in paleoclimatology. Social Studies of Science 43(3): 374–395.

63.

Sismondo

(2011) Bourdieu’s rationalist science of science: Some promises and limitations. Cultural Sociology 5(1): 83–97.

64.

Star

Griesemer

(1989) Institutional ecology, ‘translations’ and boundary objects: Amateurs and professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–39. Social Studies of Science 19(3): 387–420.

65.

Star

Ruhleder

(1996) Steps toward an ecology of infrastructure: Design and access for large information spaces. Information Systems Research 7(1): 111–134.

66.

Star

Strauss

(1999) Layers of silence, arenas of voice: The ecology of visible and invisible work. Computer Supported Cooperative Work 8: 9–30.

67.

Strasser

(2012) Data-driven sciences: From wonder cabinets to electronic databases. Studies in History and Philosophy of Biological and Biomedical Sciences 43(1): 85–87.

68.

Wetter

Pfister

Werner

et al . (2014) The year-long unprecedented European heat and drought of 1540 – a worst case. Climatic Change 125(3–4): 349–363.

69.

Zimmerman

(2008) New knowledge from old data: The role of standards in the sharing and reuse of ecological data. Science, Technology, & Human Values 33(5): 631–652.