Abstract
Ethnographic fieldnotes can contain richer and more thorough descriptions of social phenomena compared to other data sources. Their open-ended and flexible character makes them especially useful in explorative research. However, fieldnotes are typically highly unstructured and personalized by individual researchers, which make them harder to use as a method for data collection in collaborative and mixed methods research. More precisely, the unstructured nature of ethnographic fieldnotes presents three distinct challenges: 1) Organizability—it can be difficult to search and sort fieldnotes and thus to get an overview of them, 2) Integrability—it is difficult to meaningfully integrate fieldnotes with other more quantitative data types such as more such as surveys or geospatial data, and 3) Computational Processability—it is hard to process and analyze fieldnotes with computational methods such as topic models and network analysis. To solve these three challenges, we present a new digital tool, for the systematic collection, processing, and analysis of ethnographic fieldnotes. The tool is developed and tested as part of an interdisciplinary mixed methods pilot study on attention dynamics at a political festival in Denmark. Through case examples from this study, we show how adopting this new digital tool allowed our team to overcome the three aforementioned challenges of fieldnotes, while retaining the flexible and explorative character of ethnographic research, which is a key strength of ethnographic fieldwork.
Keywords
Introduction
Fieldnotes are qualitative social scientific data whose depth and richness can be incomparable to other data forms. Their open-ended and flexible nature makes them apt for explorative research and grounded theory (FitzGerald & Mills, 2022; Phillippi & Lauderdale, 2018), as well as particularly useful for studying settings marked by radical unpredictability, contingency, and uncertainty, such natural disasters and political conflicts (e.g., Gill et al., 2015). Ethnographic field data can play a vital role in “big data” research. On the one hand, fieldnotes can thus serve as “ground truth” for big data and black-boxed computational research (Blok et al., 2017; Krieg et al., 2017). On the other hand, fieldnotes also contain highly spatially and temporally granular information, and making them amenable for computational analysis could help social scientists to discover new patterns and produce ground-breaking results (Bjerre-Nielsen & Glavind, 2022; Blok & Pedersen, 2014; Munk, 2019; Nelson, 2020).
Traditionally, fieldnotes have been handwritten in situ into notebooks and over recent decades increasingly typed into text-processing programs. Ethnographers typically develop their own idiosyncratic—and not always transparent—ways of writing, storing, and accessing their fieldnotes (Abramson et al., 2018; Sanjek, 1990; see also Albris et al., 2021). This makes it hard to combine and compare different researchers’ fieldnotes, and to integrate them with other kinds of data, including not least quantitative ones. For the same reason, it is still uncommon for ethnographers to work in teams, especially interdisciplinary collaborations. 1 A further drawback arising from conventional ways of producing and handling fieldnotes is the fact that it is extremely time-consuming. For one thing, the process of typing (and sometimes re-typing) fieldnotes into text-processing software such as Word or commercial qualitative analysis software such as NVivo is very cumbersome. What is more, the subsequent coding of this data into themes in accordance with research questions will often involve many hours of often tedious work. Indeed, researchers are often forced to limit their analysis to chunks of their raw data based on their own recollection of best material and cases.
In this paper, we explore the advantages to be gained from logging and structuring ethnographic fieldnotes into a formalized digital corpus already from the onset of collecting them in the field, which makes it easier to sort, search, and analyze such ethnographic data afterwards. More specifically, we highlight three key challenges but therefore also opportunities pertaining to how fieldnotes are commonly collected, processed, and analyzed: 1) Organizability—it can be difficult to search and sort fieldnotes and thus to get an overview of them, 2) Integrability—it is difficult to meaningfully integrate fieldnotes with other more quantitative data types such as surveys or geospatial data, and 3) Computational Processability—it is hard to process and analyze fieldnotes with computational methods such as topic models and network analysis. In what follows, we propose a solution to each of these three challenges in the form of a digital tool that allows for the systematic collection, processing, and analysis of ethnographic fieldnotes.
The EthnoPlatform, as we have called our pilot version, is a digital data architecture enabling fieldworkers to collect ethnographic notes in a structured format (see also Astrupgaard et al., 2022). The idea to increasingly structure ethnographic data collection is far from new, and we draw inspiration from a range of scholars who have also seen this potential (e.g., Nippert-Eng, 2015, pp. 36–43; see also Bernard, 2006, pp. 398–405; Lyon, 1999). Via examples from a mixed methods pilot study of political attention in Denmark, we show how adopting our tool for structuring and digitizing fieldnotes allowed our team to overcome the three challenges of traditional practices around taking fieldnotes, while retaining several key strengths of qualitative social scientific methods. More specifically, we argue that by structuring fieldnotes in a tabular format, the EthnoPlatform 1) ensures organizability of fieldnotes by making them sortable and searchable, 2) enhances integrability with other data types such as geospatial data, and 3) improves computational processability by making fieldnotes amenable for computational processing and analyses via social data science methods such as network analysis and methods for automated text analysis.
What follows below is divided into two parts. In the first, we begin by providing a critical overview of existing digital tools for ethnographic data collection, processing, and analysis. We then introduce the EthnoPlatform by describing its main functionalities as a digital tool and our motivation for creating an infrastructure that compels and nudges ethnographers to settle upon a shared data structure as well as identifying key tags for their fieldnotes prior to embarking upon the fieldwork. Then, in the second part, we first introduce the interdisciplinary research project on political attention dynamics that served as the context for our pilot study. Via selected examples from this pilot, we then show how using this new digital tool made it possible to process and analyze the collected ethnographic fieldnotes in new ways that overcome the three aforementioned limitations of traditional fieldnote production and handling. We conclude by reflecting on how the EthnoPlatform makes ethnographic research more efficient and systematic and how it can contribute to front-line research combining mixed methods and data science (Grigoropoulou & Small, 2022; Marda & Narayan, 2021).
Background
Hopes and aspirations that digital tools would revolutionize qualitative research—including ethnography—have existed for quite some time (e.g., Hymes, 1965; see also Kemper et al., 1992). The first applications and platforms very much reflected the state of software development in the early periods of the internet. The earliest examples such as ETHNOGRAPH (Seidel & Clark, 1984) or Anthropac (Borgatti, 1989) by now look somewhat antiquated, although they were on the forefront of the development at their launch. Since then, the usability, design, and power of software applications for qualitative research have vastly improved. A key distinction to make when surveying the landscape of qualitative research software, including commercial and non-commercial solutions, is whether the software is to be used in the data collection phase, in the data analysis phase, across these phases, as well as in any interlocking “sub-phases.” Most qualitative researchers are familiar with research software in the data analysis phase, which is known as Computer Assisted Qualitative Data Analysis Software, often referred to in abbreviated form as “CAQDAS.” Most well-known examples are NVivo (Lumivero, 2023), Dedoose (SocioCultural Research Consultants, 2021), ATLAS.ti (ATLAS.ti Scientific Software Development GmbH, 2023), TAMZ Analyzer (Weinstein, 2006), or MAXQDA (VERBI Software, 2021), to name but a few. While all these platforms are different, what unifies them is the ability to process and analyze qualitative datasets through coding, indexing, and basic visualizations. Some of these platforms, like Dedoose, are cloud-based, to some extent oriented to teamwork, and have over time become quite dynamic and flexible to use.
However, existing digital platforms for qualitative data analysis, ranging from commercial software such as Nvivo to Dedoose have several limitations (Fielding, 2012, pp. 126-133; Paredes et al., 2017, p. 1562). Apart from the fact that much of this software is proprietary and thus difficult to afford for individuals or institutions with limited resources, their inflexibility and closed environmental design makes them cumbersome to use for ethnographic research. In particular, the inbuilt and often encapsulated structure for the labeling of data, which is a common feature of most existing platforms, puts constraints on the iterative nature of grounded theory analysis, which requires that researchers can continuously re-type and re-label their data as their analysis progresses (Nelson et al., 2021). Moreover, applications such as Dedoose and Nvivo do not incorporate in situ data collection in a way that is aligned with the affordances that smartphones and other handheld devices can offer ethnographers in the field, as they operate on computers only. Nor is Dedoose or Nvivo designed to be interoperable with backend programming languages such as Python or R, although these applications have indeed expanded their usage with, for instance, gathering data from social media platforms.
Examples of software aimed at the data collection phase of the ethnographic research process with a specific focus on fieldnotes are sparse. Many ethnographers have promoted the use of generic note taking software such as Evernote (Wang, 2012). However, while such applications can be used on handheld devices, is usable for teams, and can even do simple tagging of themes within notes, they are not designed for research, and therefore lack further functionalities and design capabilities. Crowd sourced data collection or “mobile ethnography” represents another group of applications. A case in point is the Indeemo mobile ethnography app (Indeemo, 2018), which is designed explicitly for handheld devices, aimed at letting research participants contribute to the data collection through written interactions, photos, and videos. It is, however, not a software platform optimized for the writing of fieldnotes and analysis of these across a team of researchers. This is also the case with similar platforms and apps such as EthnoAlly (Favero & Theunissen, 2018), the EthOS platform (EthOS, 2023), or QualMobile (Sago, 2023), as many other competing alternatives on the market.
To the best of our knowledge then, there is currently no tool, software, nor platform (commercial or non-profit) that can facilitate the writing and storage of fieldnotes on the move (i.e., on handheld devices), while being easy to use and adjustable to the wishes of the individual or group of ethnographers who are using it and allow computational processing of fieldnotes. This motivated us to develop our own digital tool, and in the following section, we describe the tool it is built on and how it works.
A Digital Data Architecture for Ethnographic Fieldnotes
Our overarching ambition has been to develop a digital tool that helps to overcome the three limitations pertaining to existing ways of producing and handling fieldnotes, namely, organizability, integrability, and computational processability. Unlike existing digital tools for the handling of qualitative data, it had to be accessible for ethnographers in the field through a handheld device, usable for collaborative data collection and storage, and allowing for export of fieldnotes in computationally processable file-formats. Crucially, all these objectives should be met without restricting the flexibility and open-endedness of the ethnographic method and explorative and abductive social scientific analyses too much. As we are going to describe in detail in the following pages, the solution we developed involves a tagging approach as the basic principle undergirding the data infrastructure of our digital tool. This, we shall argue, provides a viable path out of each of the three above challenges.
A Tagging Approach
The tagging approach implies that researchers develop a set of tags before entering the field. These tags serve as categories of information that are assessed to be important to retrieve in all fieldnotes. Tags can be closed categories pertaining to each fieldnote, or they can be open categories that allow for more elaborate accounts, including a virtually mandatory tag for field observations.
An example of a fieldnote template from our own case study, which we will describe in more depth later, can be seen in Figures 1 and 2, which show the interface of the EthnoPlatform. Here, we used 8 tags in total. When initiating a new fieldnote, the ethnographer is met with a template of the pre-defined tags. In our case, these consisted of 6 close-ended tags: “Name” (of ethnographer), “Project,” “Location,” “Situation,” “Date,” and “Time” (see Figure 1) ensuring that each note contains the same contextual information. Beneath the close-ended tags, what we refer to as meta data, there are two additional open-ended tags with larger text fields; “Field observation” and “Reflections” (see Figure 2). “Field observation” is for descriptive fieldnotes pertaining to a particular setting or situation playing out, and the second field is for adding analytical or methodological reflections. Interface of the EthnoPlatform ‐ close‐ended tags. Interface of the EthnoPlatform ‐ open‐ended tags.

In combination, the tags make out a fieldnote template that can be filled out by any member of a team. Every time an ethnographer writes a fieldnote, she makes sure to retrieve information that correspond to the pre-defined tags, thereby, the template ensures that all fieldnotes contain the same type of information, making them more aligned through the tags. 2 We want to emphasize that the tags exemplified in the template in Figures 1 and 2 were chosen because they made sense for our particular study at a Politics festival. In other studies, another set of tags might be more useful such as “Keywords,” “Names (or pseudonyms) of interlocutors present,” or “Summary of fieldnote.” 3 In the coming section, about how our tool enhances organizability, we elaborate how the template tags can be used to sort and retrieve specific fieldnotes. By using such templates, the ethnographers are forced to follow a structure, and the fieldnotes then become less individualistic and idiosyncratic, as the single ethnographer thus becomes more detached from her notes. Moreover, if there are multiple researchers working on a project, they can better read and understand each other’s fieldnotes.
Developing the EthnoPlatform
Certainly, it is the possibility to collect fieldnotes in a common structure via tags that makes up the key feature of our tool, The EthnoPlatform. We have developed two test-versions 4 that are based on the tagging approach, but in the first test-version, we used the already existing off-the shelf survey tool, SurveyXact. Here, the fieldnote template corresponded to questions in a survey that was distributed to ethnographers through a link that they could access on their devices and fill out whilst observing in the field. Based on lessons learned from the first version, the EthnoPlatform was refined, and we developed the next test version as a web application. 5 The fieldnote template was similar to the first version, though this time the fieldnote archive was more accessible in the field, making it easier to add and edit notes on the go. The interface of our second version is shown in Figures 1 and 2. Here the ethnographer fills out information in the fields below each tag, and then, she clicks “Save note” to store it in a cloud-based archive. The ethnographer can read and edit the note in the field or later. Thus, if there has not been enough time to type in all information at once, it is possible to log on to the EthnoPlatform and elaborate or extend fieldnotes later on if needed.
Processing Fieldnotes in a Tabular Format
Because the ethnographic fieldnotes are structured via tags, they can easily be exported from the EthnoPlatform in a tabular format such as a CSV-file. In this data format, the tags are columns and fieldnotes are rows. The tabular format makes the collection of fieldnotes appear a lot like a quantitative data set, where the tags correspond to variables and fieldnotes to observations. Then, the fieldnotes can to a large extent be processed as quantitative data. If an ethnographer uses many tags, then the fieldnote template will provide better possibilities for quantitative analysis through aggregation and clustering of fieldnotes by tags. However, the open-endedness of a research project is reduced if there are many tags as it will make ethnographers less prone to delve into unexpected questions and dimensions emerging along the way in the field. For a project with an open-ended focus, where the ethnographer prefers to retain the traditional focus on writing and editing fieldnotes in a more idiosyncratic and flexible manner, she can decide to merely use tags such as “Text”, “Place”, and “Date”. This will make the data collection less structured, reducing the possibilities of overcoming the aforementioned limitations pertaining to traditional ways of doing fieldnotes. Nevertheless, even in such cases, ethnographic researchers could still benefit from digitizing and organizing their fieldnotes via time and location metadata by utilizing the EthnoPlatform’s efficiency as a handheld device for instantly logging fieldnotes into an online editable archive.
Which and how many tags to include should ideally be settled upon before entering the field. However, it is important to emphasize that when using the EthnoPlatform, the number of tags for fieldnotes does not have to be set in stone before the data collection begins. The fieldnote template can be edited after the ethnographic data collection has begun, and in case the ethnographer encounters new paths emerging in the field that make the pre-defined tags seem insufficient, then she can add or remove tags. Archived fieldnotes can be gone through and edited to include the revised tags if it seems sensible. The framework we developed generates a protocol of the changes made in the tag structure during the period of data collection, which ensures that potential changing of tags along the way is transparent. 6 Though the main purpose of our tool is to digitize and structure fieldnotes, the possibility of changing tags ensures a high degree of flexibility during data collection. Nevertheless, any revision of tag structure should preferably take place in the beginning of the ethnographic data collection to avoid a high degree of manual post-tagging and irregularities in the structure of the fieldnotes.
Fixing Fieldnotes
There are several ways in which our tool improves organizability of fieldnotes: The tabular format of its data infrastructure means that fieldnotes are more easily sorted or grouped based on conditions through tags (e.g., “date” or “ethnographer”). This means that the fieldnotes can better be organized by one’s own preferences or preferred tags which provide the ethnographer with an overview of the collected fieldnotes. Correspondingly, the integrability of the fieldnotes is also enhanced. If one or more tags of the fieldnotes matches features of another data set, then it is possible to integrate the data sets. For instance, having a “Time”-tag allows the ethnographer to combine relevant quantitative data that also has a time feature. This could be different kinds of data dependent on the specific project, and one could imagine integrating information about anything from weather conditions to social media posts with the ethnographic fieldnotes.
But that is not all. As we are going to see below, the digitized and structured format of the EthnoPlatform’s digital data architecture also makes computational processing of fieldnotes easier. This means that we can use methods and techniques from social data science such as automated text analyses to either test pre-existing hypotheses or to discover new patterns in the fieldnotes.
Exploring Fieldnotes From an Interdisciplinary Case Study
In this second part of the article, we introduce the ethnographic research context in which the pilot test and development of the EthnoPlatform took place and we discuss how this digital tool enabled us to explore our ethnographic fieldnotes by using computational methods. Accordingly, what follows is structured around the three aforementioned limitations of traditional fieldnotes, organizability, integrability, and processability. First, we show how the structured format of our fieldnotes allow us to easily get an overview of the data using summarizing statistics as well as effortlessly search and sort through them. Thus, we can easily get an overview of our ethnographic fieldnotes. Then, we show how our fieldnotes can be integrated with geospatial data which allow us to better explore spatial dimensions of the festival site through our fieldnotes. Lastly, we show how our fieldnotes can be analyzed through more advanced means of computational processing such as topic modelling and network analysis, as we present a topical mapping to visualize thematic relations in the fieldnote corpus.
Collecting Ethnographic Fieldnotes at a Danish Politics Festival
We used the EthnoPlatform to collect fieldnotes as part of a case study at the Danish politics festival, The People’s Meeting (In Danish: “Folkemødet”). The People’s Meeting aims to facilitate a democratic dialogue between citizens and decision-makers by hosting numerous topical debates and political speeches organized by public and private stakeholders. Since its launch in 2011, it has grown to become Denmark’s largest politics festival, and in 2022, the festival hosted 2500 events in its four-day duration with around 30.000 daily visitors (Folkemødet, 2022). Many events occur simultaneously causing event organizers to use different means to try and attract the attention of visitors who can freely choose which events to partake in. The People’s Meeting thus served as an interesting case for us to study different aspects of attention dynamics and behavior as well as for testing our new tool.
The pilot was conducted as two studies, 2021 and 2022, with different analytical objectives but a shared focus on data pertaining to political attention. 7 In 2021, we experimented with new ways of collecting ethnographic fieldnotes collaboratively to use these in combination with other data types. We also sought to produce fieldnotes pertaining to similar situations and events occurring in different places around the festival site. Given the somewhat chaotic nature of such a large-scale event, we were compelled to reconsider the traditional practice of fieldnote collection and instead introduce a more structured format leading to the development of the first version of the EthnoPlatform. In 2022, we returned to the festival for the second part of the study with a second version of the EthnoPlatform, this time as a web application. Using the EthnoPlatform alongside detailed observation guides, we found that this format indeed could produce organizable, integrable, and computationally processable fieldnotes. The EthnoPlatform thus allowed us to accumulate similarly structured, ethnographic fieldnotes from many locations at the festival square at the same time which were instantly digitized and stored safely from the field in both parts of the case study. 8 We have now presented our own experiences with the tool in the data collection phase. In the next section, we turn to how the structured fieldnotes can be analyzed, and here, the first step was to export the data as CSV-files to then process them computationally using the programming language, Python.
Organizability—Searching and Sorting Fieldnotes
Summarizing Statistics of fieldnotes.
It is also possible to produce summarizing statistics that may be of more analytical interest through word frequencies. We can, for example, pick out the most frequently used words. In Table 1, we see the most common words in the data seem to be related to the stage and audience, but also to behaviors such as walking and talking. We also see that in 2021 there was a lot more written about the audience and the stage, whereas the 2022 fieldnotes seem to be more concerned with groups of people and the general atmosphere.
Since we were interested in obtaining an overview of how many of the collected fieldnotes directly mentioned our primary research concept (“attention”), we count the notes containing the word “attention.” We also extract the words that most frequently appear next to “attention” in a sentence (co-occurrence). Doing this, we find that approximately a third of the fieldnotes explicitly mention attention, and that most of these fieldnotes are in the data from 2021. The five most common words associated with “attention” are almost similar in 2021 and 2022, which indicates that our observations of attention dynamics in 2021 and 2022 share similar content. We assume that the differences in the content of notes in 2021 and 2022 is somewhat due to each year’s observations being guided by different instructions given to ethnographers on what kind of attention-related behavior to look for. Indeed, these relatively basic descriptive statistics can thus both reveal overall patterns, but more importantly, they can help to guide our qualitative reading of the fieldnotes.
As a next step of our exploratory analysis, we choose to focus on the temporal dynamics in the data. We find all sentences in the text within the “Field observation”-tag of all fieldnotes that include the word “attention” and derivatives of the word. Then we sort these sentences using “Time” and the “Date”-tags, constructing a timeline indicating how observations about attention differ during the day. An extract is shown in Figure 3, where we have highlighted the word “attention” in each sentence, and we have added information from the “Situation”-tag of the fieldnote in question to get a sense of the context. Small extract of time line of sentences in fieldnotes containing the word “attention” and the situation-tag of the fieldnote.
Reading the notes in this way allows us to analyze temporal development in attention among the visitors at the People’s Meeting. This approach reveals that the attention at the political debates might be more intense in the morning and that there at all times appear to be several external distractors, attracting attention away from the debates (such as phones, noisy bins, and demonstrations). While there might be many causes of these findings, our re-reading of the temporally organized sentences with the word “attention” reveals a pattern that we could then examine more thoroughly through detailed qualitative readings of the 48 notes in full length containing the word and its derivatives.
Having searchable and sortable fieldnotes is, thus, both a big practical and analytical advantage. Researchers can, for example, easily discover new patterns in their own or others’ ethnographic data, by searching for specified keywords or set of keywords, and extract all fieldnotes or specific sentences, where those words or derivations of those words are used. This is obviously very useful for exploring, for example, how a specific concept is used differently in the total dataset, as it reduces the time the researchers must use on skimming through their notes to find what they are looking for and can direct their attention to parts of the fieldnote corpora where they would not otherwise have looked. Sorting the notes can also be an important tool for researchers, and it is a lot easier to do computationally. Researchers can, for example, sort their corpora by time and examine their relationship and conversations with interlocutors over time. One could also imagine sorting fieldnotes by where they were recorded, in order to understand why some areas, seem to be better described than others. This way one could examine how notes with specific keywords are dispersed spatially by extracting notes with the keyword and sorting them by where they were recorded. One could go even further and search up the fieldnotes concerning a specific interlocutor that contains a specific keyword and sort these by time in order to read the notes in a very analytically focused manner. Thus, searching and sorting through the fieldnotes allows us to find patterns across the fieldnotes that can serve as independent analytical insights and/or guide further analysis.
Integrability—Combining Fieldnotes with Other Data Types
Besides examining the temporal dimensions of attention at The People’s Meeting, one can also use our digital tool to explore its spatial dimension. In this part of our exploratory data analysis, we integrate our digitized fieldnotes and geographical data from OpenStreetMap to create a fieldnote map, which can be used as point of departure for further explorative research. As part of this process, we use the “Location,” “Time,” “Field observation,” and “Name” (same as researcher ID) tags to create an interactive spatiotemporal map which we show a snapshot of in Figure 4. Here, each point denotes a fieldnote from the 2022-study, showing the exact location of its production plotted onto a map of the festival site. The colors of the pins indicate the authoring ethnographer, meaning that we see where each field observation is conducted and by whom throughout the day. Accordingly, we can explore the interactive visualization, and by hovering our computer mouse over the pins, we see various information about each fieldnote such as the time of day and the researcher ID . We also added the most important words in that fieldnote, by measuring the words with the highest TF-IDF loading (Spärck, 1972). TF-IDF loading is a very commonly used measure of word importance from the field of Natural Language Processing (NLP) that can help find the words that sets a text apart from the rest of the corpora. Here, the technique can help us summarize the fieldnote contents, so we can explore them spatially as well. Map of data collection.
Figure 4 allows us to see where the ethnographers in the team observed attention dynamics. In the specific case at hand, we see that data from the main stage zone (top left) appear to revolve around debates. The two notes highlighted in this zone in Figure 4, for example, pertain to a debate, where the audience’s attention was distracted by noise from neighboring events (in one tent, the host was a stand-comedian making people laugh, and in another tent, there was a debate with an ambassador where the main focus of the audience seemed to be on the free tapas being served). Conversely, we also see that the fieldnotes from the harbor zone (bottom right) tend to focus more on how groups of people move through the zone and their interactions. Thus, the two highlighted notes illustrate that while some of the observed people in the zone stand still in small clusters (apparently discussing particular debates), other groups traverse through the harbor while chatting, in one instance buying a flower from the local florist. While a more thorough investigation would be required to confirm and substantiate this, this preliminary analysis indicates that activities in this zone are less centered around political debates than around the main stage.
To sum up, these examples demonstrate that the combination and integration between digitized fieldnotes and geospatial data in the form of an interactive visualization offers allow us to identify new patterns in their data that we might not otherwise have seen. One could also imagine combining the ethnographic data with weather data or maybe even social media posts pertaining debates that are also described in fieldnotes. The tags indicating “time,” “date,” “location,” and “situation” allow us to combine the ethnographic data with a range of other data types that can further enrich our ethnographic data and our analysis. Similar to the sorting and searching steps described in the previous section, the ability to combine fieldnote data with other data sources can augment and expands our exploratory analysis. As such, this kind of analysis can be used to dig deeper and more systematically in the ethnographic data, for example, by re-reading and comparing fieldnotes from specific locations and moments and thereby trace the flow of attention in and across specific areas.
Processability—Exploring Fieldnotes through Topical Mapping
The structured format of our fieldnotes also allows us to make use of more advanced computational techniques to explore the textual content more in depth. Several techniques and methods from the data science fields of NLP and Machine Learning (ML) are particularly well suited for analysis of ethnographic data as they allow for statistical exploitation of language structure in search for potentially meaningful semantic patterns (Evans & Aceves, 2016). One such technique is topic modelling, a statistical method for identifying topics in large text corpora (Blei, 2012). Topic models come in many varieties, and they usually involve treating the words in text as topics, that is, latent distributions estimated by optimizing the internal coherence within each topic and minimizing the overlap with the other topics in the text data (ibid.). This means that the model will generate a range of topics that are all, to put it simply, lists of words and associated weights that can—provided that the researcher is equipped with sufficient context and domain knowledge—then be interpreted by the researcher as topics in a text corpus.
Qualitatively oriented social scientists have used these types of computational methods for finding thematic relations in ethnographic material (Fischer & Ember, 2018), to augment the coding of archival data (Nelson, 2020); for generating hypotheses from interview data (Karlgren et al., 2020); and more generally for explorative purposes in keeping with the grounded or “abductive” nature of much anthropology and sociology (Brandt & Timmermans, 2021). Having fieldnotes in a tabular format exported as a CSV-file makes it easy to upload and process with programming languages such as Python or R to then apply ML or NLP techniques on the data. Here the tags containing descriptive accounts from the field are of key interest, however, if applied, other tags such as “Place” and “Time” can also be utilized to find textual patterns across time and space.
In our own analysis, we decided to apply a hierarchical stochastic block model (HSBM) (Gerlach et al., 2018) to model the fieldnote text data, in order to guide our exploratory search for overall patterns. In essence, HSBM topic models use co-occurrence information among words and documents, to find words in the corpus of fieldnotes that typically belong together (ibid.). We choose an HSBM model because it holds the desirable property, that it attempts to calculate the number of topics in the data as well and is thus not dependent on the researchers predefining a number of topics that the model should look for, unlike other popular topic models like LDA (Blei, 2012; Gerlach et al., 2018; see also Carlsen & Ralund, 2022).
Our initial model suggested that there were 31 topics in the data. However, after examining the words contained in each of the topics generated, as well as re-reading the fieldnotes related to the different topics, we find that five of the suggested topics represented the most relevant thematic topics in our data. In Figure 5, we present the five topics as generated by the HSBM model. The bars of the words in each chart indicate the importance weight of the specific word for the overall topic. For instance, the topic, we have named Atmosphere, is comprised of words like “event,” “atmosphere,” “exciting,” and other words, which in both direct and indirect ways denote and revolve around the atmosphere at events and in groups. Debates on Stage is about what happens during and around debates. Crowd Reactions is centered on the reactions of the audience. Methodological Considerations differs by being more about methodological reflections that ethnographers had in the field. The last theme, Planning the People’s Meeting is about the process of planning the festival; here the word “Lars” is the name of our key contact person in the organizing team. Each of our fieldnotes can then be labeled with the topic(s) that they contain, by using the weights generated by the model. We assign a topic if our model predicts that it is 50% or more likely that the fieldnote contains that topic. Most important words for topics found using HSBM topic modelling.
Moving on in the exploration of topics in our data, we can use tools from network science to visualize and examine the relations between the topics and fieldnotes. In Figure 6, we have constructed what we call a topical mapping of the fieldnotes. This is a so-called bipartite network visualization, where we draw two types of nodes (points in the network), the green nodes represent fieldnotes, and the colored nodes represent topics. Each of the topic nodes have been annotated with their topic names, and the fieldnote nodes have been annotated with their “Situation”-tag. Likewise, the size of the fieldnote nodes corresponds to the length of the text from the “Field observation”-tag of each fieldnote. Having visualized the nodes, we can then draw an edge (line) between a topic and a fieldnote, if the specific node is assigned with the topic. In this way, our topic mapping visualizes the relations between each fieldnote and the topics it covers. Topical mapping of fieldnotes.
In this mapping, the location of nodes indicates topical similarity. The closeness of the “Debates on Stage” and “Crowd Reactions” topics, for example, indicates that the two topics are often co-present in our fieldnotes. The fieldnotes located between the topics appear to mainly describe situations related to political debates. Conversely, the topic “Atmosphere” seems to appear more prominently with reference to specific situations in between debates. We can also see that fieldnotes related to the topic “Planning the Festival” are related less to the other topics and thus seem like a very separate kind of notes. When we look closer at the individual notes, it appears that they are written just before the festival was open to visitors, thus utilizing the “Time” tags. From the topical mapping, we can only see an overall thematic pattern of our fieldnotes, however, this can provide a basis for topically focused re-reading of the fieldnotes with a new analytical view.
Conclusion
Producing and processing fieldnotes do not need to be a lonely and low-tech endeavor. As we have shown in this article, significant gains can be reaped from using digital formats for making ethnographic fieldnotes more sharable between researchers, and more suited for computational text analysis and other mixed methods approaches. Indeed, the EthnoPlatform is especially suited for interdisciplinary teams with multiple data sources and members with a mix of qualitative and quantitative backgrounds. The possibility of using computational techniques for pattern discovery can help mitigate biases of individual researchers, as well as opening up what Abramson et al. (2018) have described as the black box of idiosyncratic choices made by qualitative researcher when producing and processing their data. However, the tool can also be useful for individual researchers in the context of more open-ended long-term fieldwork. In this case, the researcher could use fewer pre-defined tags to retrain as must freedom as possible, but still gain the benefits of the resulting structured corpus.
As we have demonstrated using our attention study as an example, the tag-based data architecture of the EthnoPlatform allows ethnographers to log their fieldnotes in a structured format. This not only increased organizability through sorting and searching, and better integrability through merging the fieldnotes with other data sources, but also opens up for using them as sources of computational analysis in their own right. As has been suggested in recent scholarship in the interface between mixed methods and data science, quantitative analysis of qualitative data can thus be used to not only validate qualitative findings (Abramson et al., 2018; Grigoropoulou & Small, 2022; Maxwell, 2010) but also to automate labor-intensive manual coding procedures (Marathe & Toyama, 2018). In many ways, the present article follows in the footsteps of and seeks to contribute to this important nascent literature. For example, in our topical mapping of fieldnotes from the People’s Meeting, we made several interpretative steps to infer meaning from the network. These steps derived in a qualitative reading of the data; however, the quantitative nature of the topical mapping made these steps transparent and translatable among ourselves as well as to others. As a quantitatively based form of ethnographic data exploration, the topic map thus provided an overview of the data and revealed patterns that would have been hard to find via traditional qualitative means. As such, we again see how the EthnoPlatform as a tool for the digitization of fieldnotes allows for not just deductive but also inductive/abductive strategies for the collection, processing, and analysis of ethnographic data.
This new form of topical modelling of fieldnotes—topical mapping—shows significant promise for not just anthropological and sociological but also other fields of social scientific and digital humanities research. Indeed, it holds the potential to serve both as an instrument for analytical insights in itself and an offset to further empirical work. One could, for instance, explore the nature and intersections of topics by doing deep qualitative interpretations of selected fieldnotes or constellations of fieldnotes, or one might imagine that topical mappings can be served as codes in a qualitative coding protocol. Much as we saw in the two earlier sections on data organizing and data integration, the use of network visualizations and NLP (including topic models) can open up for new unexplored paths and patterns in ethnographic analysis difficult via existing ways of processing fieldnotes, whether manual or automated.
Despite these notably gains in terms of efficiency, transparency, and veracity, the EthnoPlatform of course has limitations. Notably, these include the inevitable trade-offs between rigor and flexibility arising from using tags to structure ethnographic fieldnotes, and, more generally, the significant methodological and epistemological pitfalls pertaining from any (over) reliance on automated techniques for quantitative text analysis (Carlsen & Ralund, 2022). Even with the recent introduction of sophisticated unsupervised machine learning methods into the social scientific study of text data (Enggaard et al., 2023; Kozlowski et al., 2019; Milbauer et al., 2021), any attempt to make use of the EthnoPlatform’s significant potential computational processing and modelling must be combined with a strong and systematic qualitative component. This spans all the way from the pre-processing to the visualization stage, including in-depth reading of and continuous revisiting of selected raw fieldnotes. Ideally, such a systematic complementarity (Blok & Pedersen, 2014) between qualitative and quantitative data, methods, and approaches will allow for an iterative process of zooming in on and out from the microdetails of fieldwork to obtain a bigger and thicker understanding of such ethnographic materials. Indeed, as we hope to have shown, this might be particularly beneficial in interdisciplinary research projects with a core ethnographic component.
Footnotes
Acknowledgments
We thank for the academic assistance from Clara Rosa Sandbye and Hjalmar Carlsen, who contributed with invaluable insights and suggestions to the project. We are also grateful to all members of the DISTRACT Team, especially Malene Jespersen and Thyge Enggard, for participating in continuous discussions about collaborative ethnography and computational processing of fieldnotes. A special thanks also to Asger Thomsen, Sofie Grave, Annika Isfeldt, and Maya Møller-Jensen for practical and administrative assistance with carrying out the project and to Tobias Gårdhus for contributing with suggestions to the first version of the EthnoPlatform. The article draws on ethnographic fieldnotes collected by students with training in anthropology Asta Finke, Sofie Hulgaard, Molise Nørskov, Janus Høm, Karoline Uldbjerg, Johannes Jøhncke, Martin Lauritzen, and Emil Godkin, Katrine Milde, Berthel Böttcher, Yasmin Touzani, Magnus Nørtoft and Jeppe Hansen, whose help we are very thankful for. Finally yet importantly, we thank the organization The People’s Meeting (Foreningen Folkemødet), especially Lars Rømer, for a constructive and worthwhile collaboration.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The overarching research project on political attention was supported by the H2020 European Research Council (grant number 834540) as part of the project: ‘The Political Economy of Distraction in Digitized Denmark’ (DISTRACT). The University of Copenhagen’s Data-Plus Program funded our pilot study, which included the development and the testing of the EthnoPlatform.
