Abstract
In their ‘Critical Questions for Big Data’, danah boyd and Kate Crawford warn: ‘Taken out of context, Big Data loses its meaning’. In this short commentary, I contextualize this claim about context. The idea that context is crucial to meaning is shared across a wide range of disciplines, including the field of ‘context-aware’ recommender systems. These personalization systems attempt to take a user’s context into account in order to make better, more useful, more meaningful recommendations. How are we to square boyd and Crawford’s warning with the growth of big data applications that are centrally concerned with something they call ‘context’? I suggest that the importance of context is uncontroversial; the controversy lies in determining what context is. Drawing on the work of cultural and linguistic anthropologists, I argue that context is constructed by the methods used to apprehend it. For the developers of ‘context-aware’ recommender systems, context is typically operationalized as a set of sensor readings associated with a user’s activity. For critics like boyd and Crawford, context is that unquantified remainder that haunts mathematical models, making numbers that appear to be identical actually different from each other. These understandings of context seem to be incompatible, and their variability points to the importance of identifying and studying ‘context cultures’–ways of producing context that vary in goals and techniques, but which agree that context is key to data’s significance. To do otherwise would be to take these contextualizations out of context.
In response to boyd and Crawford’s ‘Critical Questions for Big Data’, Provocation 4: ‘Taken out of context, Big Data loses its meaning’.
I am sitting in a room, different from the one you are in now. It is 10:43 pm on a Tuesday, and I am on my phone using Songza, a context-aware music recommendation service. Recognizing that it is Tuesday night, Songza tells me so and suggests some activities I might want to accompany with music: ‘Unwinding’, ‘Bedtime’, and ‘Staying Up All Night’, among others. I pick the barely accurate ‘Summer Break’ and the app gives me some more options: do I want music for ‘Throwing a Rager’, ‘Blowing Off Your Curfew’, or ‘Hooking Up With Your Summer Fling’? Typing on the couch in my living room after a day of fieldwork, I pick the relatively tame ‘Having the Best Summer Ever: ’90s Edition’, and Songza informs me: ‘This summer’s going to be tha bomb dot com [sic]. Pick a playlist’. One tap and a few seconds later:
Yo, I’ll tell you what I want
What I really, really want
So tell me what you want
What you really, really want
I am sitting in a room listening to the Spice Girls’ ‘Wannabe’ and having the best summer ever: ’90s edition.
Context is king
Among the developers of commercial music recommender systems, it is now popular to suggest that what listeners really, really want depends on their context. If you are sitting on the couch, you might want to listen to something different from when you are working out at the gym. Sunny days call for different music from rainy nights, and we want different soundtracks for parties than for quiet dinners at home. ‘Context is king’, Songza CEO Elias Roman told CNET, as his company was bought by Google for a reported US $39 million (Solsman, 2014). Two years earlier, a post on the music technology blog Hypebot suggested that we would soon see the rise of ‘context culture’, a ‘freaky, but inevitable’ trend toward pervasive data collection used for hyper-contextual personalization (Hoffman, 2012). ‘If the first streaming music revolution was about access’, Eliot Van Buskirk (2012) wrote on Evolver.fm, ‘the second one is about context’.
Academic researchers in human–computer interaction have long anticipated the emergence of ‘context-aware computing’ (Schilit et al., 1994), in which software adapts to the situations in which it is used. Across subfields like ubiquitous computing and recommender systems research, methods for defining, apprehending, and using ‘context’ are active areas of study (Adomavicius and Tuzhilin, 2011; Anand and Mobasher, 2007; Cooke et al., 2002; Dey, 2001; Lathia, 2014; Panniello et al., 2014). Although current commercial applications tend to focus on information about time and location, the spread of sensor-packed smartphones and the ‘internet of things’ are expected to provide even more contextual signals that might be used for personalization: What is the ambient noise level, according to your cell phone’s microphone? Are you running, according to the built-in accelerometer? Have you had your coffee yet, according to your smart coffeemaker? How many messages are sitting unread in your inbox, according to your email provider’s API? How tense are you, according to your fitness tracker’s skin conductance meter? The accumulation and correlation of these implicit signals is one of big data’s distinctive features: massive databases can store not only your listening or browsing history, but also the readings of various sensors associated with it.
Personalized recommender systems were once pitched as a fine-grained improvement on coarse demographic targeting, allowing software to cater to the preferences of individuals and emergent groups with shared tastes (Riedl and Konstan, 2002; Seaver, 2012). If those systems reified neoliberal, desiring individuals, the new generation of contextual recommenders appeals to the partible person: your likes, and maybe even your identity, may vary according to the situations you find yourself in. Given these trends in recommender research and development, we are in for a future where data mining concerns itself increasingly with the determination of context, drawing on a range of signals to personalize more precisely than the unified ‘person’.
Context is key
In their ‘Critical Questions for Big Data’, danah boyd and Kate Crawford (2012) provoke: ‘Taken out of context, Big Data loses its meaning’ (p. 9). They argue that, through aggregation and mathematical modeling, big data analytics tend to strip data of their contexts. How should we square this provocation with the ‘contextual revolution’ in recommender systems, which has seen big data practitioners fixate on context’s significance? Through the rest of this essay, I investigate possible answers to this question, all of which turn on the question of how ‘context’ is defined and interpreted. To understand how context can be simultaneously missing from data science and central to it, we’ll need to put ‘context’ in context.
The idea that meaning crucially depends on context is shared across a wide range of academic disciplines. In analytic philosophy, Gottlob Frege (1980 [1884]) directs us ‘never to ask for the meaning of a word in isolation, but only in the context of a proposition’ (p. xxii). In cultural studies, Ien Ang (1996) argues for ‘radical contextualism’, or ‘the impossibility of determining any social or textual meaning outside of the complex situation in which it is produced’ (p. 61). We can find more arguments for context’s importance in ethology (Von Uexküll, 1957 [1934]), linguistic pragmatics (Grice, 1989; see Morgan, 1977), feminist philosophy of science (Haraway, 1988; Harding, 1986), the sociology of science (Bloor, 1976), and epigenetics (Morgan et al., 1999). 1
The injunction to consider context is perhaps nowhere more central than in anthropology, where placing practices, beliefs, and language in context has long been a primary disciplinary mission. Bronisław Malinowski (1923) argued for the importance of context in his 1923 essay on ‘The problem of meaning in primitive languages’: ‘Language is essentially rooted in the reality of the culture, the tribal life and customs of a people, and […] it cannot be explained without constant reference to these broader contexts of verbal utterance’ (p. 305). Earlier, Franz Boas made a similar argument for reorganizing ethnological museums: rather than arranging artifacts in putative evolutionary or functional schemes (all the fishhooks of the world together, in order of complexity), he argued they should be placed with other objects from their ‘culture area’, to be seen in context (Melanesian fishhooks alongside Melanesian baskets, knives, and canoes; Boas, 1887; see Stocking, 1982). In the ancestral mythology of the discipline, this attention to the specificities of context represented a dramatic turn from the earlier ‘armchair ethnology’ exemplified by James Frazer’s (1994 [1890]) The Golden Bough, which plucked ethnographic data from their contexts around the world and bundled them together in the service of overarching anthropological insights (Strathern, 1987).
The centrality of context to anthropology found new, extremely popular expression in Clifford Geertz’s (1973) argument for ‘thick description’. He borrowed the terminology from the philosopher Gilbert Ryle to name descriptions that took sociocultural context into account as opposed to ‘thin descriptions’ that did not. Under the sign of Geertz, thick description and the placing of things into context remains a guiding principle of ethnographic work across disciplines, and the rise of ‘context’ as a matter of concern in domains like media studies and science and technology studies has been accompanied by a turn to ethnographic methods for apprehending it (Schlecker and Hirsch, 2001). Surveying this work, we might expand on boyd and Crawford’s provocation to say: Taken out of context, everything loses its meaning.
Context is questioned
In their own argument for the importance of context, boyd and Crawford draw on this anthropological tradition to suggest that quantitative measures need to be understood in the contexts from which they are drawn. Take, as they do, the example of social network analysis, a roughly 70-year-old field that has exploded in popularity with the growth of large datasets explicitly understood as ‘social’ and ‘networked’. These datasets offer a variety of proxies one could use to calculate standard social network measures like the strength of a tie between two people. We might use the frequency of interaction on Facebook, the number of emails or phone calls, or the amount of time cell phones are in the same location to measure tie strength.
However, boyd and Crawford argue, these indications pull the notion of a ‘tie’ out of context, and as a result do not always measure what they purport to: someone may friend an interesting stranger on Facebook but not their parents; my phone may spend every workday a few feet from a stranger’s phone through the wall. Without the context necessary to make sense of these signals – an especially daunting task for massive datasets – researchers risk error or operationalism, mistaking their measures for phenomena of interest. Tie strength and many other social concepts are not simply reducible to transactional data and mathematical models, but are rather ‘a subtle reckoning in how people understand and value their relationships with other people’ (boyd and Crawford, 2012: 10).
In an earlier version of their paper, boyd and Crawford (2011) gave a different heading for their provocation on context: ‘Not all data are equivalent’. This original title hints at the definition of ‘context’ they have in mind: although quantitative measures encourage us to locate numerical equivalencies, these numbers often arise in notably different settings. Context is that unquantified remainder that haunts mathematical models, making numbers that appear to be identical actually different from each other.
The aggregation of signals from a smartphone’s sensors seems definitionally incapable of accounting for this remainder. One might even go so far as to say that what Songza et al. have concerned themselves with is not really context at all, since they know nothing beyond the iPhone’s narrow, quantitative umwelt. This, then, is the disagreement: What the developers of recommender systems see as contextualizing, a critic in the vein of boyd and Crawford would see as deracinating: Is the location of your smartphone a context, or is it data in need of contextualization?
Context is constructed
In ‘What We Talk About When We Talk About Context’, Paul Dourish (2004) describes these two attitudes toward context as ‘representational’ and ‘interactional’. In the representational mode, context is considered to be a stable container for activity: one’s context can be described as an accumulation of data points such as location, weather, the people nearby, or the time of day. This attitude toward context is compatible with a positivist epistemology that extends well beyond the social sciences, and Dourish argues that it is the prevailing attitude within most computing research. An alternate, interactional mode, Dourish suggests, is a legacy of phenomenology: Contexts are not containers, but rather relational properties occasioned through activity. This attitude toward context is shared by linguistic anthropologists (e.g. Duranti and Goodwin, 1992) who study how, in conversation, speakers invoke certain things as context, these contexts are contested, and they shift dynamically over the course of interaction. For the interactionalist, context is not just there waiting to be characterized or quantified, but it is rather a localized achievement, irreducible to a collection of sensor data.
Dourish argues that these two modes of understanding context are incompatible with each other: social science critiques of context usually refer to the interactional mode, while attempts to account for context in computing systems tend to the representational. 2 We can spot such an incompatibility between contextual recommender systems and boyd and Crawford’s provocation: There is, it seems, no signal that could be added to the computational model to answer the criticism that it takes things out of context. ‘Essentially’, Dourish (2004) writes, ‘the sociological critique is that the kind of thing that can be modeled […] is not the kind of thing that context is’ (p. 22).
Although it is common sense that to put things in context is good and to take them out of context is bad, this simple take overshadows the fact that often our disagreements lie precisely in determining what context is. 3 I want to suggest that the way ‘context’ is coming to be used in personalization systems – and the anxiety it provokes in qualitatively minded critics – is instructive for thinking about context’s diverse meanings and explanatory shortcomings. Rather than taking contextual recommendation as an error to be corrected, we might use it as a case for examining contextualization as a practice in its own right. The ‘contextual revolution’ in recommendation provides an opportunity to investigate how ‘context’ is differently imagined and managed by different groups of people. This raises the vertiginous possibility that context’s constructions may themselves be contextually contingent.
Context is contested
These problems have vexed anthropological critics since at least the 1980s, as they have attempted to clarify and pursue the mission of an anthropology that understands human life in context. ‘Interpretation in context’, Roy Dilley (1999) writes in The Problem of Context, ‘requires the pre-interpretation of the relevant context, that in turn informs the subsequent interpretation’ (p. 15). As Dilley points out, this is a classic problem of hermeneutics and it points to two key issues: Because the relevant context is not self-evident, establishing context is a necessarily political project, and it is also bound up with questions of method.
To elaborate, let’s say we wanted to contextualize some data about a social tie: I have interacted with your Facebook page five times this week. Once we start looking for context, it begins to accumulate: my interactions can be located in social context (we are colleagues), cultural context (who find meaning in online interaction), economic context (though it is contoured by the demands of venture capital), geological context (in the Anthropocene), biological context (where human actions threaten to trigger a global extinction event), and so on, ad infinitum. Any of these contexts could be ‘thickened’ in a number of ways, mixed with the others, and extended across scales. Because there is no objective limit to what can be invoked as context, an exhaustive catalog of context is impossible. As a result, such an analysis inevitably involves choices about what counts and where the work of contextualization should cease (Strathern, 1996).
These choices do not derive simply from the decision to consider context, but rather from the contexts and goals of the contextualizer: software engineers pursue different contextual goals than anthropologists do, as do office managers, activists, and venture capitalists. A mixture of choice, necessity, pragmatism, and unquestioned ‘home truth’, these definitions of context are closely tied to the methods used to apprehend it. Smartphones packed with sensors produce context in one way, while ethnographers with their field notes and participant-observation produce it in another. Though often presented otherwise by representationalists and interactionalists alike, these contexts ‘are not self-evident aspects of reality that are pre-given or to be taken-for-granted, in the sense of being understood as existing prior to analysis. They are part of the analysis and interpretation itself’ (Dilley, 2002: 449).
In the essay from which I borrow my title’s syntax, Marilyn Strathern (1995) describes how an anthropologically inflected understanding of ‘culture’ has spread around the world, just as anthropologists have sought to abandon the term for its totalizing and essentializing tendencies. ‘Context’ has traveled similarly, from common usage into specialist vocabulary and then back out into diverse settings. To argue for context without recognizing that we are arguing about context is to beg the question. Borrowing and pluralizing the phrase that Hypebot used to hype the rise of contextual recommendation, we might instead take as our object the diversity of ‘context cultures’, looking not just for their incompatibilities but also for the ways they necessarily interact in the invoked contexts of everyday life. Context is not the unique domain of thick describers – everybody has it, or does it, and they do it differently.
This does not mean that we need to accept any definition of context as valid or to abandon our own contextualizing projects. As corporations turn their data mining attention to context, they have the power to impose and normalize certain modes of contextualization at the expense of others. Critical engagement with these efforts is as important as ever. Our task, however, is not only to relocate mathematical models in the social world, but also to examine how the practices of big data themselves produce context in various and particular ways. We should remember that often our disputes are not about context’s merits, but how it should be made.
Footnotes
Acknowledgements
I thank Taylor Nelms, Bill Maurer, and Ellie Harmon for their recontextualizations.
Funding
This research was supported by the National Science Foundation (Dissertation Research Improvement Grant 1323834), The Wenner-Gren Foundation for Anthropological Research (Dissertation Fieldwork Grant 8797), and the Intel Science and Technology Center for Social Computing.
