Abstract
This article reimagines the quantified self within the context of Black feminist technologies. Bringing computation and autoethnographic methods together using a methodology I call computational digital autoethnography, I harvest my social media data to create a corpus for analysis. I apply topic modeling to these data to uncover themes that are connected with broader societal issues affecting African American women. Applying a computational autoethnographic approach to a researcher’s own digitized data allows for yet another dimension of mixed-methods research. This radical intervention has the potential to transform the social sciences by bringing together two seemingly divergent methodological approaches in service to Black feminist ways of knowing.
Introduction
The use of social media, particularly among African Americans, has exploded within the last decade (Lenhart, Purcell, Smith, & Zickuhr, 2010). With fewer barriers to entry, activists, scholars, and everyday people have begun incorporating these communication technologies as extensions of their social and cultural selves. For sociologists, particularly African American women, engagement with social media is a double-edged sword. On one hand, it has become a way of extending pedagogy, sharing scholarship, and connecting with others through shared experiences (Mehra, Merkel, & Peterson, 2004). On the other hand, we have also seen technology used to oppress and silence Black women’s voices (Houston & Kramarae, 1991). Within the last year, Drs. Saida Grundy and Zandria Robinson, both African American women sociologists, came under public attack for their social media commentary (McClain, 2015). Despite the vulnerability it creates, technology continues to provide opportunities for disruption and liberation. The objective of this article is to introduce a methodological approach that reframes computation in service to Black feminist ways of knowing. I am particularly interested in expanding metanarratives around methods and troubling the conventional qualitative/quantitative dichotomy.
Using social media and computational analysis, this article reimagines the quantified self 1 within the context of Black feminist technologies. Using an autoethnographic approach, I harvest my personal social media data to create a corpus for analysis. I apply statistical modeling (Latent Dirichlet allocation or LDA) to these data to uncover themes that are then connected with broader societal issues affecting African American women. Applying a computational autoethnographic approach to a researcher’s own digitized record allows for yet another dimension of mixed-methods research. In an environment in which Black women find their digital selves attacked and distorted, I argue that the methodological cyborg, which is a research method that combines computation and reflexive human interventions, is a proactive and disruptive step toward standing “upright in a crooked room” 2 (Brown, Carducci, & Kuby, 2014). Although the use of both computational modeling and digital ethnography is well documented (Coenen, 2011), to my knowledge, this unapologetic methodological approach has never been done for autoethnographic research. This intervention has the potential to transform the social sciences by bringing together two seemingly divergent approaches in service to Black feminist ways of knowing (Collins, 1990; Cooper, 2015).
The Digital Is Political: On Knowledge, Legitimacy, Bias, and False Objectivity
Every abstraction is based on some preexisting knowledge or assertion of knowledge (Montangero & Maurice-Naville, 1997). Abstractions serve to expand upon existing assumptions about knowledge thus, in the process, transforming existing knowledge into something new as well as reproducing knowledge that is old. Because computation is the automation of abstractions via algorithms, the types of computational abstractions created by algorithms to uncover new knowledge must be interrogated as something other than objective. We must interrogate algorithms’ relationships to preexisting knowledge and ideas through reflective abstraction. How an algorithm operates and constructs can determine its relationship to preexisting knowledge as well as its relationship to the knowledge creation process. For example, how an algorithm abstracts relevance can uncover biases present within that algorithm (Noble, 2013; Sweeney, 2013).
Algorithms are power and control technologies (Deleuze, 1992) that can create controls that resemble publics (Gillespie, 2014), thus exposing assumptions and potential stereotypes perpetuated by algorithms (see Ananny, 2011). Software studies explore how software functions as a sociotechnical actor that influences the practices and experiences of Internet users. Within the context of media, including social media, we understand this device to be preoccupied with visibility, with seeing, sensing, and gaining access. Gillespie (2014) contends that algorithms are a form of power that constructs legitimacy and are essential to visibility and public discourse. How something or someone is made visible, and chosen to be made visible, signal power. Consider Facebook’s EdgeRank algorithm to understand the mediated and constructed visibility of news feeds (Bucher, 2012), specifically “how the world is captured in code in terms of algorithmic potential” (Dodge, 2010, p. 15).
To understand how visibility is constructed and through which measures, one may use the algorithm’s output to expose how the algorithm operates. The algorithm’s architecture helps us to “see” power. Not only do the functionalities (and limitations) of algorithms expose how power in decisioning is being organized within the social world, but also beyond the technical, algorithms promote a notion of “calculative objectivity” within its social ordering processes (Beer, 2016). But, because algorithms are only one of several software assemblages influencing the construction of visibility (Sandvig, Hamilton, Karahalios, & Langbort, 2014) and because new media is not constituted by a single algorithm, it is not one algorithm that should be interrogated, but rather collections of algorithms and interfaces and how they interact together (i.e., how they operate and compete) to constitute information and communication systems (Bucher, 2012; McKelvey, 2014). To further problematize the complex relationship between algorithmic assemblages and decision making, including the double articulation of LDA and Facebook’s Edgewater algorithms, there must be a consideration of how such algorithmic assemblages mediate and influence corpora. Marciniek (2016) argues that though multiple algorithms are involved in processing text, bringing with them additional degrees of uncertainty, the algorithmic assemblages of computational text analysis reduce human interference. Specifically,
where classical forms of content analysis check themselves by comparing interpretations of different coders and thereby reach a common interpretation, computational text analysis potentially endows one single interpretation with the “objectivity” of a complex, mechanical process, hiding many of the decisions involved in its conception. Yet, it also has the potential to become more “objective” a tool by retaining some of the multiplicity incorporated in the documents. (Marciniek, 2016, p.4)
These assemblages push forward epistemological questions related to how we understand objectivity as well as the recognition that subjectivities, either human or algorithmic, can still be very useful in understanding relations of power and visibility.
The decisions that algorithms make are decisions of representation and inclusion. “The vast majority of algorithmic media lack any similar sort of reflective apparatus, because algorithmic control operates a-semiotically and instantly. Algorithms leave little trace” (McKelvey, 2014, p. 602). Scholars are calling for a more critical, reflexive analysis of algorithms. These calls have been answered through traditional research methods such as reflexive journaling (Berg, 2014) and I believe can be brought to bear more deliberately through autoethnography, forming a reflexive apparatus for algorithmic media through collaborative research (experimental methods) of democratized publics (McKelvey, 2014). “Publics offer a valuable means of generating knowledge about algorithmic media” (McKelvey 2014, p. 599). In other words, those publics that have a direct stake in issues created by algorithms can help better understand the technology.
This project acknowledges my own insight and experiences as a legitimate source of knowledge and relies upon an informal data source to create knowledge in which I am heavily invested (See Brown et al., 2016 for an explanation of how corpora can be interpreted as sociopolitical entities and how the incorporation of informal sources of data into analysis can reduce some forms of disciplinary bias.) The use of social media, specifically Facebook, to create a corpus of study and analysis allows for the exploration of the social self specifically for the purposes of engaging in identity work and explicitly in service to Black feminism.
Social Self Revisited
Social media platforms offer an abundant source of information related to the social self because of the ease of access as well as the speed of production of data. It allows the author to control how one’s self is presented. Using social media, specifically Facebook, to create a corpus for topic modeling analysis creates a disruptive space whereby the computational can be brought into the political realm. This type of self-quantification constitutes a methodological cyborg and puts the power in the hands of the researcher to control and interpret the cultural representation of self. In addition, using these computational methods in service to identity work allows the researcher to engage questions that are rarely explored in fields where computational methods are predominately used.
In Representation and the Text, Tierney and Lincoln (1997) discuss the implications of postmodernist interpretations of text. Namely, that text has come to represent our authentic selves to ourselves as well as others in ways that are partial perspectives. They challenge claims that text can offer objective truth. What is liberating about a Black feminist narrative is that it does not claim to bear some generalizable, universal, objective truth. Instead, the exclamation is that much can be learned and truths can be revealed from the analysis of a very specific Black woman subjectivity. As such, the analysis of this corpus does not tell the “whole story,” nor is it designed to. Instead, it tells a specific story, within a context of many interconnected stories, with a capturing and recognition of the temporal nature of my social self’s construction as well as how I come to differently understand it over time. An advantage of using Facebook to create my corpus is that it captures text in a way that allows for speaking across audiences and across selves as a textual performance of the social self (Denzin, 1997).
“We can speak in narrative voices which represent our different selves, or which may have special meaning for particular audiences” (Tierney and Lincoln 1997, p. 38). Though Tierney and Lincoln (1997) were not speaking specifically about social media, its ability to speak across audiences allows it to facilitate a more fluid expression of various selves. One’s ability to reflect and reframe hir narrative voices can be liberating when used in service to purposefully creating, owning, and controlling depictions of one’s social self. Use of computation on these various social selves benefits the analysis not because of computation’s ability to separate these selves, which it does, but rather because of its ability to detect them, thus making visible the simultaneous presence of multiple expressions of the social self. The pieces are taken apart, yet still understood and analyzed as interconnected parts of a whole. Computation, with this application, preserves the unity and wholeness of my social self while parsing out the various interconnected dimensions based on different situations and experiences. Acknowledging Black women holistically as multifaceted women with complicated human experiences is often considered a privilege not afforded because of the prevalence of controlling images (Collins, 1990). The idea of searching for and matching various selves with text to conduct reflexive analysis is not new to ethnographers (Geertz, 1991; Tierney & Lincoln, 1997). What is new is the use of computation to aid in this process.
Cultural Representation and Performance of Self
Papacharissi (2010) describes the “private sphere” as a reorganization of social space made possible through technology, which creates private spaces that exist for the purpose of inspiring public communication. Social media, specifically Facebook, allows for and encourages this traversing of public and private performances of social self. It is not just Facebook’s ability for the social self to broadcast and reproduce itself but also that participation in this private sphere constitutes the social self as it is broadcasting and reproducing itself. This form of public communication reaches beyond itself while seeking refuge in the security of the private sphere, which, in comparison with the public sphere, has been vetted and determined to be (more) safe to expose the social self.
Cyberfeminism(s) allows for the theorizing and critique of technology and its deployment, among other things, of representations of women (Daniels, 2009). From cyborg feminism (Haraway, 1991) to how technology reads and constructs gender (Balsamo, 1996), cyberfeminist imaginings of technological advancement have critical implications for how women engage and navigate power to create liberated cyberspaces. But even within these imaginings, Black women too often are absorbed, rather than included, into White middle-class feminist traditions (Fernandez & Wilding, 2003). This absorption is a violent erasure of the cyber/digital labor Black women have given toward reclaiming themselves in service to our liberation and preservation.
The use of social media allows for a performance of the social and civic self through technology. Papacharissi (2010) explains that these types of technology “enable a performative storytelling of the self, unfolding to multiple audiences and across several chronological points” (p. 136). As a cultural representation (Hall, 1997), one’s Facebook corpus serves as a way for the individual to give meaning of/to self through text (and visual media), as a somewhat more democratized expression of the constitutive power of this technological system of representation. In terms of representation of Black women, too often the “loudest” depictions are distorted, disproportionately negative, oversexualized, one-dimensional, superficial, and disempowering, with racism often serving as an organizing logic within online communities (Kendall, 1998; Noble, 2013). Therefore, the meaning-making around what it is to be a Black woman and to embody Black womanhood relies upon representation that is overly invested in the notion that there can only be a one-dimensional (negative) truth as it relates to Black women. The shift of Black women toward technology, specifically social media, creates a subversive space in which classifications of Black women are challenged, resisted, and created in our own self-image. This technology also showcases the ways in which Black women are simultaneously culturally connected and autonomous through various representations of self within complex and hostile domains.
Method
Topic Modeling
Topic modeling is one form of computational modeling that uses statistical analysis to discover abstract themes within a collection of documents and/or digitized text. Latent Dirichlet allocation topic modeling (Blei, 2012a) understands documents as “bags of words,” or unordered collections of different words and reveals the statistical likelihood of these words appearing as general themes, or topics (Blei, 2012b). Essentially, the topic model allows researchers to take very large amounts of data and condense them to topical keywords, similar to the keywords listed with the abstract of an academic journal or #hashtags grouping similar discussions of topics.
When documents are annotated according to thematically coherent terms, (probability distribution over terms/words in a fixed vocabulary), algorithmic tools such as topic modeling help researchers explore large digitized archives. LDA is a mixed membership model—each document can be associated with multiple components/clusters (theta—distributions over clusters). For group data (multiple topics represented in one document, or in my case, Facebook post), mixed membership model is more appropriate. The topic model applies hierarchical Bayesian models to grouped data. (See Bovens & Hartmann, 2003 and Kruschke, 2011, for introductions to and the philosophical discourse and interpretation of Bayesian-based approaches.) The goal of the algorithm is to infer corpus structure—per-word topic assignment (Ζd,n), the per-document topic proportions (θ d ), and the per-corpus topic distributions (β k ) (Blei, 2012a). I utilized Mallet open source software (McCallum, 2002) to run the topic modeling on my Facebook data. See Poetics—“Topic Models and the Cultural Sciences” special issue (Mohr & Bogdanov, 2013) for a variety of examples of how LDA topic modeling is utilized. I compared all theta (alpha/beta) combinations and determined most topics of interest to my study (i.e., family, love, violence, death) were represented (as coherent topics) regardless of theta settings. I also ran the model using 10 increments of Κ—number of topics (from 10 to 100) to determine which topics emerged, persisted, and/or disappeared with changes in the K parameter. See Blei (2012a) illustration of the Latent Dirichlet allocation in Figure 1.

Latent Dirichlet allocation.
Autoethnography
Autoethnography is a method of inquiry that acknowledges the explanatory value of writing, storytelling, new media, performance, art, and other expressive forms as legitimate sources of knowledge that connect the autobiographical with larger cultural, social, and political experiences (Ellis & Adams 2014; Knowles & Cole, 2008). Autoethnographic work demands reflexivity of research and recognizes the importance of identity politics in understanding and representing social issues.
“Analytic autoethnography” is a dimension of ethnography that operationalizes abstraction and generalization by using empirical data to analyze and theorize about broader social phenomenon (Anderson, 2006; Pace, 2012). Autobiographical narratives and critically reflexive writings as well as stories based on past experiences make for particularly useful data elements (Pace, 2012). The purpose of an analytic autoethnographic approach is not to come to some universal truth or reality, but to provide a specific type of flexible for structured analysis of one’s own experiences with the intention of gaining broad social and cultural insights.
The production of digital social media data provides another space from which autoethnography, in its various forms, may pull from to capture the lived experiences, stories, and reflections of individuals. Bailey (2015) in “#transform(ing) DH Writing and Research: An autoethnography of Digital Humanities and Feminist Ethics” discusses the methodological, ethical, and theoretical issues of employing collaborative autoethnography to follow contemporary Black trans women online networks. Bailey’s use of social media networks and collaborative consent to illuminate the experiences of Black trans women is inspired and in stark contrast to guidelines and ethical considerations of focus among some computational social scientists (Kosinski, Matz, Gosling, Popov, & Stillwell, 2015; Lazer et al., 2009; Miller, 2011; Semaan, Faucett, Robertson, Maruyama, & Douglas, 2015; Weinberger, 2011). The transparency and care with which Bailey (2015) conducts digital research offers another conception of the unique offerings of new media.
Bringing computation and autoethnographic methods together using a method I call computational digital autoethnography, I harvest my personal social media data to create a corpus and apply the abovementioned topic modeling algorithm to uncover themes within my own digitized record. The corpus spans from 2007 to 2016 and includes all written text I authored, including posts, comments, and texts introducing pictures and/or links.
Corpus Considerations
Because of the visual nature of Facebook, I recognize there are limitations to privileging text within the analysis. There is also the potential for important data and meaning to be lost with the textualisation of data. To combat this limitation, I incorporated visual and audio analysis of pictures, images and videos to provide additional context for the textual analysis. I strongly encourage more research in this area to find computational solutions to visual and audio inquiry. However, the primary focus of this study is centered in the text of the corpus being analyzed. A primary focus on text is justified in that words and language, as the most common forms of communicating thoughts and emotions, are the most reliable units of analysis in which we can understand our social selves (Crystal, 2004; Eckert, 2008; Mehl, Gosling, & Pennebaker, 2006; Pennebaker & King, 1999; Tausczik & Pennebaker, 2010). Furthermore, the usage of Facebook data as sources of personal discourse has been shown in studies to be reliable enough and representative of personalities and concerns (Back et al., 2010; Gosling, Vazire, Srivastava, & John, 2004; Schwartz et al., 2013).
All autoethnographic work must consider ethical questions related to how others related to the author are implicated within their work (Boylorn & Orbe, 2014). Because we are social beings and our relationships and experiences are connected with our interactions and experiences interacting with others, it is unavoidable that others would be referenced (either directly or indirectly) and would influence our interpretations and analysis. This reality is further complicated when we add the additional component of harvesting information within a digitized environment, where the ease with which one can obtain information is muddied by the tangled social networks that are promoted by social media platforms. This unique ethical consideration influenced my decision to only harvest digitized data that I created. As a result, complete conversations between Facebook friends and myself are not included in the digitized corpus. Still, because autoethnography allows for the retroactive and selective consideration of previous experiences, I am able to refer back to the complete preserved historical record (much like a journal) so that my reflections extend beyond the constructed corpus. This intervention allows for a more full incorporation and context of specific lived experiences, rather than an artificial, abstracted corpus removed from my social interactions with others. Another concern is the asynchronous way with which algorithmic assemblages manage the narrative. The highly alienated time structures constructed and organized by these assemblages, if left without intervention, can serve to colonize the narrative. Again, because autoethnography allows for the retroactive consideration of previous experiences, within the context of specific time periods, the combined method (of computation and autoethnography) allows for patterns around experiences to become more visible. In other words, the topic modeling output refers back, not to a specific experience but to a collection of experiences. The autoethnography allows for a parsing out of individual events, within their time structure, to understand experiences as enduring.
Resisting the Desire to Privilege One Method Over Another
Autoethnography is not only a method of research that honors individual experiences, voices and narratives as an important aspect of knowledge creation, but also a reflection that art and science pole opposite ends of a disciplinary spectrum that autoethnography attempts to shatter (Ellis & Bochner, 2000). Computation’s proclivity to generate abstractions via algorithms may be viewed as an attempt to tame autoethnography’s efforts to disrupt established boundaries. As a researcher, it was important to me that I show respect to the method of autoethnography by preserving its centering of reflexivity. This is precisely what autoethnography brings to computational interpretation and analysis. (Brown, Mendenhall, Black, Van Moer, Lourentzou, Zeria, Flynn, 2016) in conjunction with the distant reading of the topic modeling output provided by the computation. With intermediate reading, topics emerging from computational modeling are traced back to Facebook posts that comprise the topics. This incorporation of selective intermediate reading helps prevent the potential colonization of computation by bringing the narrative voice back to the forefront of the analysis. The combination of computation and autoethnography allows for an omnipresent analytical lens, which can simultaneously focus the subject of study from a great distance as well as up very close. If I did only close readings of the entire digital corpus, there would be the potential to miss patterns that would be identified using computation. Using computation alone to interpret the nuances of Black women’s lived experiences would potentially lead to gross misunderstandings and perhaps even perpetuations of popular stereotypes. Specifically, the computational topic modeling allows for a deconstruction of a personal and familiar text and destabilizes the text through this deconstruction to allow for deeper reflexivity as the corpus is analyzed and the knowledge reconstructed from the text is extracted.
Computational Digital Autoethnography: One Manifestation of Methodological Cyborg
Computational digital autoethnography (CDA) serves to uncover and amplify African American women’s experiences using computation. The honoring of Black women’s experiences as legitimate through the usage of autoethnography is the reflexive response that exposes the agency of the algorithm and counterbalances its autonomy with that of a Black woman’s narrative. CDA as performance autoethnography recognizes that this computational self is presented before an audience to connect individual personal experiences publicly with larger systemic issues (Denzin, 1997).
For social scientists, our methods are the procedures we employ to gain insight into social life and society. We choose methods based on what they allow us to do and understand. But, methods also create. Methods can create legitimacy, disruption, and liberation. CDA takes the current abilities and techniques of traditional autoethnography and extends the method beyond its normal scaling limitations by using digitized data and embedding computational elements into the autoethnographic process. The strength of computational analysis to obtain emerging patterns within data, coupled with autoethnography’s disruptive centering of the individual experience in service to connecting larger contemporary concerns, creates a device well positioned to further legitimize the experiences of African American women as a source of rich knowledge. Autoethnography is about bringing the personal (experience) in, whereas computational analysis is about (the researcher’s) distance from the data. Yet, these qualitative and quantitative methods are compatible and fuse together to form the methodological cyborg as a Black feminist technology, which is employed in service to empowering and amplifying African American women’s social selves.
Whereas autoethnography is inherently inductive, in that it takes specific experiences of the individual as legitimate knowledge from which to connect to the general world (though generalization is not the goal of all types of autoethnography, but rather a potential side effect of creating community as one shares hir voice), computation is deductive in the ways it pulls in an entire corpus and abstracts specific thematic patterns in the data. Although these extremes may, on the surface, appear to deem these methods as incompatible, the reality of their polarity is what makes them uniquely coupled to produce a stronger methodological device.
Applying Computational Digital Autoethnography
Love and Violence as Reoccurring Themes
The following section provides illustrations of CDA output as well as demonstrations of the reflexive analysis and interpretive processes utilized while engaging the corpus applying this method. One of the first recognizable topics, which persisted throughout all variations of K, and also emerged multiple times within the same K value, was a topic I initially identified as related to teaching Black feminism, specifically bell hooks’ (2000) All About Love, within a local college-in-prison program (see Table 1).
“All About Love Black Feminist Reading Group” Topic Word Lists. 3
As I analyzed posts from this persisting theme, I realized that this collection of topics was more related to what I learned about loving Black people while discussing Black feminist renderings of love in a carceral space. The theme related to how All About Love shaped my life and perspective during this time period. Not only was I connecting with students, discussing love in an incarceration setting, but I was also having similar discussions outside of the prison, with people who were not incarcerated. The power of connection through love was being highlighted in this particular topic. The revolutionary power, the transformative power, the healing power, the communal power—all ignited by my desire to have the “love conversation” in the most contrary/contradictory space I could access. It was a love for education and a love for Black people that brought me into that space to do that work. It was love that made room for connections outside of the prison. With every change in K, the algorithm held tight to love’s lesson. This theme was more indicative of Black feminist love praxis than another “love” theme (see Table 2), which focused on other’s love praxis toward me.
“Love” Topic Word Lists.
These themes are complementary; they are symbiotic. They are not the same but have the same source. They need each other to survive. The key lesson for me was that my ability to deploy a Black feminist love praxis was only as strong as my community’s love praxis toward me. To my community, in the words of Whitney Houston, “you give good love.”
Another persisting topic was a topic I identified as being related to violence. Violence is categorized broadly within this corpus; violence includes racism, sexism, police violence, anti-Black violence as well as protest in response to violence. While racialized violence is discussed most frequently, it is important to acknowledge the many forms of violence, which are witnessed through the corpus. Many of the representative posts discussing violence include expressions of anger. I use more explicit language, language that is more direct and curt.
Audre Lorde (1984) encourages that Black women pay attention to the things that make us angry because there is a critical analysis, which can manifest from anger that offers powerful indictments of oppressive systems. That our anger can implicate power means our anger is often dismissed as being about our own shortcomings (angry Black woman trope) rather than acknowledged as pointing out society’s flaws. Within this corpus, the state is clearly implicated as a primary source of racialized violence, as are various forms of the educational system. On the other side of anger, too, is love. Following through the anger helps to identify what we care about deeply. There would be no anger about the violence inflicted on Black people if there was no love for them/us. So, in this way, the themes of violence and love are also linked.
How Topic Modeling’s Perceived Flaws Aid in Reflexive Autoethnographic Practices
A major critique of topic modeling is that it can be easily misinterpreted. One must be extensively familiar with the corpus under investigation to interpret the topic model output, and even then, there are moments when the word lists will be misinterpreted. There are several strategies to reduce the likelihood of misinterpretation. As mentioned, intermediate reading, which is identifying the specific posts that make up the topics being interpreted, is one strategy for verifying the interpretations of word lists. Scholars have identified various randomized tests with which to validate interpretations (Chang, Gerrish, Wang, Boyd-Graber, & Blei, 2009). Still, topic modeling is most effective when used as an exploratory tool. Though one wants to reduce misinterpretations as much as possible, even in those moments of misinterpretation, I argue there is still knowledge that can be uncovered. For example, after reviewing the word list below, my initial interpretation was that this topic was partially related to my Uncle Curtis and Grandma. My uncle Curtis and (maternal) grandmother raised me and, though I was not convinced that the topic was solely about them, I was sure that this topic was somehow related to my experiences with them as a child.
When I reviewed the representative posts for the topic (see Table 3), I was met with much disappointment. When reading only the word list for Topic 26, I thought the computational rendering was recalling my uncle and grandmother in some way.
Word List and Representative Posts for Topic 26.
It was not. This is a clear example of me seeing something that I wanted to see. I ignored all the other words that pointed to drinks (coffee and smoothie) and ignored how the sugar and caffeine in these products often causes me to have withdrawals that I (problematically) compare with illicit drug use. I ignored the words that pointed clearly to my research (i.e., consumer, gendered, racialized). I only saw in this word list what I wanted to see. I cannot tell why from the intermediate reading, but somehow my uncle and grandma got caught up in the probabilistic renderings of this topic. I felt a sense of loss that they were not rescued here. These are computation’s phantoms; word lists/outputs that are misinterpreted because of the researcher’s desire to see a triggered memory validated. This insight is important to the reflexive autoethnographic process because it reveals a frustration of not having an important part of my lived experience recognized and validated by the LDA algorithm and how these desires, if left unchecked, can lead to misinterpretation during analysis. From a methodological perspective, this is a valuable reminder of the limitations of certain quantitative methods, but as a reflexive practice, there is still a space to recognize what has been misinterpreted and incorporate that experience in a way that honors the memory of my uncle and grandmother as it provides clarity around the importance of validating interpretations. Computational phantoms also remind researchers that misinterpretations of themes also signal possible omissions within the corpus. These omissions provide an opportunity for reflection on gaps and/or silences within the corpus that are perpetuated by the algorithmic assemblages, the researcher’ s omissions or some combination.
Another challenge for those using topic modeling relates to determining parameter values. As I mentioned, I reviewed all theta parameter outputs and determined that my topics of interest were persisting, making the theta value less relevant for my purposes of using the output to promote reflexive autoethnography. I initially struggled with deciding which value for K would be most appropriate. After reviewing several K values, I decided that considering multiple K values provided more insight. As an exploratory tool, varying the K value allowed me to see at what point topics began to emerge and/or disappear and which ones persisted. Several of the topics I found particularly interesting persisted throughout the variations of K. In other words, there were topics of interest to this study that presented themselves when I set the model to produce 10 topics, 20 topics, 30 topics, and so forth, through 100 topics. Through intermediate readings, I verify that these topics are indeed the same. There is a consistency in the representative posts, which confirms that these are indeed versions of the same topic reemerging as the K topic increments change.
These persisting topics were the topics I focused on within my study. The rationale was that if they could persist at every variation of K within the algorithm output, they deserved more attention. In this way, there is a delicate balance related to how the algorithm influences and shapes the reflexive autoethnographic process. I negotiate with myself about how much agency I want to relinquish to the algorithm to determine what is worthy of study and further investigation. As a social scientist, I take comfort in the decision to allow my exploratory research to be directed in a systemic way. The methodological cyborg is in communion with hirself.
There is also a tendency within topic modeling interpretation to disregard topics that appear to be incoherent. Often, these seemingly incoherent topics are used as evidence to support the failings of topic modeling. Instead, I have framed this challenge as a methodological opportunity to aid in the reflexive process. Within my analysis, I identified a seemingly incoherent topic around death. Throughout parameter variations, “death” would present itself across many topics, at the top of multiple word lists. Within my analytic memos, I noted how the algorithm pulls out death multiple times but I was unable to make out a cohesive topic, one of which I recognized and understood. I began to wonder if this was a space of disagreement, a place where the algorithm and I could not come to an understanding of how death is discussed. Was the cyborg experiencing an existential crisis? Was something getting lost in translation/deconstruction of how I (re)presented or discussed death within the corpus? Death from police violence was clear as a cohesive topic, but why did other types of death keep occurring in my output in such seemingly incoherent ways? Is the algorithm missing a spiritual death, an emotional death—Why does it pick up on one material death but leave other types of death incoherent? Is this signaling a complexity around death that cannot be sufficiently captured by the algorithm or rather, understood from the word list? Why do I not recognize these other references to death as being generated from my experiences? What does that deconstruction block from my (re)memory? 4
Upon further investigation, utilizing representative posts, I was able to recognize, not a pattern in the individual topics themselves, but rather across the many topics which presented “death” at the top of their word lists. Though some of the topics were clearly speaking to a material death, this material death was also in conversation with an institutional (i.e., academia) erasure as another form of death. There was a death through exhaustion—from the demands of “second shift” and of crushing tensions to preserve self within certain spaces, from suppression and death from pressure, or through a desire to come into an elevated self. That an old self must die so that a newly evolved self might emerge. These are different types of death than the state-sanctioned death at the hands of police, for example. I view these deaths as more internal than external, implicating different sources. These “death” topics also focus on an individual death, rather than a collective type of death—which is implicated when discussing police and anti-Black violence, for example. Within these distinctions, I also found there were references to death’s certainty, as opposed to unpredictability—Did speaking of death in this way mean that I felt (under these circumstances) death might be certain, unavoidable? What then are the implications for these deaths through erasure? Exhaustion? Needless to say, these reflections caused me to pause as I reconsidered my career choices.
Conclusion
Potential for Decolonization of Computation
Computation is rarely used in service to identity work. The method is often inappropriately lauded as superior because of its assumed apolitical orientation. Though algorithms are not unbiased (and computation has not historically been employed in service to understanding, interpreting or reimagining identity politics) the addition of autoethnography to computation calls forth computation’s desire to be a method greater than what it is currently. The coupling of two seemingly incompatible approaches to research is a great step forward in decolonizing computation. This decolonization process occurs through a queering of computation. E. Patrick Johnson (2009) theorized the relationship between blackness and queerness as quare. Understanding blackness as queer because of its “otherness,” specifically the ways in which it cannot be contained within conventional categories of being sits within a larger ontological question, which might then be used to consider how computation might be queered through a Black feminist lens as metanarratives around methods are pushed beyond qualitative and quantitative. This coupling also serves to disrupt a more insidious problem of computation, specifically, the ways in which computation embeds racialized and gendered biases within its algorithmic assemblages. These assemblages construct processes of differentiation and hierarchy that reify existing social inequalities (Dixon-Romàn, 2016).
Potential for Colonization of Autoethnography
While this combined method of computation and autoethnography has the potential to decolonize computation, it also has the unfortunate possibility of colonizing autoethnography. What I am referring to here is computation’s potential to overtake the narrative with a privileging of algorithmic usage on text and a diminishing of written narrative/storytelling. Though the output of the word lists may be interpreted as a different type of narrative, it is told using the language and logic of quantitative methods grounded in word frequencies that privilege likelihood of probability as opposed to privileging the lived narrative. For example, in October 2016, Facebook came under scrutiny for allowing advertisers to choose the race of specific groups they wish to target in their advertising under an “ethnic affinities” option. The detailed targeting allows the advertisers to include specific behaviors and interests and exclude specific races/ethnic affinities. Facebook assigns ethnic affinity using algorithms based on quantification of pages or posts liked or engaged. However Facebook defines these categorizations, it’s clear that these racialized categorizations have an impact on how one is racially encoded as a Facebook user and one’s access to information based on those racialized encodings. The decisions that algorithms make are decisions of representation and inclusion. It’s about who is seen and who gets to see. This misrecognition encoded within algorithms is insidious. Eduardo Bonilla-Silva (2003) framed masked systemic racism as color-blind racism (racism without racists). Combined with computation, we are in an era of coded-blind isms whereby biases are encoded and algorithmically masked as objective.
As we stretch deeper into a culture of calculation and quantification, we allow algorithms to determine who in society is made visible (and how)—who is counted, what is counted—there are value judgments that are being encoded, while being misrecognized as objective. There is a literal and figurative accounting of lived experiences within this encoding. In this specific posthuman computational moment, without reflexive intervention, algorithms threaten violence on marginalized others through the simultaneous “othering” and assumptions that humanity is equally accessible to all. There is also concern our society will become a technocracy whereby a system of government rule and control will be maintained through algorithms (Morozov, 2013). Predictive policing and its embedded coded-blind racism are but one example. In this way, to engage in autoethnography within the computational paradox of surveillance and resistance is to simultaneously expose oneself to and resist the technocracy. The methodological cyborg intervenes by allowing space for reflexivity, self-definition, recognition, and subjectivity.
Further Stretching of Art and Science Binaries
As we evaluate the utility of the methodological cyborg as Black feminist technology, we must also consider why we as social scientists are so invested in maintaining binaries/boundaries of art and science. At what cost do social scientists cling to notions of objectivity in our quest to understand lived experiences? Does “science” allow for the intimate understanding of the beautiful struggle that is Black girl genius? 5 (Brown, 2013) And who gets to decide? While autoethnography abandons this binary entirely, computation unrelentingly cleaves to it. The combination of the two, through the deployment of a Black feminist methodological cyborg, produces a fissure from which a new space develops that can mediate the methodological poles, drawing them closer together. This also actively engages computation in addressing political questions around identity and intersectionality.
CDA encourages a theoretical shift around discussions of methods by challenging traditional approaches to research within a single method. The methodological cyborg combines inductive and deductive approaches to research. In the process, it also queers computation by engaging Black feminist theory to trouble notions of objectivity, and force reflexivity onto the computation, thus blurring the conventional lines between quantitative and qualitative research.
Footnotes
Acknowledgements
Many thanks to Karrie Karahalios, Ted Underwood, and Robert Deloatch for their thoughtful feedback and encouragement of this project. Special gratitude extended to Durell Callier for introducing me to the power of autoethnography.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
