Confessional data selfies and intimate digital traces

Abstract

Data selfies are representations of self through personal quantitative data: from graphs of Tinder dating outcomes, through to the story of brain surgery told through daily step counts. In this article, we explore practices around what we call ‘confessional data selfies’ shared on the reddit forum r/DataIsBeautiful, where more than 14 million subscribers – predominantly straight men – share often complex and intimate quantitative self-representations of their lives. We draw on an analysis of the top 1000 posts on r/DataIsBeautiful, and a sub-sample of 59 data selfies, to identify patterns in confessional data selfie practices. We identify three themes: families and relationships, routine management, and body rhythms. We argue that these data selfies generate opportunities for self-reflection, connection, discussions of mental health, grief and other personal experiences. Significantly, this occurs largely between men, modulating processes of gendered impression management and expanding the conceptualisation of selfie work.

Keywords

Data gender Quantified Self reddit self-tracking selfies visualisations

Data selfies are representations of one’s ‘self’, typically presented as a visualisation of quantitative data. For example, data gathered by self-tracking platforms, like Fitbit or Apple Health platforms, produce visualisations of activity: steps taken, hours slept, heart rate, calories burnt and so on. These are often in the form of graphs or ‘activity rings’ that trace activity over time. There are also manual data selfies, where one might record similar activities and log the information in a graph or table. In this article, we explore practices around what we are calling ‘confessional data selfies’ as shared on the reddit forum r/DataIsBeautiful. On this subreddit, more than 14 million subscribers, or redditors, share often complex, intimate, highly detailed quantitative self-representations, many of which reveal significant or compelling aspects of one’s life. These range from traces of dating (‘My 500 days on OkCupid’) to health data (‘My health decline and eventual brain surgery’) through to incredibly detailed lifelogging (‘Every Single Hour of My 2017 Recorded’). In these cases, the confessional data selfie appears to attempt to present its author in a kind of unbiased, ‘laid bare’, transparent way. Confessional data selfies invite analysis, elicit personal story-telling, and open one’s life up to others, in this case the r/DataIsBeautiful networked public. Importantly, the most highly circulated data selfies on r/DataIsBeautiful are also visually appealing. They package often deeply personal and emotive stories within aesthetically pleasing visualisations.

Take, for instance, this data selfie by u/andrew_elliott (on reddit, u/ indicates the username) included here with permission (see Figure 1). It is a visualisation of his daughter’s sleeping patterns over the first 4 months of her life, where the dark segments are time asleep and the light awake, with midnight at the top of a 24-hour clock configuration, circling inwards as time progresses.

Figure 1.

‘My daughters sleeping patterns for the first 4 months of her life. One continuous spiral starting on the inside when she was born, each revolution representing a single day. Midnight at the top (24 hour clock). [OC]’ post to r/DataIsBeautiful by u/andrew_elliott, with 57.8k karma/upvotes, 1.6k comments.

At one level this is a visually impressive, even beautiful data visualisation that tells a story, but it also reveals something personal about the relationship between a newborn and a parent, the rhythms of bodies, fatigue and how we make sense of time. In this sense, it is a self-representation of u/andrew_elliott as much as it is of his daughter, and through visualising quantitative data it confesses, reveals and acts.

In this article, we draw on an analysis of the top 1000 posts made to r/DataIsBeautiful – which included 59 data selfies – to identify patterns in confessional data selfie practices and to theorise the production of these digital self-representations. We seek to answer the question – what are confessional data selfies, and how do people use data selfies to represent themselves in digital spaces? Data selfies also appear to be heavily gendered, as most – according to our analysis, presented below – seem to be created by men, complicating wider readings of the gendered aspects of photographic selfies where young women are ‘often rebuked as the major producers of selfies’ (Warfield, 2017: 78). Indeed, according to Barthel et al. (2016), 69% of reddit users are men, and 56% are between 18 and 29 years old. Massanari (2017) traces the ways ‘geek masculinity’ (see also Braithwaite, 2016) plays out through reddit. In this context, are ‘data selfies’ selfies for dudes? Does the valorisation of quantitative data protect data selfie authors from the kinds of moral judgements (Tiidenberg and Whelan, 2017: 146) that have been directed at people, especially young women, who take and share photographic selfies?

Background

Because data selfies always involve some form of self-tracking, in order to build towards our conceptualisation of confessional data selfies, this article draws on two bodies of literature: one on digital self-tracking practices, and one on selfies and self-representational practices in digital cultures. Together these bodies of literature help shape our theorisation of the confessional data selfie.

Revealing the self through tracking

The term ‘self-tracking’ now represents many things: a range of record-keeping practices with long and deep historical roots; an individual process of management and curiosity; also a performance – perhaps of art (O’Neill, 2019) or protest (Neff and Nafus, 2016). At a macro level, self-tracking is indicative of an ‘ever more forensic, strategic, predictive and knowing . . . prosthetic vision [labelled the] data gaze’ (Beer, 2019: 7). In cynical assessments, self-tracking is an institutionally encouraged neoliberal process that sees human action commoditised (Neff and Nafus, 2016: 66) for consumer analysis and risk assessment. As a mixture of all these potentialities, it may also be a personal hobby that is shared with a ‘networked public’ (Boyd, 2010) of like-minded others. In this latter case, one might think of the extensive body of work on the Quantified Self (Lomborg and Frandsen, 2016; Lupton, 2016), a movement built around the sharing of data and ethos of self-efficacy. However, in this article, we are interested in exploring a microcosm of shared self-tracking, and the cultural reasons that lead individual self-trackers to share their self-tracking data – as personal representations of themselves – with strangers on reddit.

Self-tracking, of course, is not new. In tracing the history of the diary – up to and including digital forms – Rettberg (2014, 2018) and Wernimont (2018) note that the traditions of diary-keeping have historical religious affiliations. In tracing the history of ‘seeing’ oneself through various ‘data’ forms, Rettberg (2014) cites both Catholic (the confessional) and Protestant (diary keeping) traditions, noting that both examine ‘one’s own flaws and failures, seeing self-examination as the source for self-improvement and attaining grace’ (p. 5). Similarly, Cardell’s (2014: 17–19) genealogy of the diary documents religious confessional influence, as well as – following Nikolas Rose – the constructions of modern subjectivity that funnel values from the public sphere to the private self. In this way, creating and sharing a data selfie fulfils similar functions. Like Roseian approaches, Foucault (1998 [1976]) also traces how the Christian confessional ‘transform[s] desire . . . into discourse’, and certainly in many cases the confessional data selfie can also be read through this lens (p. 21).

These Foucauldian links are popular in self-tracking literature, but for the purposes of exploring r/DataIsBeautiful, there are a few key examples that are relevant. Reigeluth (2014: 252) and Cardell (2014: 17) have taken up Foucauldian ideas on how the self-reporting of data is confessional, and map this onto a general cultural shift towards intimate disclosures in personal written forms. As a useful precedent to our work with personal data and reddit, Esmonde and Jette (2018: 4) explored Fitbit’s online forums and analysed the exercises in biopower playing out in these spaces. They saw the manifestation of health self-surveillance being framed by neoliberal conditions as a moral concern of ‘risk’ and ‘burden’ (Esmonde and Jette, 2018: 4):

[t]he embodiment of these networks of power is also apparent in what Foucault describes as the modern confessional, whereby the priest as confessor is replaced with experts such as psychiatrists and medical practitioners. It is through these confessions that truths – which are always an effect of power – are established about ourselves.

Thus, we follow suit, to argue that digitally mediated networked publics – like reddit – offer another avenue for confessional behaviour. The structure of the subreddit we research – and its reification of data as offering truths about bodies and experiences – fits squarely within the positivist canon of self-tracking. While it moves from the specifics of health to broader patterns of daily life, it still shows individuals subjecting themselves to measures of self-discipline. It is the manifestation of these truth-claims-through-data on reddit that we are concerned with in this article.

What is a selfie?

Having established our framing around the confessional and ‘truth-claim’ nature of data and self-tracking, we also need to explain our use of the term ‘selfie’. What is a selfie? Are these data visualisations really ‘selfies’? We follow Senft and Baym’s (2015: 1589) theorisation of selfies as both objects and gestures:

a selfie is a photographic object that initiates the transmission of human feeling in the form of a relationship . . . A selfie is also a practice – a gesture that can send . . . different messages to different individuals, communities, and audiences.

The confessional data selfie is not a photographic object, but it is a graphic object ‘that initiates the transmission of human feeling’, even prompting recipients – other redditors in our study – to reply, ask questions, and to even respond with their own data selfies. In this way, the visual object forms a relationship between the author and the audience. The confessional data selfie is also a gesture, an act of story-telling that can be highly compelling, emotive and alluring, and we suggest that the gesture here is also confessional, revealing oneself to a pseudonymous networked public.

By theorising visualisations of self-tracking data as selfies, we also follow Tiidenberg and Whelan’s (2017) efforts to ‘decoupl[e] the mandatory presence of a face/body from the definition of a selfie’ (p. 146). However, we acknowledge that there is some contestation over what is and is not a selfie. Tiidenberg and Whelan caution against the conflation of different kinds of visual self-representational practices under the umbrella term ‘selfie’. For example, they describe the ‘Gratuitous Picture of Yourself’ (or #GPOY) practice (of relating to an image or scene in popular culture to describe a feeling or experience), and the genre of ‘Everyday Carry’ (or #EDC) photos (of objects like flashlights, pens and pocket knives to represent a set of values around preparedness, competence and knowledge) as examples of ‘not-selfies’. Despite this, we persist with framing these self-tracking visualisations as selfies precisely in order to trouble dominant narratives that, by centring faces and bodies, tend to lead to discussions of whether selfies are ‘good or bad’ or ‘empowering or objectifying’. As Tiidenberg and Whelan (2017: 146) argue, these lines of analysis have led to ‘abundant diagnoses of narcissism’ (see also Sorokowski et al., 2015; Weiser, 2015) or have produced the reading of selfies as the domain of young girls who Warfield (2017: 78) has noted are ‘often rebuked as the major producers of selfies’. Importantly, Williams and Marquez (2015) have observed how selfie work (and the perception of selfies) is modulated not just by gender but also by race, finding in their study in the United States that Black and Latino respondents tended to produce more selfies than White respondents. They also found that the figure of the ‘selfie king’ – a man who regularly takes and posts selfies – can be constructed as macho and confident but can also be read as shallow. One of their participants explained, ‘for girls to take selfies, I understand it. For guys to take selfies all the time . . . I just think it’s kind of shallow’ (Williams and Marquez, 2015: 1781).

Theorising the confessional data selfie as a selfie works to (a) move beyond the focus on faces and bodies in self-representational work; (b) disturb the moral judgement of selfies as narcissistic and self-interested; and (c) trouble assumptions about selfies primarily being the domain of young women. By studying and theorising the confessional data selfie, we seek to reveal processes of self-representation through self-tracking data where these visualisations operate as objects that transmit human feeling and forge connections.

Introducing r/DataIsBeautiful

The site of data collection for this article is the subreddit ‘Data Is Beautiful’ (r/DataIsBeautiful). With 14.4 million subscribers at the time of writing, this subreddit allows users with an interest in ‘data’ (almost always quantitative), analysis and aesthetics to gather, share and discuss data visualisations. The notion of reddit as a set of communities has been problematised elsewhere (see Summit-Gil, 2017 and Robards, 2018) but there are clearly sets of shared conventions, practices and identities that bind people together in these spaces in sometimes ephemeral and sometimes more enduring ways. Subreddits are organised by topics, such as alt-right politics (Mills, 2018), feminist memes (Massanari, 2017), naked selfies (Van Der Nagel and Frith, 2015), or just cute images of animals on r/aww.

Like most subreddits, r/DataIsBeautiful has rules – for both posting and commenting practices – which are enforced by volunteer moderators (for more on the labour involved in reddit moderation, see Squirrell, 2019). Reflecting some of the site-wide concerns referenced above, political discussion is policed here, with American political content ‘only permissible on Thursdays’ (r/DataIsBeautiful, 2020). The rules also explain the system of ‘flair’ – a system of visual signifiers and linguistic tags applied to usernames or posts within any given subreddit – through which r/DataIsBeautiful contributors are lauded for frequent sharing of Original Content (OC). The audience is inquisitive and suggestive, with comments that query datasets, methods and the presentation of data visualisations.

Methods

In this article, we examine the top 1000 posts, according to post ‘karma’ (a number that governs content visibility, determined by upvotes minus downvotes) made to r/DataIsBeautiful as of 11 September 2018. These 1000 posts were collected using a custom-made ‘Reddit Post Parser’ Google Chrome extension and exported to Google Sheets for analysis. Posts in this sample were made from August 2014 through to September 2018, but skewed to 2017 and 2018 due to the growth in the subreddit over this time. Posts can only gain karma for a 3-month window from their initial posting, and following this they become ‘archived’: they remain visible but can no longer be voted or commented on. We focussed on these top posts, to identify the subreddit’s most circulated and influential posts.

Looking at only highly circulated posts also attends to ethical concerns about this kind of research. Fiesler and Proferes (2018) explain how many Twitter users felt uncomfortable at the prospect of their tweets – even when posted ‘publicly’ – being used in research without their consent. While it may be impractical to seek consent from every user in social media research that may include thousands or millions of users, it is also insufficient to simply say that the data are public and thus open to fair use, as this places the burden of risk and visibility entirely on individual users. Instead, a more nuanced approach is needed in social media research ethics that looks beyond a simple public/private binary, towards a continuum where ‘publicness’ is on one end (the front page of a mainstream news website, a YouTube video with millions of views, a tweet with thousands of re-tweets, etc.) and ‘privateness’ on the other (a private chat or a small group chat, a message on a dating or hook-up app, an Instagram story with a ‘close friends’ group, etc.). The key consideration here is intentionality: who did the user intend this post, disclosure or exchange to be seen by? Who is the intended audience? We would argue that posts, disclosures and exchanges that tend towards the more private end of the continuum must be subject to explicit informed consent to be included in social media research projects, even where those posts are ostensibly ‘public’ but only intended to be seen by a small imagined audience (aligning with recommendations by Ravn et al., 2019). On the other end of the continuum towards ‘publicness’, posts that are much more visible and highly circulated can be subject to long-standing media analysis processes. However, this does not fully remove any sense of obligation from the researcher to consider potential harm, intentionality and issues around beneficence – even where individuals who posted the content being analysed are not conceptualised as ‘participants’, but as more abstract ‘users’ (for more, see Zimmer and Kinder-Kurlanda, 2017). We acknowledge that while many of these posts are highly personal, even intimate and – as we argue – confessional, they are also often anonymous. Redditors rarely post under ‘real names’, instead tending to pseudonymous usernames (Van Der Nagel and Frith, 2015), and while many of the data selfies in our sample reference locations like countries or cities, they tend not to be more specific than that. This relative anonymity no doubt contributes to the kinds of disclosures and confessions made in this particular context.

Like many platforms, reddit’s popularity-driven ‘karma’ system circulates some posts more than others. For us, analysing posts that had been highly circulated (6000–100,000 ‘upvotes’ and 100–9000 comments) places our sample more on the ‘public’ end of a public/private continuum. Users were also able to delete or mute their posts if the attention received from ‘going viral’ and being hyper-visible was unwanted, thus aligning with our concerns around intentionality and intended audiences. As we focus on a large subreddit and highly visible posts, we have also opted not to paraphrase the titles of posts – a strategy Fiesler and Proferes (2018) recommend to anonymise tweets, to avoid them being reverse-searchable – again because of the highly visible and circulated nature of these specific ‘top’ posts on r/DataIsBeautiful. Again, the key consideration here is intentionality and managing potential harms. To preserve the OPs capacity to delete posts, we have also not included the visualisations themselves in this article, aside from two examples where we have explicit permission to reproduce them.

From the initial sample of 1000 posts, we filtered post titles to look for posts about the user’s personal data, by initially searching post titles for the words ‘my’ and ‘I’ (e.g. ‘my 500 days on Tinder’, ‘my Netflix viewing habits’, ‘Failing to run the Paris Marathon . . . I’ve tried to animate how I did’, etc.) which produced a sub-sample of 84 posts. We also returned to the original corpus of 1000 posts to manually identify any other ‘confessional data selfies’ that were not included, but found no further posts. We analysed each of the 84 posts in this sample individually and excluded a further 25 posts that fell outside the criteria of being self-representations, or representations of some element of the OP’s personal life. Here, we defined the confessional data selfie as a visual representation of one’s self (or some aspect of one’s life), told through quantified data, that reveals or ‘confesses’ something. The latter element, around revealing or confessing, is more difficult to measure. For example, we included a visualisation of ‘my daughter’s’ sleep cycle (the opening example), because it was a starting point from which the OP spoke about his own sleeping patterns. However, we excluded a post visualising the occurrence of searches on Google for the search term ‘my eyes hurt’ around a solar eclipse, because while it included the word ‘my’, this was not a self-representation. This filtering led to a final sample of 59 posts for more detailed analysis. For each of these 59, we categorised, examined the comments, sought to discern the gender of the OP and coded various themes in the data selfie itself.

In our following analysis, the proposition of the ‘confessional data selfie’ is a term arising from what we found in the presentation and communication around data visualisations in a pre-existing group of users. This differs from the methodological trajectory of the similarly titled ‘DataSelfie’ (Kim et al., 2019) – a personal informatics system that aims to help visually co-construct representations of self-tracking data with users. Our project is also distinct from the 2016 ‘Data Selfie’ project by Hang Do Thi Duc and Regina Flores Mir, a Google Chrome browser extension that collects and analyses data on the user’s behaviour (Duc and Mir, 2016) on Facebook (Van Der Vlist and Helmond, 2017). While we acknowledge these approaches to the idea of the data selfie, we build on them to analyse a dataset composed of pre-existing content, shared in an enthusiast ‘lay-expert’ discussion space – rather than designing new ways for non-experts to create new means of visualising their data. However, our conceptualisation shares with the aforementioned projects, an awareness of the power of self-representations, and with this, a growing desire for greater user control over data presentation.

Relationships, routines and rhythms

Of the 59 confessional data selfies in our final sub-sample of the top 1000 posts made to r/DataIsBeautiful, we identified three broad themes or category of confessional data selfies: families and relationships (like posts charting children’s first words, analyses of text messages between partners and dating histories), routine management (mapping data like household spending, study habits, technology use and ‘lifelogging’) and finally body rhythms (such as visualisations of sleeping patterns, health measures and steps taken). Before we get to these categories, we first report on some broad analysis of these 59 posts.

The confessional data selfies in our sample ranged in their ‘karma’ from 57,870 for the top post (‘My daughters sleeping patterns for the first 4 months of her life’; see Figure 1) to 6309 karma for the least upvoted post in our sample (‘My three year old maintains her favourite colour is pink . . .’). The average karma on posts in our sub-sample of 59 was 21,548. In terms of comments, the most commented-on post attracted 4771 comments (‘What my gross income of 60000€/year is actually used on in Europe, Austria’) with the smallest number of comments being 124 (‘I made an animation using GPX of my 5K progression at the Edinburgh Park Run event’). The average number of comments per post was 1164. Thus, in terms of both karma and comments, these posts were highly visible and public.

We also sought to identify the gender of the original poster (OP) for each of the 59 posts in our sub-sample. In many cases these were immediately evident and explicit, but in other cases we undertook more extensive investigation to ‘decode’ gender. This involved searching through post and comment history, and considering how the OP responded to comments where gendered pronouns were used. Clearly this is not a perfect process and is open to bias and misinterpretation. Two of the authors (one who identifies as male, one female) undertook the analysis separately and conferred on interpretation to introduce some level of intercoder reliability. We found that 38 (66%) of the 59 posts in our sub-sample were made by users we were confident identified as men (‘I’m a straight male’), nine (15%) we coded as probably male (inferred based on masculine username, using masculine gendered language in comments such as ‘bro’ and ‘man’), five (9%) we confidently coded as female (‘I’m a big data mom’; ‘my 180 days of lesbian dating’) and for six posts (10%) we were unable to discern gender. We acknowledge this partial and imprecise and does not account for non-binary redditors, or OPs who did not correct others on pronouns or gender assumptions when responding to comments.

Families and relationships

There were 16 posts that we coded into the broad category of family and relationships, including four posts relating to children (as the most upvoted and least upvoted examples in our sample of top posts exemplify), three on dating, and nine on relationships.

Posts involving children related to sleeping patterns, first words and favourite colours. Arguably, posts about children are not ‘selfies’ (insofar as selfies strictly represent one’s self), but we have included them here in framing children as extensions of their parents, at least in their infancy, and also because these posts confess or reveal something about the OPs. As Rettberg (2014) argues, children can be presented by parents ‘as metonyms: this is part of me’ (p. 41). Data visualisations of sleeping patterns (Figure 1) of infants do of course reveal parent actions – being awake to comfort a newborn, to feed them, to change them – so the visualisation is a record of parents’ sleeping patterns as much as it is of the child’s. Similarly, posts that document children’s language development like ‘I tracked my daughter’s first words from 12–18 months’ and ‘I’ve tracked all my son’s first words since birth’ are about children, but they are also about the environment parents have created, through the language they use, media they might expose children to, the involvement of friends and family, and so on.

These data selfies then, using Leaver’s (2017) arguments regarding ‘intimate surveillance’ with digital infant-monitoring tools, are outcomes (and once visualised, evidence) of ‘care’ – but also speak to the growing normalisation of digitally mediated surveillance practices. Here, ostensibly, parents are ‘responsible’ for first words, and thus children are subjects within the data selfie. For Lim (2020), this is the reality of the ‘transcendent parent’ (p. 2–5) of the 21st century – a complex and entangled online/offline role that is often belied by the neat selfie images above. Media outlets, medical experts and academic raise serious concerns for digital data rights and this issue will test both the ‘emotional maturity and technological prowess’ of parents (Lim, 2020: 4). But this particular permutation of ‘other-tracking’ (see Lupton and Williamson, 2017), beyond the scope of this article.

The other two groups of posts under this broad category of families and relationships are closely related. We identified three visualisations that documented the dating experience, which we separated from a group of nine posts that traced romantic relationships over a longer period of time. We distinguished here between ‘dating’ and ‘relationships’ as dating posts were more about dating apps, and all three were variations on a genre of confessional data selfies that involve sharing dating app data to produce narratives related to success or failure. The three emblematic data selfies in our sample were ‘My 28 Days on Tinder’, ‘My 500 days on OkCupid’, and ‘My 180 Days of Lesbian Online Dating’. These data selfies all took the form of a Sankey diagram, charting the number of messages sent and received, and the outcome from those messages. See Figure 2 for an example, reproduced here with explicit permission.

Figure 2.

‘My 500 days on OkCupid [OC]’ post to r/DataIsBeautiful by u/meatenigma, with 55.4k karma/upvotes, 3.8k comments.

These Sankey diagrams tell a compelling story. For u/meatenigma, for instance, as can be seen in Figure 2, he initiated all but a few conversations in this 500-day period on OkCupid, he did not get a response to most of the messages he sent, and of 44 conversations, had but 4 ‘actual dates’. This data selfie invites questions and comparison. What is the difference between a conversation that ‘ended’ versus a conversation that ‘petered out’ or a conversation where OP was ‘ghosted’? What happened when he was stood up? How did the dates that he did have turn out? Indeed, this particular post was one of the most commented upon posts in our sample, with 3909 comments. These comments questioned, hypothesised, empathised and joked with the OP and other redditors. This post also became a template for a genre of dating-related Sankey diagram. Two others were captured in our data sample, and we have observed many since.

There were also nine posts that documented other aspects of relationships. Seven were visualisations of romantic relationships, two of which documented the OPs heart rate at the end of relationships (‘My heart rate watching my SO leave the country’ and ‘Heart rate when my wife asked for a divorce’), and four were visualisations of digital messaging data (such as ‘Smileys used in conversation since I met my gf’ and ‘I built a tool to visualize WhatsApp chats . . . with my girlfriend!’). Interestingly, two of these four data selfies based on analyses of message histories were framed as ‘gifts’ for partners, such as ‘Made this for my bf on our one year anniversary’, which broke down 425 messages over a 1-year period into frequency counts of messages, words, emoji and so on, and a similar post titled ‘My girlfriend made a visualization of all messages in the first year of our relationship for Valentines’. This latter post was based on a sample of 44,331 messages over a 344-day period, where the OPs girlfriend found they exchanged on average 129 messages per day and their average message length was 4.6 words. The final post in the group of seven on romantic relationships was an in memorim, a ‘Simple word cloud from writing I did for my girlfriend, who passed a bit over a year ago’.

According to Lambert (2016: 75), relationship tracking (through digitally quantified means) imports the norms of psychotherapy and makes the unpacking of emotional connections integral to the functioning of said relationships. Yet, for Lambert (2016), this tracking tends to spiral into further quantification and experimentation and ultimately fails to complete this interrogation (p. 83). The subreddit reflects this argument, as data selfies that confessed or revealed something about a relationship – like the agony of separation or word frequency analyses of message histories – tended to attract the most upvotes and comments. Even in posts that were more general (like the ‘lifelogs’ described below), there was often postulation in the comments about romantic relationships within these data.

While r/DataIsBeautiful is prone to jokes about sex, promiscuity and masturbation, the above examples show that users (to quote another punny reply) ‘come to r/DataIsBeautiful for the plots’. Despite the focus on numerical data, data selfies sidestep the reductive and often highly gendered tendencies of sex-quantification apps and visualisations, outlined by Lupton (2015). Although the shift from normative assessments of ‘sex’ to ‘relationships’ is not an inherent improvement: as Danaher et al. (2018) argue ‘what gets measured gets managed’ (p. 7). In a ‘quantified relationship’, metricisation remains a tool of management, surveillance and gamification. While they do not overhaul gender paradigms, the data selfies in our sample point to how these forms of self-sharing offer a largely heterosexual male user-base an intermediary through which to be vulnerable, celebrate family and romance, remember and share grief.

Routine management

There were 27 posts that we coded under routine management, the largest of our three broad categories, including three posts on ‘lifelogging’ (recording what you are doing every hour for a year, for instance), two posts that tracked location, six posts on how money was spent (all following the Sankey diagram format as seen in Figure 2), seven tracking study habits and nine log technology use. It is also our most ‘leaky’ category, with many of these 27 posts straying into the territory of family and relationships or body rhythms (our final category of posts). The concentration of posts in this category is likely due to the availability of the requisite tools: There are a lot of software services available to manage budgets and appointments – and these often intersect with location logs and calendars. While ironically time-consuming, ‘time’ is a foundational metric for personal organisation, one often addressed by histories of self-tracking tools like diaries and calendars (Rettberg, 2014; Wernimont, 2018). It forms the basis of much ‘lifelogging’-styled self-tracking today.

The posts on lifelogging were the most involved and detailed confessional data selfies in our sample, such as ‘Every hour of my 2016 charted by category’ and ‘Every Single Hour Of My 2017 Recorded’. Each hour of activity is itemised and coded (often using a spreadsheet) and is then graphed for analysis. Thus, these visualised lifelogs include length of sleep, time spent at school, playing games, travelling, hours in paid work, time with family or with a partner, and so on. Given the depth of these logs, the comments are often investigative and ask after anomalies uncovered in the visualisations. Why did the time spent with a partner disappear, and then reappear months later? What happened the night where OP did not sleep at all? The lifelogging data selfie confesses life patterns in detail, and then elicits curiosity, inviting viewers to probe for narrative.

Bell and Gemmell (2009) see lifelogging practices such as purposive, aimed at bypassing fallible biological memory. Barnet (2010) argues that the quest for recording and remembering is to put one’s self in a ‘dual state: at once pack-rat and amnesiac’. Barnet (2010) explains that to archive obsessively, to record everything in a digital culture, is also to ‘witness the death of memory’ (p. 218). And yet, these lifelogging confessional data selfies in our sample go beyond these motivations for lifelogging and work to forge – as Senft and Baym (2015) contend selfies more generally do – relationships, built on the transmission of human feeling.

The biggest group of posts in this category of routine management was on technology use (n = 9). Pre-installed step and location-tracking apps for smartphones have become near-ubiquitous, and a new trend has emerged whereby devices – through, for example, Apple’s ‘Screen Time’ or Google’s ‘Digital Wellbeing’ – capture meta-level traces of activity, providing users statistics on their app use. There is an ironic pivot here (and noted elsewhere) from technology creating the problem to technology creating the solution to disrupt or sedentary, screen-focussed lifestyles (see Lupton, 2016; O’Neill and Nansen, 2019). One such visualisation, ‘I tracked my phone usage before and after disabling notifications’, shows how OPs time spent on his phone went from 74.9 minutes per day to 61.4 minutes per day after disabling notifications.

Another post in this theme is self-referential to reddit’s interest-based conversations, ‘Plotting my interests over 5 years through subreddit comment distribution’, while yet another was a ‘Network visualization of my twitter followers and their followers’. These two examples are reminiscent of the kinds of data visualisations seen in social media research, such as Bruns and Moon’s (2019) research that maps a day in the life of a ‘national Twittersphere’. Like Bruns and Moon (2019), the redditor who made the data selfie on their Twitter network used Gephi software to produce the network visualisation. On r/DataIsBeautiful, the production of data selfies strays into the methodological territory of social network analysts, sociologists, demographers and media studies scholars. But rather than attending to broader social patterns or media trends, the focus in the data selfie is on the self, for self-reflection. The Twitter network data selfie in our sample contained a deeply confessional aspect, with the network map centres on two large central nodes, one coloured blue and the other pink. OP explains, ‘almost three years after her passing, my late girlfriend is still my strongest connection’.¹ Here, the network visualisation moves from being an abstract but interesting data selfie to a representation of connection, loss and pain, taking on a quality of what Barthes (1981) calls ‘punctum’ – that aspect of an image that holds our gaze. In the act of confession and revelation, that transmission of human feeling occurs.

We identified six finance-focussed confessional data selfies, categorised here under routine management because of the way they often centred on routine spending habits such as ‘Where my monthly paycheck goes – Japan’ and ‘What my gross income of $100,000 is actually used for in Seattle’. These posts again often followed the popular Sankey diagram format (like in Figure 2), with monthly income on the left, broken down into where that money went, from general areas of expense like tax, health insurance, retirement/pension savings, rent, travel and food, down to granular spending such as Netflix subscriptions, motorcycle costs, tattoos and videogames. In the comments on these posts, there were also comparisons across different countries, prompting discussions about how nation’s taxes worked; differing costs of education, rent and health insurance; and reflections on how policies shaped these outcomes. These visualisations touch on a topic that is usually private – how we spend money and justify our consumption practices. In this way, they clearly confess something about the OP, both in terms of where their priorities are (what percentage of income is spent on rent versus put towards savings, for instance) but also how they might be constrained by taxes or by care commitments. These indicators of priorities and constraints then produce discussions, where other redditors ask questions and make comparisons to their own contexts.

Another group of posts under the routine management category were related to study habits, such as mapping time spent on tasks to assessment results, monitoring heart rate during a PhD exam and tracking time slept in the final year of a graduate degree. While each post tells a story, confesses something about how people navigate study and elicits comparison with readers, one post in this group was particularly confessional. ‘Graph of my GPA per semester in college’ charted a decline in GPA over 3 years that bottomed out at zero with an annotation reading ‘suicidal’, before bouncing back up again with an annotation ‘diagnoses and meds’. This kind of image is emblematic of the confessional data selfie, as a data visualisation that operates as an object that – to again return to Senft and Baym’s (2015) conceptualisation of what a selfie says – ‘initiates the transmission of human feeling in the form of a relationship’ (p. 1589). In this example, the mapping of a GPA over 3 years tells one story, of highs and lows, but when it is punctuated by annotations (‘suicidal’ and ‘diagnosis and meds’) it takes on a deeper quality, of personal revelation, an intimate confession, a layer of narrative over the graph. The top comments on this post – among more than 500 – are supportive, celebrative and warm: they commiserate, they lament the ways in which grades come to define a sense of self-worth, they relate, they rejoice, with one redditor saying ‘you have no idea how much hope this gives me’.

While these posts represent extensive labour – in recording routines of spending, time spent on tasks, mapping technology use and so on – they tend to capture mundane experiences. At the same time, they often confess and reveal aspects of the OP like the end of a relationship or a struggle with mental health. Following our conceptualisation of the confessional data selfie and existing qualitative research on the work of self-tracking, data about life’s routines act as a new form of a ‘mirror for the self’ (Lomborg and Frandsen, 2016: 1021). These redditors are drawing on these everyday data to produce these confessional data selfies that, in the sharing, connect with others, but also serve as sites of personal reflection.

Body rhythms

Broadly speaking, this third category accounted for general health or well-being-related posts (n = 6), sleep (n = 4), physical activity (n = 4) and two posts on masturbation. While the everydayness and relatability of metrics like ‘steps’ are appealing to self-trackers, r/DataIsBeautiful tends to be invested in more detailed and apparently revealing metrics. A key example here is sleep data. The post ‘I’ve tracked and plotted my nightly sleep in Excel for the last three years’ presents this longitudinal data, plotted as a weekly rolling average. Redditors then speculate on a range of social events that may impact cycles of more or less sleep across the graph in the comments: Christmas ‘benders’, children, breakups, video gaming and sex are all suggested. The OP eventually explains that the cause was the stress of buying a house, and in addition, the sleep-tracking process itself was a by-product of self-tracking for a neurological condition. This example highlights some of the tensions between medical and social considerations of sleep (O’Neill and Nansen, 2019).

The appearance of posts related to bodies and health here is unsurprising, given the focus of much consumer self-tracking software. For Millington (2016), there is a second fitness boom in the early 21st century, mediated by digital technologies and data – such as smartphone apps, exergames and wearables. These are often framed under ‘healthist’ discourses, and engagement with these carry a sense of moral obligation. In many ways, body rhythm data selfies reflect this. Recorded and then visualised over time, data about bodies often featured a denouement, a moment of insight where some resolution is found and action is prompted. In the case of sleep statistics, this was often an identification of negative behaviour patterns, such as ‘My inefficient sleeping rhythm and how much I need to sleep in on the weekends (based on 1,000 days of data)’. Similarly, the post ‘My health decline and eventual brain surgery’, traced the OP’s daily activity vis-à-vis this illness, both before and after diagnosis.

The notion of confession as part of a story about self-development is not new – especially not where bodies are concerned. Following Foucault’s arguments around confession, studies of weight loss forums (Couch et al., 2019; Grønning and Tjora, 2018) have highlighted how confessional communications perpetuate dominant societal beliefs, and also responsibilise (or if taboo, inculcate) the individual in an ‘act of panoptic self-surveillance’ (Couch et al., 2019: 10). Taking this to a particular extreme, the two posts in our sample on masturbation are very detailed: ‘My Masturbation Habits for the Last 3 Months Visualized’ and ‘I recorded my masturbation habits for a fourth and final year’. In the former post, OP’s data selfie includes annotations such as ‘apparently I didn’t like to jerk off on Tuesdays’, and that he found he used porn to masturbate 92.1% of the time, revealing and confessing practices and patterns that are largely taboo in public discussions.

The final group of posts in this category are related to personal physical activity, such as ‘Three years of running around my neighborhood’. Taken together, the discussion in the comments of body rhythm posts in our sample were often positive, with encouragement and empathy being offered – especially around cases of illness and where the OP interpreted some kind of personal failure. In some instances though, commenters took up the confessional practice and were self-admonishing their own lack of activity. This is further evidence that responses to the sharing of self-tracking practices are often polarised: extreme performances are lauded, but sit in tension with the guilt, shame and jealousy of others – representing the pitfalls of ‘pushed’ and ‘communal’ tenets (cf. Lupton, 2016) of these monitoring practices.

Conclusion

What does the confessional data selfie do? Following Senft and Baym (2015) and Tiidenberg and Whelan (2017), in this article we have sought to further extend the conceptual lens of the ‘selfie’ to a new (but in some ways long-standing) practice of sharing personal data in the form of a data visualisation. As we have discussed through a sub-sample of 59 posts on r/DataIsBeautiful, the confessional data selfie invites analysis, elicits personal story-telling and opens one’s life up to others. The data selfie is an object that ‘initiates the transmission of human feeling in the form of a relationship’ (Senft and Baym, 2015: 1589) through what Barthes (1981) described as ‘punctum’ – that aspect of an image that holds our gaze. At the same time, the data selfie is also a practice, a gesture, that can send ‘different messages to different individuals, communities, and audiences’ (Senft and Baym, 2015: 1589).

Are data selfies, selfies for dudes? On r/DataIsBeautiful, the confessional data selfie has offered this user-base of predominantly straight men a space for open discussions about mental health, trauma and loss and offered opportunities for collectivity, celebration, commiseration and sharing aspects of self. Wernimont (2018: 110–120) traces the inception of self-tracking as a male domain through practices of cartography and timekeeping – through devices like the pocket-watch and, later, the pedometer. These contrasted the more feminised records of ‘life writing’. However, in this modern incarnation – the confessional data selfie – we see a blending of these practices of self-representation in a new form, where quantification is key, but also where personal life narratives reveal, resonate and elicit responses (in the form of upvotes, comments and spin-off reaction data selfies).

Importantly, at least on r/DataIsBeautiful, the valorisation of quantitative data as the object of the data selfie seems to inoculate OPs from the same criticism that many photographic selfie takers are subjected to (especially young women, as per Warfield, 2017: 78) such as shallowness, narcissism and self-absorption (Sorokowski et al., 2015; Williams and Marquez, 2015). Furthermore, the confessional nature of these data selfies – while often appearing secondary to the data in the initial presentation of the selfie – also appears to be the most compelling (commented-on) aspect. There is a paradox here, then, bound up in gendered processes of impression management and appearance. The confessional data selfie is deeply self-interested, reflective and revealing, often requiring equal if not more labour than a photographic selfie, but is not rebuked or belittled as narcissistic or indulgent.

In this article, we have theorised this novel form of digital self-representation that also has an established community. Self-tracking, and the sharing of this self-derived information, has a long history; however, the technologies we use are increasingly producing more elaborate records of our lives and bodies. These personal data raise new concerns around surveillance capitalism while also generating new opportunities for self-reflection, connection and sharing of experiences. It is our argument that confessional data selfies represent a compelling form of story-telling and self-representation, beyond the faces and bodies central to dominant photographic selfie practices, and provide alternate representations of lives and narratives through meaningfully interpreted personal data.

Footnotes

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors received support for this research from Monash University’s School of Social Sciences.

ORCID iD

Brady Robards

Notes

Author biographies

Brady Robards is a senior lecturer and Australian Research Council senior research fellow in Sociology at Monash University. He studies digital cultures, with a focus on how people use and produce social media.

Ben Lyall is a research fellow and digital sociologist at Monash University. His research uses mixed-method approaches to explore digital media use in the social world. His current research explores digital health, with a focus on wearable technology and associated self-tracking apps.

Claire Moran is a PhD candidate and research assistant at Monash University, working in the Monash Migration and Inclusion Centre. Her thesis examines the social media practises of African youth in Australia, with a particular emphasis on how digital experiences impact on a sense of belonging.

References

Barnet

(2010) Pack-rat or Amnesiac? Memory, the archive and the birth of the Internet. Continuum: Journal of Media & Cultural Studies 15(2): 217–231.

Barthel

Stocking

Holcomb

, et al. (2016) Nearly eight-in-ten Reddit users get news on the site. Pew Research Center, 25 February. Available at: https://www.pewresearch.org/wp-content/uploads/sites/8/2016/02/PJ_2016.02.25_Reddit_FINAL.pdf

Barthes

(1981) Camera Lucida: Reflections on Photography. New York: Hill and Wang.

Beer

(2019) The Data Gaze: Capitalism, Power and Perception. London: SAGE.

Bell

Gemmell

(2009) Total Recall: How the E-Memory Revolution Will Change Everything. London: Penguin Books.

Boyd

(2010) Social network sites as networked publics: affordances, dynamics, and implications. In: Papacharissi

(ed.) Networked Self: Identity, Community, and Culture on Social Network Sites. New York: Routledge, pp. 39–58.

Braithwaite

(2016) It’s about ethics in games journalism? Gamergaters and geek masculinity. Social Media+ Society 2(4): 1–10.

Bruns

Moon

(2019) One day in the life of a national Twittersphere. Nordicom Review 40(s1): 11–30.

Cardell

(2014) Dear World: Contemporary Uses of the Diary. Madison, WI: University of Wisconsin Press.

10.

Couch

Han

Robinson

, et al. (2019) Men’s weight loss stories: how personal confession, responsibility and transformation work as social control. Health 23: 76–96.

11.

Danaher

Nyholm

Earp

(2018) The quantified relationship. The American Journal of Bioethics 18(2): 3–19.

12.

Duc

HDT

Mir

(2016) Data selfie. Available at: https://22-8miles.com/data-selfie/ (accessed 23 August 2019).

13.

Esmonde

Jette

(2018) Assembling the ‘Fitbit subject’: a Foucauldian-sociomaterialist examination of social class, gender and self-surveillance on Fitbit community message boards. Health 24: 299–314.

14.

Fiesler

Proferes

(2018) ‘Participant’ perceptions of Twitter research ethics. Social Media + Society 4(1): 1–14.

15.

Foucault

(1998 [1976]) The History of Sexuality: 1: The Will to Knowledge. London: Penguin Press.

16.

Grønning

Tjora

(2018) Digital absolution: confessional interaction in an online weight loss forum. Convergence 24(4): 391–406.

17.

Kim

Riche

, et al. (2019) DataSelfie: empowering people to design personalized visuals to represent their data. In: CHI’19: proceedings of CHI conference on human factors in computing systems, Glasgow, 4–9 May, pp. 1–12. New York: ACM.

18.

Lambert

(2016) Bodies, mood and excess: relationship tracking and the technicity of intimacy. Digital Culture & Society 2(1): 71–88.

19.

Leaver

(2017) Intimate surveillance: normalizing parental monitoring and mediation of infants online. Social Media + Society 3: 1–10.

20.

Lim

(2020) Transcendent Parenting: Raising Children in the Digital Age. Oxford: Oxford University Press.

21.

Lomborg

Frandsen

(2016) Self-tracking as communication. Information, Communication & Society 19(7): 1015–1027.

22.

Lupton

(2015) Quantified sex: a critical analysis of sexual and reproductive self-tracking using apps. Culture, Health & Sexuality 17(4): 440–453.

23.

Lupton

(2016) The Quantified Self. Cambridge: Polity Press.

24.

Lupton

Williamson

(2017) The datafied child: the dataveillance of children and implications for their rights. New Media & Society 19(5): 780–794.

25.

Massanari

(2017) #Gamergate and the Fappening: how Reddit’s algorithm, governance, and culture support toxic technocultures. New Media & Society 19(3): 329–346.

26.

Millington

(2016) Fit for prosumption: interactivity and the second fitness boom. Media, Culture & Society 38(8): 1184–1200.

27.

Mills

(2018) Pop-up political advocacy communities on Reddit.com: SandersForPresident and The Donald. AI & Society 33(1): 39–54.

28.

Neff

Nafus

(2016) Self-tracking. Cambridge, MA: MIT Press.

29.

O’Neill

Nansen

(2019) Sleep mode: mobile apps and the optimisation of sleep-wake rhythms. First Monday 24(6). Available at: https://dx-doi-org.web.bisu.edu.cn/10.5210/fm.v24i6.9574

30.

O’Neill

(2019) Synaesthesia and cycling data art: towards cross-modal representations of self-tacking cycling data. Digital Creativity 30(2): 127–140.

31.

Ravn

Barnwell

Barbosa Neves

(2019) What is ‘publicly available data’? Exploring blurred public – private boundaries and ethical practices through a case study on Instagram. Journal of Empirical Research on Human Research Ethics 15: 40–45.

32.

Reigeluth

(2014) Why data is not enough. Surveillance & Society 12(2): 243–354.

33.

Rettberg

(2014) Seeing Ourselves through Technology: How We Use Selfies, Blogs and Wearable Devices to See and Shape Ourselves. Basingstoke: Palgrave Macmillan.

34.

Rettberg

(2018) Apps as companions: how quantified self apps become our audience and our companions. In: Ajana

(ed.) Self-Tracking: Empirical and Philosophical Investigations. Basingstoke: Palgrave Macmillan, pp. 27–42.

35.

Robards

(2018) Belonging and neo-tribalism on social media site Reddit. In: Bennett

Hardy

Robards

(eds) Neo-Tribes: Consumption, Leisure and Tourism. London: Palgrave Macmillan, pp. 187–206.

36.

Senft

Baym

(2015) What does the selfie say? Investigating a global phenomenon. International Journal of Communication 9(Feature): 1588–1606.

37.

Sorokowski

Sorokowska

Oleszkiewicz

, et al. (2015) Selfie posting behaviors are associated with narcissism among men. Personality and Individual Differences 85: 123–127.

38.

Squirrell

(2019) Platform dialectics: the relationships between volunteer moderators and end users on reddit. New Media & Society 21: 1910–1927.

39.

Summit-Gil

(2017) Is reddit a community? Cyborgology, 23 March. Available at: https://thesocietypages.org/cyborgology/2017/03/23/is-reddit-a-community/ (accessed 22 May 2017).

40.

Tiidenberg

Whelan

(2017) Sick bunnies and pocket dumps: “Not-selfies” and the genre of self-representation. Popular Communication 15(2): 141–153.

41.

Van Der Nagel

Frith

(2015) Anonymity, pseudonymity, and the agency of online identity: examining the social practices of r/Gonewild. First Monday 20(3). Available at: https://firstmonday.org/article/view/5615/4346

42.

Van Der Vlist

Helmond

(2017) Speculative data selfies. Internet Policy Review, 1 March. Available at: https://policyreview.info/articles/news/speculative-data-selfies/449 (accessed 23 August 2019).

43.

Warfield

(2017) MirrorCameraRoom: the gendered multi-(in)stabilities of the selfie. Feminist Media Studies 17(1): 77–92.

44.

Weiser

(2015) # Me: narcissism and its facets as predictors of selfie-posting frequency. Personality and Individual Differences 86: 477–481.

45.

Wernimont

(2018) Numbered Lives: Life and Death in Quantum Media. Cambridge, MA: MIT Press.

46.

Williams

Marquez

(2015) The lonely selfie king: selfies and the conspicuous prosumption of gender and race. International Journal of Communication 9(1): 1775–1787.

47.

Zimmer

Kinder-Kurlanda

(2017) Internet Research Ethics for the Social Age: New Challenges, Cases, and Contexts. New York: Peter Lang International Academic Publishers.