Abstract
This article investigates a rapidly expanding branch of journalism innovation in online news media. The umbrella term computational exploration in journalism (CEJ), embraces the multifaceted development of algorithms, data, and social science methods in reporting and storytelling. CEJ typically involves the journalistic co-creation of quantitative news projects that transcend geographical, disciplinary, and linguistic boundaries. Drawing on extensive empirical data, this article provides a conceptual overview of the field by identifying three main pathways of computational exploration in journalism: the newsroom approach, the academic approach, and the entrepreneurial approach. Implications for changing journalistic practice are discussed, and the theorizing is summed up in a triplex proposition about changing mindset processes coming out of CEJ. The study indicates that the computational exploration not only leads to innovative uses of the technology, but also to innovative ways for journalists to think and behave; journalism innovation leads to innovation journalism.
Keywords
Introduction
In the globalized, digital age, governments, organizations, and individuals across countries might store, and potentially have access to, unlimited amounts of structured and unstructured data. Large quantities of political, economic, populative, and other quantitative data are released in public databases (Flew et al., 2012; King, 2011), and computing experts can easily search and do text-automated analyses of more than 100 million social media posts per day (Hopkins and King, 2010). In practice, many countries are in the midst of a database revolution in which a great variety of actors – from politicians, bureaucrats, academics, web developers, and journalists to the general public – are discovering and conquering new ground in information collection and dissemination.
At the same time, individuals from government down and citizens up are increasingly experiencing a state of information overload. Information surplus is identified as a main constituent in the transition from a control paradigm to a chaos paradigm in the globalized news culture (McNair, 2006). But even if accessibility and diversity are constantly growing and database storage capacity is unlimited, knowing how to find what one is looking for is a threshold many people have yet to cross. McNair (2006) has termed the changing relationship between journalism and power as ‘cultural chaos’, and several studies indicate that information overload affects decision-making negatively (Buchanan and Kock, 2000; Eppler and Mengis, 2004). Thus, researchers and media CEOs as well as political decision makers foresee that being able to retrieve information and identify patterns by cutting across immense quantities of data is crucial for the further democratic transparency of society (Flew et al., 2012; King, 2011; Meyer, 2002[1973]). Moreover, investing in content computerization is considered key for news organizations that aim to cope with efficiency pressures in the news market (Pavlik, 2013).
In the western world, major news organizations, such as the Guardian in the United Kingdom and the New York Times in the United States, have manifested leading roles in the transnational open-source movement. These news outlets have taken tremendous steps to make governments open their databases, and to build their own huge databases where datasets are missing. The Guardian, for example, launched Content API and DataStore, which are called the open-platform initiative. The main actors in the open-source movement across countries are identical to those who explore computing potential in journalism: journalists, software developers, and computer scientists inside and outside of established news organizations.
At the same time, observers note that governments are ‘tightening control over access to information under cover of the so-called “war of terror”’ (Nielsen, 2012: 61). Cost issues are listed by government representatives as a main obstacle against more transparency (Compact Voice, 2012; Ovrebo, 2011), along with a fear of data misinterpretation (2011). Using quantitative data in investigative reporting and accountability journalism is also challenging because much data is not originated digitally or collated centrally within governmental departments.
Nevertheless, it is easily taken for granted that the news media, as long-time providers of quantifiable news beats to many audiences, will naturally claim a leading position in the extended quest for freeing governmental datasets. Whereas much theoretical and practical research has been focused on the technological promises of computing on journalism per se (Cohen et al., 2009, 2011; Flew et al., 2012; Hamilton and Turner, 2009), less is known about innovation processes that take place in knowledge institutions such as the news media. In particular, little is known about human aspects of extended technological approaches to journalism. Anecdotal evidence suggests that, within the industry itself, there is considerable confusion about human issues such as skill development, role changes, collaboration forms, ethical grey zones, and the application of new research methods.
Questions and data
This article investigates these issues from the inside out. By identifying the prevalent ideas, thoughts and actions of a spectrum of professional news media actors, the study provides a new perspective of what is going on beneath the general visioning of extended democracy building through the means of extensive database trawling in journalism.
The main research question of the study is: What are the most crucial factors for the future role of journalism in the process of applying software and new technologies to the collection, analysis and presentation of large quantity datasets retrieved from databases?
The subsequent theorizing of what I term ‘computational exploration in journalism’ (CEJ) is based on empirical data from online news accounts, listservs, and English-language journalism blogs over a period of one year. I define computational exploration in journalism as ‘the innovative processing that occurs at the intersection between journalism and data technology’. The concept embraces the experimental use of algorithms, data, and social science methods in the news media, a process that ranges from data retrieval and data analysis to data visualization. CEJ may include the building of new technological tools for data mining and visualizations, or it may imply using existing tools in new ways. Computational exploration in journalism transcends former geographical, disciplinary, and linguistic boundaries and is carried out at individual and group levels as well as at organizational levels.
The data material includes several hundred posts about the use of quantitative data in journalism, most of which were retrieved from the Data Blog at www.guardian.co.uk (Guardian Data Blog, 2013) in the UK and the listserv of the National Institute of Computer-Assisted Reporting (Nicar-L) in the USA. The running news data were supplemented by relevant research literature and conversations with actors in the data journalism realm.
The theorizing that came out of the study includes: (a) an identification of the main attitudinal stances of actors in the field; (b) an analysis of terminology generation in the same field; (c) categorization and case illustrations of main approaches to computing in journalism; and (d) an alignment of CEJ with innovation news coverage in general. The threads are pulled together in a triple hypothesis about the impact of computational exploration in journalism in the future.
High-tech optimists and practice-focused skeptics
In the journalism realm as well as in media research, there is a divide between high-tech optimists and practice-focused skeptics. The former are convinced that by means of CEJ, the most influential journalistic approaches of the future will be carried out by professional news media institutions (Cohen, 2011; Flew et al., 2012; Hamilton and Turner, 2009; King, 2011). The latter, the practice-focused skeptics, point out that the traditional news industry does not have enough of the innovation culture needed for such a complex transformation (Uskali et al., 2008).
If one follows the high-tech optimist stance, the solution seems to be an adaptive-generative approach: online technological innovations should be constantly adapted and generated for journalistic purposes. Implicitly, to get relevant and interesting news out of numbers in, for instance, governmental databases, journalists have the resources and tools that are needed (Cohen, 2010, 2011; Flew et al., 2012). High-tech protagonists go as far as to highlight that analyzing quantitative data will be journalists’ principal work in the future; their main argument is identical to that of accountability journalism: the press should be responsible for knowing how to work with large data sets in order to hold governments, or anyone else, accountable (Arthur, 2010; Flew et al., 2012; Meyer, 2002[1973]).
The high-tech visions are supported by a survey conducted by the Pew Research Center’s Project for Excellence in Journalism (2008), according to which 96 percent of American editors say that it is important for journalists to acquire computational skills. However, in a survey among Norwegian editors, only 10 percent of the news editors wanted to prioritize innovative projects in data journalism, whereas 55 percent said they would not be investing resources in data journalism (Ovrebo, 2010). The editors were not disinclined to invest in technology, rather, they revealed a disinterest in investing in human training. At the same time, more than half of the editors lament the lack of statistical skill among newsroom reporters (2010).
In contrast, a widespread complaint among practice-focused skeptics is the lack of innovative competition in conventional newsrooms. The skeptics point out that, within mainstream newsrooms, there is an insufficient awareness of the innovation processes that are needed to do such digging. Other skeptics insist that the missing culture for innovation in the traditional news industry is a main reason why some newspapers, TV stations, and radio stations are losing money and going out of business. For instance, Uskali et al. (2008: 23) claim that ‘Traditional media companies preserve and encourage the established values, instead of having an innovative mindset’, and that nourishing an innovative mindset is a question of culture.
It appears that both the high-tech optimists and the practice-focused skeptics are concerned about processes and effects of journalism innovation. Where they disagree is over resources and priorities to support this significant turn in global media history.
Creating meaningful terms
To communicate meaningfully – to define situations, experiences, tasks, thoughts, and the relationships among them – humans need to be in possession of terms or concepts that seem relevant to what they are trying to convey (Demers, 2005). During periods of significant change, however, anecdotal evidence indicates that human actions often take place ahead of naming the same actions. Similarly, when a multiplicity of new terms and concepts suddenly surface in a specific professional field, it typically signalizes that innovative processing is intensified and that many people are engaged.
Depending on positions and roles, different individuals tend to conceptualize the same field differently. This is true also for the evolving field of CEJ, which journalists, researchers, and web developers, to mention a few, might perceive quite differently. This article seeks to bridge that gap by providing an overview of both journalist and developer terms, and research terms; reciprocal knowledge of vocabularies might open up further interdisciplinary collaboration and co-creation. Also, the coupling of theory-focused and practice-focused terms provides a more extensive grip of what is going on in the field; it highlights the fact that the evolving experimentation with data in journalism may take different forms, depending on access to human, technological, and financial resources. The many new terms speak to the fact that journalistic innovations, like innovation in general, are about cognitive expansion, and that they deal with changing values as well as with changing thought systems (Kazanchi et al., 2007; Schumpeter, 1934; Uskali et al., 2008).
During the first decade of the 21st century, new terms such as data journalism, data-driven journalism, computational journalism, journalism as programming, programming as journalism, open-source movement, and news applications have all contributed to the general conception of what is happening at the intersection of new technology and journalism. Among insiders, discussions typically contain a mix of technological jargon and a multitude of new concepts, as in this example from Sir Tim-Berners Lee, the founder of the World Wide Web:
[The Web’s future] lies with journalists who know their CSV from their RDF, can throw together some quick MySQL queries for a PHP or Python output … and discover the story lurking in datasets released by governments, local authorities, agencies, or any combination of them – even across national borders. (Quoted here from Arthur, 2010)
The quote exemplifies the presence of jargon in innovative processes, which can be seen as ‘a vocabulary of action’. Often, jargonizing and conceptualizing go hand-in-hand. The use of jargon can distinguish ‘insiders’ from ‘outsiders’, and in an early phase of any innovative process, it might be challenging to predict which terms are going to survive and thrive in the long run. There is usually a fine line between jargonizing to demonstrate status, and the emergent development and use of sustainable concepts. Nevertheless, in the following section, I propose a prolonged existence for a few. Structurally, three main but diverging pathways for handling large data sets are identified here, and the focus of all three is on databases – both those that already exist and those that have yet to be built.
The newsroom approach
The newsroom approach refers to CEJ that takes place within the structural frames of established news media institutions. The newsroom approach is based on journalistic traditions for handling quantitative data that actually go back more than half a century. It embraces data journalism and the open-source movement and is particularly concerned with cross-disciplinary collaboration. The newsroom approach integrates several sub-categories of CEJ which are clearly interrelated but spring out of temporally, geographically, and disciplinary different contexts. Implicitly, the overview of the newsroom approach also provides an historic background for computing in journalism in modern times.
Computer-assisted reporting (CAR)
Although high-tech terms used by news professionals have been much discussed during the last decade, the most widespread term within American newsrooms is still ‘computer-assisted reporting’, or CAR for short. CAR includes techniques such as data searches on the web, spreadsheet and/or statistical analysis of various public records, and geographical and other information mapping. CAR also allows users to seek background information online and to interview people by email or social media. The term ‘computer-assisted reporting’ emerged long before digitalization of the media. The concept can be traced back to 1952, when CBS used a computer program to analyze aspects of a presidential election in the United States (Meyer, 2002[1973]). Another term, ‘database journalism’ (15,100 hits on Google as of 10 February 2013), also emerged in the 1950s and has been used as a synonym for computer-assisted reporting.
Precision journalism, Philip Meyer, and NICAR
One of CAR’s foremost developers since the 1960s has been Philip Meyer, whose seminal work was Precision Journalism: A Reporter’s Introduction to Social Science Methods (2002[1973]). Meyer stood in the journalistic tradition, but advocated that journalists should learn adequate research methods from scientists. According to Meyer, precision journalism was perceived by his colleagues as more or less an equivalent to survey research, and he himself was introduced as a ‘computer journalist’. In the 2002 version of his book, Meyer points out that he was quite disappointed to be awarded this, in his view, degrading title, since his intention was ‘to encourage my colleagues in journalism to apply the principles of scientific method to their task of gathering and presenting the news’ (2002[1973]: vii).
It still took 16 years before NICAR, the National Institute of Computer-Assisted Reporting, was established in 1989. Since then, thousands of reporters from the USA and more than 30 other countries have been trained in applying computing to their journalistic activities through NICAR’s independent, organizationally overarching program for investigative editors and reporters. In addition to the annual IRE (Investigative Reporters and Editors) and CAR conferences, NICAR runs a number of boot camps, database services, an online listserv, and other discussion forums.
From an innovation perspective, the 60 years between 1952 and 2012 is a long time. Computers were used to analyze huge governmental records as early as 1969 (Meyer, 2002[1973]), and investigative journalists have built their own quantitative databases since the early 1990s. Elections, health statistics, national budgets, and other numerical variables are traditional, well-known quantitative sources in journalism. What is new in the digital era is the extended dimensioning and accessibility of computational opportunities inside and outside of news organizations. Fewer than 5000 editors and reporters are registered on the NICAR mailing list, which might be explained by the fact that CAR is mainly associated with investigative reporting, a small branch of journalism. Some people in the industry tend to speak of CAR as the earlier generation of data devoted journalists.
The gradual growth of CAR is nevertheless of great importance, since it built a foundation for the current newsroom approach as well as the entrepreneurial and academic approach to digital handling of quantitative data in journalism.
Data journalism: An extended newsroom approach
The growing academic focus on innovation in journalism is also evident in ‘data-driven journalism’ (458,000 hits on Google as of 10 February 2013), more often referred to as ‘data journalism’ (986,000 hits on Google as of 10 February 2013). This direction, unlike computer-assisted journalism, deals only with open data – data that is freely available online and can be analyzed with freely accessible open-source tools. Data journalism refers to the process by which journalists use numerical data in databases as their primary news material. When adopted by the guardian.co.uk , the world leading pioneer in the field, the term rapidly spread in Europe and in the USA. The Guardian’s Content API and Data Store are called the open-platform initiative, since the Guardian not only do original research on data they have obtained; their Data Blog also provides a searchable index of world government data which contains more than 800 datasets (as of 10 February 2013).
The Guardian’s readers are encouraged to help analyze data sets, provide feedback or additional data, formulate research questions, and submit applications and visualizations that they have created from accessible data in the Data Store. According to the data blog editor Simon Rogers, ‘data has proven to be just as popular with non-developers – regular people who want the raw information’ (Stray, 2010: 1).
A successful example of cross-disciplinary teamwork
So far, I have identified and further developed the terminology of CEJ, pointing out the newsroom approach as a main pathway. All these initiatives, however, presuppose the existence of some human, journalistic qualities that are easily taken for granted. To get a better feel of the innovative processes going on within a newsroom environment, I examine a successful example of cross-disciplinary teamwork carried out by the Toxic Waters team working for the New York Times. Cross-disciplinary teamwork approaches involve the full integration of computing skills in newsroom work processes, but do not require each news professional to possess a full repertoire of computational thinking and skills. This example is significant, since it has general implications for issues of collaboration in CEJ.
In 2010, the reporter Charles Duhigg and his team received the IRE Medal, one of the most prestigious awards for investigative journalism in the United States (Alvares, 2010). Duhigg was the project leader and writer of ‘Toxic Waters – a series about the worsening pollution in American waters and regulators’ response’. 1 The series was developed by Duhigg and a database-and-multimedia team comprising seven programmers and two videographers. The goal of this investigative team was to inform the public about the quality of drinking water across the nation. The series started out with stories about individuals who became ill because of too much heavy metal and other toxins in tap water. Since a thorough investigation of data on the topic resulted only in a fractured collection of databases, Duhigg’s team members built their own database.
The data were collected by sending more than 500 Freedom of Information Act requests to all states and several federal agencies (Duhigg, 2010). The investigative reporters and data experts gathered an exhaustive amount of information on how to protect drinking water from the government records, which were of low quality. Shockingly, they found that, since 2004, 62 million Americans had been exposed to toxic tap water. The stories were complemented with quantitative data from databases and with interactive graphics that documented the situation in every state and every city in the country. The databases, which were searchable by anyone interested, generated millions of clicks. According to the IRE Award board, the result was ‘a sweeping indictment of the system’. The series led to crackdowns, new environmental rules in the United States, and new appropriations for clean-water projects.
The New York Times project is particularly interesting for at least three reasons. First, the series illustrates that high-quality CEJ requires ‘thinking at multiple levels of abstraction’ (Wing, 2006: 33), which implies understanding scale and its consequences, for economic and social reasons. It also requires news professionals to have clear-cut journalistic goals, to be able to spot significant societal problems, to think big, to work their way through bureaucracies, and to participate in teams of people who have diverse skills and understandings of what the project will entail. Second, the series made evident that, through methods of data mining, screen scraping, and a number of other technological tools, there is a rapidly growing quest among people to get access to untapped sources of data, particularly in the public sector.
Third, the series tells something about the resources needed to produce quality journalism: one of the world’s largest and most influential news organizations invested in 10 experts over a period of several months. This was no small achievement, but, according to Duhigg (2010), it did take a lot of work and much skill to keep such a big group of complementary experts working on a single project for so long. In most newsrooms, even in the New York Times, there is a shortage of human resources to carry out time-consuming investigations.
The entrepreneurial approach
The second approach, the entrepreneurial, is focused on the constitution and maintenance of the database that web or mobile applications can be built upon, and is exemplified by the Adrian Holovaty initiative. Inventions within a variety of newsroom structures support the general truth that innovation and change usually start with the ideas of individual creators. After the turn of the 21st century, the funding crisis of the traditional news industry has prompted hundreds of online startups in both the United States and in Europe, mostly non-profits. The competencies of investigative reporters have to a large extent moved from established newsrooms to the entrepreneurial realm.
Many people insist that most of the exciting innovation in journalism is happening outside news organizations (Bocskowski, 2004; Bradshaw, 2010, cited in Arthur, 2010). A strong voice in favor of the entrepreneurial approach in the UK is the journalist, blogger, and university reader Paul Bradshaw, who is the co-funder of the helpmeinvestigate.com , a website for investigative journalism. Other entrepreneurial journalism sites in the UK include openlylocal.com, charitiesdirect.com, wheredoesmymoneygo.org, scraperwiki.com and thebureauinvestigates.com . In the USA, propublica.org and californiawatch.org represent two of the most successful news sites that have surfaced during the last decade.
The careers of many news entrepreneurs have been fostered and supported by the NICAR and the IRE (Investigative Reporters and Editors). But much innovation is also happening outside what is traditionally perceived as journalism, or, at least, taking place in the outskirts of journalistic norms and formats. Once more, conceptualization is a leading thread.
One of the more disputed concepts is ‘journalism as programming’, which refers to the database as the locus of news attention. Journalism as programming was introduced by Adrian Holovaty, who initially was in charge of the editorial innovations at the Washington Post, but broke out after a few years. As early as in 2005 Holovaty developed one of the original Google maps mashups, http://chicagocrime.org , and the successor http://everyblock.org in 2007.
In 2006, Holovaty published his much-cited manifesto ‘A fundamental way that journalism needs to change’, in which he proposed that to remain essential information-providers to their communities, ‘Newspapers need to stop the story-centric view’ (2006: 1). People don’t need all the words any more, Holovaty claimed. Storytelling is out. What they do need, are ‘facts’, which according to him are numbers and figures, statistical data. As long as people have access to databases with vital data about crimes, or restaurants, he pointed out, they can create their own news, and search to find whatever it is that they need. You do not need anybody to interpret the data for you.
Holovaty was called the patron saint of a movement that created somewhat of a revolution by releasing the potential of news embedded in data, through new ways of sifting through and sharing data (Ingram, 2009). Of the data in
everyblock.com
60 percent was pulled from other sources, and one of Holovaty’s favorite tools was screen scraping (Holvaty, 2009a, 7 May). Three years after the 2006 manifesto, Holovaty indicated that he had moved on from the journalism paradigm. On his website he wrote:
It’s a hot topic among journalists right now. Is data journalism? Is it journalism to publish raw database? Here, at least, is the definitive two-part answer: 1. Who cares? 2. I hope my competitors waste their time arguing about this as long as possible. (Holovaty, 2009b)
The non-interpretive use of raw data is disputed. Since Holovaty’s method relies solely on trusting quantitative data as a secondary source, there is no validity checking of the data before it is made publicly available. Nevertheless, the case of Holovaty and his journalism-as-programming mission exemplifies an entrepreneurial start-up activity that has evoked considerable interest and sparked the interest for digital innovations in journalism. In everyblock.com , anyone could check out what was going on close by in 16 American cities by getting civic information, media mentions, and postings from neighbors. In 2009, the site was acquired by msnbc.com , later NBS News Digital, and on 7 February 2013, everyblock.com announced that it had closed down. At the same time, the more than 1100 commenters recommended moving on to emerging sites such as nextdoor.com, neighblr.com, lociville.com, wikiblock.com, romio.com, and circlesavvy.com.
The academic approach
Even though the use of quantitative social science methods in journalism was first introduced by Philip Meyer (2002[1973]), several decades passed before information scientists and media researchers became interested in sparking computational innovation in journalism. With the internet, researchers across disciplines have increasingly engaged in journalism innovation projects and multidisciplinary collaboration. Coming from a variety of groups within the research society, these groups comprise the academic approach, which is the third direction of CEJ.
In the USA and parts of Europe, the academic approach to CEJ is now widely known under the term ‘computational journalism’. The field surfaced at the Georgia Institute of Technology in 2006, and gained international acceptance as it was adopted by a number of science communities in the USA and in Europe, and also by funders such as the Knight Foundation. On Google, the number of hits for ‘computational journalism’ (17,800) are equivalent to those of ‘precision journalism’ (17,000) as of 10 February 2013.
Hamilton and Turner (2009: 1) define computational journalism as ‘the combination of algorithms, data and knowledge from the social sciences to supplement the accountability function of journalism’. By contrast, Nick Diakopoulos has chosen a work process approach and defines computational journalism as:
… the application of computing and computational thinking to the activities of journalism including information gathering, organization and sensemaking, communication and presentation, and dissemination and public response to news information, all while upholding core values of journalism such as accuracy and verifiability. It is inclusive of CAR (Computer-Assisted Reporting) but distinctive in its focus on the processing capabilities (e.g. aggregating, relating, correlating, abstracting) of the computer in comparison to mundane aspects of storage or access. The field draws on technical sub-fields of computer science including information retrieval, artificial intelligence, content analysis, visualization, personalization, and recommender systems as well as aspects of social computing and information science. (2011[2010]: 1)
Sarah Cohen et al. (2009: 3) envision a system to support collaborative investigative journalism, a system based on ‘a cloud for the crowd, which combines computational resources as well as human expertise’ to increase efficiency. The researchers exemplify how crowd-sourcing via the cloud can be actively directed to generate and verify results in stages, and thus multiply stories at a low cost. They point out an important switch in journalistic competency: to date, ‘much of the database research has focused on answering questions’, but for journalism, finding the most relevant and interesting questions to ask is getting increasingly important (2009: 3). The DocumentCloud project, started by a group of ProPublica and New York Times reporters in 2008, addresses some of these issues, and the crowd-sourcing experimentation carried out at large in the guardian.co.uk also manifests the visions of Cohen et al.
Computational thinking and the impact of individual creators
The many approaches merging into computational exploration in journalism make evident that the quality and direction of CEJ does not depend primarily on technological skills or tools. Rather, I argue that it depends on what Jeanette M. Wing has termed ‘computational thinking’ (Wing, 2006), which is an aspect of human cognition. Computational thinking refers to a way of ‘solving problems, designing systems, and understanding human behavior that draws on concepts fundamental to computer science’ (2006: 33). The focus of computational thinking are the humans, and included in computational thinking is a range of mental tools, as computational thinking is ‘a way that humans solve problems, it is not trying to get humans to think like computers’, since ‘Computers are dull and boring: humans are clever and imaginative’ (2006: 35).
The idea of computational thinking integrates logical, algorithmic, scientific, and innovative dimensions of human cognition. It includes abstracting and decomposing data when approaching complex tasks, in addition to building algorithms for pattern recognition. In a journalism context, computational thinking supports and helps explain the impact of individual creators such as Philip Meyer, Adrian Holovaty, and Charles Duhigg. The work of these outstanding computational innovators speaks to the storytelling power of computational exploration in journalism. Computational thinking is in alignment with cultivating an innovative mindset; it presupposes openness and curiosity towards new ideas, change, and quality improvement, and it welcomes risk-taking and new challenges.
As a computing pioneer, the computational thinking skills of Philip Meyer, for example, were ahead of his time. Nearly half a century had to pass before his ideas were widely embraced. His visions of journalism as science are still not fully realized. Fifty years after Meyer, Adrian Holovaty also broke out of established norms of what contemporary journalism was supposed to imply. By acting in an innovative style instead of an adaptive style, both of them have contributed to the extension of journalistic frames for work. Interestingly, Charles Duhigg chose the collaborative pathway to reach the level of journalistic excellence through computational thinking. He demonstrated how cross-disciplinary collaboration might help resolve the very complex challenges facing news professionals in the digital age. Collaborative incentives are spreading within and among news organizations. Former walls between academics and practitioners are also breaking down, not the least through the computational journalism initiative.
There is a link from these examples to the WikiLeaks’ system for controlled leakages on a large scale. WikliLeaks’ release of top secret raw data from the US government in 2010 also demonstrated the potential impact that a single creator’s thinking about computing power can have. They all highlight that the outcomes of computer-assisted reporting, data journalism, journalism as programming and computational journalism are not accidental; they all depend on human input.
From programmers to creators
Right here we have come to a decisive point in the argument I am developing in this article. From the listservs and blogs that constitute the empirical data of this study, it emerges that a main concern for programmers, a professional group who only recently entered the newsrooms, is how to become creators and innovators. These new technologists aspire not only to do routine computing tasks for other content creators. Rather, they want to become computational thinkers, that is, to actively take part in content development and the search for computational solutions to complex problems. At the current stage of CEJ, it is a fact that a large number of computing specialists are wrestling to find their place within the journalistic realm. Many individuals engaged in the cross-collaboration processes are pondering their roles, their status and their professional titles (Myers, 2011; Taylor, 2009).
On the helplist of ‘hacks&hackers’, an online interest group for news people who work at the intersection between journalism and programming, the question of titles has been repeatedly brought up. New terms such as ‘jourveloper’, ‘progojournalist’, and ‘hacker-journalist’ have been introduced. Among the most frequently used at this point of digital history, are data journalist, programmer, computational journalist, database editor, news application developer, journalist-programmer, coder-journalist, and computationalist.
Once more we are back to questions of conceptualization and jargonizing. Thus, awareness of evolving terms in an innovative field such as CEJ might help to explain the state of computational skills in the newsrooms as well. Evidently, such competence integration is a gradual process, which, similar to any other innovation process, takes time. Pearson (2009: 2) predicts that the infusion of computational thinking into journalism is going to change the epistemology of the field just as much as did the implications of ‘objective reporting’ 100 years ago. However, when it comes to applying aspects of computational thinking to practical situations, there is much disagreement both among practitioners and among computational researchers. Anecdotal evidence indicates that neither the news industry nor media researchers claim that every journalist should become a programmer or vice versa. However, the survey (Project for Excellence in Journalism, 2008), which found that the quest for computational skills ranks highest among American editors (96%), indicates that news professionals are expected to get better at reasoning abstractly about what they do and to have more insights in a breadth of computational tools. The survey also indicates that journalists and programmers are expected to collaborate to use those tools most efficiently.
A question that still awaits an answer is this: To what extent is journalistic and computational thinking, and not only computational skills, embedded in the generation and analysis of significant questions? To what extent is training in logical, algorithmic, scientific thinking and in journalistic sorting and selection part of the same package?
It can be argued that the concepts used in CEJ are all indicators of the actors’ innovative ambitions, and that they are on their way. For instance, computational journalism, the academic approach, aims to support computational thinking by changing the fundamental thought system in journalism from descriptive storytelling to abstract reasoning, autonomous research and visualization of quantitative facts. Journalism as programming, which exemplifies an entrepreneurial approach, basically conveys that there is no difference between journalism and the presentation of raw data: readers are autonomous and creative people who are trusted to do the searches and analyses they want on their own. The cross-disciplinary team approach conveys that in a complex society, quality journalism requires innovative ways of doing investigations. On top of the list are good ideas and management skills that enable humans with complementary, logical and algorithmic skills, attitudes, and values to implement the ideas together.
CEJ as a door opener to innovation at large
So far, we have investigated a dimension of journalism innovation that operates mainly within the media business itself. In this section, the focus is broadened, since innovation is not a phenomenon exclusively linked to journalism practice. Rather, innovation as an influential societal force is scarcely mirrored in the news beat. Thus, we ask: What transference value might there be between journalism innovation and innovation journalism, that is, the reporting of innovation at large?
One decade ago, John Pavlik predicted that the ongoing technological changes would exert a profound influence on journalism in at least four ways’ (2000: 236). He suggested that journalistic work processes, news content, and the structure of the newsroom and the news industry would change. He also predicted that the relationship between news organizations and their audiences would change (2000). Furthermore, Pavlik suggested that the emergence of the internet and the World Wide Web in the 1990s would redefine the notion of who is a journalist. CEJ is but one indication that Pavlik was right. Even though it was not explicitly stated by Pavlik, what he foresaw was a future where journalism would become more open minded, more solution oriented, and more innovative.
This article argues that the emergence of innovation journalism as a specific dimension of journalism is a vital part of this process. The concept was coined by David Nordfors in 2003 (Uskali et al., 2008), and this new term, again, opens up a different way of categorizing and thinking about the output of journalistic work. Instead of traditional labelling such as business journalism and political journalism, innovation journalism implies thinking and working across established disciplines.
Innovation journalism research pioneers claim that reporting on innovation processes in society at large has been ignored. A main explanation is that the media often:
… separates technology, business, politics and culture into discrete beats, while the process of innovation is about how technology, business, politics and culture drive each other and interact. (Uskali et al., 2008: 3)
Once again it is demonstrated how the labelling and categorizations that are made in the journalistic field serve as norms for news work. As pointed out by the innovation journalism initiative, the accountability dilemma for news media arises at the moment when established categorizations and norm sets no longer mirror societal processes of great importance. A main goal for advocates of innovation journalism, therefore, has been to bridge a gap. The void is sought to be filled by building a sustainable body of theoretical knowledge and practical skills in the field. These actions are justified by the fact that even though innovation is a main driver of economic growth it has been largely overlooked by established news media (Uskali et al., 2008).
During the first decade of the 21st century, reporters from several continents participated in the Innovation Journalism Program at Stanford University. This incentive was based on the idea that ‘if enough journalists write about innovation, their work should have an impact on society in the long run’ (Uskali et al., 2008: 19). A main aim with the program was to provide an opportunity for fellows to learn how to work horizontally instead of vertically when searching for beats. The participants explored innovation within a paradigm that stressed autonomous creation. The focus was on ways that science, technology, business, politics, and culture are nested, since these cross-disciplinary aspects are co-variables in innovation journalism. The Stanford model of teaching innovation journalism has later spread to several European countries, in particular Finland.
There are several striking similarities between the Stanford innovation journalism approach and the CEJ as identified in this article. First, there is the dimension of cross-disciplinarity and horizontality when establishing work teams and in targeting news beats.
Second, the focus on developing individual journalistic autonomy should be noted. Studies of skill development in journalism support the idea that journalistic creativity, autonomy and innovation, which are closely interlinked, develop faster when they are deliberately fostered. The study also illustrates that developing an innovative mindset takes time and concentrated effort. Empirical research among journalists indicates that individuals need to pass at least two stages of journalistic proficiency – novicing and conventionalizing – before they are able to contribute creatively and innovatively to a newsroom culture (Gynnild, 2007).
Uskali et al. point out that journalists do a better job when they cover issues they are really interested in: ‘Innovation is a mindset and a culture, where increased value comes from “something new” rather than “more of the same”’ (2008: 20). The authors compare reporting on innovation without having an innovative mindset with covering the IT industry using a typewriter. And yet they point out that due to the pace of online journalism, ‘journalists risk becoming minor players in the same unfolding drama they are attempting to critique’ (2008: 2).
The relevance of this statement might be disputed. It runs counter to the ongoing wave of CEJ. The joint revelation, retrieval, analysis, and presentation of beats from immense amounts of quantitative data leaked to WikiLeaks and further to established media institutions in the last few years, suggest that CEJ is empowering news media to an extent that was formerly unknown.
At the same time, the ability to develop ideas and retrieve quantitative data, as well as to analyze and extract value from that data, is clearly still in short supply. The scarcity relates to a number of interactional, human aspects of handling huge datasets, for instance cross-disciplinary collaboration, the contesting of established status hierarchies, contested news priorities, and the integration of computational thinking.
Furthermore, the analysis of computational exploration in journalism indicates that in a first phase of innovation, the drive of particularly devoted individuals is of crucial importance for the development and growth of the field. They seem to work independently of institutional structures or systems. In the next phase, more systematized and focused investments by news media organizations in human resources are needed to further expand the data potential. A main argument that comes out of the analysis is that CEJ supports the development of an innovative mindset among news professionals, which in turn may lead to more focus on covering innovation processes within society at large.
Three propositions
I finalize this article with three propositions that derive from the study. The propositions are interlinked and are based on alignments between innovation journalism and computational exploration within journalism. Behind this choice of comparisons lies the idea that the ongoing CEJ exemplifies a particularly innovative arena in which the further development of an innovative mindset among news professionals is beneficial. The propositions can also be viewed as a future vision in which the human search for meaningful journalistic work is implied.
Proposition Number 1: The most critical factor for journalism’s impact on society in the future is not the accessibility of high-tech tools. It is the professional fostering of news professionals who are intrinsically motivated to explore, contest, and further develop meaningful journalistic approaches within the contexts they are operating.
Proposition Number 2: The second most critical factor for journalism’s impact on society in the future is not how much money can be saved through the development of high-tech tools. It is the development of news professionals’ ability to think abstractly and work collaboratively to solve problems – with the assistance of high-tech tools.
Proposition Number 3: The third most critical factor for journalism’s impact on society in the future is not an extended focus on one-time, often catastrophic events that cannot be undone. It is the news professionals’ ability to extract complex problems of societal importance from such events, provide adequate and innovative solutions to these problems, and move on.
Concluding remarks
In this article, I have argued that the impact of computational exploration in journalism is less dependent on technological creation than on news professionals’ values, goals, and interactional skills development. In an era of cultural chaos, in which information scarcity is turned into information surplus; opacity into transparency; exclusivity into accessibility; homogeneity into heterogeneity; hierarchy into network; passivity into (inter)activity; and dominance into competition (McNair, 2006), journalists are exposed to fundamental challenges of continuous change.
In processes of journalism innovation, heterogeneous and cross-disciplinary insights, experiences and knowledge are synthesized into new ways of seeing, understanding and presenting societal issues. Subsequently, it might be argued that journalism innovation clears the way for innovation journalism. Thus, the critical issue for both journalism innovation and innovation journalism to expand is not primarily concerned with business models or publishing visions of media companies. The critical issue is the facilitation and development of an innovation-oriented mindset among the people working professionally in the field. Fundamentally, it is about journalistic will and skill to put good ideas into action for the benefit of society.
Footnotes
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
