Abstract
This syndicate offers four recommendations to help educators adjust curricula to accommodate the rapid integration of data into journalism. First, instruction in numeracy and basic descriptive statistics must be required as either modules in existing courses or as separate offerings. Second, students should be taught to avoid mistakes in interpreting and writing about data in both reporting and visual classes. Third, ethics courses should discuss data as a transparency tool that poses distinctive dilemmas. Fourth, computational thinking, or how to dissect and solve problems like a computer does, can be incorporated into existing classes that teach logic.
Skills for working with numbers, large and small data sets, public records, and visualizations are essential in news organizations today because data provide journalists the reporting tools to disentangle today’s social complexity (Berret & Phillips, 2016a; Boyles & Meyer, 2017). Four in 10 journalists use data regularly to tell stories (Rogers et al., 2017). The rapid integration of data into journalism, both as a specialty and a baseline expectation of everyday practitioners (Stalph & Borges-Rey, 2018), is a challenge for journalism education. The purpose of this article is to identify the data skills emerging journalists must learn and what approach educators should take toward teaching students.
Data and Computational Journalism
Data journalism defies simplistic explanations (Lewis & Waters, 2018). Yet a strict definition may be less important than an appreciation for the broad scope of activities it encompasses and the fundamental skills required in numeracy and basic descriptive statistics. Numeracy is thinking critically about data to understand what numbers represent and omit (Maier, 2002). Essential descriptive statistics include averages, percentages, and ratios to make valid comparisons and ask informed questions (Nguyen & Lugo-Ocando, 2016).
More specialized data journalism skills involve at least four topics: acquiring, cleaning, analyzing, and presenting data. Acquiring data includes knowing what public records exist and how to obtain them, as well as seeking numbers suitable for spreadsheets or databases. Cleaning data involves software and human intelligence to standardize spelling, punctuation, and representation to obtain accurate counts. Analyzing data may be accomplished with spreadsheets, programming languages, database management systems, or visual representations to find commonalities or outliers. Presenting data may involve charts, maps, or graphics to enable audiences to comprehend the analysis or personalize the data through interactives.
Like data journalism, computational journalism also lacks a common definition. One typology identifies computational journalism, data journalism, and computer-assisted reporting as three elements of quantitative journalism (Coddington, 2015). Those three elements can be seen as a continuum (Anderson, 2018). Others see computational journalism as a discrete field (Cohen et al., 2011). Educators should distinguish between computational and data journalism because automated journalism, bots, and machine learning algorithms typically found in computational journalism involve skills and tools different from using numbers to find or verify news stories in data journalism (Carlson, 2015; Diakopoulos, 2019).
Another definitional issue is whether the portion of computational journalism that involves coding or programming is integral to data journalism. One view holds that programming is inextricably linked with data acquisition, cleaning, analysis, and presentation, and that learning the language of the web is essential (Chimbel, 2015). Others contend that programming skills are useful but not necessary to practice data journalism—some rely on the R programming language while others find Excel sufficient. As a result, programming skills classes should be offered in journalism programs as electives.
Data Journalism Education
An examination of 219 data journalism modules and programs in 24 countries found most courses (55%) covered the four-part data journalism process: acquiring, cleaning, analyzing, and presenting data. However, few courses cover advanced data analysis. Almost half (48%) were graduate courses, most were in the United States, and only 25 worldwide had dedicated programs or degrees (Heravi, 2019).
Almost half of 113 accredited U.S. journalism programs evaluated had no data journalism courses, and most courses were at an introductory level. Only 18 of the programs offered more than two courses in which spreadsheets, statistical software, relational databases, or programming were used while 27 programs offered only one relevant course. In 69 programs, some data journalism was taught in reporting courses (Berret & Phillips, 2016b).
Statistical literacy is rarely taught, even though administrators say it is important (Dunwoody & Griffin, 2013). A census of 369 U.S. journalism programs found statistics was not required in 79% of the programs, and none offered its own statistics course (Martin, 2017). Similarly, numeracy has long been absent from journalism programs in England (Harrison, 2014).
Recommendations
Required: Numeracy and Basic Statistics
Emerging journalists need an understanding of numeracy and quantitative data to confidently interpret numbers and avoid errors. Journalists in a shrinking job market can no longer afford to let a fear of numbers restrict their career options. As student comfort with numbers varies across nations and socioeconomic groups, educators should tailor their numeracy offerings accordingly. Benchmarks can be obtained from an international audit of adult skills (Organisation for Economic Co-Operation and Development, 2016) and begin with averages, percentages, and ratios. Critical thinking about numbers involves knowledge about samples and populations, how data are defined and obtained, and an ability to detect biased or manipulated statistics.
Educators should consider whether numeracy and basic statistics are best inserted into an existing required course, such as reporting, or taught as stand-alone courses. If the latter, journalism educators should consider whether to outsource the teaching to other departments or create a journalism-specific approach as Martin (2017) recommended.
Required: Communicating Data
Communicating data accurately is as fundamental as is precise reporting of a speaker’s words. In fact, accurately communicating data can be solely as text, such as correctly conveying the uncertainty of polling results within an error margin. It also means correctly interpreting visualizations and being alert to truncated axes that overemphasize differences or presentations that fail to account for population growth and inflation. Journalists do not have to know how to create visualizations, but they must know how to present information fairly.
Learning how to accurately communicate data can be incorporated into existing reporting and visual communication classes. Detecting misleading visuals can be built into numeracy or statistics modules.
Required: Data Ethics
The use of data in journalism creates distinct ethical challenges, starting with data transparency. Audiences may not expect online transcripts or audio files of interviews, but they do want the option to explore data files for themselves. Yet only 13% of stories made data downloads possible (Zamith, 2019). Second, journalists who scrape websites must be aware of both legal and ethical ramifications of compiling data that originators did not intend to be aggregated. Third, data stories can create their own ethical quandaries, such as when a U.S. news organization mapped locations of gun owners (Craig et al., 2017).
An existing course in ethics can include a discussion of data as a transparency tool as well as the dilemmas of reporting with data.
Encouraged: Computational Thinking
Sometimes overlooked in the debate over whether learning programming languages is essential to data journalism practice is the larger benefit brought by computational thinking. This is thinking with a computer in mind rather than thinking like a computer—to learn how to disassemble a problem into steps and finding reproducible solutions (Bradshaw, 2018). Emerging journalists should know how programming works without necessarily having to write code, and not just because it will equip them to work with specialists. Computational thinking enables better journalism by endowing “new ways of seeing, understanding and presenting societal issues” (Gynnild, 2013, p. 728).
Computational thinking can be incorporated into numeracy or statistics modules or plugged into courses that teach logic, like media law (paired with legal thinking) and ethics (paired with philosophical thinking).
Conclusion
The use of data to find, verify, and tell stories is becoming a mainstream journalism skill and thus must be incorporated into the curriculum. Journalism educators have a daunting task because the field attracts students who would rather avoid numbers, and the skills to be taught are evolving more rapidly than normal academic processes and faculty turnover can accommodate. Still, the steps outlined are achievable minimums. Numeracy and basic descriptive statistics are no longer optional—for today they are as essential as spelling and grammar.
Footnotes
Authors’ Note
Data Journalism syndicate participants: Nouha Belaid, Central University of Tunisia, Tunisia; Martin Chorley, Cardiff University, UK; Andrea Czepek, Jade University, Germany; Kayt Davies, Edith Cowan University, Australia; Barry Finnegan, Griffith College, Ireland; Miao Guo, Ball State University, USA; Bahareh Heravi, University College Dublin, Ireland; Oleg Igoshin, South Ural State University, Russia; Guido Keel, Institute of Applied Media Studies, Switzerland; Sophie Knowles, Middlesex University, UK; Jennifer Leask, Langara College, Canada; Michael Lithgow, Athabasca University, Canada; Jack Lule, Lehigh University, USA; Scott Maier, University of Oregon, USA; Siobhan McHugh, University of Wollolong, Australia; Radu Meza, Babeș-Bolyai University, Romania; Adrienne Mong, NBC News, UK; Glyn Mottershead, City University of London, UK; John Price, Sunderland University, UK; Rayya Roumanos, Institut de Journalisme, France; Cindy Royal, Texas State University, USA; Magdalena Saldana, Universidad Católica, Chile; Daniel Thomas, Western Washington University, USA; Cheryl Vallender, Sheridan College, Canada.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
