Abstract
The rise of automated journalism—the algorithmically driven conversion of structured data into news stories—presents a range of potentialities and pitfalls for news organizations. Chief among the potential legal hazards is one issue that has yet to be explored in journalism studies: the possibility that algorithms could produce libelous news content. Although the scenario may seem far-fetched, a review of legal cases involving algorithms and libel suggests that news organizations must seriously consider legal liability as they develop and deploy newswriting bots. Drawing on the American libel law framework, we outline two key issues to consider: (a) the complicated matter of determining fault in a case of algorithm-based libel, and (b) the inability of news organizations to adopt defenses similar to those used by Google and other providers of algorithmic content. These concerns are discussed in light of broader trends of automation and artificial intelligence in the media and information environment.
Keywords
Introduction
In August 2016, Facebook fired the team who had been curating its “Trending Topics” section and replaced them with an algorithm that would automatically recognize and promote popular topics. The move came after months of controversy about Facebook’s handling of Trending Topics, including charges of political bias on the part of the editors managing the feature. About 2 days later, however, “Facebook’s foray into automated news went from messy to disastrous” (Oremus, 2016). For at least 8 hours on August 28, Trending Topics highlighted a popular—but erroneous—article claiming that then-Fox News anchor Megyn Kelly had been fired from the cable TV network because she had endorsed Hillary Clinton for U.S. president. Although Facebook still employed a few human editors who were supposed to watch for algorithmically surfaced hoaxes and fake news, they failed to recognize that the source of this particular article (the website endingthefed.com) was neither a reputable mainstream outlet nor a popular conservative site. Instead, “the Trending review team accepted it thinking it was a real-world topic” (Gunaratna, 2016). In the aftermath, the question of responsibility arose: “Even if algorithms are now running the show, is Facebook legally responsible for what happened . . . ?” (R. Meyer, 2016). In other words, had Facebook’s Trending Topics algorithm defamed Megyn Kelly? Or, more generally, can a bot commit libel?
Such questions figure into a broader set of concerns about the “social power of algorithms” (Beer, 2017), or the increasing and complicated role that algorithmic processes play in structuring our social world, in connection with big data, automation, and the overall quantification of everyday life (Carr, 2015; Crawford, Miltner, & Gray, 2014; Howard, 2015; O’Neil, 2016; Ziewitz, 2016). Drawn more narrowly in the context of big data for media work and journalism (Lewis & Westlund, 2015b), the question of algorithmic libel, as we demonstrate in this article, is one that news organizations must consider as they increasingly develop automated forms of news production—that is, machine-driven modes of news creation and publication based on algorithmic procedures. This is particularly true as news companies deploy algorithms that transform structured data (e.g., financial earnings reports) into narrative news texts, with little to no human intervention (for an overview, see Graefe, 2016).
Although a growing body of research has begun to examine automated journalism (e.g., Carlson, 2015; Clerwall, 2014; Haim & Graefe, 2017; Jung, Song, Kim, Im, & Oh, 2017; Linden, 2017; Thurman, Dörr, & Kunert, 2017) and related matters of human–machine interactions in journalism (e.g., Bucher, 2017; Lewis & Westlund, 2015a; Lokot & Diakopoulos, 2015), questions of legal liability have remained almost entirely unexamined. The possibility that such “robot-written” texts could libel individuals or institutions—however futuristic and implausible such a scenario may seem—merits scholarly legal study. Such a perspective can strengthen media law scholarship by synthesizing cases on emerging issues of algorithm-oriented accountability (Diakopoulos, 2015), while also providing journalism scholars and practitioners with a better understanding of the potential pitfalls of automated news (Kirley, 2016; Ombelet, Kuczerawy, & Valcke, 2016; Weeks, 2014). This study addresses such matters primarily in the context of First Amendment law, in part because the U.S. Constitution provides robust protection for defendants accused of libel. As a result, if American courts were to hold media organizations liable for defamatory content produced by a news algorithm, it would certainly follow that these organizations could face even greater liability in other countries where protections for freedom of expression are weaker. The article proceeds by outlining algorithms and automated journalism, the application of First Amendment principles to changing technologies (with a particular focus on algorithms), significant findings from our review of legal cases, and, finally, an agenda for future research.
Algorithms and Automated Journalism
Algorithms are often misunderstood: to software developers, they are simple sets of operations, programmatic rules for computers to follow; to software users, however, they may represent incomprehensible processes, black boxes of inscrutability (Gillespie, 2016). Amid the proliferation of digital media technologies and automated decision-making systems, algorithms increasingly influence what we see, hear, and experience via the likes of Google, Amazon, and Netflix (Gillespie, 2014; Hallinan & Striphas, 2016; Van Dijck, 2013); how we perceive public discourse (Braun & Gillespie, 2011) and engage in matters of public interest (Napoli, 2015); and how some determinations are made regarding job applications, car loans, and parole sentencing (O’Neil, 2016). As Pasquale (2015) put it, we live in a “black box society” influenced by “enigmatic technologies”—enigmatic precisely because the decisions underlying such systems are guided by value sets and prerogatives that are often opaque (p. 1).
Amid the pursuit of “big-data solutions” in many sectors of society, there is a chorus of concern about the implications of big data for matters of privacy, surveillance, ethics, bias, manipulation, and power imbalances in a world dominated by algorithms and automation (Boyd & Crawford, 2012; Carter & Lee, 2016; Crawford et al., 2014; Stewart & Littau, 2016; Stoycheff, 2016). Thus, big data, algorithms, and related phenomena are of great importance for the fundamental nature of digital information: how it is captured, configured, and ultimately made visible in society.
A key aspect in this process is the rise of automated forms of journalism, such as stories produced not by human authors but written by machines. In journalism today, algorithms are responsible for producing tens of thousands of stories that fill online news sites, even from leading news organizations such as the Associated Press (Dörr, 2016; LeCompte, 2015). At the moment, these news articles are restricted mostly to topics that have well-structured datasets associated with them, such as sports results and quarterly financial earnings reports—data that can be fed into an algorithm and transformed into a narrative (Graefe, 2016; Marconi, Siegman, & Machine Journalist, 2017).
Such automated journalism is defined as “algorithmic processes that convert data into narrative news texts with limited to no human intervention beyond the initial programming” (Carlson, 2015, p. 416). What began a few years ago as small-scale experiments in machine-written news has, amid the development of big data broadly, become a global phenomenon, with technology providers from the United States to Germany to China developing algorithms to deliver automated news in multiple languages (Dörr, 2016). Moreover, beyond merely producing routine sports stories, automation is being used by The New York Times, Forbes, and other news companies to complement human reporting—for instance, by suggesting story ideas based on trends in data, or by offering new forms of personalized news to suit different audiences. As one trade-press report summed it up, “Automation is taking off, in large part because of the growing volume of data available to newsrooms, including data about the areas they cover and the audiences they serve” (LeCompte, 2015, p. 3).
The deployment of automated processes at select news organizations also has significantly increased the output of news content. For example, the Associated Press (AP) reported that its collaboration with Automated Insights, which creates automated-writing algorithms, and the financial data company Zack Investment Research has allowed the AP to move from producing 300 financial earning stories per financial quarter to more than 3,000 stories per quarter (Madigan White, 2015). The AP has since expanded its automated journalism to include coverage of minor-league baseball, and it aims to use machine learning to automatically translate print stories into broadcast ones (Lichterman, 2016). Such impressive increases will only encourage other news organizations to turn to algorithms for content production, amid a broader turn toward exploring what artificial intelligence could mean for developing “augmented journalism” in the next decade (Marconi & Siegman, 2017; Marconi et al., 2017). As a result, legal teams must consider how laws regulating the press should handle this new technology, especially as automated journalism becomes adopted more widely, whether as supplemental processes to human reporting or in permitting algorithms to produce content independently.
Algorithms and First Amendment Protections
Broadly speaking, developing technologies pose challenges to the traditional understanding of media law and the U.S. Supreme Court’s interpretation of the First Amendment (Anderson, 2002). As advancing media technologies become widely adopted, the U.S. Supreme Court is confronted with questions about the scope of First Amendment protection that should apply to the new technologies (Wu, 2013). The Supreme Court has generally extended First Amendment protection to content produced using emerging technologies, including films (Joseph Burstyn, Inc. v. Wilson, 1952), radio broadcasts (Red Lion Broadcasting Co. v. FCC, 1969), Internet sites (Reno v. ACLU, 1997), and video games (Brown v. Entertainment Merchants Association, 2011), but typically does so only after the new technology has been adopted broadly within American society.
Scholars have begun to grapple with questions about whether algorithms, and the content they produce, should qualify for First Amendment protection. Bracha and Pasquale (2008) suggested that algorithms themselves should not receive broad First Amendment protection. Rather than producing speech, algorithms are merely designed to facilitate the protected speech of others. As a result, they argued that algorithms producing content are more akin to common carriers, such as telephone wires carrying phone calls, with output that can be regulated rather than viewing them as more traditional media outlets, such as newspapers. Thus, the First Amendment would not bar government regulation of algorithm output unless such regulations would limit the speech that the algorithm is helping to facilitate.
Wu (2013) took a similar approach in his views of how the First Amendment should apply to content produced by algorithms, noting that, over time, courts have adopted a “de facto functionality doctrine” that can apply to algorithm-produced content. Under the doctrine, the First Amendment would not provide protection to “communication tools,” such as some types of algorithms, that “primarily facilitate the communications of another person, or perform some task for the user” (p. 1498). Alternatively, he argued that “speech products,” which are technologies using algorithm output, such as “blog posts, tweets, video games, newspapers, and so on,” that are “viewed as vessels for the ideas of a speaker, or whose content has been consciously curated” (p. 1498), should qualify for First Amendment protection.
Bambauer (2014) has suggested that the Supreme Court’s decisions in a variety of First Amendment cases regarding privacy, commercial speech, and copyright involving the dissemination of data indicate that First Amendment protection extends to useful knowledge and information. Therefore, she argued that most algorithm-produced content would likely be protected under a “thinker-centered First Amendment” that assumes a right to learn new things (p. 77). Benjamin (2013) substantially agrees, noting that the Supreme Court’s jurisprudence that has created broadening interpretations of First Amendment protection would likely consider algorithm-produced content as speech worthy of protection.
In the most expansive view, Volokh and Falk (2012) argued that search results—the algorithmic output of search engines such as Google and similar online search providers—clearly deserve First Amendment protection. The scholars likened an Internet search firm’s algorithm-produced content to the editorial-produced content of more traditional media organizations, such as newspapers, because human actors made specific decisions about how to write an algorithm so that it would publish some types of content while excluding other types. According to Volokh and Falk (2012), [Search engine companies] exercise editorial judgment about what constitutes useful information and convey that information—which is to say, they speak to their users. In this respect, they are analogous to newspapers and book publishers that convey a wide range of information from news stories and selected columns by outside contributors to stock listings, movie listings, bestsellers lists, and restaurant guides. (p. 899)
As a result, the scholars suggested the First Amendment bars the government from dictating or regulating the types of content that search engine algorithms produce.
Taken together, these arguments indicate that no clear consensus has developed about whether algorithm-produced content deserves First Amendment protection. However, as Blackman (2014) noted, “in certain cases it will soon become more difficult to unbundle technology’s speech rights from our own” (p. 36) as algorithm-produced content becomes more prevalent in everyday life. Automated journalism clearly complicates the evaluation of First Amendment protection for algorithm-generated content. To help unpack these challenges, we examine the various constitutional problems likely to arise in one specific legal context: when automated journalism results in the publication of libelous content.
Robots Off the Rails: Automated Journalism and Libel
News organizations’ use of automated journalism poses a legal quagmire for both the news organizations and American courts when it comes to libel. The publication of a false statement of fact about someone that injures his or her reputation in the eyes of the community or deters others from associating with him or her constitutes libel (Restatement [Second] of Torts, 1977, §559, §568). Although the First Amendment does not protect libelous statements in some instances, American news organizations have benefited from increasing protections against libel judgments during the past 50 years. 1 Although libel was primarily governed by state law until 1964, the U.S. Supreme Court ruled in New York Times v. Sullivan (1964) that the First Amendment provided some constraints on libel actions. Writing for the majority in Sullivan, Justice Brennan noted, “that erroneous statement is inevitable in free debate, and that it must be protected if the freedoms of expression are to have the breathing space that they need . . . to survive” (pp. 271-272).
Ultimately, the Supreme Court’s decision to constitutionalize libel law created a complicated patchwork of state common law, state statutory law, and federal constitutional law to regulate false speech that harms individuals’ reputations. Nonetheless, American law provides extraordinary protection for libelous speech, establishing the United States as one of the most speech-protective countries in the world when it comes to defamation. In nearly all cases, a plaintiff must demonstrate that the defamatory statement identified him or her and that the defendant published that statement with some level of knowledge as to its falsity, 2 resulting in injury to the plaintiff’s reputation. 3 Few clear precedents suggest how constitutional protections would apply to defendants employing automated journalism if a plaintiff asserted a libel claim on the basis of words produced by algorithms, rather than by human journalists. Two major problem areas emerge in relation to automated journalism and libel law: (a) the constitutional requirements that plaintiffs must prove that a media defendant is at fault for libelous statements, and (b) news organizations’ inability to use defenses against libel suits that other organizations publishing algorithm-generated content, such as Google, have been able to use.
Determining Algorithms’ Fault
One of the five elements that plaintiffs must prove to win a libel suit in the United States is that the defendant was at fault—in other words, that the defendant was responsible for the harm to the plaintiff’s reputation. The level of fault that a plaintiff must prove depends on how the courts categorize the plaintiff. The U.S. Supreme Court ruled in Sullivan that public officials could not recover damages for a false statement about their official conduct unless they first proved that the defendant had acted with “actual malice,” meaning that the defendant had knowledge of the statement’s falsity or acted with reckless disregard for the truth (New York Times v. Sullivan, 1964, p. 280). In Curtis Publishing Co. v. Butts (1967), the U.S. Supreme Court extended the “actual malice” standard to statements about plaintiffs who are considered public figures. An additional degree of protection exists for statements about public plaintiffs; the plaintiff must prove “actual malice” with “convincing clarity” (New York Times v. Sullivan, 1964, pp. 285-286). This is a higher burden of proof than the typical “preponderance of the evidence” standard used in civil cases, and the difficulty of meeting this burden further insulates defendants who discuss public officials and public figures. 4
Actual malice takes shape after Sullivan
Throughout the past six decades, the U.S. Supreme Court has continued to refine the meaning of actual malice. As it stands, actual malice presents a significant hurdle for public officials and public figures to overcome to prevail in a libel suit. Subsequent decisions have included clear articulations that Sullivan applies equally to both criminal and civil libel actions. In Garrison v. Louisiana (1964), the Court noted that “only those false statements made with the high degree of awareness of their probable falsity demanded by New York Times may be subject to either civil or criminal sanctions” (p. 74). For many plaintiffs, proof of a defendant’s awareness of the statement’s falsity is often impossible. Even prevailing on the theory that the defendant showed reckless disregard for the truth can be difficult. For example, the U.S. Supreme Court noted in St. Amant v. Thompson (1968) that merely repeating false statements without actually verifying whether such statements were false or failing to check the source’s credibility was not enough to constitute “reckless disregard for the truth” on the part of the defendant.
Subsequently, in Pape v. Illinois (1971), the Court ruled that the omission of the word “alleged” from a Time magazine report on a government agency’s investigation of police brutality did not amount to actual malice in regard to a Chicago police detective, who claimed the article defamed him.
Applying this standard to Time’s interpretation of the Commission Report, it can hardly be said that Time acted in reckless disregard of the truth. Given the ambiguities of the Commission Report as a whole, and the testimony of the Time author and researcher, Time’s conduct reflected at most an error of judgment. (Pape v. Illinois, 1971, p. 292)
Although the Court’s opinion noted the decision was heavily dependent on the facts of the case, those facts are similar to ones that might present themselves in an automated journalism case. In Pape, the magazine’s reporters and researchers had relied on one volume (“Justice”) of a 1961 report by the U.S. Commission on Civil Rights in an article addressing police brutality. “The Time article went on to quote at length from the summary of the Monroe complaint, without indicating in any way that the charges were those made by Monroe rather than independent findings of the Commission.” The Court goes on to describe the report and its contents as “anything but straightforward” and “extravagantly ambiguous” (pp. 286-287). In ruling that the reporters’ conduct did not rise to the level of actual malice, the Court argued that Time’s omission of the word “alleged” amounted to the adoption of one of a number of possible rational interpretations of a document that bristled with ambiguities. The deliberate choice of such an interpretation, though arguably reflecting a misconception, was not enough to create a jury issue of “malice” under New York Times. To permit the malice issue to go to the jury because of the omission of a word like “alleged,” despite the context of that word in the Commission Report and the external evidence of the Report’s overall meaning, would be to impose a much stricter standard of liability on errors of interpretation or judgment than on errors of historic fact. (Pape v. Illinois, 1971, p. 290)
Similarly, instances involving interpretations by algorithms might lead to similar outcomes where datasets and other publicly available information are less than clear.
More recently, the Court held in Masson v. New Yorker Magazine, Inc. (1991) that “a deliberate alteration of the words uttered by a plaintiff does not equate with knowledge of falsity for purposes of [Sullivan and Gertz] unless the alteration results in a material change in the meaning conveyed by the statement” (p. 517). There, a psychoanalyst alleged that a freelance reporter for The New Yorker had defamed him by falsely attributing direct quotations to him. The magazine article, and a subsequent republication as a book, contained six passages that he believed portrayed him negatively and injured his reputation. Despite recording more than 40 hours of interviews with Mr. Masson, the writer claimed the passages were from additional interviews that had not been recorded either because of the situation or because the writer’s equipment was inoperable. 5 Among the quotes in dispute were statements that his coworkers viewed him as an “intellectual gigolo” whereas the recording contains the words “was much too junior within the hierarchy of analysis for these important . . . analysts to be caught dead with [him].”
Masson presents a fairly egregious fact pattern among U.S. Supreme Court libel cases. When staff members of The New Yorker engaged in fact-checking the article about Masson, he claimed that he alerted them to numerous errors and asked to review portions of the article that quoted him or attributed information to him. However, no such review ever occurred. Although the Court reversed a Ninth Circuit ruling in favor of The New Yorker, its opinion provided wide latitude for media defendants, even with regard to the altering of quotations. The Court concluded that deliberate alteration of quotations—something that flies in the face of accepted journalistic practice—does not automatically support a finding of actual malice. Instead, the Court noted, Deliberate or reckless falsification that comprises actual malice turns upon words and punctuation only because words and punctuation express meaning. Meaning is the life of language. And, for the reasons we have given, quotations may be a devastating instrument for conveying false meaning. In the case under consideration, readers of In the Freud Archives may have found Malcolm’s portrait of petitioner especially damning because so much of it appeared to be a self-portrait, told by petitioner in his own words. And if the alterations of petitioner’s words gave a different meaning to the statements, bearing upon their defamatory character, then the device of quotations might well be critical in finding the words actionable. (Masson v. New Yorker Magazine, Inc., 1991, pp. 517-518)
In another case exploring actual malice, the Court’s decision in Harte-Hanks Communications, Inc. v. Connaughton (1989) suggests that significant factual findings must be made with regard to the defendant’s actions. In that case, the Court upheld a lower court decision that a newspaper had acted with “actual malice” when it claimed a judicial candidate had been using deceptive tactics to criticize an opponent. The candidate was able to demonstrate the newspaper had made several missteps that constituted “actual malice.” The newspaper relied heavily on a source whose credibility was seriously questioned by the newspaper’s other sources, the newspaper refused to listen to a tape the plaintiff had provided that disputed several claims, the newspaper published editorials clearly indicating that it would publish the dubious claims against the candidate regardless of conflicting developments, and witness testimony suggested the newspaper failed to conduct a full investigation into its source’s claims, which indicated the news organization deliberately avoided the truth (Harte-Hanks Communications, Inc. v. Connaughton, 1989). The plaintiff’s efforts to show that the newspaper acted with “actual malice” in Connaughton were no small feat and relied on a patchwork of circumstantial evidence to support the court’s findings of fact.
Taken together, Sullivan, Butts, Garrison, St. Amant, Pape, Masson, and Connaughton demonstrate that public officials and public figures must overcome significant hurdles to prevail in a libel suit. As public persons are often the subjects of news coverage, and as a result, face greater likelihood of being the subject of algorithm-produced content, these protections have important implications for automated journalism. 6 Even if the content an algorithm produces is false, a public figure plaintiff would likely struggle to prove actual malice under the Court’s current standard because algorithms, in and of themselves, do not engage in the kind of subjective decision-making processes present in St. Amant, Masson, or Connaughton. Rather, the algorithms merely produce content as a result of specific programming efforts. As a result, public figure plaintiffs would likely struggle to show that an algorithm had acted with knowledge of falsity or reckless disregard for the truth (see Harte-Hanks Communications, Inc. v. Connaughton, 1989).
Algorithms’ processing and distribution of information do not even rise to the types of actions that were in question in St. Amant and Masson, as algorithms have little ability to verify statements or alter the meaning of quotations unless specifically programmed to do so. Algorithm programming comes with its own subjective judgments (Gillespie, 2014), which raises questions about whether programmers could be held liable for libelous statements that their algorithms produce. 7 It stands to reason that the courts would allow a public plaintiff to prove knowledge or falsity or reckless disregard on the part of either the news organization or the creator of the algorithm, but in most instances plaintiffs would likely run into the same hurdles they face proving an editor or reporter acted with actual malice. In such a lawsuit, plaintiffs would have to prove the human programmers had a “high degree of awareness” of false statements rather than interrogating the awareness of an algorithm. To do so, the plaintiff would need to show that the programmer knew, or should have known, that the algorithm would produce false statements that would be harmful to an individual’s reputation. Such a showing could occur if an algorithm were intentionally programmed to develop and produce false content.
In the alternative, a plaintiff may be able to succeed in cases where the news organization relied on an algorithm to produce content without any additional editorial oversight prior to publication. To do so, the plaintiff’s attorney could introduce evidence that the editor or publisher knew, or should have known, that the algorithm was not foolproof but published the content without editorial review. In reality, most competent news organizations would be able to avoid such scenarios simply by following traditional journalistic norms and practices, which include editorial review of content prior to publication. However, even in instances where a news organization’s editorial staff failed to catch defamatory content produced through automated journalism prior to publication, most courts would be reluctant to find actual malice.
A brief modification of the Megyn Kelly example from the introduction illustrates how such a scenario might play out in court. Assume that instead of Facebook, an algorithm at a major news organization came across the story of her firing, crafted a short news alert and published it to the organization’s website without editorial review. As a public figure, Kelly would be required to prove actual malice. Assuming the algorithm’s programmer had not nefariously programmed the algorithm to falsify content, Kelly would need to prove actual malice on the part of the news organization. Her attorney might do this through examination of the editorial team, inquiring about whether they knew the algorithm could produce false content and if they subjected automated content to editorial oversight prior to publication.
Short of proving actual malice on the part of the programmer or news organization, public plaintiffs would likely struggle to recover in libel-by-algorithm cases unless the courts were willing to create a new standard of liability to address such instances. Given the courts’ steadfast adherence to the actual malice standard for more than 50 years, this seems highly unlikely. For instance, although the Internet has allowed the rapid spread of libelous speech against professionals and other public plaintiffs, the Supreme Court has not tamped down on libelous speech. Influential media attorney Floyd Abrams has lauded the Roberts Court for its decisions in speech cases. In an interview, he recently noted, “Taken as a whole it has rendered First Amendment-protective decisions in an extraordinarily broad range of cases, and it deserves great credit for doing so” (Chapman, 2017, para. 6). In the same column, published in July 2017 in the Chicago Tribune, law professor Geoffrey Stone is quoted as saying, “The Roberts Court has given more protection to free speech across a larger range of areas than any of its predecessors have—although sometimes unwisely” (Chapman, 2017, para. 7). With this in mind, it seems unlikely the Court would begin to unravel the protections of Sullivan based on the adoption of automated journalism.
Negligence: The level of fault for the rest of us
As discussed, the actual malice standard provides a high level of protection for news organizations engaged in publishing algorithm-generated content about public officials and public figures. Of much greater concern to news organizations should be their legal liability in cases brought by private plaintiffs who allege algorithm-produced content has harmed their reputations. In Gertz v. Robert Welch, Inc. (1974), the U.S. Supreme Court drew distinctions between public officials/public figures and private plaintiffs, holding that the latter were not necessarily required to prove “actual malice” to recover damages. The Court determined that states retained the ability to establish the level of fault that private plaintiffs must prove in libel suits, so long as the states did permit recovery on a theory of “liability without fault” (Gertz v. Robert Welch, Inc., 1974, p. 347). In other words, the Court allowed states to determine whether private plaintiffs could prove defendants acted with a lower standard of fault, such as negligence, to recover monetary awards for injuries suffered as a result of false factual statements. More than 40 jurisdictions permit private plaintiffs to recover damages in a libel suit by proving a lower standard of fault, such as negligence (e.g., Jadwin v. Minneapolis Star & Tribune Co., 1985). The Gertz Court carved out one caveat, though, holding that all plaintiffs—public and private—must prove “actual malice” to recover punitive damages, which are intended to punish defendants for their egregious behavior and deter others from taking similar actions (Gertz v. Robert Welch, Inc., 1974, p. 349).
Plaintiffs can prove negligence by showing the defendant did not act with reasonable care, resulting in the publication of a libelous statement (Diamond, 1996; Sack, 2012, § 6:2.1). As negligence is often judged based on professional standards, this fault standard could present problems for news organizations because it does not require a subjective examination of decision-making processes made prior to publication. The negligence standard allows juries to hold news organizations liable for making a mistake that falls outside the bounds of reasonable professional standards (Diamond, 1996; Franklin, 1984).
About 2 years after the Court decided Gertz, it once again confronted the need to evaluate whether the conduct of a media organization constituted negligence. This time, it took a hard line in favor of protecting the interests of private plaintiffs. In Time, Inc. v. Firestone (1976), Justice Rehnquist squarely rejected Time magazine’s assertion that an organization should never face liability for negligent errors in reporting the contents of judicial proceedings. “As to inaccurate and defamatory reports of facts, matters deserving no First Amendment protection, we think Gertz provides an adequate safeguard for the constitutionally protected interests of the press and affords it a tolerable margin for error by requiring some type of fault” (Time, Inc. v. Firestone, 1976, p. 457). In similar fashion, it logically follows that the Court would likely not unconditionally excuse news organizations from negligent errors in algorithmically generated content. Therefore, news organizations could face the threat of large monetary judgments when they fail to “exercise reasonable care” in publishing automated journalism content. In the case of automated journalism, negligence could include failing to properly clean input data or fact-check the algorithms’ output when automated journalism produces and publishes content that identifies private individuals.
Publisher Liability Versus Distributor Liability: Algorithms and Section 230
As noted earlier, publication is an element of the libel tort, and some companies using algorithms have been able to use this to their advantage by claiming they were distributors who should be immune from libel lawsuits. Recall that to prove publication, a plaintiff must show that the defendant communicated a libelous statement to a third party (Restatement [Second] of Torts, 1977, § 577). Under the related doctrine of republication, anyone who repeats or republishes the false statements can also be held liable (Restatement [Second] of Torts, 1977, § 578). However, the law draws a strong distinction between publishers and distributors, and a number of cases and statutes have outlined the parameters of liability for defendants who do not exercise editorial control over the content they distribute (e.g., Cubby v. CompuServe Inc., 1991; Stratton Oakmont v. Prodigy, 1995; Zeran v. America Online, Inc., 1997). As a result, a party who simply transmits or delivers libelous statements is subject to liability only in situations in which it has reason to know that the information is false (Restatement [Second] of Torts, 1977, § 581).
Prior to the ubiquity of Internet communication, the distinction between publisher and distributor immunized newsstands and libraries from being held liable for the sale or loan of materials that contained libelous content. As speech on the Internet proliferated, American courts disagreed on the role of early Internet Service Providers such as Prodigy and CompuServe. In the wake of two conflicting federal appellate court decisions, Congress passed Section 230 of the Communications Decency Act (2016), which clearly applied the offline distinction between publisher and distributor to the Internet.
Under Section 230, “interactive computer providers,” such as website operators, cannot be held liable for information that third parties post on their website (Communications Decency Act, 2016). As a result, Section 230 has provided broad protections for website operators who face libel claims (see Zeran v. America Online, Inc., 1997; Jones v. Dirty World Entertainment Recordings LLC, 2014). For example, Google has successfully used Section 230 in the past to immunize itself from defamation suits stemming from algorithm-generated content. In 2007, the Third Circuit affirmed a 2006 district court decision holding that Google was an interactive computer service that qualified for Section 230 immunity, and it should not be held responsible even though its algorithm produced search results based on websites linking to false information about individuals (Parker v. Google, Inc., 2006).
Although Section 230 provides a strong shield for interactive service providers, news organizations are not likely to succeed in a claim of immunity for algorithm-produced content appearing on their websites because the organizations are more akin to “information content providers.” Information content providers are the entities who “creat[e] or develop[] content provided through the Internet” (Communications Decency Act, 2016). Under Section 230, such providers do not qualify for statutory immunity from libel suits involving online content. As Weeks (2014) noted, courts are likely to view any automated journalism content published online as though it were produced through a traditional editorial process, meaning that Section 230 immunity would not apply. Thus, courts would find that news organizations using algorithms still exercised the traditional editorial control associated with publishers.
Returning to our Megyn Kelly example, Facebook would certainly qualify for Section 230 immunity if someone were to post false content about Kelly being fired. Similarly, based on the Third Circuit ruling involving Google, Facebook would likely qualify for Section 230 immunity even if its algorithm were responsible for the distribution of false content through Trending Topics or other curated posts. However, our news organization—being an information content provider rather than an interactive service provider—would not qualify for Section 230 protection in most instances where its algorithm publishes false content.
Weeks suggested that news organizations may qualify for Section 230 immunity as interactive service providers in rare situations in which Internet users can input data into an algorithm to produce libelous content that is published on the news organization’s website. He acknowledged that such situations would require courts to read Congress’s intent for Section 230 to provide a “marketplace of information” online very broadly to grant the news organization immunity (Weeks, 2014, p. 85). Even if a broad reading were possible, courts appear to be reluctant to grant Section 230 immunity to website operators who may have knowledge that their site is being used to conduct illegal activity (Doe No. 14 v. Internet Brands, Inc., 2014; Fair Housing Coun. of San Fernando Valley v. Roommates.com, LLC, 2008). As a result, Section 230 immunity does not seem to be an effective avenue for news organizations seeking to defend libelous online content produced by automated journalism. News organizations would need other defenses to combat libel, lawsuits over algorithm-produced content.
American courts have not yet had to decide lawsuits alleging that a news organization’s automated journalism has libeled an individual. But technology companies have had to defend themselves against similar claims. For example, Google was sued in the European Union, where the lawsuit alleged that the search giant was a publisher of libelous information based on its auto-complete algorithm function for user searches (see Diakopoulos, 2013; Ghatnekar, 2013). These cases, nearly all of which have been brought in foreign jurisdictions, have involved an individual or organization suing Google because a search of the plaintiff’s name included auto-complete suggestions that were incorrect and had negative connotations, such as “scam” (Cushing, 2011) or “con man” (D. Meyer, 2011). In 2012, Google faced a similar lawsuit in California when Australian doctor Guy Hingston sued for defamation in federal district court (Kearn, 2012). Hingston alleged that upon entering his name in a Google search, the autocorrect algorithm added “bankrupt.” The doctor maintained in his complaint that he was not bankrupt and Google failed to address his complaint that its autocorrect feature suggested that he was. Hingston argued that Google should be held liable for harming his reputation when it refused to make corrections to its auto-complete function (Hingston v. Google, 2012). However, the case was never decided because the doctor withdrew the lawsuit 3 months later (Grubb, 2013).
Pasquale noted that as Google has faced lawsuits related to content harming individuals’ and organizations’ reputations, it has maintained two different positions on its search algorithm’s role in providing information to avoid liability for libel (Pasquale, 2015). In some instances, Google has defended itself as a publisher of information, claiming that its algorithm’s output is protected First Amendment speech (Volokh & Falk, 2012). In other situations, Google has claimed its search function is a conduit for information, meaning that it cannot be held liable because it merely distributes libelous information that is published by others, but it does not have personal knowledge that the information is false (Pasquale, 2015).
News organizations using automated journalism are unlikely to be able to successfully rely on the legal positions that Google has staked out to avoid liability for algorithm-produced content. The element of publication is rarely at issue in libel suits involving media defendants because such organizations’ role in society is to publish information for consumption by third parties. Thus, a news organization employing algorithms to produce content could rarely, if ever, claim a position similar to Google’s—that it was merely acting as distributor of others’ information. In reality, news organizations must rely on First Amendment arguments—similar to some of those made by Google in certain cases—to defend themselves against libel lawsuits involving algorithm-produced content (Volokh & Falk, 2012).
Discussion
Although still in an early phase of development, automated journalism will grow substantially in the coming years as news organizations increasingly explore opportunities with big (structured) data and simultaneously seek to cut labor costs and expand content coverage by means of automation (Lewis & Westlund, 2015b). Algorithms for automated news content present both great potential—for producing news faster, at scale, with fewer errors—and substantial concerns, be they technical (e.g., data quality and writing quality), normative (e.g., the ethics encoded in algorithms), or political-economic (e.g., the labor dynamics of human displacement; for example, see discussion in Ananny, 2016; Dörr & Hollnbuchner, 2016; Linden, 2017). In some instances already, algorithms are producing news content that is hardly distinguishable from that written by human journalists (Clerwall, 2014; Graefe, Haim, Haarmann, & Brosius, 2016; O’ Connor, 2016). The algorithms producing news content will become only more sophisticated and autonomously oriented in the future. Such a development may challenge fundamental notions about what “communication” means in an era of human–machine communication (Guzman, 2016; Jones, 2014), or what it means to be a (human) journalist in a moment of growing dependence on machines (Carlson, 2017; Lewis & Westlund, 2016)—as well as raise more practical issues of determining authorship credit in automated journalism (Montal & Reich, 2016). In all, the news industry will likely encounter challenges that other industries have faced as software has assumed job roles previously undertaken by humans (Brynjolfsson & McAfee, 2016; Ford, 2016). More broadly, technology companies are developing more advanced artificial intelligence projects that can create and communicate messages (Metz, 2016), developing forms of automated public communication that, despite occasional missteps (Neff & Nagy, 2016), point to a future where automated journalism is hardly the purview of news organizations alone. This makes the potential for bot libel a salient concern across many fields.
News organizations’ deployment of automated journalism will force courts to reconsider traditional understandings of First Amendment protection for speech, particularly in relation to libel. In automated journalism libel suits, public officials and public figures may find it nearly impossible to prove “actual malice,” the level of fault that public plaintiffs must prove before recovering damages for libel because of the strict standards that the U.S. Supreme Court has imposed. The actual malice standard requires plaintiffs to show the defendant knew or should have known that published statements were false, and algorithms do not make subjective judgments independent of their programmers’ decisions. In many jurisdictions, however, news organizations should be concerned about liability for libelous automated journalism content affecting private plaintiffs, who can recover by proving the negligence on the part of the news organization.
Legitimate concern exists about whether the negligence standard outlined in Gertz will hamper technological developments in the news media given the serious legal issues presented by algorithm-generated content about private citizens. Legal scholar David Anderson’s initial concerns—that Gertz’s negligence standard “favored journalistic orthodoxy because the conventional press was more likely to focus its attention on government and public figures, leaving coverage of privately exercised power to magazines and newspapers that emphasize investigative reporting” (Anderson, 2002, pp. 453-456)—continues to ring true in a changing media landscape. About 40 years later, the economic constraints placed upon news organizations suggest that his fear resonates even more loudly in an industry with dwindling human resources and an increased emphasis on producing more content with fewer costs. Even more, as the institutional press battles against newer news/entertainment start-ups for both audience and advertising revenue, it is likely that other cost-cutting measures—such as the shrinking role of the copydesk (e.g., Beaujon, 2013; Dunlap, 2017)—will only increase the likelihood that defamatory content slips through the cracks.
News organizations will not be able to rely on one of the key defenses that Google and other organizations deploying algorithms to produce content have used to shield themselves from libel suits. Google has often claimed that its algorithm serves as a mere conduit to search results containing libelous information, rather than being the actual publisher of such libelous statements. Conversely, news organizations using automated journalism to produce content cannot marshal convincing arguments that their algorithms were just providing readers a path to harmful, false information.
Courts, news organizations, and libel plaintiffs will be required to confront these issues—determining fault in cases involving algorithmically generated content, and whether and when media organizations are liable as publishers—as automated journalism becomes more widely adopted. In addition to libel, automated journalism will pose challenges in other areas of the law, including intellectual property and newsgathering. Scholars have noted that automated journalism may raise questions regarding who should be considered the author of algorithm-produced content for copyright purposes (Montal & Reich, 2016; Ombelet et al., 2016; Weeks, 2014). As artificial intelligence and automated decision-making advance to the point that automated journalism can gather, synthesize, and publish information with less and less human assistance, courts may be confronted with whether an algorithm could qualify for testimonial privileges (i.e., journalist’s privilege) to protect its sources—however far-fetched such a scenario may seem today.
This article has examined questions related to automated journalism and libel law only in the American context—an important consideration given that the First Amendment to the U.S. Constitution provides greater protection for freedom of speech than many other countries are willing to grant (e.g., Schauer, 2005; Sedler, 2006). As a result, news organizations publishing beyond U.S. borders are likely to face even stiffer penalties and greater regulations in situations in which automated journalism violates the law. For example, European countries vary significantly on the degree of fault that public plaintiffs must show to successfully recover damages for libel, with none providing a standard as high as “actual malice” (Ambrose & Ambrose, 2014).
As automation becomes a defining feature of the media and information landscape, matters of accountability for automated content will become pressing concerns: Who takes responsibility for automated journalism? What happens when automated content libels someone, whether because of erroneous data, poorly programmed algorithms, or some combination that produces not just false information but particularly damaging, reputation-harming material? It was once unimaginable that news organizations would publish content that had not been fully vetted by the copydesk, but market forces—including economic and deadline pressures—are dramatically affecting traditional news routines (Harlow, 2012). Reporters are increasingly becoming their own editors, quickly churning out copy to break news via social media channels (Rawlinson, 2016). As a result, it is no longer unthinkable that news organizations would publish content untouched by human employees. To the extent that automated journalism becomes a fixture in journalism’s future—and every indication suggests that it will be, in the near term or the long term—scholars of journalism and media law, as well as professionals managing such processes in newsrooms, will need to think through the legal as well as social and ethical consequences of such technologies.
Footnotes
Acknowledgements
The authors wish to thank Kyu Ho Youm, Jonathan Marshall First Amendment Chair at the University of Oregon, for his comments on an earlier version of this paper.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
