Microdebates: Structuring debates without a structuring tool 1

Abstract

Argumentative debates are a powerful tool for reaching agreements in open environments. However, in large scale settings, such as social networks, making sense of ongoing debates may be a compelling task, and debates risk to lose their effectiveness. We thus propose “microdebates” to help organizing and confronting users’ opinions in an automated way.

Keywords

Online debate social networks social media Twitter argumentation

1. Introduction

According to Mercier and Sperber [45], the need to create and use arguments to convince others is the main driver behind the evolution of human reasoning. Supporting evidence is how good people are at reasoning when they communicate through an argumentative context, rather than in an abstract setting. Moreover, arguments are used to convince others especially in absence of trust.

Fig. 1.

Network graph representation of the arguments about Lights for sensors in the Evidence Hub for Energy Awareness in KMi. Retrieved on November 22, 2013 from http://isave.evidence-hub.net/. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

Given the prominent role of arguments in human reasoning, it comes as no surprise that, also in the context of social media, people became accustomed to arguing online. Online debates are usually organized in threaded sequences of posts and differently from debates requiring physical presence, they can be very long-lived and involve many actively involved participants and an even larger number of observers and bystanders.

However, this “freedom of expression” comes at a cost. When many individuals participate in a discussion, making sense of opinions emerging from these streams of unstructured text may be challenging even for the participants themselves: there is simply too much “noise”. One way of coping with such noise is to restrict one’s focus on the general sentiment emerging from an ongoing discussion, ignoring any specific claims or opinions that may be there. State-of-the-art opinion mining/sentiment analysis techniques and tools classify the sentiment orientation of opinions by defining positive/negative scales of values for every specific domain. Usually these algorithms need training from large corpora of sentences expressing opinions, such as online reviews about some product, brand, or service [43,49].

Such an approach is especially effective if the domain is well-defined, and if a large enough training set is available. In domains such as customer reviews [36], where the concepts involved can be defined using specialized ontologies, and the jargon is relatively narrow and well-defined, the classification accuracy of existing sentiment analysis algorithms is more than acceptable. In other domains instead, such as in political debate, this is not the case [61]. Importantly, sentiment analysis does not explicitly tell why certain opinions are in place and how they relate to other opinions.

A second approach is to force users to structure their contributions using dedicated tools. Debate-friendly tools that should help users visualize and understand the outcome of a discussion are now becoming popular. Among them there are (a) visualization tools, such as DebateGraph;2

http://www.debategraph.org.

(b) community-based tools relying on user ranking, such as DBee,3

http://dbee.me.

a global debating network which features scoring and ranking with both positive or negative values and Debate.org,4

⁴

http://www.debate.org.

a social network platform where users can start a debate and comment with pro/cons rating against the main argument in the debate; and (c) community-based moderated, professional discussion forums.

An example of the last category is Debatepedia,5

⁵

http://idebate.org/debatabase.

an International Debate Education Association project containing a database of more than 500 debates, produced by professional debaters and used as a training set for students who want to learn how to debate effectively and improve the database. Another one is Deliberatorium,6

⁶

http://deliberatorium.mit.edu.

a community-moderated system where comments are subject to moderator approval before they may be certified and finally become visible to a larger community. The Evidence Hub [18,19] is an argumentation-based tool to structure conversations, which puts issues, ideas, and evidence at the center of a reflective community of practice. The Evidence Hub adopts a version of the IBIS model [37] to create argument maps which are then visualized in the form of graphs (see Fig. 1). This approach is quite common [3,9]. Other argumentation-enabled web applications are discussed in [4,9,12,52,56].

Applications of such debate-friendly tools include e-gov, e-participation and policy-making. For these specific purposes, diverse platforms are being developed within a number of EU cooperation projects. The present work is part of the ePolicy7

⁷

http://www.ePolicy-project.eu.

project activities.

One of ePolicy’s aims is to develop methods for deriving social impacts through opinion mining on e-participation data extracted from forums and blogs. Other related projects are IMPACT,8

⁸

http://www.policy-impact.eu.

which is developing an innovative argumentation toolbox for supporting open, inclusive and transparent deliberations about public policy, and WeGov,9

⁹

http://www.wegov-project.eu.

a project which provided to members of the European Parliament and of several regional parliaments a toolbox for exploiting existing social networking sites to engage citizens in two-way dialogues as part of governance and policy-making processes.

Grosse et al. [29] are also oriented towards the deployment of an e-gov framework in which argumentation and debates are used. That is particularly related to our work, because it uses a Twitter-based perspective. However, the focus is on mining opinions from Twitter, by taking Twitter messages into consideration for analyzing underlying arguments, rather than on the definition of a Twitter dialect and on the identification of the emerging computational argumentation framework, which is the subject of the present article. On the other hand, the idea behind DECIDE 2.0, by Chesñevar et al. [14] – a framework integrating argumentation technologies and context-based search [44] for processing citizens’ opinions in social media – is to employ context-based search and knowledge engineers with expertise on defeasible logic programming [28] who will be in charge of maintaining a knowledge base. This is a major difference from the bottom-up argumentation approach which, as we will see shortly, envisages a self-regulating discussion without expert intervention.

More recently, a cluster of projects has been funded in the CAPS (Collective Awareness Platforms for Sustainability and Social Innovation) initiative of the EC under FP7.10

¹⁰

https://ec.europa.eu/digital-agenda/en/caps-projects.

Notably, the CATALYST project11

¹¹

http://www.catalyst-fp7.eu/.

also focuses on related issues, building on the Evidence Hub experience.

The WeGov toolbox, like DECIDE 2.0, relies on sentiment analysis algorithms. IMPACT instead has developed a language-independent approach which aims at modeling policy issues at a conceptual level. The project delivered an argumentation toolbox for supporting open, inclusive and transparent deliberations. The IMPACT toolbox supports argument reconstruction from online text data, and visualization of arguments about policy issues. The argumentation toolbox is pluggable into existing content management systems in order to be used with a variety of e-participation platforms.

These activities show a general interest in argument retrieval for understanding how people argue about policy issues. In such settings it is reasonable to expect users to publish not only an opinion (as in writing a review), but also to expand on their opinions by laying out arguments, to convince others and reply to criticisms. Following this stream of research, in the scope of ePolicy we have been working along two directions: extracting arguments from text [41], and fostering a bottom-up approach to argument production. The present paper introduces and discusses this second approach.

The idea of bottom-up argumentation [55] is that traditional opinion gathering methods, such as questionnaires and polls, pose severe limitations in that they mainly force those interviewed to express preferences upon some predetermined options. Social media may overcome such limitations, enabling online debates between (informed) citizens, who can come up with new ideas and perspectives. The problem is then how to keep conversations manageable, thus debate-friendly and effective tools are needed.

Current online debating tools, such as the aforementioned ones, build on and extend the traditional forum-like structure, where users can reply to or quote other users, by introducing debate-oriented concepts. They are not very different from a standard discussion forum with reputation, moderators and recommendation features. Moreover, they require the user to comply with and adapt to the abstractions they are built around, and not vice-versa.

We believe that introducing built-in debate-oriented concepts is an important step in understanding people opinion and facilitating online debate. However, it is hard for such tools to achieve the huge amount of traffic of the main social networks. This is not necessarily related to the size of marketing (i.e. monetary) investments, but could be well explained instead with the concept of tipping point. Even big companies, like Apple or Google, tried to launch a service, like Ping or Orkut, and then were forced to shut it down because it did not reach the threshold of users’ engagement (the so-called hype moment) needed to spread a new online behavior.

Other companies, like Facebook, YouTube and Twitter, had instead a huge success. Would it be possible to exploit the massive online presence of such services to gather structured debates?

The WeGov project is somewhat moving in this direction by developing a toolbox that detects, tracks and mines opinions and discussions on policy oriented topics among existing and well established social networking sites, like the aforementioned and other ones such as Wordpress and Baboo. The key idea is to mine opinions and engage with users by means of channels citizens are already familiar with. However, the project mainly focuses on describing the key features of a discussion, such as: which post is most relevant, how active or important a discussion is, or which user is perceived as most influential, rather than on argument retrieval and on the reasons why users make such and such claims.

Our aim is to encourage free, unconstrained online debates and to provide policy-makers and users with tools to automatically make sense of possibly very lengthy debates. Such tools should not only show the general sentiment around a specific topic, as current sentiment analysis tools do. Instead, they should also identify specific opinions, as well as the relations among them. At the same time, the tools should gather structured debates from well-established social networks, in order to take advantage of the high level of participation and rich discussions that already take place in such contexts. But how to retrieve structured debate from online social networking sites? Following the bottom-up argumentation approach, we identified a possible solution, summarized in the phrase: structuring debates without a structuring tool.

We rely on a well-established convention among online users, that is the ability to tag their own messages [1]. We selected Twitter to start our experimentation, because Twitter users are already accustomed to annotating their messages with #, @, RT and $ tags. Twitter is currently one of the most popular social networks, and among them it distinguishes itself for blurring the boundaries between ludic and serious [38]. It is thus more likely to find users interested in policy issues in Twitter rather than elsewhere. Finally, Twitter has important features that greatly help implementing our ideas: messages (tweets) are short – there is a 140 character limit – and thus easy to analyze; they are broadcasted and thus publicly visible; they can be retrieved using efficient tag-based search; and there exist several tools that support application integration via the Twitter APIs.

Fig. 2.

A fragment of a Twitter stream, showing an example of microdebates. Twitter organizes its entries top to bottom from newest to oldest. $$ tags and !$ tags represent supported and attacked arguments. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

In this article, we describe microdebates for Twitter. Figure 2 shows an example of them. The microdebate language is a simple yet powerful Twitter dialect. It allows users to debate about an issue, by explicitly referring to emerging arguments, either in support or in opposition, with no need to learn new interfaces or move to new social networks. We identify computational argumentation, and in particular abstract argumentation [21], as the conceptual and computational framework to model the retrieved arguments and reason from them automatically.

This paper offers a comprehensive description and an empirical evaluation of microdebates and their enabling software tools. It is structured as follows. In Section 2, we show that the most popular organization of online conversation does not suit well to argumentative debates, and motivate the introduction of new solutions. In Section 3 we introduce Twitter and micro-blogs. We then define microdebates, their syntax, and provide examples of bottom-up argumentation via microdebates. Enabling software tools are the subject of Section 4. In Section 5 we introduce the notion of group support of arguments, and discuss he role of weights in microdebates. Section 6 presents some experimentation done with different user groups and elaborates on the observed results. We conclude with Section 7, where we discuss areas of application, criticalities, and future perspectives.

2. Threaded discussions and arguments online

Threading is a popular way to structure online conversations, common to diverse media such as email, blogs, discussion forums, and online platforms in general. User messages (comments, posts, tweets, emails, etc.) are grouped in a hierarchy by topic, with any replies to a message arranged visually near to the original message. A set of messages grouped in this way is called a thread. Figure 3 shows an example of a threaded discussion on Facebook.

Fig. 3.

Excerpt of one of the Syria threads on New York Time’s Facebook home page, retrieved on September 27, 2013. These are only a small fraction of over 200 comments. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

Organizing messages via threads has many advantages. For example, in an educational environment, online threaded discussions may offer a forum for quiet students to develop and verbalize ideas; promote in-depth response and reflection; encourage peer affirmation; and provide opportunities for more teacher–student and student–student interaction [24].

Threads are very useful for sorting mail, and other types of online content originated by a reply-to mechanism. However, threading does not correspond to the way natural human conversations take place in the real world. Threaded comments risk to break up the dialogue into a bunch of private conversations instead of an ongoing, open discussion, which leads to a more confrontational debating style [11,33].

In computational argumentation, and in abstract argumentation in particular [21], arguments are unordered objects, connected with one another by binary attack relations. This results in general in a directed argument graph, as opposed to an argument thread. A thread is a special case of a directed graph. If we were to map comments to arguments, and threaded discussions to argument graphs, we would obtain a tree of arguments, which is in general less expressive than an argument graph.

Arguments and their attacks form argumentation frameworks. A (Dung-style) argumentation framework (AF) is defined as a pair $⟨ X, A ⟩$ , where $X$ is a set of atomic arguments and $A$ is a binary attacks relation over arguments, $A \subseteq X \times X$ , with $⟨ x, y ⟩ \in A$ interpreted as “argument x attacks argument y”. If some arguments attack some other arguments, not all arguments can be accepted at the same time. Collections of acceptable arguments can be described by various extension-based semantics [2]. Some notable semantics defined by Dung are called the admissible, preferred, and complete semantics. In particular, let S be a set of arguments, $S \subseteq A$ :

S is conflict-free if $\forall x, y \in S$ , $⟨ x, y ⟩ \notin A$ ;

an argument $x \in S$ is acceptable w.r.t. S if $\forall y \in X$ s.t. $⟨ y, x ⟩ \in A$ , $\exists z \in S$ s.t. $⟨ z, y ⟩ \in A$ ;

S is an admissible extension if S is conflict-free and all its arguments are acceptable w.r.t. A;

S is a preferred extension if it is a maximal admissible set, w.r.t. set inclusion;

S is a complete extension if S is admissible and $\forall x \in X ∖ S$ , $\exists y \in S$ and $⟨ y, x ⟩ \in A$ .

Other extension-based semantics have been defined, such as the grounded, stable, semistable, and ideal semantics [2].

From a practical viewpoint, nowadays there are many efficient implementations of abstract argumentation semantics [13], which can be used to support social applications with an underlying Dung-style knowledge representation. Figure 4 shows the output produced by the web interface of one such tool, ASPARTIX,12

¹²

http://rull.dbai.tuwien.ac.at:8080/ASPARTIX/.

from a model of the Syria example shown in Fig. 2.

Fig. 4.

Argumentation framework emerging from the Syria microdebate: ASPARTIX interface showing a complete extension. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

If argument graphs are in fact argument threads, the last arguments in the thread are by construction unchallenged, and therefore acceptable. In a sense, this reflects the way threaded discussions evolve: the most recent post has, typically, an advantage over its predecessors, as it is often displayed on top of them, and – at least for a while – it stays unchallenged. On the other hand, we think that graphs represent a better way to organize arguments in a debate.

Our work is an attempt to use the existing technology to host debates where arguments and their attack relations are clearly identified, and mutually compatible arguments and opinions can be clustered and visualized together.

Our proposal is not to do away with thread-based discussions altogether, but to enhance them. We enable authors to mark the arguments in their posts, and let arguments be expressed in several posts, possibly by several users, in a collaborative fashion, as opposed to mapping individual posts to arguments one-to-one. In this way, we make it possible for “older” arguments to attack “newer” arguments. This also allows posts to make explicit reference to the arguments they attack.

3. Micro-blogging and microdebates

Micro-blogging is a form of communication whereby users can describe their current status in short posts distributed by instant messages, mobile phones, email or the Web [20]. A very popular platform for micro-blogging is Twitter, where people talk about their daily activities and seek or share information [34] by broadcasting brief textual messages (tweets) to their followers [32].

Users can also add tags to their messages. Such tags include the now famous hashtag, i.e., the # symbol followed by a text string, which represents the stream of news a tweet belongs to; the @ symbol followed by a user name, also adopted by other social networking facilities such as Facebook, to identify a Twitter user, so that they can be explicitly addressed within a tweet; and the RT tag, which indicates a re-tweet, i.e., a re-post of a tweet from another user.

Figure 5 shows a tweet broadcasted on January 22, 2013 by user @EU_Commission.

Fig. 5.

A sample tweet. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

The tweet indicates a reposting (RT) of a message, relevant for topic literacy. This tweet reaches all followers of @EU_Commission, but it is explicitly addressed to users @VassiliouEU and @dennisabbott. It signals a speech, titled “Overcoming the literacy taboo”, delivered at the European Parliament by Twitter user @VassiliouEU and provides an URL with further information.

The Twitter jargon may seem cryptic to the novice. Nevertheless, it is interesting to notice its adoption by government officials. For instance, as of January 2013, all US senators as well as 90% US representatives were on Twitter, as officially announced by the Company.13

¹³

pic.twitter.com/JkuJMSCI.

Recently, Twitter has also endorsed the $ tag (cashtag), which is similar to the hashtag, but it is especially used in tweets about the stock market.14

¹⁴

http://money.cnn.com/2012/07/31/technology/twitter-cashtag/index.htm.

The interesting fact about the hashtag (from a sociological perspective) is that users invented it. Twitter users started adding hashtags to their messages sometime around February 2008 [8]. In a short while, hashtags became very widespread. Twitter simply accommodated its users’ behavior by highlighting the hashtags in the tweet and by facilitating their retrieval.

Tagging behaviors in Twitter are interesting not only for their bottom-up nature, but also because they are distinct from those in other social media. Twitter users are less likely to index messages for later retrieval [51]. This reflects the fact that tagging patterns in Twitter have a conversational rather than organizational nature [31], i.e., users follow what people are saying about a topic by following the related tag.

Users can also reply to tweets, which results in threaded discussions. As we pointed out in Section 2, threaded discussions are not expressive enough to represent argument graphs. Microdebates address exactly this shortcoming.

A microdebate is a set of tweets, each contributing to a debate. The contribution may be for instance a statement expressing an opinion, providing some evidence, or defining a fully-fledged argument. Such tweets may contain explicit references to ideas expressed in other tweets in the same debate. Such reference is made via short combinations of characters that express positive or negative relations.

In this way, all that is asked of the user is to use certain combinations of characters in order to put their opinion in the context of other opinions. In exchange, debates will be easier to parse, and a number of visualization tools could be deployed to facilitate browsing, participation and focus. Indeed, microdebates can be processed by automatic reasoners, such as argumentation-based reasoning tools [7,23] and the output can be visualized graphically as clusters of coherent opinions, where different clusters may attack each other. This could foster awareness of different opinions on a topic. In some cases it may encourage arguers to reach an agreement.

Microdebates are inspired by Twitter’s micro-blogging nature. They consist in a stream of tweets annotated with all the available tags, plus two new tags to mark opinions and conflicts between opinions. Introducing these special tags to a tweet enables us to convert a stream of tweets into a microdebate – where the prefix “micro” reflects the micro-blogging nature of tweets.

Let us summarize the meaning of tags in the microdebate Twitter dialect:

# (hashtag) identifies the discussion (e.g., #literacy, in a debate about literacy): as customary, this ensures that a tweet will appear in the right streams;

@ (at) identifies a Twitter user: for instance, this could be used for call and response between two users, or, as it often happens, by a user associated with a web site that collects and displays all tweets directed to that user;

RT (Re-Tweet): re-tweeting repeats a post authored by others; it contributes to making the post more salient, by bringing it back up in a (chronologically ordered) stream;

$$ (double-cashtag), as in $$redLooksGreat: here, redLooksGreat has to be interpreted as (the label of) an opinion or argument supported by the author of the tweet;

!$ (bang-cashtag), as in !$greenLooksBetter: here, greenLooksBetter has to be interpreted as (the label of) an opinion or argument opposed by the author of the tweet.

There is no special syntax for tweets belonging to a microdebate, other than the usual Twitter syntax which imposes a 140 character limit for a tweet, and space-free tags. However, tweets belonging to a microdebate should at least contain a discussion identifier (hashtag), and an argument identifier (double-cashtag). There are no other restrictions on the number and type of tags a tweet can contain.

This is how microdebates work in practice:

content elements are tweets with a suitable # tag, used to identify the specific microdebate users are contributing to. (Twitter then displays such tweets in the public stream associated with such a hashtag.);

users annotate their tweets using $$/!$ tags. When a user @A specifies $$opinion_1, it means that the text in that tweet supports opinion_1, which can be an opinion expressed by the user himself in the comment, or by another users @B. In that case opinion_1 will be seen as based on two comments, @A’s and @B’s respectively. The opinion name is an abstract label, preferably but not necessarily evocative of the user’s opinion;

users can attack (counter) opinions using the !$ tag, e.g., by a tweet containing a !$opinion_2 tag. This negation states that the tweet is a comment that attacks opinion_2. This enables establishing relations among opinions;

a single tweet may as well fit to more than one argument; in that case, it may include more than one double-cashtag, indicating support to a set of opinions. Similarly, a single tweet may explicitly attack more than one argument: in that case, it will include more than one bang-cashtag;

if a user adds a tweet with a new $$ tag, the user is in fact identifying a new opinion in the microdebate;

RT, i.e. re-tweets, are handled like new tweets that add “weight” to the original tweet.

Our interpretation of this set of tags allows us to distinguish between different microdebates, because each of them will have a different hashtag, and between supported ($$) and attacked (!$) opinions or arguments within each microdebate. Consistently with Dung’s formulation, our syntax does not allow for the presence of support relations. In spite of its simplicity for the user, this approach enables us to translate microdebates into fully-fledged abstract argumentation frameworks.

To demonstrate how microdebates unfold, let us consider again the stream of tweets in Fig. 2, between three fictitious Tweeter users: Angel Eyes (@mdeb_a), Blondie (@mdeb_b), and Tuco (@mdeb_c). This set of tweets is a fragment of a longer microdebate modeled on a discussion that took place on Facebook’s New York Times home page in August/September 2013. The whole microdebate and pointers to the original posts are in Appendix.

In this example, a microdebate consisting of around 30 tweets eventually produces around 20 focal arguments. Such arguments are not statically defined. They do not even exist before the debate starts. We can see instead how arguments take shape bottom-up [55] as time goes by. $$rebels is an embryonal “argument”, thrown on the table at some point in the discussion, as a statement with supporting evidence (They have a video of $$rebels loading chemical weapons into a pick up and Germany says rebels used them). Its author, @mdeb_b, also indicates that this argument attacks an other argument referred to by !$Assad. As a different position emerges, arguments are fleshed-up by their supporters. After a number of tweets ahead in the dialogue, @blondie, @angel_eyes and @tuco are focusing on a limited number of arguments, which surely contain a great deal of implicit premises and conclusions (the debate may go on indefinitely), but are, nevertheless, identifiable as such. For example:

rebels	There is a video of rebels loading chemical weapons into a pick up and Germany says rebels used them. UN confirms. Even some Americans do not believe that Assad attacked his own people (implicit claim: this was a set up organized to justify an intervention against Assad).
assad	Assad did It. We’ve investigated before and evidence showed they’ve used chemical weapons (implicit claim: this was not a set up).
enforce	Someone should enforce some order. Two years of civil war has to end. It is our duty to make sure they’re not using chemical weapons. We should intervene.
out	We can’t prove who used the gas; it’s not our war; There is no logical reason to become involved; money had better be used to create jobs; bombing Syria won’t solve the problem; AlQaeda may take over Syria; we don’t want another Iraq. These are all valid reasons not to intervene. Therefore we should stay out.

From a Twitter user’s perspective, especially for those interested in policy issues, the motivation to use microdebates is that by doing so opinions can be named and thus they can be identified and made gain prominence, which is not possible in standard Twitter exchanges. This is thanks to a number of visualization tools that we are currently developing (see Section 4).

In the context of ePolicy, where we are interested in gathering and analyzing e-participation data extracted from forums and blogs, and where we aim to encourage citizen to participate and get engaged in the policy-making life cycle, this approach seems to offer a promising avenue.

In the next section, we discuss some prototypes we have implemented so far.

Fig. 6.

A snapshot of a discussion labeled #speakinpublic loaded into the NetLogo-based analysis tool. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

4. Tools

The first microdebate analysis tool prototype was implemented as a NetLogo model [59]. Figure 6 shows a snapshot of its user interface. In this model, each NetLogo agent (or turtle) represents an argument used in the microdebate. Attacks between arguments are represented by directed links from one agent to another one. Notice that in NetLogo turtles are primitive types, and they are essential elements in creating and reasoning about graphs. Therefore, unintuitive as it may seem to those not familiar with the language, using turtles and direct links to model graphs, such as AFs, is common practice in the NetLogo community. The choice of language for implementing this first prototype fell on NetLogo because that is a widely known framework to potential audiences, such as computational sociologists, and because of its simplicity.

NetLogo does not provide native methods for computing the semantics of abstract AFs. Instead, we bundle together, using the NetLogo API, an extension called semconarg. That includes ConArg [7], a Java tool that uses constraint programming to model and solve different reasoning problems related to abstract AFs. In particular, we rely on ConArg to compute admissible, complete, and stable semantics with and without weights (see Section 5), and to solve problems such as enumerating and counting all the extensions and deciding if an argument is credulously or skeptically accepted. The outcome of such reasoning is reflected in the visualization.

A concise summary of reasoning problems in AFs is provided by Charwat et al. [13], where the authors also report known results about computational complexity of reasoning in AFs. An experimental assessment of ConArg is given in [6]. Due to the lack of benchmarks for abstract argumentation systems [46], but still wanting to study the scalability of their algorithm on realistic cases, the authors run experiments on particularly complex networks of different topologies, such as those known in the literature as Barabasi, Kleinberg, Erdös–Rényi and Watts–Strogatz networks. The experiments show that in such worst-case AFs ConArg can find all the stable and complete extensions of networks of 40 to 60 nodes in a matter of seconds on an Intel^® Core^TM i7 2.4 GHz processor, with 16 Gb of RAM. In reality, our initial experiments indicate that AFs deriving from online debates produce much sparser graphs, therefore we expect ConArg to be able to handle even larger AFs without problems. According to our measurements, most of the computation time is instead devoted to the visualization tasks.

Fig. 7.

3D view of the #SyriaWar debate in Netlogo. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

To use the NetLogo model, as a first step, one should enter a hashtag identifying a microdebate in the GUI’s debate text box. For instance, the debate identifier in Fig. 6 is #speakinpublic. At that point, NetLogo retrieves from Twitter (via the Twitter API) and parses the stream of tweets containing that hashtag. We obtain:

a new argument for each new double cashtag encountered;

a new attack for each pair (double cashtag, bang cashtag) encountered.

The NetLogo tool analyzes the content of the tweets and then visualizes the outcome. The user can interact with the graph in 2D (see Fig. 6) or by moving, orbiting, and zooming a 3D representation of the underlying AF, such as the one in Fig. 7. Nodes and edges have different radius and thickness, depending on the number of times an argument or an attack is found in the given set of tweets. It is possible to inspect edges and nodes to find which tweets are associated with them.

Our prototype supports all semantics supported by ConArg, which include, among others, the admissible, complete, and stable semantics. The interface permits to choose a specific microdebate by selecting a debate ID (buttons debate and Get Debate), indicate which semantics to apply (buttons semantic and Get Extension), and finally chose one among possibly multiple semantics extensions to visualize (buttons available-extensions and Apply). The user can also specify an alpha value, whose meaning will become clear in the next section.

Figure 6 shows all argument tags of the selected debate. A complete extension – actually, the only complete extension – in this example contains two arguments, represented by larger circles on the right-hand side of the figure: $$argument, which claims that $$useless is not a valid argument, and $$reviewmeetings, which claims that review meetings are a good venue to share ideas. These arguments defeat $$useless, which states that review meetings are of no importance, and $$solittletime, which states that usually review meetings do not give enough time to really express one’s ideas. The outcome of the microdebate #speakinpublic can be thus compactly summarized.

Alongside developing analytic tools for the policy maker, we have been developing user-oriented tools. In particular, we have implement a web service and interface for storing and browsing microdebates, and an Android application. One can therefore take part in microdebates either via a Twitter client (such as the Twitter web site or any mobile App for Twitter), in which case the microdebate will not be visualized any differently than any other Twitter stream; or one can use the Microdebates web site (http://argu.ing.unibo.it/microdebates/) from any device with a browser and an Internet connection; or one can use the Microdebates App for Android. The web site and App enable browsing views of debates and arguments using tag clouds. Users who have a public account with Twitter can also send tweets to the microdebates. Tag clouds are visual presentations of a set of words, typically a set of “tags” selected by some rationale, in which attributes of the text such as size, weight, or color are used to represent features, such as frequency, of the associated terms [50]. By looking at the tag cloud, a participant can form a general impression of the underlying arguments and their status in the ongoing discussion. Tapping (or clicking) on a tag cloud shows the full text of the related tweets. This visualization method seems to be much more accessible to the non-tech-savvy than other interfaces that put explicit emphasis on a data model (the underlying AF), which should instead remain invisible to a large class of end users (for example, to those interested in participation but not in analytics).

Fig. 8.

Some tagclouds generated from the Syria discussion. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

Figure 8 shows automatically generated tag clouds for some arguments in the Syria microdebate. Each tag cloud shows a group of keywords that describe, somehow, the content of the tweets in support of that argument. The font color indicates the status of the argument in the debate, which can either be skeptically accepted (present in all extensions), credulously accepted (present in at least one extension) or defeated (absent from all extensions). The font size is indicative of the prominence of a certain keyword in the set of tweets supporting the central argument. When the user touches an argument’s tag cloud, the App shows the tweets that support or attack such an argument. Arguments are ranked based on the support they receive (see Section 5) and displayed top to bottom based on their ranking.

In this way, each tweet can potentially impact on the visual representation of a discussion: some keywords can emerge, gain emphasis, or disappear as new tweets about a give argument are broadcasted. By giving users a possibility to contribute also visually to a discussion, we can improve an ongoing discussion in terms of focus, accessibility, and impact.

Because of tag clouds which rely on linguistic features, the Microdebates visualization in the web site and Android App is language-dependent. Currently, English and Italian are supported. The language is automatically detected by the server routines.

5. Weighted microdebates

As Dung observes [21], the way humans argue is based on a very simple principle which is summarized succinctly by an old saying:

The one who has the last word laughs best.

Indeed, considering all attacks equally important may give raise to counter-intuitive outcomes and makes the whole framework unstable. It may happen that many users believe in an argument, or in a specific attack, and what we would expect is that such argument or attack is, somehow, acceptable. However, a new attack on a single argument posted by an individual user may very well suffice to defeat everything everybody else agrees upon. Although this is not a logical problem in the abstract world, it does create a problem in concrete applications of microdebates. We do want to let mainstream arguments be challenged and possibly defeated by new arguments, but intuitively, this should happen only if there is enough consensus on such new arguments.

Notice that this is not only a problem of microdebates. Online community discussions are subject to this and all other kinds of “bad behaviour”, whose effects can be limited with the help of various techniques [35].

For microdebates, we address the issue by relying on weighted argumentation frameworks [22]. Thus we consider not only the content of the tweets, but also the number of tweets containing the same arguments and attacks, and how many times these are re-tweeted.

Weighted Argument Frameworks (WAFs) are a natural extension of Dung’s AFs. The idea is simple. Each attack between two arguments has a weight that specifies its intensity, thus a weighted argument system is a triple $W = ⟨ X, A, w ⟩$ where $⟨ X, A ⟩$ is a Dung-style abstract argument system and $w : A \to R$ is a function assigning real valued (>0) weights to attacks.

A possible semantics of WAFs is described by Dunne et al. using the idea of inconsistency budgets [22]. An inconsistency budget β defines how much inconsistency we are prepared to tolerate within an argumentative framework. Then, we can disregard attacks up to a total weight of β in order to find extensions in a Dung-style AFs.

Fig. 9.

Another snapshot of the NetLogo-based analysis tool, this time considering weights. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

Some generalizations of this approach emerged in recent literature. In particular, Coste-Marquis et al. [16] focus on generalizing the WAF setting by considering other ways to aggregate weights than using summation and to show how weights can be exploited to define new notions of extensions. Leite and Martins [39] propose a class of semantics for “social abstract argumentation”, extending Dung-style frameworks with positive (pro) and negative (con) votes. Efficient implementations of these semantics have been presented in [15].

Bistarelli and Santini [7] propose a solution that captures the semantics of the different metrics used in literature by independent models with a single parametric semiring-based framework. That leads to a unifying modeling framework, supported by Soft Constraint Programming techniques. Moreover, they focus on small-world networks and are strongly oriented towards application in real-world contexts, like discussion fora or online social networks where arguments may be rated by users leading to the definition of WAFs. These algorithms are efficiently implemented in the aforementioned ConArg Java tool (see discussion in Section 4), which suffices for our purposes. For these reasons, we decided to adopt Bistarelli and Santini’s solution, and embed their software in semconarg.

Inconsistency budgets come in handy when we try to incorporate the wisdom of crowd into argumentative systems. In a social networked environment it does matter how many people agree with a certain attack, and that should impact on the outcome of the debate, in terms of the extensions we are prepared to accept.

In our setting, weights are related to how many users reiterate the same attack. In particular, every new attack has an initial weight of 1. For each new tweet that expresses an attack already present in the framework, if the tweet comes from a user that has not expressed that attack yet, the weight associated to that attack is increased by one. This also applies to re-tweets.

ConArg proposes a new approach to weighted extensions, called α-extensions, which suits particularly well to microdebates. Similarly to Dunne et al.’s inconsistency budgets, a certain level of inconsistency in the theory is tolerated, thus attacks that sum up to the threshold level β, but do not exceed β, are tolerated within each extension. In ConArg’s α-extensions, however, defenses are also weighted, as well as attacks. In order to defend itself, an extension should have arguments that counter-attack external attacks, as usual, but the counter-attacks should overall outweigh the external attacks. An α-extension is thus a set of arguments that defends itself because those who agree with it outnumber those who agree with its attackers.

This feature is particularly relevant in our context: to be successful, an attack should be not only put in place, but also reach a significant consensus, otherwise it will not be effective. In this way, attacks that do not attract enough consensus are disregarded and the argumentative framework, i.e. the microdebate, cannot be spoiled by a single user with a “last minute” broadcast.

The application of α-extensions is illustrated in Fig. 9. The microdebate is the same as in Fig. 6, but this time repetitions and RTs have been considered, allowing for a majority who supported the attack $$useless versus $$reviewmeetings. We can see that RT has the effect of making the $$useless argument become a part of the alpha-complete extension.

As we see from the picture, adding group support reverses the dominant position in the microdebate. The α-extension contains $$useless and $$solittletime: the majority of participants do not like review meetings. At the same time, with an α threshold of 5, a little inconsistency is tolerated in the extension, thus also $$argument, which attacks $$useless, is allowed in the extension.

So we can say that if the outcome of microdebates is decided using WAF, the one who has the last word laughs best – only if enough words have been said that agree with that last word.

However, while microdebates allow users to contribute to the acceptability of an argument with multiple tweets, there is no guarantee that each argument in the microdebate is a well-formed one. We cannot guarantee that, because by adopting an abstract approach, we do not look inside the arguments. Thus the outcome of microdebates should not be considered as a logical truth, but as a socially accepted position.

Fig. 10.

Some examples of AFs extracted from our initial experimentation with students at the University of Bologna. (Colors are visible in the online version of the article; https://dx-doi-org.web.bisu.edu.cn/10.3233/AIC-150690.)

6. Experiments

We made some initial experiments in November 2012 with students from the University of Bologna. We asked them to use microdebates to discuss about certain issues, such as:

what sort of jobs they wish for their future (#lavorochevorrei),

what they think of microdebates (#microdibattiti),

what they think of a candidate for the Italian Democratic Primaries (#renzidovevavincere),

what they think of speaking in public (#speakinpublic).

Since back then we did not have a server to collect tweets nor any specially designed user interface, we asked users to participate using their devices and tools, and to address their tweets also to a Twitter profile we created specifically for this purpose (@microdebate). We used the Twitter API to retrieve and parse tweets from the microdebate profile, where we started to collect debates occurred during this experimental phase.

Figure 10 shows a number of graphs corresponding to some of the microdebates resulting from the experimentation. We collected twelve microdebates. Those illustrated here belong to a sample that span from very short dialogues to more complex debate structures.

The resulting microdebates are of various graph structures, as there are no restrictions on the number and chronological order of the attacks. Indeed, linear graphs (see Fig. 10(b), (f)) as well as non-linear graphs (see Fig. 10(d), (e)) were possible outcomes of microdebates.

Based on this initial experimentation with students we observed, encouragingly, that users that have never been exposed to any type of formal argumentation, computational or otherwise, got accustomed to our syntax almost immediately.

As of January 2014, the Microdebates App tool introduced in Section 4 became available, so we could run further experimentation. A first study, described in [60], was designed to understand whether Microdebates App for Android could provide understandable, useful input to a human user; and under which circumstances. We also wanted to gauge how much the user experience would be influenced by the system’s calibration (in particular, by the value given to α), and whether having to create new cashtags would be seen as a hurdle by users not accustomed to microdebates.

For this study, we approached ten participants in the 25–34 age group, all of them with an Android phone and a Twitter account. The experiment was conducted in English, in Turkey among non-native English speaking participants, all with a reasonable command of English. We divided all participants into two equally sized groups; each group was given a topic, and a 40 min time frame, to discuss using Microdebates App. At the end of 40 min we gave a two-hour break. Then we gave a different topic, and an additional 40 min for a second microdebate. Eventually, we asked participants to answer an anonymous survey.

The topics were: Are occupy protest movements justified? and Is nuclear energy justified and should it be expanded? In the first debate, participants were allowed to create new cashtags in order to label their arguments. In the second debate, participants were given a fixed set of cashtags, each one with a brief explanation of the concepts around it. These conditions were the same for both groups. α was set to 1 for a group’s first debate (#mdoccupy) and to 3 for its second debate (#mdnuke). Conversely, α was set to 3 for the other group’s first debate (#mdprotest), and to 1 for its second debate (#mdenergy).

We observed that the structure of the debates was not visibly influenced by α, and that there was no substantial difference between debates whose cashtags were given and those with free cashtags.

Based on the data we gathered via questionnaires, we observed an interesting correlation between a participant group’s interest in a topic and the number of tweets and explicit attacks produced by the group, all else being equal. When at least 4/5 participants declared interest in a topic, the discussion received 14 to 21 tweets, containing 6 to 10 attacks and the connectivity of the argument network was 3. When less than 4/5 participants declared lack of interest in a topic, the discussion received 7 to 9 tweets with 2 to 5 attacks, and the connectivity was less than 3.

In general, interest in the topic under discussion is key for obtaining argumentation frameworks rich with connections and weights, where groups engaged in microdebates are more likely to enjoy a sharper consensus. This is also thanks to the information extracted from attack edges and weights which enables a better, sharper visualization, which in turn may be generally perceived, by the group, to be appropriate and useful.

We designed and run a second experiment, in Italian, on May 15, 2014, on the night of the pan-European presidential debate. The debate among candidates to the Presidency of the European Commission was broadcasted live. We gathered a group of around 50 Italian students of Political Science in Bologna, plus some more students at other University sites (Trento and Siena), and gave them a 30 min tutorial about microdebates. Like in the first study, all participants were equipped with their own devices and Internet connections. About half of the participants had an Android phone, and not all of them installed the App, so visual feedback was limited, as most participants could see the other tweets, but could not enjoy the hash tags and browsing features provided by the Microdebates App. The web site was not yet available at that time.

The participants in Bologna were accommodated in a room with a large screen from which they could follow the two-hour debate. In preparation for the experiment, we identified 9 different topics which would be addressed by the candidates during the debate, and we published a corresponding list of 9 hashtag, one for each topic (#MDcrescita for economic growth, #MDpolest for foreign policy, #MDimmi for immigration, etc.). We asked the participants to engage in microdebates about what was being said in real time, during the live broadcasting, using their mobile phone.

We gathered 293 tweets from 24 active participants. However, the majority of tweets contained syntax errors (for example, the $$ tag was missing) and thus could not contribute to the microdebate’s visualization and were not visible in the App (Microdebates App only retrieves and stores syntactically correct microdebate tweets). We also noticed much less connections between tweets, resulting in very sparse graphs. Indeed, while 43% of tweets contained an endorsement of a candidate’s claim, only 39% contained an “argument” and even less (23%) an “attack”. Finally, 27% of tweets contained ironic messages or outward insults, indicating a markedly different style of participation from the previous study.

The differences between the outcomes of the two studies could surely be ascribed, at least in part, to differences in the experimental settings. The demographics were similar, but the technology used was different – much more varied in the second study – and the context and background were also quite different – the dynamics of a fast-paced and often emotionally engaging ongoing debate had a different impact on the participants. However, we believe that these two studies also reflect different debating styles identifiable in online communities.

According to a recent study published by the Pew Research center [54] not all online conversation are of the same “type”. On the contrary, the study identifies at least six distinctive structures of social media crowds (“conversation archetypes”) which form depending on the subject being discussed, the information sources being cited, the social networks of the people talking about the subject, and the leaders of the conversation. Each has a different social structure and shape.

For example, polarized discussions feature two big and dense groups that have little connection between them. Polarized crowds on Twitter are not arguing: they are ignoring one another while pointing to different web resources and using different hashtags. Discussions in so-called community clusters instead are characterized by multiple smaller groups, which often form around a few hubs, each with its own audience, influencers, and sources of information. Community clusters conversations look like bazaars with multiple centers of activity. There we can see arguments: some information sources and subjects ignite multiple conversations, each cultivating its own audience and community. These can illustrate diverse angles on a subject based on its relevance to different audiences, revealing a diversity of opinion and perspective on a social media topic.

As there is no single way conversations take shape in social media, there is no single way arguments take shape, either. For this reason, the macroscopic difference between the outcomes of the two studies could be explained also, in part, by relating to the different archetypes. In particular, we could argue that the participants in the political debate behaved like “polarized crowds”, as witnessed by the large number of endorsements and insults and by the few arguments and connections, whereas the participants in the first behaved like “tight crowds”, where tight, information-rich argument graphs could emerge if the topic was interesting for that crowd.

7. Discussion

Recent findings in cognitive science [45] suggest that people are good at arguing, actually that the main function of reasoning is argumentative. However, when big numbers are in play, as it may happen with very crowded online platforms or with very complex debates, it may be difficult for bystanders and potential contributors to make sense of what is going on.

The motivation behind this work is to improve online debates and support the agreement process, by formalizing and rationalizing the debate itself. Microdebates aim to encourage debaters to focus on the arguments involved in the debate, on the relations between such arguments, and on the possible evolutions of arguments.

A legitimate question is whether microdebates could be accommodated by Twitter using current tags, and whether no additional syntax was required at all. The answer is negative. The hashtag is grounded in Twitter habits as the way to indicate “what we are talking about”, therefore it cannot be used to identify arguments. However, it was unavoidable to introduce new symbols to label arguments and attacks. The recently introduced cashtag is meant to be used to keep track of specific stocks. To the best of our knowledge, double- and bang-cashtags are not in use yet. Weights are calculated based on the existing language and do not require further additions. All in all, the extension to the Twitter language that we propose is both necessary and, in a sense, minimal.

A purpose that microdebates could serve particularly well is to support the activities preceding deliberations, in online democracy and e-participation environments that call for new experimental solutions.15

¹⁵

This is especially true in the Italian context. Quoting Fiorella de Cindio, “Right now Italy is a lab for participatory online platforms since there is a strong need to rebuild trust into politics and politicians” http://techpresident.com/news/wegov/24489/italy-test-lab-participatory-democracy.

In this perspective, the type of agreement microdebates aim at is not about which option to chose, but rather on the definition of the different options. Community members could be called to deliberate on them at a later stage, using well-established voting mechanisms.

Microdebates follow the grassroots argumentation philosophy introduced in [55]. Users contribute to a debate by broadcasting annotated comments. As a result, arguments arise bottom-up.

Interestingly, arguments here are dynamic entities that evolve over time. A particular argument, identified by a hashtag, can be thrown in the debate arena even if it is not “well-formed”. We do not have a concept of well-formedness. There are no arbitrarily chosen standards that an argument is expected to meet. This marks a fundamental difference with other argumentation-based sense-making tools such as those reviewed in our introductory section.

Instead, many users can cooperate, tweet by tweet, to make that argument evolve, gain focus, strength and social support. New elements can be added at any time, turning a simple claim into a fully-fledged argument. Other arguments can arise, that counter those already in place, forming an argumentation framework. Such a collaborative effort does not require conforming to (rigid) rules, learning new interfaces, nor creating new social networking environments.

Microdebates also offer important degrees of flexibility. Our proposal is orthogonal to the semantics of argumentation, by design. Different extension-based semantics may suit to different domains. Moreover, we are platform-independent. We do refer to Twitter, because the Twitter user base seems to be particularly well versed in this type of content tagging and could easily pick up the spirit of microdebates. Besides, Twitter has important features that greatly help implementing the idea. However, microdebates may be run in other online social platforms as well.

In a broader perspective, the role of microdebates goes beyond that of an innovative tool for debating and reaching agreements. As an analytical tool, microdebates allow a deep analysis of arguers’ position in a debate. For instance, policy-makers need to understand why citizens feel in some way or another, about a given policy. There is a lot of material available in public online forums, but it is very hard to perform an analysis of arguments using only statistical approaches. Even state-of-the-art argumentation mining techniques are not yet able to provide high levels of accuracy [41].

As a tool to support scientific research, microdebates could produce significant benchmarks for argumentation. The lack of benchmark libraries for argumentation is a well known issue [46]. As Modgil et al. say, a benchmark library will bring various benefits to the field of argumentation as it will support the implementation of new theoretical ideas, as well as their testing and comparison with the state of the art. Moreover, we can gain insights on the practical use of different semantics for specific domains subject of debate.

Finally, microdebates could provide valuable input for agent-based social simulations. In particular, NetArg [26,27] is a model for agent-based social simulations, where agents are modeled using AFs that represent their beliefs. It would be interesting to use AFs produced by microdebates, to set up simulations and possibly predict how opinions may spread in a social network. That would contribute to sociological theory and could also be used as a social monitoring tool. Indeed, arguing is a social process, so we may use sociological models to capture diffusion of ideas and innovations among arguers, in order to monitor anomalies.

There are also intrinsic risks in our approach. In particular, users may not embrace our syntax, even if they already have an account on Twitter. This sort of risk is associated with many initiatives whose success depends on community engagement. Experimentation discussed in Section 6 indicates that users that have never been exposed to computational argumentation can get accustomed to the microdebates syntax almost immediately. This fact suggests that our syntax will not represent a hurdle for Twitter users, and that it will fit their habits without particular efforts. But will it be engaging to join a “real world” microdebate? Only a wider experimentation will answer this question. We made a Twitter profile16

¹⁶

https://twitter.com/microdebate.

and a Facebook page17

¹⁷

http://www.facebook.com/Microdebates.

and we are working on new experimental settings that use mobile technology and alternative visualization methods. However, while evaluating user engagement, we must not forget that online conversations follow different archetypes, and it would be unrealistic to expect polarized crowds to exchange arguments and interact with each other as if they were tight crowds, only because the enabling technology is available.

We are actively working on the microdebates software tools described in Section 4, in order to enable further experimentation. The last experiments taught us that well-designed user interfaces are essential to the uptake of this technology, so we are actively working on improving them. Tools are also crucial to understanding the boundaries of our approach: skilled Twitter users may develop habits that could be different from what we expect, leading to unforeseen system behaviors.

Other risks are due to the openness of our approach. A criticality of all open, non-moderated online debate platforms is that they are prone to (intentional or unintentional) “bad behavior”, and microdebates are no exception. They suffer from a weakness shared by many other democratic approaches: they offer maximal freedom to citizens, but are prone to exploits that could inflate support without real arguments, or even without real followers. Trolling is a well-known issue [5,57], as well as fake accounts, which could be spammers or bots.18

¹⁸

http://wafi.iit.cnr.it/TheFakeProject/.

Solving these problems is beyond the scope of this work, but it is important to be aware of them. Indeed, devising robust mechanisms to cope with bad behavior is an important subject for future work. We mitigate this problem by equipping microdebates with mechanisms for group support. In particular, the use of weights derived from community support for computing acceptable extensions could introduce a form of crowd-sourced self-regulation over the outcomes of microdebates.

Since arguments are dynamic entities that evolve over time, it is inevitable that the meaning of arguments changes over time and it may happen, for instance, that someone who referred to a given argument in the past does no longer agree with it at some later point, though the user’s contribution to the argument weight is still there. Addressing this issue does not pose a technical hurdle: a Twitter user can easily delete old tweets. However, it is unrealistic to assume that users will monitor the content of all arguments they have been referring to in the past. In the end, the impact of argument dynamics and the user response to the issue are the kinds of effects we expect to be able to observe at the macro-level.

Another matter of discussion is the scope of microdebates. Twitter streams are usually public, i.e., visible to everybody. However, some microdebates may need to be kept private. We are working on privacy-enabling extensions using other online platforms, as there is no reason why the microdebates concept should be restricted to Twitter.

Scalability is another well-known issue of online debates [17]. We address it by following a principle of locality and distribution, and via an incentive mechanism. In particular, since users tag their own arguments, even if large crowds may bring in many arguments, tagging is still done locally. Moreover, because arguments that weigh more have a better chance to emerge, there is an incentive in building on consolidated, “heavy” arguments, rather than creating new arguments with minimum initial weight. This should help containing the number of arguments.

For future work, it would be useful to sift through arguments automatically in order to filter out the spam. We are aware that argument analysis [53] is a very difficult task, even when it is done by hand on a single argument. Besides, the bottom-up, unstructured nature of arguments arising from microdebates makes the task even harder. However, automated argument analysis is a rapidly expanding field [41] and some interesting results are becoming available [10,29,30,40,42,47,48,58]. Thus a scenario of intelligent argument filtering may be not so unrealistic as it seemed a few years ago. We are working on integrating an “argument filtering” module to our visualization tools, so that “better” arguments can be given more emphasis.

The concept of group support (or “social acceptance” [39]), also needs to be further explored. It is somewhat related to a democratic principle: what the majority believes fair will be considered as fair. Suppose a scenario in which microdebates are widely adopted to collect opinions in settings like political elections, like nowadays some TV shows use tweets to measure what the audience thinks about an issue, or to fact-check what politicians say in live debates.19

¹⁹

See for example the ESPRC EDV project, http://edv-project.net/.

Argumentation frameworks need to represent not only opinions, but also their social endorsement, so to understand what people think and why and how strongly they feel about opinions.

While we believe that argumentative skills are fundamental to foster democratization processes, we recognize that argumentative elements in generic social media tools are very basic: Twitter, Facebook and Google Plus use “RT”, “Like” and “+1” buttons, while YouTube also added a “thumbs down” option. “Argumentation support has not yet moved firmly from the academic lab, into the mainstream”, conclude Schneider et al. [52] after a comprehensive review of the state-of-the-art of actual argument tools, and they claim that different interfaces are needed to support different kinds of arguing, because people often argue to position and establish themselves, not only to solve hard problems. This is where microdebates come into play. To the best of our knowledge, this is the first attempt to introduce unstructured, community-based argumentation into a popular social web platform.

Footnotes

Acknowledgements

We thank the anonymous reviewers for providing high quality, constructive feedback. We thank Nefise Yağlıkçı, Lorenzo Michelacci, and Günce Çağloğu for their work with the implementation of the Microdebates App and of the web interface, and our colleagues and students at the Political Sciences department of the University of Bologna for their contribution in the experimentation. A special thank you and farewell to Aldo Di Virgilio.

The research described here was mostly carried out when the first author was affiliated with University of Bologna’s DISI, in the research staff of the ePolicy project. This work was partially supported by the ePolicy EU project FP7-ICT-2011-7, grant agreement 288147. Possible inaccuracies of information are under the responsibility of the project team. The text reflects solely the views of its authors. The European Commission is not liable for any use that may be made of the information contained in this paper.

The original posts can be retrieved from:

References

[1]

Ames and

Naaman , Why we tag: Motivations for annotation in mobile and online media, in: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI’07, ACM, New York, NY, USA, 2007, pp. 971–980.

[2]

Baroni and

Giacomin , Semantics of abstract argument systems, in: Argumentation in Artificial Intelligence,

Simari and

Rahwan , eds, Springer-Verlag, Heidelberg, 2009, pp. 25–44.

[3]

Baroni ,

Romano ,

Toni ,

Aurisicchio and

Bertanza, An argumentation-based approach for automatic evaluation of design debates, in: CLIMA,

Leite ,

T.C.

Son ,

Torroni,

van der Torre and

Woltran , eds, Lecture Notes in Computer Science, Vol. 8143, Springer, 2013, pp. 340–356.

[4]

Bex ,

Lawrence ,

Snaith and

Reed , Implementing the argument web, Commun. ACM 56(10) (2013), 66–73.

[5]

Bishop , Examining the Concepts, Issues, and Implications of Internet Trolling, IGI Global, 2013.

[6]

Bistarelli ,

Rossi and

Santini , A first comparison of abstract argumentation systems: A computational perspective, in: CILC,

Cantone and

M.N.

Asmundo , eds, CEUR Workshop Proceedings, Vol. 1068, 2013, pp. 241–245; available at ceur-ws.org.

[7]

Bistarelli and

Santini , A common computational framework for semiring-based argumentation systems, in: ECAI 2010 – 19th European Conference on Artificial Intelligence, Lisbon, Portugal, 2010, pp. 131–136.

[8]

Bruns and

J.E.

Burgess , The use of Twitter hashtags in the formation of ad hoc publics, in: 6th European Consortium for Political Research General Conference, Reykjavik, 2011, Univ. Iceland.

[9]

Buckingham Shum , Cohere: Towards web 2.0 argumentation, in: Computational Models of Argument: Proceedings of COMMA 2008,

Besnard ,

Doutre and

Hunter , eds, Toulouse, France, May 28–30, 2008, Frontiers in Artificial Intelligence and Applications, Vol. 172, IOS Press, 2008, pp. 97–108.

10.

[10]

Cabrio and

Villata , Combining textual entailment and argumentation theory for supporting online debates interactions, in: The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8–14, 2012, Jeju Island, Korea – Volume 2: Short Papers, The Association for Computer Linguistics, 2012, pp. 208–212.

11.

[11]

Campbell ,

Fletcher and

Greenhill , Conflict and identity shape shifting in an online financial community, Information Systems Journal 19(5) (2009), 461–478.

12.

[12]

Cartwright and

Atkinson , Political engagement through tools for argumentation, in: Computational Models of Argument: Proceedings of COMMA 2008,

Besnard ,

Doutre and

Hunter , eds, Toulouse, France, May 28–30, 2008, Frontiers in Artificial Intelligence and Applications, Vol. 172, IOS Press, 2008, pp. 116–127.

13.

[13]

Charwat ,

Dvořák ,

S.A.

Gaggl ,

J.P.

Wallner and

Woltran, Implementing abstract argumentation – A survey, Technical Report DBAI-TR-2013-82, Technische Universität Wien, 2013.

14.

[14]

C.I.

Chesñevar ,

A.G.

Maguitman ,

Estévez and

Brena , Integrating argumentation technologies and context-based search for intelligent processing of citizens’ opinion in social media, in: Proc. of ICEGOV 2012, Albany, NY, 2012, pp. 166–170.

15.

[15]

Correia ,

Cruz and

Leite , On the efficient implementation of social abstract argumentation, in: ECAI 2014 – 21st European Conference on Artificial Intelligence, 18–22 August 2014, Prague, Czech Republic – Including Prestigious Applications of Intelligent Systems (PAIS 2014),

Schaub,

Friedrich and

O’Sullivan , eds, Frontiers in Artificial Intelligence and Applications, Vol. 263, IOS Press, 2014, pp. 225–230.

16.

[16]

Coste-Marquis ,

Konieczny ,

Marquis and

M.A.

Ouali , Weighted attacks in argumentation frameworks, in: Principles of Knowledge Representation and Reasoning: Proceedings of the Thirteenth International Conference, KR 2012, Rome, Italy, June 10–14, 2012,

Brewka ,

Eiter and

S.A.

McIlraith , eds, AAAI Press, 2012, pp. 593–597.

17.

[17]

Cottica , How online conversations scale, and why this matters for public policies, Contrordine compagni, 2012; available at http://www.cottica.net/2012/08/01/how-online-conversations-scale-and-why-this-matters-for-public-policies/.

18.

[18]

De Liddo and

Buckingham Shum , The evidence hub: Harnessing the collective intelligence of communities to build evidence-based knowledge, in: Workshop: Large Scale Ideation and Deliberation at 6th International Conference on Communities and Technologies, Munich, Germany, 2013.

19.

[19]

De Liddo and

Buckingham Shum , Improving online deliberation with argument network visualization, in: Workshop: Digital Cities 8 at 6th International Conference on Communities and Technologies, C&T 2013, Munich, Germany, 2013.

20.

[20]

K.M.

DeVoe , Bursts of information: Microblogging, The Reference Librarian 50(2) (2009), 212–214.

21.

[21]

P.M.

Dung , On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games, Artificial Intelligence 77(2) (1995), 321–357.

22.

[22]

P.E.

Dunne ,

Hunter ,

McBurney ,

Parsons and

Wooldridge, Weighted argument systems: Basic definitions, algorithms, and complexity results, Artificial Intelligence 175(2) (2011), 457–486.

23.

[23]

Egly ,

Gaggl and

Woltran , ASPARTIX: Implementing argumentation frameworks using answer-set programming, in: ICLP: Proceedings of the 24th International Conference on Logic Programming,

Garcia de la Banda and

Pontelli , eds, Lecture Notes in Computer Science, Vol. 5366, Springer, 2008, pp. 734–738.

24.

[24]

English , Finding a voice in a threaded discussion group: Talking about literature online, The English Journal 97(1) (2007), 56–61.

25.

[25]

Gabbriellini and

Torroni , Large scale agreements via microdebates, in: Proceedings of the First International Conference on Agreement Technologies, AT 2012, Dubrovnik, Croatia, October 15–16, 2012,

Ossowski ,

Toni and

G.A.

Vouros , eds, CEUR Workshop Proceedings, Vol. 918, 2012, pp. 366–377; available at ceur-ws.org.

26.

[26]

Gabbriellini and

Torroni , Arguments in social networks (extended abstract), in: Proceedings of the 12th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2013), Saint Paul, MN, USA, May 6–10, 2013.

27.

[27]

Gabbriellini and

Torroni , A new framework for ABMs based on argumentative reasoning, in: Advances in Intelligent Systems and Computing,

Kamiński and

Koloch , eds, Advances in Social Simulation, Vol. 229, Springer, Berlin/Heidelberg, 2014, pp. 25–36.

28.

[28]

A.J.

García and

G.R.

Simari , Defeasible logic programming: An argumentative approach, Theory and Practice of Logic Programming 4(1) (2004), 95–138.

29.

[29]

Grosse ,

M.P.G.C.I.

Chesñevar and

A.G.

Maguitman , Integrating argumentation and sentiment analysis for mining opinions from Twitter, AI Communications 28(3) (2015), 387–401.

30.

[30]

Habernal ,

Eckle-Kohler and

Gurevych , Argumentation mining on the web from information seeking perspective, in: Proceedings of the Workshop on Frontiers and Connections Between Argumentation Theory and Natural Language Processing, Forlì-Cesena, Italy, July 21–25, 2014,

Cabrio ,

Villata and

Wyner , eds, CEUR Workshop Proceedings, Vol. 1341, 2014; available at ceur-ws.org.

31.

[31]

Huang ,

K.M.

Thornton and

E.N.

Efthimiadis , Conversational tagging in Twitter, in: Proceedings of the 21st ACM Conference on Hypertext and Hypermedia, HT’10, ACM, New York, NY, USA, 2010, pp. 173–178.

32.

[32]

B.A.

Huberman ,

D.M.

Romero and

Wu , Social networks that matter: Twitter under the microscope, First Monday 14(1) (2009).

33.

[33]

Ignacio , An open letter: Threaded comments suck, LJWorld.com weblogs, Ronaldo’s World, 2011; available at http://www2.ljworld.com/weblogs/ronaldos-world/2011/feb/7/an-open-letter-threaded-c/.

34.

[34]

Java ,

Song ,

Finin and

Tseng , Why we Twitter: Understanding microblogging usage and communities, in: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, ACM, New York, NY, USA, 2007, pp. 56–65.

35.

[35]

Kiesler ,

Kraut ,

Resnick and

Kittur , Regulating behavior in online communities, in: Building Successful Online Communities: Evidence-Based Social Design,

R.E.

Kraut and

Resnick , eds, MIT Press, Cambridge, MA, 2012, pp. 125–178.

36.

[36]

S.-M.

Kim and

Hovy , Automatic identification of pro and con reasons in online reviews, in: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, 2006, pp. 483–490.

37.

[37]

Kunz and

H.W.J.

Rittel , Issues as elements of information systems, Working Paper 131, Institute of Urban & Regional Development, Univ. California, 1970.

38.

[38]

Kwak ,

Lee ,

Park and

Moon , What is Twitter, a social network or a news media? in: WWW’10: Proceedings of the 19th International Conference on World Wide Web, ACM, New York, NY, USA, 2010, pp. 591–600.

39.

[39]

Leite and

Martins , Social abstract argumentation, in: IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16–22, 2011,

Walsh , ed., IJCAI/AAAI, 2011, pp. 2287–2292.

40.

[40]

Levy ,

Bilu ,

Hershcovich ,

Aharoni and

Slonim , Context dependent claim detection, in: COLING 2014,

Hajic and

Tsujii , eds, ACL, Dublin, Ireland, 2014, pp. 1489–1500.

41.

[41]

Lippi and

Torroni , Argument mining: A machine learning perspective, in: International Workshop on Theory and Applications of Formal Argument (TAFA), Buenos Aires, Argentina, 2015.

42.

[42]

Lippi and

Torroni , Context-independent claim detection for argument mining, in: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25–31, 2015,

Yang and

Wooldridge , eds, AAAI Press, 2015, pp. 185–191.

43.

[43]

Liu , Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications), Springer-Verlag, New York, Inc., Secaucus, NJ, USA, 2006.

44.

[44]

C.M.

Lorenzetti and

A.G.

Maguitman , A semi-supervised incremental algorithm to automatically formulate topical queries, Inf. Sci. 179(12) (2009), 1881–1892.

45.

[45]

Mercier and

Sperber , Why do humans reason? Arguments for an argumentative theory, Behavioral and Brain Sciences 34(2) (2011), 57–74.

46.

[46]

Modgil ,

Toni ,

Bex ,

Bratko ,

C.I.

Chesñevar ,

Dvorák,

M.A.

Falappa ,

Fan ,

S.A.

Gaggl ,

A.J.

García ,

M.P.

González ,

T.F.

Gordon ,

Leite ,

Molina ,

Reed ,

G.R.

Simari ,

Szeider ,

Torroni and

Woltran , The added value of argumentation, in: Agreement Technologies, Law, Governance and Technology Series, Vol. 8, Springer-Verlag, 2013, pp. 357–404.

47.

[47]

R.M.

Palau and

M.-F.

Moens , Argumentation mining, Artif. Intell. Law 19(1) (2011), 1–22.

48.

[48]

Pallotta and

Delmonte , Automatic argumentative analysis for interaction mining, Argument & Computation 2(2,3) (2011), 77–106.

49.

[49]

Pang and

Lee , Opinion mining and sentiment analysis, Found. Trends Inf. Retr. 2(1,2) (2008), 1–135.

50.

[50]

A.W.

Rivadeneira ,

D.M.

Gruen ,

M.J.

Muller and

D.R.

Millen , Getting our head in the clouds: Toward evaluation studies of tagclouds, in: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI’07, ACM, New York, NY, USA, 2007, pp. 995–998.

51.

[51]

D.M.

Romero ,

Meeder and

Kleinberg , Differences in the mechanics of information diffusion across topics: Idioms, political hashtags, and complex contagion on Twitter, in: Proceedings of the 20th International Conference on World Wide Web, WWW’11, ACM, New York, NY, USA, 2011, pp. 695–704.

52.

[52]

Schneider ,

Groza and

Passant , A review of argumentation for the social semantic web, Semantic Web 4(2) (2013), 159–218.

53.

[53]

Sinnott-Armstrong and

Fogelin , Understanding Arguments, 8th edn, Wadsworth/Cengage, 2010.

54.

[54]

M.A.

Smith ,

Rainie ,

Himelboim and

Shneiderman , Mapping Twitter topic networks: From polarised crowds to community clusters, Technical report, Pew Research Center, Washington, DC, 2014; available at http://www.pewinternet.org/2014/02/20/mapping-twitter-topic-networks-from-polarized-crowds-to-community-clusters/.

55.

[55]

Toni and

Torroni , Bottom-up argumentation, in: TAFA 2011: Revised Selected Papers from the First International Workshop on Theory and Applications of Formal Argumentation,

Modgil ,

Oren and

Toni , eds, Lecture Notes in Computer Science, Vol. 7132, Springer, 2012, pp. 249–262.

56.

[56]

Torroni ,

Gavanelli and

Chesani , Arguing on the semantic grid, in: Argumentation in Artificial Intelligence,

Simari and

Rahwan , eds, Springer, New York, 2009, pp. 423–441.

57.

[57]

Torroni ,

Prandini ,

Ramilli ,

Leite and

Martins , Arguments against the troll, in: Proceedings of the 11th Italian Symposium on AI, 2010, pp. 232–235.

58.

[58]

M.P.G.

Villalba and

Saint-Dizier , Some facets of argument mining for opinion analysis, in: Computational Models of Argument – Proceedings of COMMA 2012, Vienna, Austria, September 10–12, 2012,

Verheij ,

Szeider and

Woltran, eds, Frontiers in Artificial Intelligence and Applications, Vol. 245, IOS Press, 2012, pp. 23–34.

59.

[59]

Wilensky , Netlogo. Center for Connected Learning and Computer-Based Modeling, Northwestern Univ., Evanston, IL, 1999.

60.

[60]

Yaglikci and

Torroni , Microdebates app for android: A tool for participating in argumentative online debates using a handheld device, in: 26th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2014, Limassol, Cyprus, November 10–12, 2014, IEEE Computer Society, 2014, pp. 792–799.

61.

[61]

Yu ,

Kaufmann and

Diermeier , Exploring the characteristics of opinion expressions for political opinion classification, in: Proceedings of the 2008 International Conference on Digital Government Research, Dg.o’08, Digital Government Society of North America, 2008, pp. 82–91.