Abstract
In this paper we discuss the issues related to the formal representation of thematic roles in an ontology modeling historical events. We start by analyzing the ontological distinctions between thematic roles and social roles, which suggest different formal representations. Coupling the study of existing approaches with an analysis of historical texts – available within the Harlock’900 project – we propose a formal representation of thematic roles in HERO (Historical Event Representation Ontology), based on binary properties, directly connecting the event to its participants. Moreover, we show that a fine-grained formal ontological model of participation in (historical) events should include general thematic roles (e.g., agent, patient) – able to capture the common aspects of the ways entities are involved in events – and event-specific roles (e.g., sniper), introduced in the ontology according to a specific criterion, that guarantees the needed expressivity without proliferating roles. We conclude the paper by discussing the benefits of our approach.
Keywords
Introduction
The notion of role has been discussed in different research areas, where it has received many different, sometimes overlapping, interpretations. What is clear is that it is an important challenge also in the applied ontology and in the Semantic Web communities.
In this paper, we try to clarify the notion of role in the representation of historical events, by providing an ontological analysis and a formal account of thematic roles, as well as a comparison between thematic and social roles. In the historical domain, in fact, roles are widely used to describe events: Historical texts, as well as multimedia documents (e.g., video, images), are full of expressions that refer to roles, e.g., king, prime minister, leader, conqueror, loser, victim.
Starting from the work presented by Goy et al. (2017), we can identify three different notions that have been called “roles” in the literature, but show a quite different nature:
The role somebody or something plays when she/he/it
The role somebody plays within a given
The role that can be attributed to someone/something from a specific
The three different meanings of the notion of role mentioned here are not totally independent of each other, since many relations among them can be considered:
As a consequence of killing someone (sense (a)), somebody can be socially considered a killer (sense (b)).
Somebody can participate in an international meeting (sense (a)) “in the role of” Italian Prime Minister (sense (b)).
Somebody can be considered a liberator (sense (c)) as far as its participation in a specific event is concerned (sense (a)).
However, we claim that these three senses have some important ontological differences that suggest they should be modeled in different ways. In particular, in this paper, we focus on case (a) and we analyze the historical domain to support our thesis, which can be summarized as follows:
Although sense (a) and sense (b) share several features, they also show important
As far as sense (a) is concerned, the notion of participant in an (historical) event is not enough and a more fine-grained formal ontological model of participation modality1
Participation modality here means the way an entity participates in an event, i.e., the role it plays in the event.
Participation modality should be modeled by relying on a set of
A criterion is needed to regulate the introduction of
Actually, these roles are event
Here sniper (cecchino in Italian) means somebody who shoot from a hidden position (against an enemy); we do not refer to a marksman (specialized soldier).
As a consequence of the ontological characterization of thematic roles (sense (a)), the participation in an (historical) event should be modeled by means of
In the following, we will briefly analyze two research areas in which the notion of role has been studied: formal computational ontologies modeling events (Section 2.1), and (computational) linguistic approaches accounting for thematic roles (Section 2.2). Then, by relying on the analysis by Goy et al. (2017), we will summarize the reasons supporting the ontological differences between the ways of participating in an (historical) event and social roles (Section 3.1), leaving the discussion of roles attribution according to points of view for a future investigation. We then present the results of our domain analysis of documents describing events related to the Italian 20th Century history (Section 3.2). These results provided us with the empirical basis of the model for representing the participation in historical events, presented in Sections 4.1 and 4.2. The benefits of the proposed model are discussed in Section 4.3. Finally, we summarize conclusions and future directions of our work in Section 5.
Event ontologies and participation in events
In the following, we briefly sketch the modeling choices concerning roles that can be found in some existing and well-known event ontologies (or ontologies that also account for the concept of event).
Simple Event Model (SEM; van Hage et al., 2011) provides a pattern that allows one to specify that a participant in an event (i.e., an instance of the
LODE (Shaw et al., 2009) provides two basic properties (
CIDOC-Conceptual Reference Model (CIDOC-CRM; Doerr, 2003; Le Boeuf et al., 2015), offers two binary properties (
It is worth noting that CIDOC “properties of properties” cannot be directly expressed in the current most common Web ontology languages, such RDFS and OWL.
The Event Model F (Scherp et al., 2009) extends DnS (Gangemi and Mika, 2003), therefore it inherits the DnS capabilities of representing roles (in particular, type (b) roles, represented as instances of the DnS
The Event Ontology (EO; Raymond and Abdallah, 2007) does not explicitly claim to provide mechanisms for role representation, but it includes some properties and classes that can be used to express few participation modalities in events, namely: the
Also in the Europeana Data Model (EDM; EDM, 2016) roles are not modeled, but generic participation in events can be expressed through the
The event ontology defined by Hyvönen et al. (2012), and related to the CultureSampo project, supports the representation of people and objects involvement in events, but it does not provide a model to further specify the different types of involvement, i.e. the roles played by participants.
A little more fine-grained account of participation in events can be found in the ABC ontology (Lagoze and Hunter, 2001), which provides some support for type (a) roles (but no support for either types (b) or type (c) ones): In ABC, various properties can be used to express simple involvement in (
An interesting perspective is provided by the Generalized Upper Model (GUM; Bateman, 1990; Bateman et al., 1995, 2010),5
a linguistically-motivated ontology containing a backbone taxonomy of classes and a hierarchy of relations. The most relevant class, from our point of view, isFinally, Mizoguchi et al. (2015) divide
The analysis of existing event models, as far as the participation modality in events is concerned, shows that:
All models provide some properties to represent the Only a subset of the models explicitly deal with the notion of None of the models analyzed account for None of the analyzed models make any (formal)
In the wide literature about thematic roles belonging to the linguistics and computational linguistics fields, the terminology is sometimes confusing, and different expressions can be found, referring sometimes to the same concepts, sometimes to different ones: thematic roles, theta-roles, semantic roles, thematic relations, and sometimes also simply arguments.
Probably, the most classical choice would be using arguments and/or theta roles at the syntactic layer,6
Obviously, the term arguments can also be used to refer to the arguments of a relation at the (formal) semantic level: e.g., hero:hasParticipant
For these reasons, in this paper, we will use thematic roles for referring to the semantic relations in the formal representation of the meaning of a natural language expression. In this sense, thematic roles represent the distinguishing relations that link entities to events, thus providing a way to (partially) characterize events themselves on the basis of their participants, i.e. on the basis of which entities are involved and how they participate in a given event.
The majority of works in the (computational) linguistics area have faced the issue of the linking between syntactic structures and thematic role assignment in the semantic representation (also referred to as semantic role labeling or theta-marking). Although this is a very important issue, especially as far as automatic knowledge extraction is concerned (see Section 4), this paper does not discuss it and, instead, it focuses on thematic roles as relations defined in a semantic model – conceptualization, or ontology (Guarino et al., 2009) – of events and their participants. In this perspective, an important early contribution can be found in the work by Dowty (1989), who considered thematic roles (such as agent, patient, etc.) as prototypes that can be used to classify events, i.e., as prototypical ways to participate in events (and, in fact, he defines two “proto-roles”, called proto-agent and proto-patient). Another important work that deserves to be mentioned is the one by Parsons (1990), who lists the following roles: agent, experiencer, theme, source, goal, instrument, benefactive. Jackendoff (1990) groups thematic roles into three distinct tiers: the Thematic Tier, defining the role of a participant with respect to its agentivity/affectedness (thus providing agent/patient distinctions); the Action Tier, defining the role of a participant with respect to its movement and position (thus providing theme/goal distinctions); and the Temporal Tier, representing the temporal dimension of the event. Interestingly, a participant can play multiple roles, provided that they belong to different tiers (e.g., given the sentence “The car hit the tree”, the tree can play both the goal and the patient roles).
One of the most influential studies of verbs, their arguments and corresponding thematic roles is the one by Levin and Rappaport (1991), where the authors argue for the intrinsic relational nature of thematic roles, which are not semantic primitives, or properties characterizing an entity, but rather relations between individuals and events (see also the paper by Jackendoff, 1990, among the others).
An important debate concerns specificity of thematic roles: Are thematic roles specific for every type of event (buyer, seller, …) – as claimed by McRae and Matsuki (2009) – or are they general ways of participating in events of different kinds (agent, patient, …) – as maintained by Dowty (1989) and Levin and Rappaport (1991)? These two perspectives are not necessarily in competition, but, instead, they can be seen as complementary, and used together, defining verb-specific roles (e.g., victim of a killing) as specialization of general semantic roles (e.g., patient). However, as pointed out by Lebani et al. (2015), the relationships between event-specific roles and general semantic roles is a complex issue, which deserves further study.
Another significant thread, in the thematic roles debate, is Frame Semantics, introduced by Fillmore (1982) – see also Petruck (1996). Fillmore claims that the meaning of a linguistic expression can be understood only within its context, and such a context is represented by a frame, i.e., a cognitive-grounded structure depicting a typical real-world scenario and supporting natural language understanding (Fillmore, 1982). A frame basically includes the entities involved and the relationships between those entities. In 1997 Fillmore and colleagues started the FrameNet project (framenet.icsi.berkeley.edu), a huge English lexical database, both human- and machine-readable, based on Frame Semantics. In FrameNet, word meanings are represented as frames: Each frame represents an event type (e.g., cooking), and participants playing different roles (e.g., cook, food, heating_instrument, container) are frame elements (FEs). Frames are linked by different types of relations (e.g., IS-A, using relation, sub-frame relation). FrameNet contains more than 1000 frames, and also provides an annotated corpus. Moreover, FrameNet entries have mappings onto other lexical resources, including VerbNet (verbs.colorado.edu/~mpalmer/projects/verbnet.html). VerbNet, the largest online verb lexicon for English, is organized into verb classes extending the work by Levin (1993) and its lexical entries include thematic roles and selectional restrictions on verb arguments. The complete list of thematic roles used by VerbNet includes 30 different roles (see verbs.colorado.edu/verb-index/vn/reference.php).
An interesting project, partially based on FrameNet, at least as far as thematic roles are concerned, is the Event and Situation Ontology (ESO; Segers et al., 2015). The main goal of ESO is to model implications of events, in terms of pre and post conditions, in order to enable a reasoner to infer situations holding before and after the occurrence of certain types of (static or dynamic) events. In ESO thematic roles are modeled as object properties connecting events to entities, and are mapped onto frame elements in the corresponding FrameNet frames.
As far as the semantic representation of linguistic expressions denoting events is concerned, again the literature, within the (computational) linguistics field, is extremely wide. A good, although not recent, survey is provided by Tenny and Pustejovsky (2000). Interesting suggestions, in particular concerning thematic roles, can be found in this literature, even though Tenny and Pustejovsky explicitly say that they refer to events as “grammatically or linguistically represented objects” and not as events in the world (Tenny and Pustejovsky, 2000, p. 4). Our perspective is slightly different: Following the DOLCE cognitive approach (Masolo et al., 2003), our goal is the definition of a historical event ontology modeling the knowledge that supports cognitive representations of events occurring in the world. Obviously, such a model can be used by people to understand linguistic expressions referring to events, but the same knowledge can also be exploited to interpret a picture, a movie or a real-world scene. This less “linguistically grounded” perspective over events is also justified by the fact that we will use the ontology to represent historical events that are accounted for in archival resources, which can be texts (books, documents, letters, …), but also images, video, and even objects (e.g., dresses, or flags).
Our ontological analysis of thematic roles has been supported by a mixed approach, including: (1) An analysis of the literature, coupled with a deep investigation of existing (event) ontologies (see Section 2) – which provided us a “top-down” hypothesis; (2) an analysis of historical texts describing events and their participants – which offered us a grounded “bottom-up” point of view. We describe these two processes, and their results, in the following sections.
Ontological analysis of event participation modality and social roles
In this section we analyze, from an ontological point of view, the similarities and the differences between playing a role in an event as participant (case (a) above) and playing a social role (case (b)), leaving the study of role attribution according to points of view (case (c) above), which is linked to the notion of (historical) interpretation (van den Akker et al., 2011), for a future work.
As described by Goy et al. (2017), we use the notions of rigidity and foundedness, defined by Welty and Guarino (2001) and used by Masolo et al. (2004), to characterize roles as anti-rigid and founded concepts.
Both social and thematic roles are anti-rigid since they are “concepts that can be ‘played’ (in a contingent and temporary way) by certain entities” (Masolo et al., 2004, p. 267); this means that individuals playing a role do not play it necessarily. Entities can start and stop playing a role and they can change role during their life. For example, no Prime Minister necessarily is a Prime Minister; any Prime Minister starts being a Prime Minister at a certain point in her/his life and, usually, (s)he stops being a Prime Minister before her/his death. Similarly, no entity playing the patient role in an event plays it necessarily; moreover, it/(s)he stops being a patient as soon as the event ends and there can be periods in which it/(s)he is not patient in any event; moreover, an entity can play the same role more than one time. Furthermore, an entity can play different roles simultaneously and a role can be played by different entities at the same time.7
There are roles that cannot be played by different entities at the same time (e.g., Italian Prime Minister), but there are examples of roles that call for this possibility (e.g. Italian citizen), because of their very nature and independently from their formal representation as types or as individuals.
According to Masolo et al. (2004), roles (in particular social roles) are also founded concepts. The notion of foundedness captures the definitional dependency relation between concepts: Intuitively, a concept x is founded if and only if its definition mentions another concept y, “such that for each entity classified by x, there is an entity classified by y which is external to it” (Masolo et al., 2004, p. 273).8
The concept of external entity is a complex notion involving those of parts, qualities and constituents; for the purpose of the present discussion, it can be approximately defined as in the paper by Masolo et al. (2004): y is external to x iff x is not part of y and y is not part of x.
The relation between participating in an event acting as a musician (thematic role) and being a musician (social role) can also be taken into account by referring to the distinction between performable and definitional content that can be found in the work by Mizoguchi et al. (2015); see Section 2.1. However, this issue seems to be more complex: The relation between playing a thematic role in an event – sense (a) in Section 1 – and playing, as a consequence, the “corresponding” social role even when the event itself is over – sense (b) in Section 1 – deserves a further discussion. In Section 1, we mentioned the fact that, as a consequence of killing someone, somebody can be socially considered a killer. This relation can be (at least partially) accounted for by the notion of derived role, and in particular by the concept of retrospective derived role, proposed by Mizoguchi et al. (2015): In their approach, the retrospective derived killer role9
Mizoguchi et al. (2015), for the very same case, use the term murderer instead of killer: in the present discussion, we consider them as synonyms (both translations of the Italian word assassino).
We agree with Masolo et al. (2011) and with Mizoguchi et al. (2015) in recognizing a relation between the original occurrent-dependent role (i.e., our thematic role) and the retrospective derived occurrent-dependent role. However, we claim that such derived roles are full-fledged social roles, since, besides the context represented by the original event (e.g., the event of reaching the Everest top), they need a social context, i.e., a community that recognizes such events as credits for the agent who performed them. This means that, for a derived role to be played by the individual who played the original one, the original context is not enough and a current social context, defining such a role and attributing it to the individual, is needed. Consider, for example, a Partisan in a liberation war, or a war hero: S/he killed people, but s/he does not play the (derived) killer role, because s/he is not socially considered a killer. In order to play the killer social role, s/he needs to have killed someone and to be socially recognized as a killer. Therefore, differently from Mizoguchi et al. (2015), we do not think that, in order to be a killer (social role), having killed someone is a sufficient condition (although it is probably necessary): In fact, someone can become a killer (socially recognized as such), and after some time, maybe due to changes in the socio-political context, s/he can stop being (considered) a killer.
The ontological differences between thematic and social roles, just discussed, support our claim that in an ontology they should be conveniently formalized in two distinct ways.
Although in a few cases some social roles, such as jobs, are modeled as states or events, with properties expressing temporal boundaries (e.g., BiographyNet; Ockeloen et al., 2013), in computational ontologies, roles (as well as contexts) are often reified in order to place them in the domain of discourse, thus being able to “talk about” them, i.e., to explicitly represent their properties and the relations they are involved in (e.g., the link with the definitions that introduce or use them, the relationships with the contexts in which they hold, the reciprocal relationships among them, etc.). In this perspective, social roles are usually represented as instances of some sort of Role class. Sometimes, also role attributions are reified, and one of the most common reasons is that reification enables to express temporal boundaries for the relation between an entity and the role it plays, especially in those ontology languages in which only unary or binary predicates can be specified, such as OWL (see, for instance, the RoleInTime class in the Publishing Role Ontology, exploited also in PRoles; Daquino et al., 2014).
As a consequence of the previous analysis, we claim that, in order to represent (historical) events:
We need a formal representation that enables us to “use” both social and thematic roles – for instance, to formally represent the fact that De Gasperi played the role of Prime Minister of the Italian Republic (social role), and the fact that a group of Fascists played the role of agent (killer) in the murder of Galimberti (thematic role).
We need a formal representation that enables us to “talk about” (i.e., predicate on) social roles – for instance, to formally represent the fact that the role of President of the Italian Republic (social role) is defined in the Italian Constitution.
We do not need a formal representation enabling us to “talk about” (i.e., predicate on) thematic roles per se.
Therefore, reification is a suitable pattern to represent social roles in a first order logical theory,10
The large majority of computational ontologies are expressed in first order (possibly modal) logic languages. This is true, in particular for DOLCE (Masolo et al., 2003), which we take as a reference framework and, in general, for application-oriented ontologies, where a trade-off between expressivity and computability/computational complexity must be reached.
This claim enables us to state that thematic roles should be formally represented as binary properties, linking events and individuals participating in them, in line with some of the approaches discussed in Section 2.1, like, for instance, the one by Bateman and colleagues (2010). Such a representation, in fact, provides a more immediate account of the close relationships holding between events and participants, as we will show in Section 4.
With the purpose of verifying the ontological analysis described in Section 3.1, we examined written texts related to the Italian history of the 20th Century. This choice was driven by the context of Harlock’900, a project (running 2016–2018) involving the Department of Computer Science of the University of Torino and the Fondazione Istituto Piemontese Antonio Gramsci (www.gramscitorino.it), a non-profit institute promoting research on contemporary history, within the framework of the Polo del ’900 initiative (www.polodel900.it). The project aims at implementing a semantic layer, based on computational ontologies of historical events, to enrich archive metadata with information about the content of resources. An overview of the overall approach adopted in Harlock’900 can be found in the paper by Goy et al. (2015).
We performed a (manual) qualitative, in-depth analysis of 200 text fragments, extracted from books (biographies, war reports, testimonies, etc.) containing the narration of events which occurred in Piemonte, a North-West region of Italy, in the period 1943–1945, and belonging to the “Resistenza” (the partisans struggle against the Fascist regime and the Nazi occupation).
Each fragment contains references to 1.7 events on average (with a lot of fragments referring to a single event and some fragments containing references to up to 4 or 5 events). For each event, we identified the time period and the place it occurred (when available in the text), as well as the participants (a total of 380), distinguishing persons, organizations and groups; moreover, we assigned each event a typology (see below).
To perform the overall task (identification of events – with typology, participants, time and place) we set up a small team of “annotators”, with an average expertise about the historical period in focus. For the identification of events typology, each annotator was provided with the following rules: (a) look at the available typologies (initially an empty set) and choose one (or more) of them, if suitable; (b) if no available typology is suitable, then define the new suitable ones; (c) try to assign the most specific typology it make sense to you.
After a first set of fragments was “annotated” (i.e., events with typology, participants, time and places where identified), the annotators set up a panel to discuss their choices and to formulate a hypothesis about more general classes that could represent generalizations of the identified typologies, thus defining the backbone of the taxonomy of our historical event ontology HERO (Historical Event Representation Ontology). We continued iteratively, alternating annotation rounds and panel discussions until all the fragments were annotated, the event typologies identified and the more general classes defined. For each participant in an event, we also asked the annotators to assign a thematic role, starting from a standard list of general roles found in the literature (see Section 2.2), with the possibility of proposing new roles, if the available ones were not suitable.
A small excerpt of text fragments from our corpus can be found in Table 2, together with their (partial) formal representation (that will be described in the next section); a small part of the event class taxonomy is depicted in Fig. 1; the final list of (general) thematic roles used is shown in Table 1(a) and some examples of event-specific roles can be found in Table 1(b).
A formal model for representing participation modality in historical events
Events in HERO
In this section we describe the ontological model for representing participation in historical events (thematic roles) in HERO, by relying on the results of the previously described analysis. Moreover, we evaluate the proposal by showing the benefits of the approach.
Our approach is inspired by the Davidsonian view of events (Davidson, 1967), in which events are ontologically treated as individuals, thus enabling quantification and predicate attribution over them; more precisely, our approach is inspired by the neo-Davidsonian perspective (Parsons, 1990), which maintains that event types are represented as unary predicates (e.g., stabbing
In order to rely on a well-founded account for the notion of event, we refer to the
Similar considerations could hold for stative perdurants (see the
Space (as other event properties, such as time) can be often described in a rough and fuzzy way (e.g., “in Italy”, “yesterday”), but this does not mean that events do not occur in specific spaces and at specific times: for example, a trans-Atlantic telephone conversation takes place in a space region that is maybe scattered and difficult to be formally represented, but it does take place in some space.
Sometimes participants are not so easy to be identified and described (e.g., what are the participants in a rainstorm?), but “something” seems to always participate (masses of air, rain drops, etc.).
As discussed above, the general notion of participant, such as the one provided by DOLCE and by other event models (see Section 2.1), does not seem to be enough to describe how entities are involved in events, at least as far as the historical domain is concerned: In the corpus we analyzed (see Section 3.2), clearly distinguishing, for example, victims from perpetrators emerged to be a major requirement.
For this reason, we claim that a more fine-grained formal ontological characterization of participation modalities is needed, describing what are the entities involved and how they participate in a given (historical) event (i.e., what role they play in the event).
But what is the correct level? Does a good ontological model need general roles (such as agent, patient, beneficiary, etc.) or specific ones (killers, makers, cooks, etc.)?
We claim that participation modality in historical events should rely on a set of
Moreover, there seem to be formal reasons for supporting the representation of event-specific thematic roles in an ontology of events. There are, in fact, some cases in which two participants play the same (general) role, but with peculiar characteristics that are lost if more specific roles are not defined. For example, in a commercial transaction, it is impossible to distinguish between the buyer and the seller if only the agent role is used; in a trial (at least in the Italian legal system), different participants typically play an agentive role, with different characteristics: the State Attorney (Pubblico Ministero), the private prosecution attorney (avvocato dell’accusa), people bringing a civil action (Parte Civile); in a combat, among “shooters” there can be individuals playing the specific role of snipers.
However, a very specific characterization of participation modality alone is not enough, since it does not capture the common semantic aspects shared by specific roles (such as killers, makers, cooks, etc.), i.e., it fails in representing common general aspects of the ways entities are involved in events. Capturing such common aspects could be useful also from an application-oriented perspective, for example in cases in which a user may want to retrieve all participants playing a given general role (e.g., all beneficiaries) of a set of events.
On the basis of these considerations, we introduced in our model
Thematic roles (both general and event-specific ones) are formally defined in our historical event ontology HERO; HERO top layer relies on the already mentioned foundational ontology DOLCE (Masolo et al., 2003) and its extension DnS (Gangemi and Mika, 2003). In particular, HERO includes the following axioms:14
We express them in First Order Logic, for the sake of clarity. Free variables should be intended as universally quantified.
Within the domain considered (historical events), although in the analyzed corpus also phenomena are described, the major role is played by actions. Therefore, we introduced a taxonomy of subclasses of

A small fragment of the taxonomy of action types in HERO.
Based on the analysis discussed above, in HERO, we represented thematic roles as binary properties, connecting an event with a participant. More precisely, we have a hierarchy of thematic roles, where:
The
The
The
(a) General thematic roles in HERO; (b) some examples of event-specific thematic roles in HERO
This property can be seen as a shortcut for: Event x causes object y to come into existence.
General thematic roles are defined as properties with
Event-specific thematic roles are formally introduced only when needed, according to the Criterion (1); for example, a gunfire e (instance of
The intermediate layer, hosting general thematic roles, is worth a final remark. As already mentioned, the proposed list is the result of the convergence of two analytical processes, taking into account the literature and the existing ontologies (top-down), and historical texts (bottom-up). Not surprisingly, such a result is not far from “standard” thematic role lists that can be found in the literature (see Section 2.2). Only a few peculiarities may deserve a further comment:
We included both the patient and the theme roles, because we think that these two roles, with their intended meaning in HERO, can capture different relevant aspects of participation modality. Basically, we consider patients all entities affected by the event (“A group of Fascists killed Galimberti”), while we use theme to describe the role of participants that change place or owner in the event (“The shepherd gave the two Partisans some bread”).15
Typically patients are “affected” because they change their (intrinsic) state, while themes only change their position in space (see, for example, the approach by Bateman and colleagues, 2010). However, the issue is a complex one, and it would deserve a deeper investigation, which is out of the scope of this paper.
The choice of having both patient and theme roles is supported by authors like Jackendoff (1983, 1990), with his tier-based approach (see Section 2.2), as well as by projects like VerbNet (verbs.colorado.edu/~mpalmer/projects/verbnet.html), where both roles are present.
We added the damaged role, coupling the traditional beneficiary role, in order to capture cases in which the side-effects of an event cause some disadvantage to a participant (“The collapse of the bridge represented a great handicap for the Allied troops”). A similar role – although not very common – can be found in the literature, for instance in the approach by Smith and Grenon (2004), where the authors distinguish between facilitation and hindrance as specifications of a more general influence role.
We did not include properties referring to time and place in the list of HERO general thematic roles since we prefer considering thematic roles as properties representing only participation modalities in a strict sense. Obviously, HERO provides properties expressing the time and the place an event occurs (although they are out of the scope of the current discussion).
In order to evaluate the benefits of our representation model, we formulated, with the help of domain experts involved in the project, a set of questions that an application, offering a smart access to the content of archival resources about the “Resistenza” in Piemonte, should be able to answer. From the dialog with historians, it emerged that – when performing different activities related to historical research based on access to library and archive resources – both generalization and specification mechanisms can be useful in order to face typical questions researchers try to answer during their investigation activity. For instance, as far as conflicting actions are concerned, according to the domain experts we interviewed, the application should be able to answer questions representing generalization needs (G), as well as questions representing specialization needs (S); for example:
Who were the victims of conflicting actions?
Who perpetrate conflicting actions?
How many people have been executed (by shooting, hanging, etc.)?
Which buildings have been destroyed (by fires, or bombing, or …)?
In which combats did snipers take part?
Who voted in favor of the Grandi’s Order of the Day, in the Grand Council of Fascism, on July 25th 1943?
Who took the floor in the Grand Council of Fascism, that took place on July 24–25th 1943?
Considered these requirements, the If the ontology includes only a only only
then the application cannot answer both kinds of questions, (G) and (S), mentioned above.
If the ontology includes both
In order to explain how the proposed approach supports an application in answering questions like those listed in (G) and (S), we discuss in detail the first two, namely:
Who were the victims of conflicting actions?
In which combats did snipers take part?
Table 2 shows some examples of text fragments extracted from our corpus (see Section 3.2); expressions referring to events that are classified as conflicting actions (i.e., instances of the
Text fragments from the Harlock’900 corpus, with their semantic representation
Text fragments from the Harlock’900 corpus, with their semantic representation
In order to answer questions (a) and (b), an application (App) providing access to resource content is faced with the following situations:
If the participants in conflicting actions are simply represented by some
If
If
As a consequence, in order to be able to answer both questions, App should be provided with
The main goal of this paper was to clarify the different meanings of the notion of role and to provide an ontological analysis and a formal representation of thematic roles in (historical) events. In particular, this paper has discussed an ontologically-grounded distinction between social and thematic roles, leading to the claim that, in an (historical) event ontology, thematic roles should be represented as binary properties, connecting events to their participants. The model resulting from our analysis includes a set of general thematic roles coupled with a criterion for the introduction of event-specific roles. The paper also presents a validation of the model, by showing the benefits it would provide to an application that supports a content-based access to resources from historical archives.
There are also interesting open issues we plan to address. In particular, we are building an OWL version of the HERO ontology to be used in the already mentioned Harlock’900 project and in PRiSMHA, a national project started in May 2017 and funded by Compagnia di San Paolo and Università di Torino; it involves the Computer Science and the Historical Studies Departments of the same university, and it is based on a close collaboration with the Polo del ‘900 and, in particular, with the archives and library of the Fondaz. Ist. Piemontese A. Gramsci. Both projects aim at enriching metadata with semantic knowledge about historical events: This will enable us to test our approach in a running application, with real users.
Moreover, we are investigating the opportunities offered by automatic information extraction (Rovera, 2016; Rovera et al., 2017), and in particular by event mining and thematic role labeling approaches. In this perspective, the possibility of mapping HERO classes and properties onto frames and frame elements in FrameNet will be taken into consideration. This task is both interesting and challenging, due to the peculiarity of the domain (the Italian history of the 20th Century, and the “Resistenza” in particular) and the language used in our corpus, represented by biographies, war reports, testimonies, etc.
Finally, the case of role attribution on the basis of specific (historical) perspectives (case (c) in Section 1) represents an interesting open research issue to investigate.
