Abstract
This article discusses work on implementing emotional and cultural models into synthetic graphical characters. An architecture, FAtiMA, implemented first in the antibullying application FearNot! and then extended as FAtiMA-PSI in the cultural-sensitivity application ORIENT, is discussed. We discuss the modelling relationships between culture, social interaction, and cognitive appraisal. Integrating a lower level homeostatically based model is also considered as a means of handling some of the limitations of a purely symbolic approach. Evaluation to date is summarised and future directions discussed.
This article presents work on the modelling of emotion and culture and their interrelationship from a computational perspective. In order to implement any model on a computer, the model itself must be sufficiently specific. From this perspective, many psychological models are not usable as they stand, but must be operationalised. Qualitative relationships must be quantified, a definite selection made from competing alternatives, and internal structures must be mapped onto software entities. Those that cannot be so mapped may be omitted altogether. Often this produces a simplified version of the original theory, not always appreciated in the originating disciplines.
Thus, when computer scientists select models from psychology, they tend to favour those that are already sufficiently specific or that can be made so relatively easily. It is for this reason that the cognitive appraisal-based approaches discussed later have often used what is often referred to as the OCC approach (after Ortony, Clore, & Collins, 1988) in spite of the availability of more recent and much more sophisticated models (Scherer, Schorr, & Johnstone, 2001). In the same way, the cultural dimensions model of Hofstede (Hofstede, 2001), also selected here, appeals because of its systematic and easily operationalised taxonomy.
One motive for the computer-based modelling of emotion and culture indeed lies in the pressure that implementation puts on the original theory. This may tease out ambiguities or lack of precision not initially visible to the constructors of the theory. Increasingly though, computational models are being built for specific applications, often those in which graphical characters perform as autonomous intelligent agents or synthetic characters. These may interact with each other in graphical worlds (Marsella, Johnson, & LaBore, 2000; Paiva et al., 2005), or with human users (Cassell, 2001; Narita & Kitamura, 2010). It is this application domain that has prompted the work we report.
Affect has been a part of work on synthetic characters for nearly 20 years (Elliot, 1992). Work in the OZ project in the 1990s identified emotion as a key aspect of creating what is known as believability for synthetic characters (Bates, 1994). This term has become a touchstone for research on the subject.
It had become clear that users interacting with graphically represented characters as well as robots would often treat them as if they were human interaction partners even though they knew that they were not (Reeves & Nass, 1996). This included applying the intentional stance (Dennett, 1989) to them, and acting as if they had their own inner life: motives, goals, beliefs, and feelings. Believability may be taken as the extent to which human users are willing to suspend their disbelief in an artificial computer-based synthetic character.
Affect relates to believability in synthetic characters in two main ways. The first is through the requirement for contextually appropriate expressive behaviour such as glance, facial expression, posture, and gesture. The impact of a synthetic character acting as a humanised interface depends on expressive behaviour that is consistent and appropriate. Producing expressive behaviour involves sending the right markup commands to a graphics engine. However, synthetic characters that are responsive to user interaction in real time must generate such markup “on the fly” rather than through predetermined scripting. Generating expressive behaviour in this way requires affective models linking inputs from the user and from other synthetic characters to contextually appropriate outputs.
The second requirement for affective models is a generalization of the first. Synthetic characters that are responsive in real time must in fact be autonomous, that is, able to select their actions in relation to real-time input, in the same way as an autonomous robot. It has become clear from work on action selection in the human case (Damasio, 1994) that affect and cognition do not form two separate systems, but are intimately linked. In the case of synthetic characters this requires an affective model that generates overall motivations (Canamero, 1998; Velásquez, 1997), helps to select which goals to pursue at a particular time, and allows alternative plans to be weighed (Aylett, Dias, & Paiva, 2006). Expressive behaviour supports the intentional stance precisely because it acts as a window for the user into the decision-making process of the character, supporting the decoding of motive and intention referred to before. Thus the user will track the supposed motives, goals, and plans of the character, and if these are not appropriate then they will not be believable either.
FAtiMA: An Affective Architecture
In this section we discuss the development of the first version of an affective architecture for synthetic characters as part of work on the application FearNot! (Fun with Empathic Agents Reaching Novel Outcomes in Teaching). This version of FAtiMA (FearNot! AffecTIve Mind Architecture) (Dias & Paiva, 2005) modelled an integration of affect and cognition for autonomous characters but did not specifically model culture. In the following section we describe extensions to FAtiMA for the application ORIENT (Overcoming Refugee Integration with Empathic Novel Technology), intended to incorporate an explicit model of culture.
Empathic Agents
In the work reported here, synthetic characters were developed for educational applications in which the objective was not to produce greater knowledge per se, but to impact attitudes and, as a result, change behaviour. FearNot! focused on education against bullying (Paiva et al., 2005), and ORIENT on education in intercultural empathy (Aylett et al., 2009). In both cases, creating empathy between the user and synthetic characters was seen as a key requirement for meeting the educational objectives. Empathic characters, that could successfully evoke this, were therefore required.
A modern definition suggests that “empathy” is “any process where the attended perception of the object’s state generates a state in the subject that is more applicable to the object’s state or situation than to the subject’s own prior state or situation” (Preston & de Waal, 2002).
One can distinguish between two types of empathy: cognitive empathy and affective empathy. For cognitive empathy, perception of the “object” (another person in this case) produces an understanding of their “affective state.” In the case of affective empathy, a change in the affective state of the subject is produced.
One can also distinguish two different mechanisms (Bischof-Köhler, 1989) for mediating empathy. The first is mediation by the situation in which the object is perceived to be: for example, seeing someone have their handbag stolen may produce the “cognitive empathy” effect of understanding that they are sad and angry. Empathy may also be mediated by expression, where any element of the full range of expressive behaviour produces the empathic effect. Thus an affective empathy reaction of sadness may be produced by seeing the target crying.
FearNot! was designed as a virtual drama in which one synthetic character was bullied by another in a virtual school (see Figure 1). The actions of the characters were not prescripted but were dynamically generated by the parameters in their affective models: anger and hate in the case of the bully and fear and distress in the case of the victim.

FearNot!: Education against bullying.
The child user watched the interaction and after it the victimised character would ask them for advice about how to deal with the situation. The child user was able to input free text (see Figure 2). The advice given would influence the behaviour of the victimised character indirectly by altering some of the parameters of its affective model, and this in turn would make specific actions more likely—though not certain—in the next scene.

Interaction with FearNot!
Including an affective model both allowed the character to display appropriate expressive behaviour (mediation by expression) and produced the necessary dramatic actions (mediation by situation).
Modelling Cognitive Appraisal and Coping
There are many models of affect in the psychological literature, more than can be reviewed here. Not all of these are readily implementable. Of those that have been implemented, some derive from a more neuro-physiological perspective (e.g., Canamero, 1998; Velásquez, 1997), less relevant to a synthetic character that interacts with users using natural language. Multilevel models, and notably Russell’s model of core affect (Russell, 2003) and its extension by Mehrabian (1996), have been widely implemented; but while the concepts of pleasure, arousal, and dominance are influential, and can be used to generate expressive behaviour on their own, they provide no link between an event in the world, an affective response, and a resulting action.
The required action repertoire of a FearNot! character included “physical” actions such as one character pushing another, or movement in the graphical 3D scene, as well as dialogue or language actions such as “mock” or “insult.” While expressive behaviour required a reactive model—characters do not plan to cry, but may do so if they are sufficiently distressed—other behaviour, such as actual bullying behaviour, involved planning from an initial set of goals.
Cognitive appraisal was chosen as a means of generating a modelled affective state. Already implemented in earlier systems (Elliot, 1992; Marsella & Gratch, 2003; Marsella et al., 2000), a cognitive appraisal mechanism can be represented as a set of symbolically encoded rules in software linking an event in the virtual world and the current goals of the synthetic character to a generated affective state. The OCC model (Ortony et al., 1988) was selected because it was straightforward to represent the events and goals of the FearNot! synthetic characters and link them to the 22 affective states of the OCC taxonomy. Moreover, the OCC taxonomy includes emotions that concern behavioural standards and social relationships (like/dislike, praiseworthiness and desirability for others) and thus it was felt that it would support appraisal processes that take into consideration cultural and social aspects of interaction.
While in the human case a taxonomy of emotions could be criticised as conflating linguistic representations with much more complex underlying states and processes, from a computational perspective, it leads to a structure representing each of the 22 OCC emotions for a specific character as shown in Table 1.
Representing an emotion computationally
Here valence corresponds to the pleasure dimension of Russell (2003), and intensity can be combined over all 22 emotions to represent his arousal dimension. What OCC adds to this is an account of how external events can be coupled to the generation of one of the 22 emotions, readily implemented as a set of rules. To this static representation can be added numeric thresholds, and decay factors, so that the level of an elicited emotion is reduced over the succeeding time periods. By defining thresholds and decay factors differently for different characters, it becomes possible to model different emotional dispositions (see Figure 3). Thresholds represent a resistance to a specific emotion, so that though evoked by an appraisal, an emotion only becomes part of the emotional state if above threshold. When emotions are linked to actions as described in what follows, this will produce different patterns of behaviour over time. These patterns can then be perceived by the user as personality, removing the need to model personality separately from the affective model. This is an example of the modelling economy introduced by using a cognitive appraisal approach.

Defining a character’s emotional disposition.
Cognitive appraisal links external events to generated emotion through emotional-reaction rules, but not to action selection. Reactive behaviour was modelled in FAtiMA by taking up the idea of action tendencies, using the Lazarus (1991) view that action tendencies are well-established biological impulses as compared to coping behaviour. Every character is equipped with its own set of action tendency rules, each linking a minimum level of a specific emotion to a reactive action. For example, distress at a high minimum intensity may trigger tears, an expressive behaviour. Action tendencies can vary across characters, so that not only might one character experience more distress for longer (low threshold, high decay rate), this might quickly trigger crying where some other distress behaviour might be generated for a different character.
Coping behaviour is more likely to involve planned actions (Lazarus, 1991). Problem-based coping involves actions in the world in relation to the event that caused a strong emotion, while emotion-based coping results in internal adjustment—for example, mental disengagement in which a goal or plan is dropped, or wishful thinking about the outcome when faced with a threatening condition. In the computational paradigm, intelligent planning has been studied largely as a means of sequencing actions together correctly such that they meet some goal(s) efficiently. However, efficiency is not the same thing as believability, and the standard artificial intelligence (AI) approach neglects the interaction between planning and affect (Damasio, 1994). In an application where one character is bullying another, fear, anger, and other emotions are intimately involved with planning.
A detailed discussion of the planning process created for characters may be found elsewhere (Aylett et al., 2006). It draws on the OCC prospect-based emotions, that is, emotions relating to events or actions that have not yet taken place: hope and fear. Hope and fear may be directly generated from the appraisal process, but through emotion-directed coping may also be generated from within planning itself as future actions are considered. From a computational point of view, emotions can be considered as if they were planning heuristics, helping to control which goals are selected for planning, and which of a number of possible plans for each goal are favoured for execution. We have already seen that the use of an affective model that includes thresholds and decay factors in a character subject to variable external events, can produce a complex dynamic model. The addition of planning extends the dynamics of the model beyond single-step action–reaction and introduces internal cognitive processes in a feedback loop with the affective system.
Table 2 summarises the FAtiMA model elements specified when defining a specific character.
Model elements used for FAtiMA character definitions
There is no straightforward way of testing a model such as FAtiMA in isolation from a particular application, even though FAtiMA is a generic architecture. Actual characters, actual events, and actual actions in a specific domain are required in order to run the model. This means that one cannot test the generic mechanisms of cognitive appraisal and reactive and planned coping behaviours independently of specific settings of the many variables in the model. Advice from the psychologist members of the development team was taken, drawing on characterizations in the literature of the personality of bullies and victims. Variable settings were then refined through trial and error in which values were tried and the resulting character behaviour considered for appropriateness and believability.
FearNot! was extensively trialled in schools in the UK, Germany, and Portugal. A large-scale longitudinal evaluation (Sapouna et al., 2010) showed that FearNot! did have a positive impact and, as important, that it did not produce cleverer bullies. This underlines the importance of characters that can evoke empathy.
Extending FAtiMA: Including Motivations and Culture
Though embodying a generic model, FAtiMA was initially developed in response to the demands of FearNot! While there are strong cultural factors at work in human bullying behaviour, these were incorporated in FearNot! through content and not through generic modelling. Some different scenes/situations were authored for a German version as against a UK version, and cosmetic changes such as characters having graphical school uniforms or not were incorporated. Nevertheless, the evaluation showed that the impact of FearNot! was indeed different in the two different cultures in which it was tested (Watson et al., 2010).
However, FAtiMA was next reapplied to education in intercultural empathy in the ORIENT application. This required a generic model of culture such that characters could be configured to behave according to the norms of different cultures.
In this section, we first consider ways in which FAtiMA as first implemented might be modified to include a model of culture. These possibilities were inherent in the way the model had been constructed from both a static and dynamic point of view.
Having established that at least one of them requires much better goal management than FAtiMA initially offered, we explain how ideas from the PSI model (Doerner, 2003) were included so that motivations could be used as a way of managing goals. We then consider a computationally feasible model of culture—Hofstede’s (2001) cultural dimensions. Finally, we explain how part of Hofstede’s model has been incorporated into the extended FAtiMA needed for the ORIENT application.
FAtiMA and Extensions for Modelling Culture
First, culture may impact the FAtiMA appraisal process linking character goals to events or actions to generate a change in one or more emotions. Emotional reaction rules (above in Table 2) link events or actions to values representing desirability (how far this supports or impedes character goals) and praiseworthiness (how far it supports or impedes social norms). Praiseworthiness and desirability values are then used to update all 22 OCC emotions for a character. Social norms could be expressed as the praiseworthiness of specific events or actions or as the desirability of specific social goals held by all characters in a culture.
Second, culture might be encoded within the network linking praiseworthiness and desirability to emotion intensity updates, independently of specific goals or norms. This would be to assert that a culture has a generic impact on the amount of emotional change generated during appraisal—possibly differentially across emotions. For example, a culture might be a censorious one in which everyone reacts more angrily to blameworthiness.
Third, culture might impact emotional disposition or action tendencies (see Table 2) via global modification of thresholds and decay rates. Adding a cultural increment to all thresholds in all characters, for example, would reduce their emotional sensitivity to the appraisal process, and subtracting globally would increase it and produce an increase in volatility. A global increment to decay rates would produce personalities that would stay in given emotional states for longer and thus would be more emotionally driven, while subtracting would produce a set of relatively calmer personalities. Action tendencies are expressed as rules triggered by a minimum intensity of a specific emotion, so a cultural increment or decrement to emotion thresholds and decay rates would itself impact the reactive behaviour of a character. However, the minimum intensities could themselves be adjusted with reference to a cultural model, affecting the perceived impulsiveness or stolidity of characters.
Finally, relationships between culture and coping behaviour could also be modelled. For example, culture could affect the goal selection process, in which a subset of possible goals is foregrounded for active planning. However in the first version of FAtiMA, preconditions are hardwired in goals attached to specific characters, and goal priority is determined solely by the amount of hope and fear generated by the planning process and the other emotions that result from it. In the next section we discuss the addition of motivations to the model as a way of improving goal management, and in the following section we discuss the use of Hofstede’s cultural dimensions that build upon this.
Adding Motivations to FAtiMA
The cognitive appraisal approach discussed so far represents one broad paradigm, characterised computationally by symbolic representations and explicit reasoning processes. However, particularly in robotics, a different paradigm has often been applied, taking ideas from neurophysiology (Canamero, 1998; Velásquez, 1997). This posits a small set of basic drives, some survival-related, such as that for food, and some more general, such as that for novel experiences.
Drives have associated states—for instance, how much an agent has eaten recently, how many novel experiences it has had—whose value has upper and lower bounds relating to the overall comfort of the agent. When a state moves outside of these bounds, the drive produces a motivation for actions that will move it back inside its comfort range. Thus homeostasis is the primary mechanism for the dynamics of this type of model. Affect is procedurally modelled through behaviour that can be interpreted as affective, rather than through an explicit representation as in cognitive appraisal. There is no requirement either for symbolic representations or for explicit reasoning, with stimulus–response mappings able to manipulate numerical sensor inputs directly.
Instead of a mutually exclusive alternative, one can include this approach as part of a heterogeneous model in which drives, and the ensuing motivations, act as a goal-management system for symbolically represented goals (Aylett, 2006). In this account, a motivation is a long-term and generic reaction to a drive that activates a set of relevant goals without necessarily choosing between them. Thus a hungry agent has a motivation to find food, but this motivation could be met by the different goals of opening the fridge, buying a sandwich, or picking wild berries, depending on the context and the resources of the environment.
Consider a cultural reason for satisfying hunger in a specific way—say not eating the ham sandwich in the fridge but going out to buy an egg sandwich instead. It is hard to model this purely at the level of drives. Plausibly, culturally mediated food preferences operate through an affective state, so that the agent finds ham sandwiches disgusting. However, a cognitive appraisal seems a more convincing mechanism for generating disgust when the fridge door opens. In the same way, an affiliation drive might produce the socially interactive behaviour that allows cultural norms to be learned and internalised, but the drive alone seems too generic to invoke culturally specific behaviours.
FAtiMA-PSI
In incorporating drives and motivations into FAtiMA, we could have picked an existing computational implementation of a neuro-physiologically inspired theory. However, these were designed to act in isolation, not to be integrated with a cognitive appraisal-based system. We therefore examined the PSI theory of Doerner (2003), since this already integrated cognition, emotion, and motivation for human action regulation and included links to planning. The PSI theory starts from the definition of five basic drives or needs, as seen in Table 3.
PSI drives/needs
A deviation from the threshold set for a need will give rise to a motive. Motives feed into an action selection process and a goal is selected for execution based on its anticipated probability of success, the degree to which it satisfies needs, and its estimated urgency. If the character does not have any knowledge of how to satisfy this goal, the success probability will be low; however, if its competence is high, it will perform exploratory behaviour and may still consider selecting the goal. A PSI agent has three strategies for dealing with a goal. First, the agent tries to recall an automatic reaction. If this is not successful or if no such reaction exists, it attempts to construct a plan. If both automatic reaction and planning fail, the agent resorts to applying trial and error, a type of exploratory behaviour.
Unlike a cognitive appraisal approach, PSI has no explicit representation of emotion. Rather, cognitive and motivational processes are modulated under different environmental circumstances in ways that are interpretable by an observer as emotionally modified. The three modulating parameters are:
Activation or arousal: the degree of preparedness for perception and reaction. A high level of arousal produces faster behaviour, and arousal itself increases with the overall pressure from the motivational system as well as in relation to the strength (urgency and importance) of the currently active intention.
Resolution level: This determines the accuracy and deliberateness of cognitive processes such as perception, planning, and action regulation. It varies inversely with arousal: when arousal is high, an agent will put less effort into considering the consequences of its actions.
Selection threshold: This prevents oscillation between behaviours by giving the current active intention priority. It varies in proportion to arousal: An agent is easily distracted from its current intention when the threshold is low, and is highly focused when it is high.
A big advantage in linking this to the FAtiMA architecture is that PSI incorporates a built-in learning process. By trying different goals and actions under different circumstances, the agent will learn which goal and action is the most effective in satisfying its needs. PSI agents’ differences in behaviour will then correspond to different life experiences that lead to different learned associations, offering a potential not only for the modelling of personality, but also of culturally mediated behaviours.
As shown in Table 2, an emotional reaction rule has to be written for each action of each FAtiMA agent to define the praiseworthiness and the desirability (or undesirability) of the action, both for the agent itself and for other agents. However, PSI agents can derive desirability for events automatically from needs: the better an action or goal satisfies need(s), the more desirable it is, eliminating the hardwired emotional reaction rule sets. FAtiMA goals and actions then require an expanded representation that includes the potential effects on needs of carrying out the corresponding goals or actions. The same applies to the action tendency rules also seen in Table 2. A FAtiMA agent needed a reactive rule, for example, run when in danger. For a FAtiMA-PSI agent, this action is automatic because in this case, the need for certainty would be high and the agent would choose the run action.
The FAtiMA-PSI architecture was developed for the ORIENT application in which meeting the pedagogical goals required a greater degree of control over agent behaviour than a pure PSI architecture can provide. While the desirability of actions for the agent itself can be learned using the PSI mechanisms, the use of OCC and cognitive appraisal allows an explicit model of desirability for others and praiseworthiness as described in what follows. More detail on this architecture, including all the equations used to calculate internal variables, can be found in Lim, Dias, Aylett, and Paiva (2010).
A Computational Model of Culture to Support ORIENT
It is widely accepted that culture pertains to the social world and determines how groups of people structure their lives (Bennett & Bennett, 2004). Thus culture can be seen as a collective phenomenon shared by people that live in the same social environment (Hofstede, 2001). A constructivist definition of culture (Berger & Luckmann, 1967) looks at culture in two ways. The first covers the institutional aspects of culture, such as political and economic systems, and products of culture: works of art, music, cuisine, etcetera. The second, subjective culture, covers the experience of the social reality formed by a society’s institutions, that is, the worldview of its people (Bennett & Bennett, 2004).
However, in order to operationalise a model of culture, we sought work that could easily be expressed as a set of specific rules and focused on Hofstede’s taxonomy of cultural dimensions (Hofstede, 2001). This work views culture as those patterns of thinking, feeling, and acting that are shared and learned by members of the same culture. These patterns can manifest in several forms: values, rituals, heroes (persons that serve as models of behaviour), and symbols (gestures, words, pictures to which members of the culture have assigned a particular meaning).
Values are defined as “broad tendency to prefer certain states of affairs over others” (Hofstede, 1991, p. 35). “They transcend specific situations, guide selection or evaluation of behavior and events, and are ordered by relative importance” (Schwartz & Bilsky, 1987, p. 551). Values are often unconscious to those who hold them and as such they cannot be directly observable. Instead they can be inferred from the way people act under certain circumstances.
Unlike values, rituals are clearly observable in cultures and are essential to social activities—it is known that humans have been involved in ritual activities since the earliest tribal communities. According to Bell (1997), rituals not only regulate the relationships between people within a community, but also between people and their natural resources. In general a ritual can be defined as a particular set of actions, often thought to have symbolic value. The performance of a ritual is usually prescribed by a religion or by the traditions of a community. Finally, cultures also have associated symbols, which constitute words, gestures, pictures, or objects with meanings specific to that particular culture.
Based on a large-scale study of IBM employees in different countries, Hofstede added to these patterns five dimensions across which cultures vary, and that are indications of general behavioural tendencies. These dimensions are: power distance, individualism/collectivism, masculinity/femininity, uncertainty avoidance, and long-term orientation (defined in Table 4). The importance of these dimensions is that they can be associated with manifestations of cultural difference, thus linking cultural parameters to cultural behaviour.
Hofstede’s cultural dimensions model
Note: PD = power distance; M = masculinity; F = femininity.
Two of these dimensions were modelled in FAtiMA-PSI. The first is power distance, the degree to which less powerful members of the group expect and accept that power is distributed unequally. In low power distance cultures, power relations are usually more consultative or democratic, whereas in high power distance cultures, people tend to accept power relations that are more autocratic, and usually respect and acknowledge the power of others just by their formal status. The second dimension considered was individualism/collectivism, which looks at the relations between the individual and the group. Collectivism pertains to societies in which people are integrated into strong, cohesive in-groups, whereas individualism pertains to societies in which everyone is expected to look after themselves and their immediate family.
Extending FAtiMA-PSI for ORIENT
Just as FearNot! was the motivating application for the development of FAtiMA, so ORIENT motivated extending it to deal with culture. ORIENT is an intelligent graphical- character-based system designed to enhance intercultural empathy. It attempts to take a group of three teenage (about 14-year-old) users, cooperating together, through the early stages of the Bennett (1993) model of the development of cultural sensitivity. This application is discussed in more detail elsewhere (Aylett et al., 2009; Lim, Dias, Aylett, & Paiva, 2010); here we consider only the graphical characters involved, visualised as “aliens” called Sprytes, on a planet called Orient.
Sprytes are somewhat humanoid, but as seen in Figure 4, actually modelled visually on tree frogs. Their culture is a synthetic one, not identifiable as any specific human culture. There were several reasons for this choice. First, the application itself was to be used in different cultural settings, and for evaluation purposes it was better that the culture portrayed be unfamiliar to all users; second, there was a desire to avoid real-world cultural stereotypes (and for similar reasons Sprytes were ungendered); finally, it was very clear that representing all the richness of a real culture was infeasible in the current state of the art. We wished to avoid a sketch or caricature that could be quite offensive to members of a real culture.

Spryte characters in ORIENT.
It is clear that in creating an affective agent architecture, culture should be taken into account in the modulation of affective states—both at the expression and at the generation level. Some emotionally based agent architectures already do consider cultural differences and have explored this issue for concrete applications.
CUBE-G (Rehm et al., 2007) is an interesting project that also uses Hofstede’s cultural dimensions (Hofstede, 2001) for modelling nonverbal communication aspects of different real-world cultures. Agents in CUBE-G establish conversations between themselves and users, and in those interactions the cultural background of a user is inferred by sensing their nonverbal behaviour while using a Nintendo Wii remote controller. The nonverbal behaviour of the agents is then dynamically adapted according to the culture inferred by the system. The culturally affected behaviour (CAB) model (Solomon, van Lent, Core, Carpenter, & Rosenberg, 2008) takes a different approach, allowing the encoding of specific ethnographic data on cultural norms, biases, and stereotypes. These are used to modulate the behaviour of agents. However, neither of these systems considers how culture might influence emotional processes, a requirement for the creation of intelligent agents for ORIENT.
Our own approach allows the cultural dimensions of an agent society to be explicitly represented through individual culturally specific behaviours. This in turn supports the emergence of collective behaviours for a society of agents. Our aim in extending FAtiMA-PSI was to parameterise the two Hofstede’s cultural dimensions mentioned, so that different settings of these parameters changed the cognitive–affective processes of FAtiMA-PSI, modelling the impact of a specific culture. The affected elements in our model were the appraisal process, appraisal variables, and goal selection. A set of values for these parameters are specified for the symbols, cultural dimensions, and culturally specific rituals for a particular culture, and all the characters that are part of that culture inherit these settings.
Symbols, Rituals, and Appraisal in FAtiMA-PSI
When an event is perceived by the agent it now passes through a symbol translator to obtain the meaning of that event according to the particular agent’s culture (for instance a waving hand may be considered a greeting in one culture but insulting in another). The perceived event is then used to update the agent’s knowledge base (containing its knowledge of the world), its autobiographic memory (containing events organised as episodes), and the motivational state discussed before.
Some events affect the agent’s motivational state: for example, if the agent finished an eating action, its “need” for energy should go down. However, if the agent sees another agent finishing eating, then it should also capture that event, and then predict the other agent’s level of energy. FAtiMA-PSI includes a mechanism to model other agents and their relationship to the individual agent, which is able to build and update a record of the motivational state of other agents according to events perceived. This information is used later in the cultural goal selection and cultural appraisal processes.
After updating its motivational states, the event is finally appraised by the agent. A “cultural appraisal” process was integrated into the reactive appraisal component of FAtiMA. This process is based on the idea that the appraisal variable praiseworthiness (using the OCC appraisal variables) is culturally dependent. Indeed, as stated in Ortony et al. (1988), events with a positive praiseworthiness will potentially cause the character to feel pride or admiration, and a negative praiseworthiness result will potentially cause the character to feel shame or reproach.
According to Markus and Kitayama (1991), people in an individualist culture appraise events in terms of their individual achievements and properties, whereas collectivists appraise events in terms of the group the person belongs to or the interpersonal relationships of the group. Collectivist cultures therefore try to avoid conflicts that would disrupt the harmony of the group. The extensions to FAtiMA-PSI incorporate a cultural appraisal process where the praiseworthiness variable calculated in appraising an event depends on the agent who caused the event and the impact that it has on the other characters. This means praiseworthiness is calculated differently for behaviours that involve others, depending on the degree of individualism or collectivism.
Thus, the more collectivist a culture is, the more an event that is undesirable for others but beneficial for the responsible character will be blameworthy (e.g., stealing something). In addition, the more an event is good for others, even if it is bad for the responsible character, the more praiseworthy it will be (e.g., giving food). By taking into account the benefits that an event has for the self (the agent) and for others (as modelled by the agent) according to the culture parametrization, the agent reacts differently.
If Ia(e) is the impact of an event on the agent that causes it, and if Io(e) is the sum of the impacts on other agents according to the first agent’s models of them, and IDV is the Hofstede degree of individualism (0–100; 0 more collectivist), then the praiseworthiness P(e) of an event e can be calculated as:
The link between culture and agent behaviour also impacts the agent’s goal selection. In cognitive architectures, goal utility is a value relating to how useful a goal is to the agent. For example, if an agent has the goal of drinking some water, then that goal’s utility rises as the agent becomes thirstier.
A cultural goal selection process was added that also calculates the expected cultural utility for each active goal. This is the expected impact the goal will have on the agent’s own motivational state and on the motivational state of the goal’s target (determined using the representation of that other agent). This allows the modelling of individualistic agents that are primarily concerned with themselves, and only for another agent if they have a strong interpersonal attraction (symbolizing a close bond) with them.
Collectivistic characters are, however, equally concerned with themselves and with others and treat everyone alike (regardless of social bonds). The details of how goal utility is calculated can be seen in Mascarenhas, Dias, Prada, and Paiva (2010). This link between culture and goal utility also allows us to capture the “power distance” dimension. It can be applied so that characters belonging to a high-power culture favour goals that positively affect others with a higher status than themselves (Mascarenhas et al., 2010) by giving a higher utility to such goals.
The outcome of the cultural evaluation of goal utility is that different goals will be selected in the same situation by an agent depending on its cultural parametrization. As a result, different actions will be carried out. For example, consider a scenario in which a sick character reports their sickness to some other characters. Using the mechanism just discussed, if the culture is highly individualistic, a character that has medicine but is not a friend of the first character will criticise them for complaining. Conversely, if the culture is highly collectivistic, the same character will promptly offer their medicine to help (Mascarenhas et al., 2010).
Evaluating the Extended Model
As with the initial FAtiMA model, the culturally extended version can only be evaluated in the context of a specific application—in this case ORIENT—with actual settings for its large number of parameters. The same combination of expert advice followed by trial and error of the selected settings was followed in ORIENT as in FearNot! When character behaviour was compliant in the view of the development team, ORIENT was evaluated with groups of users.
Evaluation of a complex model embedded in a specific application is not an all-or-nothing affair but involves a set of evaluation metrics. One can pose questions relating to user perceptions of the characters in relation to themselves in the story world context, both in relation to a very unfamiliar cultural context and to the overall interaction experience. One can also evaluate whether the cultural model just discussed produces perceptibly different behaviour for users and whether they ascribe this to personality or to culture. Finally, the pedagogical objectives—desired changes in attitudes and/or behaviours—must be evaluated.
This article focuses on the modelling issues tackled in the development of FAtiMA and FAtiMA-PSI and so the first two types of evaluation are the most relevant to this discussion. Pedagogical evaluation of ORIENT has not in any case yet been carried out. There are logistical problems in evaluating an application in which three users jointly interact with a large screen in a real physical space using a variety of interaction devices (mobile phones, a dance mat, a Wii remote), that are outside the scope of this discussion (but see Lim et al., 2009). The pedagogical objectives would also need to be explained within the context of the Bennett (1993) stages model and space does not allow this here.
Four groups of three in Germany (N = 12, all female) and three groups in the UK (N = 9, six male, three female) were evaluated through a whole engagement session with ORIENT, using a set of questionnaires with 5-point Likert scales. The results were not of course statistically significant with these small numbers, but gave interestingly suggestive differences between German and UK subjects on some questions relating to perception of the Sprytes and their culture. While all German subjects felt Sprytes’ culture was “friendlier” than theirs, the majority of UK subjects thought it was “less friendly.” On the Intergroup Anxiety Scale (Stephan, Diaz-Loving, & Duran, 2000), while values were similarly high for adjectives comfortable, confident, at ease, entertained, interested, and happy, only UK subjects scored high for frustrated and concerned. Finally, the UK subjects rated the Sprytes low on the attribute consider you as an enemy/friend (i.e., Sprytes seemed hostile towards them), while the German subjects rated the Sprytes high.
Perception of cultural difference for different cultural parameter settings was evaluated in two studies, reported in Mascarenhas, Dias, Afonso, Enz, and Paiva (2009) and Mascarenhas et al. (2010). These did not use the ORIENT Sprytes, but humanoid figures with slightly archaic robes and hats, meeting together for a meal. These were presented in a noninteractive form as pairs of videos. The first study showed different rituals (Mascarenhas et al., 2009), where one video represented a culture with high power distance and the other a culture with low power distance. In the second study (Mascarenhas et al., 2010), one video had extreme individualist parameter settings, and the other extreme collectivist parameter settings. Questionnaires were used to see if users perceived a difference when the cultural parameters were varied.
The first study asked users to identify attributes that related to high and low power distance. Nearly all the sample characterised the cultures according to the power distance of the ritual that had been depicted. In response to a question asking whether culture or personality was the cause of the differences in behaviour between the two videos, 67% associated the differences with culture, 30% with personality, and 3% answered neither. In the second study, relating to the attributes of individualistic and collectivist cultures, nearly all users again showed an ability to distinguish between them correctly. Interestingly, however, they also characterised the collectivist culture as more hierarchical (higher power distance) than the individualist one. This suggests that the Hofstede dimensions are not completely orthogonal. The question about the source of the difference produced the opposite result from the first study: 63% associated the differences with personality, 30% with culture, and 7% with neither.
This result might be explained by the argument of Hofstede (2001) that behaviours associated with the cultural dimensions are implicit cultural manifestations, while rituals and symbols are explicit manifestations. However, it is not clear how far an observer of an unfamiliar culture is able to distinguish between ritual and nonritual behaviours. Both culture and personality are assessed as patterns of behaviour over time. An attribution to personality is an individually centred explanation of these patterns, while seeing them as culture is a socially centred view of behaviour. The dinner party scenario is an inherently social event in which there is a great deal of common behaviour in both the high and low power distance cases. The alteration of collectivist– individualist parameters in contrast produces more changes in individual behaviour. This may make it inherently more likely that an observer will interpret the differences as due to personality.
This contrasts with the described ORIENT study in which users showed no indications of differentiating the Sprytes by personality. However, Sprytes were not equipped with many variable expressive features—for example, they had no facial expression changes. They were equipped with a range of gestures, but these were deliberately chosen to be unfamiliar to human users. Though their behaviour was driven by the FAtiMA-PSI affective architecture, only the content of their utterances and their movement in space (advancing or retreating, for example) could provide emotional cues. Finally, rather than watching a video, ORIENT users interacted with Sprytes, mimicking the unfamiliar gestures, and trying to understand the meaning of culturally specific artifacts. Arguably this forced users to focus on social rather than individual aspects of behaviour.
Conclusions and Future Directions
In this article we have reviewed work taking certain theories from psychology and cultural modelling and bringing them together in computationally implemented models. These models have been developed for specific applications in which the affective engagement of the human user is the basis for a desired pedagogic effect. Inevitably this means that models originally developed from a descriptive or analytical perspective are applied generatively in order to produce behaviour rather than to analyse it.
The model adopted is inherently interactional, in that the moment-to-moment affective states generated within characters depend entirely on the events they perceive in their environment. While the number of affective states modelled and the thresholds and decay rates of each are givens for a specific character, we have already described how the intensity of specific affective states and the extent to which they result in actions are contingent and learned. Two different contributions can be made by this type of work, one to the theoretical field and one to the external world.
Once a computational model has been constructed, then its precision lays the basis for questions that the theorists may not have yet posed. In considering how to extend FAtiMA, we asked questions about the relationship between more physiological-based theories and cognitive–appraisal theories that challenge both theories. In the same way, our detailed consideration of how to include culture in an affective architecture raises detailed questions about how culture impacts emotional responses: for example, are some cultures more prone to certain emotions, and do some cultures nurture particular types of temperament?
It may be, however, that the greatest impact of this style of work lies in the external world. As computational resource spreads from the desktop into everyday human social environments via powerful handheld devices and the embedding of computational power into the environment, affective and culturally sensitive characters may become the new generic interface. We have noted the human tendency to project social-partner status onto autonomous graphical characters. If the models that sustain this engagement can take its weight, affective technology could be the currency of all our lives in the future.
