Abstract
The objective of this study was to determine the influence of alexithymia on the ability to identify emotions through visual and auditory stimuli. We assessed Alexithymia using the Toronto Alexithymia Scale (TAS-20). As visual stimuli, we employed the images of faces from the Ekman 60 Faces Test, while the auditory stimuli consisted of fragments of instrumental music. A total of 303 students participated, 139 in secondary education and 164 in the first year of university (M = 17.58 years; SD = 4.16). The results show higher alexithymia levels in the female participants than in the male participants, mainly in the difficulty identifying feelings (DIF) and difficulty describing feelings (DDF) factors, and higher in the secondary students than in the university students, especially in externally oriented thinking (EOT). In terms of the identification of emotions through auditory stimuli, the EOT factor showed a strong predictive effect for the emotions of surprise and anger. For the visual stimuli, the EOT factor showed predictive validity for identifying happiness, while the DDF factor showed predictive validity for identifying sadness. We conclude that there is a relationship between alexithymia levels and emotion recognition, which varies depending on the nature of the stimulus.
Adolescence is often seen as a critical and vulnerable period (Gomes et al., 2016; Lees et al., 2020), particularly affected by a reorganization involving a multitude of risks and opportunities (Steinberg, 2005). During adolescence, a series of neurobehavioral changes linked to pubertal development occur, which also have a significant effect on emotions. In turn, these specific affective changes in this evolutionary stage are related to numerous changes in development (Dahl, 2004). From this perspective, adolescence is a decisive period during which neurobiological processes related to social and emotional behaviors mature, which include discriminating among emotional signals (Yurgelun-Todd, 2007). Furthermore, most individuals are able to identify and define the emotions evoked by distinct stimuli, although certain patient groups present difficulties in developing this ability, a phenomenon known as alexithymia. Taruffi et al. (2017) suggested that alexithymia, classically associated with a deficit in the emotion perception of facial and vocal expressions, is also linked to difficulties related to emotion perception through musical stimuli. Studies have found that alexithymia is a predictor in the recognition of facial, vocal, and musical emotions (Cook et al., 2013; Heaton et al., 2012). Nevertheless, Lyvers et al. (2020) believe that there is a positive relationship between emotional response to music and alexithymia, considering that alexithymic people can rely on music to experience emotions more fully, so it is considered that this debate on the relationship between alexithymia and emotional recognition through music needs further research. However, research on the relationship between alexithymia and emotion identification has generally been conducted with visual or auditory stimuli and mostly in the adult population. As far as we know, there are no studies with adolescent populations that have evaluated the relationship between alexithymia and emotion identification, comparing the potential differences between visual facial and musical stimuli, which is the motivation for the present study.
The construct of alexithymia and its measurement by the Toronto Alexithymia Scale
Alexithymia is a personality trait characterized by an internal state lacking hope (Sifneos, 1973) combined with a thought pattern aimed at external details (Martínez-Sánchez, 1996; Parker et al., 1993), as well as difficulty in identifying and describing verbal emotions (Martínez-Sánchez, 1996; Parker et al., 1993). In other words, alexithymia manifests through a series of processes that yield deficient results in terms of the intrapersonal and interpersonal processing of emotions (Páez & Velasco, 1993). It is worth adding that alexithymia manifests in psychiatric and psychosomatic patients and among healthy individuals (Taylor & Bagby, 2004). Currently, there is growing interest in determining the aspects related to the influence of alexithymia (Kajanoja et al., 2017), perhaps due to the fact that the rates of alexithymia are increasing in both the adolescent and adult population (Courty et al., 2015). In terms of the variable age, a number of studies have found no significant results with alexithymia (Joukamaa et al., 2007), while other studies have found significant results through higher scores in older adult patients (Mattila et al., 2006). Other studies have considered that alexithymia shows an attenuated tendency from adolescence to the entry into adulthood, a stage in which alexithymia levels undergo fewer variations (Säkkinen et al., 2007). Moral de la Rubia and Retamales Rojas (2000) found lower alexithymia levels in intermediate ages and higher levels in adolescent and older adult populations. From other perspective, Säkkinen et al. (2007), in a study with adolescents, found no differences in alexithymia in terms of gender, while Joukamaa et al. (2007) detected a higher rate of alexithymia in girls than in boys. It has been observed that although men show higher rates of alexithymia than women, the data on this sociodemographic variable, along with that of age, are not conclusive and show inconsistent results, which leads to debate in this setting.
Several studies to assess alexithymia levels have employed the Toronto Alexithymia Scale (TAS-20; Bagby et al., 1994). In a study of alexithymia using this scale in an adult population, Kooiman et al. (2002) found low internal consistency for the “externally-oriented thinking” (EOT) factor, as well as low and variable correlations between the factors “difficulty identifying feelings” (DIF) and “difficulty describing feelings” (DDF) on one hand and the EOT factor on the other, agreeing in this aspect with Loas et al. (2017). From this perspective, other studies conducted with adolescent populations using the bifactorial model (DIF–DDF) of TAS-20 showed a significantly greater fit than those studies performed with a three-factor structure, thereby increasing the homogeneity of the concept of alexithymia. Some studies conducted with children, adolescents and youths have indicated the low reliability of the EOT factor and have proposed suppressing EOT in future studies (Loas et al., 2017). Unlike the situation with the other two factors of alexithymia (DDF and DIF), individuals with higher rates of EOT present reduced physiological reactivity in response to film fragments that evoke sadness (Davydov et al., 2013). This finding is in line with the results of another study that showed an association between EOT and difficulties in identifying facial manifestations that express anger, sadness or fear (Prkachin et al., 2009). All of this evidence highlights the fact that the presence of alexithymia can predict problems in the setting of social relationships and the ability to identify and control one’s emotions (Leshem et al., 2019).
Alexithymia in relation to visual facial and musical auditory stimuli
Emotion perception is the ability to detect and interpret emotions in faces, images, voices, and cultural manifestations, which include music (Scherer & Scherer, 2011). For an emotion to be experienced or perceived, the stimulus that has generated it needs to be categorized through a conceptualization process. The emotions perceived through static stimuli (images) are easier to decipher than those that come from dynamic channels, as in the case of music (Scherer, 2004). Nevertheless, this debate generates controversy, given that other studies that assessed the subjective perception of emotions through music achieved similar results to those obtained with visual or vocal stimuli (Laukka et al., 2013), although Keltner and Cowen (2021) note that emotions such as joy and love are general, while others are specific to the modality in question. Furthermore, the intrinsic difficulties of alexithymia that affect the recognition of emotional expressions through visual and language stimuli (Parker et al., 1993; Prkachin et al., 2009) involve a deficit in the interpretation of external and internal emotional signals, especially of those emotions with negative valence (Prkachin et al., 2009) that can be extrapolated from music (Taruffi et al., 2017). Along these lines, Larwood et al. (2021) suggest that the perceptual deficits in alexithymia tend to be specific to the unpleasant emotions associated in the study with music. Based on the concept supported by Ekman and Friesen (2003) in which all individuals react emotionally to music, this emotional reaction when perceiving musical stimuli is considered easy to measure (Zatorre, 2015), develops over time, continuously produces expectations, is a complex stimulus that rarely generates a single emotional perception (Mallik et al., 2017) and frequently manifests intermediate emotions, consisting of other primary origin emotions (Larsen & Stastny, 2011).
The emotions can easily be cataloged by hearing musical fragments (Juslin & Laukka, 2004); however, the ability to identify emotions through these stimuli appears to be specific to certain emotion categories. The accuracy in identifying emotions, therefore, varies based on the emotions under study. This fact indicates that certain musical stimuli seem to have pancultural elements that are more determinant than others (Argstatter, 2015). A number of emotions such as happiness and sadness, when encoded musically, are identified more easily than others (Laukka et al., 2013), perhaps due to a combination of simple musical parameters, while other emotions such as anger and surprise are more difficult to evaluate because they encompass more complex musical patterns (Eerola & Vuoskoski, 2013). More specifically, surprise is an emotion that is difficult to decipher interculturally through music and is frequently confused with happiness, due probably to the close similarity of the musical characteristics between the two emotions (Argstatter, 2015), such as the frequent association between surprise and an immediate and unexpected event with pleasant characteristics for the recipient.
Furthermore, the unpleasant emotions of fear, revulsion, and anger often cause bewilderment (Argstatter, 2015), creating confusion in the identification of these emotions through musical stimuli (Argstatter, 2015; Eerola & Vuoskoski, 2011). The complexity of musical stimulus and the multiple modalities of the elements that make up the stimulus can significantly affect the emotional perception of music and, as a result, its identification.
Purpose of the study
The primary objective of this study is to determine whether those individuals with higher alexithymia scores show greater difficulty in identifying and describing emotions evoked by both facial visual stimuli and auditory musical stimuli, as well as to determine how the emotion processing of the two types of stimuli behave according to similar patterns.
Material and methods
Participants
A total of 303 students participated in the study, 139 secondary education students and 164 first-year university students, without applying additional exclusion criteria. Their mean age was 17.58 years (SD = 4.16), and 66% were female.
Stimuli and measures
Musical fragments
A professional composer created six unedited musical fragments, each 24–33 s long, aimed at evoking the six primary emotions, with the musical characteristics indicated in Table 1. To corroborate that the compositions provoked the emotions under consideration, the compositions were submitted to a panel of experts formed by 50 music specialists, who determined whether each melody produced the emotion under consideration. The level of agreement was very high for all emotions except Disgust: Sadness 96%, Fear 88%, Surprise 86%, Anger 74%, Happiness 72%, and Disgust 38%. The discrepancy in Disgust was discussed with both the composer and the experts. The conclusion reached was that the composition could be valid, although the emotion was difficult to evoke, so it was decided to keep it.
Characteristics of the Musical Fragments Employed as Stimuli.
Facial expressions
For this section, we employed the Ekman 60 Faces test (EK-60C; Young et al., 2002).
Alexithymia
To determine the effective alexithymia levels in the sample, we administered the TAS-20.
Procedure
We opted for the free identification of emotions as, in line with Cowen et al. (2020), the aim was to obtain spontaneous rather than directed answers. Moreover, this procedure (rather than one involving a closed choice) allows us to evaluate the possibility that the same stimulus could generate very different emotions (Swaminathan & Schellenberg, 2015).
First, after listening to each musical fragment the participants had 20 s to specify the emotion evoked. The answers were classified by categories, starting with the emotional states with the highest scores. Those that were not mentioned by more than 5% of the sample were not counted. Next, the participants completed the TAS-20. Finally, they were asked to identify each of the emotions expressed in the facial images of the EK-60C test. The procedure took approximately 20 min and was conducted during class periods in the reference classroom. The order in which the stimuli were presented was the same for all the participants, and there was a 3-min pause between each activity.
Ethics approval
Participation was voluntary. For those under 18, the study was conducted by signing an informed consent document by the families or those responsible for the student’s guidance. The protocol was approved by the Ethical Committee of the Faculty of Psychology of the Complutense University of Madrid (Pr_2019_20_045) and was conducted in accordance with the requirements and ethical principles set out by the Declaration of Helsinki.
Data analysis
Factorial structure of the TAS-20 scale for adolescents
We tested the three-factor structure through a confirmatory factor analysis (CFA) with a robust weighted least squares mean and variance estimation method (see Figure 1), obtaining an adequate goodness of fit as shown by the following indices: χ2(167) = 343.83; p < .001; root mean square error of approximation (RMSEA) = .060 (.051; .069), comparative fit index (CFI) = .95, Tucker–Lewis index (TLI) = .94, and standardized root mean square residual (SRMR) = .07. The cut-off points for an adequate recommended fit are CFI and TLI > .90 (Bentler, 1990) and RMSEA and SRMR < .08 (Hu & Bentler, 1999).

Path Diagram of the Three-Dimensional Factor Model.
In Figure 1, we can see a strong correlation between the DIF and DDF factors, while the EOT factor has a weak and inverse relationship with DDF and practically null with DIF. This result implies that this is a multidimensional instrument in which the factors should be studied separately and that using an overall measure for alexithymia is inadequate.
Invariance analysis
Measurement invariance was tested using multigroup CFA. The invariance analysis performed for gender and educational stage (secondary university) revealed strong invariance in both, although there are differences in latent means (Table 2).
Invariance Analysis.
Note. CFI = comparative fit index; RMSEA = root mean square error of approximation.
The mean alexithymia scores were higher for the female participants (M = 52.18; SD = 12.51) than for the male participants (M = 49.47; SD = 10.74). The secondary education students scored higher for alexithymia (M = 53.81; SD = 11.76) than the university students (M = 49.03; SD = 11.76). Although the difference in terms of gender was associated with the DIF and DDF factors, the difference between the secondary education and university students was observed in the EOT factor (Figure 2).

Differences by Sex and Educational Level in the Alexithymia Factors.
Predictive validity of alexithymia factors in identifying emotions
Table 3 shows the emotions reported by the participants to visual facial stimuli. As the heterogeneity of responses was low, it was not necessary to categorize them into more general groups, except in the case of some synonyms such as peace and serenity.
Rates of Identification of Emotions with Music and Faces.
Note: Informed answers with percentages >5% are shown.
Table 3 demonstrates that facial emotions are easier to identify than musical emotions; and it was particularly difficult for participants to identify happiness, which they confused with serenity or a feeling of peace. In fact, the results of the interpretation of emotions can be seen to vary clearly between responses to facial and musical stimuli. For example, in the case of musical stimuli, emotions that are as different as sadness or joy are identified as serenity, which does not happen with visual facial stimuli.
To assess the predictive capacity of alexithymia levels based on recognizing emotions, we divided the total score into terciles to determine the scoring levels of low, middle, and high. We subsequently performed a multinomial logistic regression, both for the six facial visual stimuli and for the auditory stimuli and musical fragments. We observed that the predictive validity for identifying the musical fragments was greater (pseudo Cox–Snell R2 = .58, Nagelkerke’s R2 = .65, and McFadden’s R2 = .39) than for the visual facial stimuli (Cox–Snell R2 = .37, Nagelkerke’s R2 = .42, and McFadden’s R2 = .21). For the faces, the effects in terms of the variables were not statistically significant; in the case of the musical fragments, we observed that together with age variable (p = .011), the identification of the musical emotions of anger (p = .022) and surprise (p = .008) has a statistically significant effect in predicting alexithymia (Table 4).
Likelihood Ratio Test for Predicting Alexithymia in Relation to Emotion Identification Through Music, by Sex and Age.
To evaluate the predictive validity of alexithymia factors in identifying the emotions evoked through visual and auditory stimuli, we applied a receiver operating characteristic analysis for each of the six emotions. In terms of the facial visual stimuli (faces), Figure 3 shows that the EOT factor has predictive validity for identifying facial expressions that manifest happiness (area under the curve [AUC] = .68, p < .001), while the DDF factor showed marginally significant predictive validity for identifying sadness (AUC = .59, p = .059).

Receiver Operating Characteristic Curves for Happiness (Left) and Sadness (Right) for the Emotions Expressed by the Faces.
In terms of identifying the emotions expressed by the musical fragments, Figure 4 shows a predictive validity for the EOT factor for the emotion of surprise (AUC = .62, p = .007) and, marginally significant for the EOT (AUC = .60, p = .085) and DIF factors (AUC = .60, p = .083) for the emotion of anger.

Receiver Operating Characteristic Curves for Surprise (Left) and Anger (Right) for the Emotions Evoked by the Music.
Discussion
The objective of the present study was to determine the relationship between alexithymia and emotion recognition using visual (facial) and auditory (musical) stimuli. Although adult males usually show higher alexithymia levels (Mattila et al., 2006), the results for adolescents and youths are more controversial. This study showed significant differences between men and women in their alexithymia scores that reveal higher scores for women than for men, which do not agree with the results of previous studies that have shown a higher tendency among the males (Mattila et al., 2006) or with studies that have found no differences in alexithymia in terms of gender (Säkkinen et al., 2007). Nevertheless, the results of this study are in line with those of another study conducted with an adolescent population that observed higher alexithymia scores in the girls than in the boys (Joukamaa et al., 2007).
In terms of the age variable, the results of our study agree with those obtained by Moral de la Rubia and Retamales Rojas (2000), given that higher rates of alexithymia were found in younger students than in university students, showing an attenuating tendency for alexithymia as age increases, as has been indicated by other studies (Säkkinen et al., 2007). A possible reason for the discrepancy between studies has been the treating of alexithymia as a simple scale or one-dimensional measure. The results of our study show that alexithymia in adolescents and youths should be assessed as a multidimensional construct. While the difference in terms of gender was associated with the DIF and DDF factors, the difference between secondary education students and university students was evidenced in the EOT factor. Thus, the differences in gender are associated with a greater or lesser ability to identify and describe emotions, while the difference between adolescents and university youths is established at the thinking level, which is probably related to the cognitive development level.
To a great extent, these results are consistent with those of other studies that have indicated that the factors DIF and DDF of the TAS-20 are strongly correlated, while the factor EOT has a weak and inverse relationship with DDF and practically null with DIF (Kooiman et al., 2002). Based on these results, it is plausible to consider that the TAS-20 is a multidimensional instrument in which each factor should be studied separately and that it is not appropriate to use a single overall measure of alexithymia. This conclusion agrees with Loas et al. (2017) who conducted a study with an adolescent population and discovered that the bifactorial model of TAS-20 (DIF–DDF) showed a significantly better fit than those studies performed with a three-factor structure. From this line of analysis, the results of this study are related to those of other studies that established the limitations of the EOT factor in the overall evaluation of alexithymia with this tool (Loas et al., 2017).
With respect to the identification of emotions evoked through facial stimuli, we observed that the EOT factor has predictive validity for identifying facial expressions that manifest happiness, while the DDF factor has a marginally significant capacity for predicting the identification of sadness and fear. In terms of identifying the emotions expressed by the musical fragments, the EOT factor was predictive of the emotions of surprise and anger, although DDF and DIF have certain predictive weight in the emotion of anger. Unlike the situation with the other two factors of alexithymia (DDF and DIF), individuals with higher scores on the EOT factor showed reduced physiological reactivity in response to auditory stimuli (musical fragments) that evoke sadness (Davydov et al., 2013). This finding is in line with the results of another study that showed the relationship between the EOT factor and difficulties identifying the facial expressions of anger, sadness, and fear (Prkachin et al., 2009). According to this interpretation, perhaps the participants with higher scores on the EOT factor achieved poorer results in the ability to recognize emotions through music due to the existing pattern of avoidance of emotion information processing, given that the EOT factor, which is characteristic of alexithymia, significantly influences the recognition of musical emotions and can predict low performance in tasks of recognizing emotions in music (Taruffi et al., 2017).
In general terms, in respect to the identification of emotion through musical fragments, the participants in the sample generally successfully recognized the emotions transmitted by the music; certain emotion categories, as is the case with fear and anger which are frequently confused, especially when anger was the emotion being studied. This finding coincides with those of other investigations (Eerola & Vuoskoski, 2011). Our study also found a second source of confusion for the emotion categories of revulsion and surprise evoked by music (see Table 3). As expected, the emotion revulsion was the least successfully recognized in the music compared with other primary emotions such as sadness, as shown by the results obtained by Argstatter (2015). In the case of the musical fragment composed with the intent to evoke happiness, most participants opted for serenity, probably due to the low arousal of the composition, primarily because of the moderato tempo. In this respect, it coincides with the results of other research (Droit-Volet et al., 2013; Gagnon & Peretz, 2003).
These results appear to be novel when reviewing the current literature. Regarding the emotion identification using musical fragments, we observed that the EOT factor of alexithymia exerted a strong effect predictive for the emotions of surprise and anger. Similarly, in the case of visual stimuli, the EOT factor showed predictive validity for identifying the emotion happiness, while the DDF factor showed predictive validity for identifying sadness. The fact that we found patterns of connectivity between the factors that determine the level of alexithymia and emotion recognition through musical fragments and the fact that these and the linked emotions change according to the visual or auditory modality of the stimulus could be explained by the evidence that reveals the involvement of various brain pathways in the regions of the brain responsible for perception, identification, evaluation, coding, emotional regulation and emotional response (van der Velde et al., 2013). The analysis also shows that the relationship between alexithymia and emotion identification through facial stimuli manifests in the emotions of happiness and sadness, while this relationship through musical stimuli manifests in the emotions of surprise and anger. In this case, the processing pathway for the two types of stimuli is therefore likely to be different, as suggested in certain research (Donges & Suslow, 2017; van der Velde et al., 2013). In the case of the facial stimuli, the specific weight lies mainly on the axis of valence, while in the case of the music, the axis of arousal is activated to a greater extent (Salimpoor et al., 2009). Based on our findings, our results are consistent with other studies that rely on the hypoarousal model of alexithymia (Neumann et al., 2004; Wehmer et al., 1995). From this perspective, people with high levels of alexithymia may find it necessary to resort to external stimuli that generate the necessary arousal to fully experience different emotional states.
We can, therefore, conclude that alexithymia affects not only the perception of visual stimuli and of language but also affects music sensitivity, as indicated by Taruffi et al. (2017). This highlights the need for further research on the representation of emotions in music and its relationship with alexithymia and other related constructs such as anhedonia, the difficulty in properly processing emotional stimuli and reward processes. In light of the results, there could be abnormalities in certain brain structures related to emotion processing (van der Velde et al., 2013). From this line of analysis, it is important to note that the processes of emotion perception can be supported by other more general mechanisms related to emotional sensitivity, which are necessary for adequate social functioning (Taruffi et al., 2017). Finally, as suggested by the literature (Keltner & Cowen, 2021; Laukka et al., 2013; Scherer, 2004), the results obtained may lead to a better understanding of a consistent perception of stimuli across different modalities, and allow more significant perceptive experiences.
Practical implications
The presence of alexithymia can hinder the development of social skills during adolescence (Honkalampi et al., 2009; Iannattone et al., 2021); therefore, if the approach of any educational system is to train individuals comprehensively and prepare them for society, activities need to be promoted through educational proposals that are related to the study of the emotional and affective needs of adolescent students. Furthermore, this type of study introduces other professionals of education and music therapy-related settings to the manner in which music can help in the cognitive and emotional development during this evolutionary stage. In short, these types of interventions can play an essential role in regulating the intrinsic physiological reactions of alexithymia and improve skills in educational and social contexts. Moreover, the results obtained have practical implications in specific areas such as music therapy, where more effective intervention strategies can be designed; since by modifying the musical parameters that influence emotional perception, the interventions may be adjusted in real time to patient needs (Droit-Volet et al., 2013; Gagnon & Peretz, 2003). Similarly, the multimodal use of supplementary information from visual and aural stimuli as a way of perceiving information previously known by one sense as completely as possible can have a practical application: for example, by the use of pictograms and music for people with special educational needs. Thus, multimodal perception could be used in a variety of different educational and therapeutic contexts.
Limitations
Firstly, as the presented results are based entirely on nonclinical samples of individuals, it would be interesting to compare these data and conduct direct tests on the perception of emotions recalled by music in samples diagnosed with alexithymia. Furthermore, it would be useful to establish interventions from the three levels of links evaluated by the scale (DIF–DDF–EOT), both in clinical and nonclinical populations, in which the TAS-20 scores can be assessed before and after the interventions aimed at reducing the levels of alexithymia.
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: the Santander-UCM “Evaluación de la Inteligencia Emocional a partir de Secuencias Cinematográficas” (grant number PR87/19-22661).
