Abstract
Diversified emotional responses are generally referred as the evidence of showing categorical perception of major and minor modes. Yet it is uncertain whether the categorical performance is independent to the emotion tagging. This study therefore adopted a direct measure with the proper controlled stimuli to reexamine the categorial nature of major and minor modes across ages. Results showed that except the group of male elderly, untrained participants in all three age groups performed better than chance in the categorisation task. Their above chance level performance might therefore suggests an implicit working of the conceptualisation of major and minor. Suggestion on hearing ability and item validity were also made so that the performance of elderly and the potential performance difference between sex can be further interpreted.
Although it is unlikely to trace back the birth of the first piece of music, music has certainly been closely bonded with human societies for a very long period of time. From mating choices to religious rituals, music serves its irreplaceable functions to smoothen the processes and intensify the emotional experiences in the activities (Miller, 2000; Nattiez, 1990). People may feel energetic when listening to the theme song of ‘Rocky‘ or they would experience fear when listening to the classical background music in Hitchcock’s ‘Psycho’. The relationship between music and emotion is complex and the induced emotions can also be varied by different combinations of tempo, dynamics, and phrasing (Baugh, 1993; Tagg & Clarida, 2003). In general, music tempo and major/minor are two of the most researched music features in associating with emotions (e.g., Balkwill & Thompson, 1999; Gagnon & Peretz, 2003; Kamien, 2008; Parncutt, 2014). A positive relationship between tempo and emotion valence has been widely validated (e.g., Balkwill & Thompson, 1999; Liu et al., 2018). Some findings have even suggested that tempo is more salient in inducing emotions than mode (Gagnon & Peretz, 2003, Experiment 1), though it is still inconclusive (Gagnon & Peretz, 2003, Experiment 2; Peretz et al., 1998). Similarly, listeners’ emotions can also be induced by major/minor as evidenced by the studies testing with triads, which are combinations of three music notes (Howard, Rosen, & Broad, 1992), modes, which are strings of consecutive notes from the base note to its octave (Kastner & Crowder, 1990), and chords (Pallesen, Brattico, & Carlson, 2003). Furthermore, emotion induction effect of major/minor has also been verified in different cultures (Fang, Shang, & Chen, 2017; Fritz et al., 2009), age groups (Bonetti & Costa, 2019; Kastner & Crowder, 1990), and people with autism (Heaton et al., 1999; Kopec, Hillier, & Frye, 2014).
Emotion Induction Effect of Major/Minor
Emotions can be induced and expressed based on some general principles which can be observed in different perceptual domains (Damasio, 2018). For example, people prefer regularity over uncertainty (Shermer, 2011). When they are placed in an uncertain situation or a situation with low control, stress and anxiety, which are negative in emotion valence, are commonly experienced (Brosschot, Verkuila, & Thayerb, 2016). Similar in music perception, the nature of uncertainty can be observed in the minor scale. Compared with major scale, minor scale has less predictable patterns and numerous ambiguous roots which possibly induce negative emotion to the listeners (Parncutt, 2014). In terms of expressions, negative emotions, except anger, usually associate with low activity level (Darwin, 2009; Lo, Li, Lee, & Yeung, 2018; Russell & Barett, 1999). This association between energy and frequency also fits the findings in the studies on speech sounds and music. Speech sounds connoting sadness usually contain lowered pitch (Parncutt, 2014) which echoes the slightly lower pitches found in minor, when compared with music in major (Huron, 2008). In other words, the correspondence between emotions and major/minor may not be completely arbitrary. It seems that humans are possibly equipped with a set of algorithm in associating music (e.g., major/minor) with specific type of emotional feeling. Research on brain activities in amygdala also showed distinctive brain wave patterning when music in major and minor chords was presented respectively (Pallesen et al., 2003). Ventral striatum which is responsible for the feeling of satisfaction, reached to a higher activation when participants listened to music in major keys than the control condition (Mitterschiffthaler et al., 2007). With these possible structural dispositions, children, as young as three year olds, showed an understanding of the affective meaning in music by pointing to the emotion expressions corresponding to the keys of the music being heard (Cunningham & Sterling, 1988; Kastner & Crowder, 1990). Also, an association between emotion and major/minor chords is gradually strengthened from age 2 to 6 (Bonetti & Costa, 2019). In addition, social learning also affects the emotional experiences in music listening (Howard et al., 1992). Sound tracks in major keys are commonly paired with scenes showing joy and hope. Whereas melancholic atmosphere can be further strengthened with music in minor keys. This kind of rules are commonly practiced in movies and TV dramas which are powerful tools in shaping the emotional experiences of music (Cohen, 1990; Hevner, 1935).
Non-Emotional Perception
A strong correspondence between emotion and major/minor is sometimes suggested as an evidence of categorical perception of major and minor (e.g., Goldstone & Hendrickson, 2010; Kamien, 2008). Nevertheless, it is uncertain whether these categorical performances were independent to the emotion elements. Even without controlling the valence level of the stimuli, when participants were asked to judge the similarity, with any basis they could rely on, of two pieces of music stimuli, the feature of major/minor was not mostly relied on by the non-musician participants in a judgement task (Halpern et al., 1998). Furthermore, non-musicians were able to discriminate major/minor melodies with emotion labels (Leaver & Halpern, 2004). Yet their performance only reached at chance level in a mode discrimination task without emotion tagging and could only be slightly improved after a short-term training. Their poor performance was possibly explained by a lack of the corresponding mental representations for the music stimuli (Leaver & Halpern, 2004). Similar to colour categorisation task, participants who did not have the specific vocabulary for blue and green did not conceptually differentiate blue and green colours (Roberson, Davidoff, Davies, & Shapiro, 2005). Major/minor is a terminology developed in western music tonic system. It is therefore possible that an individual does not know about this classification without acquiring the specific representation for major/minor through proper training. On the other hand, mere exposure alone may only influence one’s affective judgement or preference to music (Peretz et al., 1998), but not adequate to enhance the core understanding of the corresponding music terminology.
Categorisation does not necessarily come with linguistic labels. Considering the categorisation studies with young children who had no or very limited verbal skills, researchers usually observe how their young participants assign objects into different groups or discard the different one from a group with similar nature (e.g., Mak & Vera, 1999). In order to minimise the linguistic loading, similar methodology would also be adopted in this study for the untrained participants who had no or very little music vocabulary.
Control of Parameters
Music stimuli with valence control is important in examining the categorical ability of the untrained participants with the minimum effect of emotion elements. The present study therefore created its own set of music stimuli in order to control not only the valence level but also other possible music parameters, e.g., tempo. People generally feel more energetic and are highly aroused to the music in fast tempo than slow tempo (Fernández-Sotos, Fernández-Caballero, & Latorre, 2016). When hearing music in fast tempo during sport activities, the induced refreshing and encouraging experiences are usually regarded as positive (Szabo & Hoban, 2004; Waterhouse, Hudson, & Edwards, 2010). Hence, composing a specific set of music stimuli can try to minimise the effect of the variability of the music parameters that may deviate the goal of the original design.
On the other hand, some research took a minimalistic approach to provide a perfect control to the unrelated music parameters by only presenting triads (i.e., combinations of three music notes) as the sole stimuli to examine the emotion induction effect of major/minor (Cook & Fujisawa, 2006; Fujisawa & Cook, 2011). Although the hearing experience to triads can be quite different from the general expected experience of laymen to music, findings from this kind of studies give a basis which allows a further investigation on major/minor categorisation in the context with the stimuli which are closer to daily music listening experiences.
Tonality in music is about a basic musical system including the arrangement of pitches and chords in a music piece or a song (Lerdahl & Jackendoff, 1987). With this format, listeners can be fully aware of major and minor keys (Crowder, 1984). Furthermore, compared to a triad or a mode, music stimuli in a form of tonality, which were adopted in the present study, can facilitate the holistic processing style which is also close to the type of music listening experiences that the untrained people would expect (Reybrouck, 1997).
The Present Study
Through the investigations to the association between emotion valence and major/minor in the past few decades, major and minor are believed to be categorically perceived. In order to further validate whether the categorical perception to major/minor can be largely independent to emotional elements, music stimuli with the controlled emotion valence in this study were composed. With this set of newly-composed music stimuli, other music parameters can also be controlled. The music stimuli were composed in a form of tonality which aligned with the general impression of music from people with no music training background.
If major and minor modes can be categorically perceived, the present study therefore further predicted that there was no significant performance difference in the categorisation task in different age groups. Children performance in categorising major and minor, as reflected by their emotional responses, were reported in different studies in the literatures (e.g., Kastner & Crowder, 1990; Trainor & Heinmiller, 1998). Yet no reviewed study attempted to examine the discriminative performance, in a valence controlled condition for both children and elderly. In order to construct a developmental account for the categorical perception to major and minor modes, this study tried to recruit participants in three different age groups (i.e., children, adults and elderly) to have a same set of test. Variations in task difficulty and format could therefore be largely minimised and a fair performance comparison among these age groups became possible.
Methodology
Participants
Ninety-four participants in three age groups were recruited for this study. Twenty children aged from 4 to 5 years old (female: 8; male: 12) were recruited from a kindergarten in Hong Kong. Another thirty-nine undergraduates with age ranged from 18–23 (female: 22; male: 17) were from Hong Kong Shue Yan University. Lastly, thirty-five elderly with age ranged from 60 to 73 (female: 19; male: 16) were recruited via personal network and snowball sampling for this study. All participants did not receive any proper music training in prior, i.e., untrained listeners. They had no knowledge of the Western music tonic system. Music listening history and preference however were not controlled in this study. None of them reported to have any visual and auditory defect that would affect their performance in this study.
Materials
Eighteen pieces of music stimuli were composed for this study. Half of them were in major keys (i.e., E major, G major, and C major) whereas the other half were in minor keys (i.e., A minor, F minor, and D minor). All of them lasted for around 7 s and shared same tempo, i.e., 80 quarter notes per minute. They also shared same note intervals and chord progressions. See Fig. 1. All stimuli were created through a software, MuseScore.
A group of 12 raters were recruited to evaluate the acceptability of each of the stimuli in a 7-point scale, i.e., ‘1’ represented ‘the least likely to regard it (i.e., music stimulus) as music’ and ‘7’ represented ‘the most likely to regard it as music’. Over 65% of raters chose “5” or above to all the music stimuli. Since there were more than 2 raters, weighted Kappa was not used to measure the reliability of the raters’ responses. On the other hand, results from Kendall’s tau-b correlation showed a positive relationship among the responses to the acceptability from all 12 raters (tau-b = 0.477, p < 0.05).

Sample of the stimulus used in trial 1.
The same group of raters were also asked to evaluate the emotional valence to each of the music stimuli in a 7-point scale in which ‘1’ represented the least pleasantness of the stimulus and ‘7’ represented the highest level of pleasantness. Similarly, over 70% of raters chose “3” or “4” to all the music stimuli. A significant positive relationship among the rating responses was also yielded (tau-b = 0.501, p < 0.05).
After the rating tests, all these eighteen stimuli were then arranged into six trials so that each trial contained three different music stimuli. The small trial number design based on the feedbacks in the pilot tests in which children and elderly might not be fully engaged in the setting with large number of trials. The six trials were divided into two groups, i.e., major dominant and minor dominant. In the major dominant group, each trial consisted of two music stimuli with major keys and another music stimulus with minor key. Same rationale was applied in the minor dominant group except reversing the ratio of the stimuli with major and minor keys. An example of a trial in minor dominant group can be found in Fig. 1. Moreover, the major (or minor) keys of the two music stimuli in a single trial in the major (or minor) dominant group were different. This design minimised the chance of making correct categorisation simply due to the matching of the two pieces of music stimuli with identical major (or minor) keys but not the ability to group the same kind of modes together. The acoustic difference among the keys in each of the six trials was achieved by chromatic transposition. Stimuli in trials 1 and 6 were transposed upwardly. Whereas stimuli in trials 3 and 5 were transposed downwardly. Stimuli were transposed both upwardly and downwardly in trials 2 and 4. See Table 1.
Distribution of the Stimuli Across Six Trials
Procedure
The experiment was conducted in a quiet place. Participants were assigned to sit in front of a 10” touchscreen. In each of the trials, there were three identical gift boxes with the labels from 1 to 3 placed horizontally on the screen. See Fig. 2. Participants were told that there was a music box in each of the gift boxes and required to discriminate which music box was different from other two by hearing the music they produced. After that, the experimenter started pointing to box 1 and the corresponding music was played simultaneously. This repeated until all three pieces of music were played. The experimenter would then suggest the participants to listen to all three pieces of music again before making the decision. Similarly, music stimulus was only played after the corresponding box was pointed. After listening to the music twice, participants only needed to point to the gift box that was different from the other two. Their choices were marked on the recording sheets. No participant refused to give any decision in this study. All six trials were randomly presented to the participants and the music stimuli in each trial were also randomly assigned to different gift boxes.

Sample of stimulus being shown on the screen.
In order to let the participants to be familiarise with the procedure, a practice session with four trials was conducted before the experiment. The procedures were identical to the experimental one. The practice trials consisted of animal sounds, e.g., two acoustic stimuli were bird songs and the remaining was the bleating sound of a goat. The practice session not only allowed the participants to be familiarise with the testing format, but also ensuring that they could hear the sounds well. No participants made error in their first attempt in all four trials.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
Results
A correct answer is regarded as picking the stimulus with minor mode (or with major mode) in the major dominant group (or minor dominant group). Descriptive findings on accuracy among the three age groups were shown in Table 2.
Means and Standard Deviation of the Accuracy (in %) Across Three Age Groups
Consistency Among Trials
In order to ensure the consistency of the difficulty level between major and minor dominant groups, repeated measure ANOVA with Greenhouse-Geisser correction was conducted. There was no significant performance difference between major and minor dominant groups across all three age groups (Children: F(1, 19) = 0.012, p = 0.915, η _ p2= 0.001, Cohen’s d = 0.051; Adults: F(1, 38) = 0.009, p = 0.925, η _ p2= 0.000, Cohen’s d = 0.051; Elderly: F(1, 34) = 0.01, p = 0.923 η _ p2= 0.000, Cohen’s d = 0.051).
Furthermore, the difficulty level among the three trials in major dominant group and minor dominant group were also checked by similar analysis. There was no significant performance difference among the three trials in the major dominant group across all age groups (Children: F(1.74, 33.07) = 0.137, p = 0.845, η _ p2= 0.007, Cohen’s d = 0.238; Adults: F(1.927, 73.24) = 0.21, p = 0.803, η _ p2= 0.005, Cohen’s d = 0.081; Elderly: F(1.903, 70.51) = 0.172, p = 0.819 η _ p2= 0.005, Cohen’s d = 0.051). Similar results were yielded in the minor dominant group (Children: F(1.563, 29.69) = 0.322, p = 0.674, η _ p2= 0.017, Cohen’s d = 0.092; Adults: F(1.953, 72.27) = 0.057, p = 0.941, η _ p2= 0.02, Cohen’s d = 0.058; Elderly: F(1.911, 69.07) = 0.038, p = 0.653 η _ p2= 0.02, Cohen’s d = 0.018). Based on the non-significant difference shown in the repeated measures, the responses in different age groups were therefore combined together to compute the Cronbach’s alpha coefficients of the major and minor dominant groups. The coefficients for major and minor dominant groups were .77 and .67 respectively. Results in the repeated measures suggested the trials in both major and minor dominant groups were similar in terms of their difficulty to the participants across three age groups. Furthermore, there was an acceptable consistency among the participants’ responses within the groups. Hence, the performance in both groups of stimuli were combined to compute in the following analysis.
Categorical Performance
Given that this is a three-alternative forced choice task, the chance level should be around 33%, i.e., 1/3. When checking this with the performance in each of the six groups, it was found that only the performance in male elderly group (35.41%) was at chance level (t(15) = 0.426, p = 0.676). In other words, most of the participants across three age groups performed better than chance in the categorisation task.
Age Difference
A one-way ANOVA was conducted to examine the performance difference among the age groups. A significant main effect of age was obtained (F(2, 91) = 6.392, p < 0.05, η _ p2= 0.123, Cohen’s d = 0.893). Post-hoc test’s results showed that children were as capable as adults in discriminating major and minor modes in the categorisation task (Tukey HSD = 5.46, p = 0.117). On the other hand, performance of elderly was less accurate than the performance in adult group (Tukey HSD = 4.62, p < 0.05). It is probably due to the low accuracy of the elderly male participants (35.41%), compared with their female counterparts (42.11%), which lowered the overall performance of the elderly group.
Discussion
Excluding the performance in the male elderly group, untrained participants across different age groups were able to categorise major and minor modes at above chance level. With the control of the valence level, participants’ categorical sense to major and minor was found to be not necessarily emotion orientated. Different from some of the previous findings (e.g, Halpern et al., 1998), the categorical perception found in this study could be explained by some slight changes of the experimental design. Compared to the traditional discrimination task where two stimuli were presented, participant in the current study were suggested to associate and compare the possible similarity(ies) between any two stimuli and discard the remaining in every trial. It therefore possibly hinted the participants to be more sensitive to the features signifying major/minor which are salient enough while other parameters were controlled. All these methodological changes aimed at minimising the distraction and lowering the unnecessary cognitive loading which helped examining the categorical sense to major/minor of the untrained participants who were also unaware of their possible implicit ability.
Possible Criterions
On the other hand, the categorical performance also implied that participants were attracted by and relied on some salient acoustic features to accomplish the current categorisation task. Minor triads are found to be more dissonant and are less harmonic, compared with major triads (Parncutt, 2014). Acoustic dissonance is also associated with negative emotions (Kastner & Crowder, 1990) and less preferred by infant participants (Trainor & Heinmiller, 1998). Yet consonance and dissonance are not always perceptually distinctive and their emotion induction effect would vary in different combinations between chords and tonality (Parncutt, 2014). All these make it to be less certain whether the acoustic difference between consonance and dissonance can be easily perceived by the untrained participants in this study.
Another possible feature that might catch participants’ attention in categorising major and minor would be the presence of minor third. In short, minor third interval refers to the three-semitones difference between the first and second notes in a music triad. This kind of intervals can usually be found in music in minor keys but not in major keys. This music feature seems to be subtle but even untrained participants were sensitive to their differences (Cook & Fujisawa, 2006). The presence (or absence) of minor third interval, probably became implicitly effective in helping the participants to make the categorical decisions in the present study. Similar findings have also been yielded in the studies on the perception of pitches. Perfect pitch is regarded as a rare phenomenon but over 40% of the untrained participants could generate the pitch that they had perceived before within 2 semitones deviations (Levitin, 1994). The two-component theory further explains why normal people are seldom aware of this ability is possibly due to a lack of conceptual representation (Levitin & Rogers, 2005). Supportive evidence was also reported in a linguistic context. Non-tonal language speakers were generally believed to have difficulties in differentiating the tonal changes among words in tonal language system. Nevertheless, it was found that their performance were fully comparable with the performance of the tonal language speakers in discriminating the syllabic difference in a same-different task even at the unconscious level (Lo, 2015).
Experiences and Hearing
Although participants in the present study generally performed above chance level in the categorisation task, these results should also be interpreted with cautions. There was no control of the participants’ music listening preferences, experiences and their hearing ability. With a rapid advancement of internet connection, music around the world can be easily reached which possibly develops different music preferences among the participants in this study. Given that there is a possible relationship between gender and music preferences (Langmeyer, Guglhör-Rudan, & Tarnai, 2012; McCown et al., 1997), a gender difference in the categorical performance in this study was expected if male-female music preferences determined their categorical ability to major/minor. No gender effect in general was found except in the elderly participant group. The absence of gender effect in the children and adult participant groups does not imply the insignificance of music preference when interpreting the present results. On the other hand, a more reliable interpretation can be generated only until a set of finer control of participants’ music preferences and listening history is provided in the future investigation.
Furthermore, if the amount of music listening experiences is positively correlated with categorical performance, children were generally expected to perform the worst among the groups. Nevertheless, the present findings show no significant performance difference between the groups of children and adults. On the contrary, elderly, especially the elderly male participants, performed less satisfactory among the groups. This performance difference seems not to be fully explained by the amount of participants’ listening experiences but their hearing ability.
Hearing loss, or presbycusis, is commonly found in adults over 60 (Patterson et al., 1982). It is usually characterised by a low sensitivity to different parameters of sound signals including pitches, intensity, and directionality (Dobreva, O’Neill & Paige, 2011; Whitbourne, 1998). Without a control of the hearing ability in this study, the elderly participants might perform less compatible with the adults not because of the loss of a categorical sense to major and minor but an inability to hear the music stimuli clearly. Though males have lower threshold to sound signals at low frequency than females (Murphy & Gates, 1997), females in general are more sensitive to different ranges of frequencies (e.g., frequency above 1 khz) and have a later onset of hearing deterioration than males (Pearson et al., 1995). Consistent to the literatures, elderly female participants in this study also performed better than their male counterparts in the categorical task. No such gender difference was found in the adult group (t(37) = 0.195, p = 0.847). Due to the skewed sex ratio in the children group, no analysis was conducted to examine the gender effect. A balanced sex ratio of the participants in different age groups with a control of their listening ability would be important when constructing a more comprehensive understanding of the conceptualisation of music features of the untrained population in the future.
Construct Validity
Another possible factor that would hinder the significance of the present findings is the construct validity. Consistent with most of the stimuli designs found in the reviewed literatures, the present stimuli were originally created with different types of control of the music properties, acceptability, and emotionality. Yet, no specific statistical analysis, like confirmatory factor analysis (CFA), was provided as an additional validation for the stimuli types which would eventually lower the validity of the present findings. Cronbach’s alpha coefficients were calculated for both major dominant and minor dominant groups. Due to the small item number (i.e., 3) in each of the stimuli groups, low alpha coefficients (i.e., 0.67) was yielded in minor dominant group. On the other hand, the means of the inter-item correlation of the major and minor dominant groups are 0.503 and 0.404 respectively which are still suggested to be acceptable for the constructs with small number of items (Briggs & Cheek, 1986). In order to have a more accurate analysis for construct validity, items for each stimuli group and participants in each of the three age groups should be largely increased. Not less than 200 participants in each age group (or not less than 10 participants per observed items, depending on which criterion can yield a larger participant number) should be recruited as to test securely the theoretical construct in a CFA model (Barros at al., 2017; Myers, Ahn, & Jin, 2011). Furthermore, an increased number of items in each stimuli group can also enhance the reliability of the latent variables in the tested theoretical model. By at least satisfying these two conditions, the present findings can be further verified in the future investigation.
