Abstract
The aim of this study was to correlate musical aptitude scores derived from two tests based on the same structural model for musical aptitude in a sample of 9- to 13-year-old children. We controlled for the influences of socioeconomic status (SES; measured by parents’ education), the amount of music lessons, and general cognitive abilities (i.e., IQ). The sample comprised 89 (46 girls) 9- to 13-year-old children. We applied two different tests by Edwin Gordon: Intermediate Measures of Music Audiation (IMMA) and Advanced Measures of Music Audiation (AMMA) to measure musical aptitude. As control variables, IQ, socioeconomic status, and amount of music training were assessed. A hierarchical multiple regression analysis revealed that the total score of the IMMA together with the control variables could not predict the total score of the AMMA. Furthermore, regression models for each of the subtests were also not significant. With respect to the control variables, we revealed an association between the IMMA and socioeconomic status as well as amount of music training. We conclude that even tests that are based on the same structural model of musical aptitude were not associated significantly. This might indicate problems of validity. Additionally, it seems to be difficult to assess musical aptitude independently of influences from music training and SES. Ultimately, this may support the notion that we still need valid musical aptitude tests for this particular age group.
It is perfectly conceivable that the artistic faculty in any person might be somehow measured, and its amount determined, just as we may measure strength, power of discrimination of tints, or the tenacity of memory. (Galton, 1883)
Researchers are interested in quantifying individual differences, whether they relate to intelligence or musical abilities. However, it seems to be especially challenging to measure musical aptitude. So far there is no universal standard for musical aptitude tests, and the tests that are applied in various studies differ considerably (Schellenberg & Weiss, 2013). This is problematic because the results of different musical aptitude tests are hardly comparable. Musical aptitude tests are based on different models representing the structure of musical aptitude and therefore differ in the musical abilities (e.g., pitch, rhythm, loudness, or timbre) they assess. This might explain why different musical aptitude tests show only moderate correlations with one another (Gembris, 2009). To clarify this it might be useful to compare tests that belong to the same theoretical approach and assess the same musical abilities. This may shed light on the question of whether moderate correlations between musical aptitude tests are due to differences in the structural models of musical aptitude that underlie the construction of a test, or more generally due to unsuitable test construction (e.g., problems with validity and reliability). Besides differences in structural models of musical aptitude, factors such as the amount of music training, socioeconomic status, or level of cognitive functioning might partly explain the modest correlations between tests, because they may influence the outcomes of different tests to varying degrees.
The aim of this study was to measure musical aptitude in 9- to 13-year-old children with two tests based on the same structural model of musical aptitude and to find out if these two tests were associated. We applied two different tests by Edwin Gordon: Intermediate Measures of Music Audiation (IMMA) and Advanced Measures of Music Audiation (AMMA). These tests are based on the same structural model of musical aptitude and assess the same musical abilities (pitch and rhythm). Thus, it seems to be highly likely that the scores of one test (IMMA) can in a hierarchical regression model significantly predict the score of the other test (AMMA). Furthermore, we controlled for influences of the amount of a child’s music training, socioeconomic status (SES; measured by parents’ education), and general cognitive abilities (i.e., IQ).
Musical aptitude
Musical aptitude can be defined as a student’s potential to achieve high levels of proficiency in music. One could even say that it is the innate potential to succeed as a musician (Schellenberg & Weiss, 2013). In contrast to this definition, there are approaches that emphasize the amount of practice rather than the innate potential. They anticipate that one achieves an expert level by working hard and accumulating many hours of practice (Howe, Davidson, & Sloboda, 1998). More inclusive approaches define musical aptitude as the interaction of nature (i.e., the innate potential) and nurture. Whereas nurture refers to environmental influences like education, musical experience (e.g., music training), and self-regulated activities such as practice, nature refers to the innate level of musical aptitude (Gembris, 2003). In sum, it can be stated that it is difficult to arrive at a general definition of musical aptitude. Schellenberg and Weiss (2013), for example, concluded (well aware of their circular argument) that musical aptitude tests measure musical aptitude. Apart from the circularity of this notion, it is an important point to mention that creating a theory-based measurement did not primarily motivate the construction of musical aptitude tests. It was rather the desire to support teachers in their work at schools that inspired some test constructors (Bentley, 1966). Only later did an interest in systemizing different test approaches and assumptions about the structure of musical aptitude arise.
The currently available musical aptitude tests can be categorized into two opposing approaches: the multifactorial approach and the general factor approach (Kormann, 2005), with an additional third approach combining those two. The multifactorial approach assumes that musical aptitude is a conglomerate of several relatively independent facets of musical abilities. Tests that are based on this approach assess several musical abilities and result in a musical aptitude profile. A total test score is not computed. The Seashore Measures of Musical Talents (Seashore, 1919) represents an example of a multifactorial test. It measures pitch, loudness, rhythm, timbre, pitch memory, and tone length. In contrast to the multifactorial approach, the general factor approach assumes that musical aptitude is a complex ability that cannot be divided. It has different aspects, but these facets are interrelated and therefore integrated into one total score of musical aptitude. The Measures of Musical Abilities (Bentley, 1966) is a musical aptitude test that may serve as an example of a test that can be subsumed under the general factor approach. It measures pitch discrimination, pitch memory, chord analysis, and rhythm memory. A composite score based on the scores of each subtest is calculated and used as an estimate of a persons’ musical aptitude. The third approach – Gordon’s understanding of musical aptitude – combines features of the multifactorial and general factor approaches. Gordon postulates that musical aptitude consists of several aspects that are interrelated, but also independent. This assumption refers to theoretical as well as to statistical relatedness and independence. Theoretically, these aspects are related because they belong to the concept of musical abilities. Nevertheless, several interrelated aspects represent different independent facets of musical abilities. Statistically, moderate correlations between the various musical abilities represent their interrelatedness, whereas the possibility that a person can differ in the performance level in different musical abilities puts forward the independence of the various aspects of musical aptitude.
Gordon’s theoretical framework
According to Gordon (1989) musical aptitude equates to the ability to audiate. Audiation is the basis for musical aptitude and comprises the abilities to hear and feel music, as well as comprehend music for which the sound is not physically present. Audiation (i.e., musical aptitude) is innate, but it is not inherited. Although it is innate, the environment (especially the early musical environment) contributes to the unfolding of the innate potential (Gordon, 2004). However, even with a perfect early music environment no one ever exceeds one’s innate level of musical aptitude. However, typically, no one develops his or her own musical aptitude to the highest possible level. According to Gordon, the development of musical aptitude consists of a developmental and a stabilized stage (Gordon, 1986). The developmental musical aptitude stage lasts from birth until 9 years of age. In this stage, environmental influences can have an impact on the development of one’s musical aptitude. At age 9, musical aptitude stabilizes and remains the same throughout life. After the stabilization of musical aptitude, it is still possible to learn music, but the level of music achievement is limited by the potential (i.e., musical aptitude) that is reached by the time of stabilization. Taken together, the level of a student’s musical aptitude at age 9 becomes (1) the level of stabilized musical aptitude throughout life and (2) the limit that musical achievement will never exceed. Appropriate tests are available for the developmental as well as for the stabilized musical aptitude. The first Gordon test was the Music Aptitude Profile (MAP) that assesses stabilized musical aptitude with several subtests in 9- to 17-year-old children. Later, Gordon simplified his approach to two (pitch and rhythm) subtests (Schellenberg & Weiss, 2013). These simplified versions are designed to measure musical aptitude across a broad age range. They are available for children between 3 and 5 years of age (AUDIE-test), 5 and 8 years of age (Primary Measures of Music Aptitude), 6 and 9 years of age (IMMA), as well as for students in post-secondary education (AMMA). However, Gordon (1989) postulates that the IMMA can not only measure developmental musical aptitude in children between 6 and 9 years of age, but that it can also measure stabilized musical aptitude in children between 10 and 11 years of age. Gordon (2001) assumes that it is also possible to administer the AMMA to younger students. He tested 12-year-old children and reported a satisfactory reliability (Gordon, 2001). Hence, it might be possible to assess stabilized musical aptitude in 9- to 13-year-old children with the IMMA as well as with the AMMA. What interested us was whether both tests come to similar judgments about a person’s musical aptitude. Although the tests might differ in their overall difficulty, and they are unlikely to produce exactly the same score for a person, the relative position of a person on the musical aptitude scale may be similar for both tests and result in a positive association between them.
Objectives
Therefore, we investigated whether the estimation of music aptitude by the IMMA can predict the estimation of music aptitude by the AMMA in 9- to 13-year-old children. Because children in the investigated age range (from 9 years on) stabilize in their musical aptitude, both tests of musical aptitude should yield similar evaluations of a person’s aptitude. We wanted to test this and we expected, according to Gordon’s assumption, that IMMA scores are a significant predictor of AMMA scores in a hierarchical multiple regression. Beyond that we included measures of socioeconomic status (SES), general cognitive abilities, and amount of music lessons in our study, because former studies found that these variables were linked to musical aptitude. Although Gordon assumes that audiation measures musical aptitude independently of musical training, associations between music lessons and musical aptitude (Forgeard, Winner, Norton, & Schlaug, 2008) have been observed. Hence, we included amount of music lessons as an informative control variable. Aside from musical experience, SES has also been found to be associated with musical aptitude (Rainbow, 1965; Sergeant & Thatcher, 1974). In addition, we assessed IQ because prior research revealed associations between IQ and musical aptitude (Norton et al., 2005; Sergeant & Thatcher, 1974). These associations have been revealed in children and even in adults, across both older and more recent studies (Schellenberg & Weiss, 2013). With the assessment of the mentioned control variables, we wanted to investigate the relatedness between both measures of musical aptitude while controlling for possible covariates.
Method
Participants
The sample comprised 89 (46 girls) 9- to 13-year-old children (M = 10 years; 9 months, SD = 8 months) recruited from volunteers listed in the database of the department of developmental psychology, and a secondary school in Germany. The mean IQ was M = 105.29 (SD = 12.63). In the sample, 21.6% of the children reported no former music education except for music as a school subject, 32.9% reported music education (private music lessons, single or in small groups) for 2 years or less, 21.6% reported 2 to 4 years of music education, and 23.9% reported a history of music education (private music lessons, single or in small groups) lasting for 4 years or more. The sample showed diversity with respect to parents’ education: 54.1% of households had no parent holding a university degree, 27% of households had one parent holding a university degree, and 18.9% of households had both parents holding a university degree. Parents’ education was correlated with children’s music education (r = .37, p = .001).
Measures
Amount of a child’s music lessons, socioeconomic status, and intelligence were measured. Beyond that, music aptitude was assessed with two different tests.
The amount of a child’s music training and socioeconomic status, measured by parents’ education, was assessed with a questionnaire. Formal music training (in months) was determined for each child from questions about current and former music education. When a child played more than one instrument, the number of months of instruction for each instrument was combined. Mothers’ and fathers’ education were initially coded individually as dichotomous variables (0 for “no university degree” and 1 for “a university degree”), and for the later statistical analyses parents’ education was collapsed into a single variable (0, 1, or 2 parents with a university degree).
A short version of the HAWIK III (Hamburg-Wechsler-Intelligenztest für Kinder; Tewes, Rossmann & Schallberger, 2000), which consisted of two verbal and two performance subtests, was administered to assess intelligence. The verbal subtests were the vocabulary test, and the information test; the performance subtests consisted of the picture arrangement test, and the mosaic test. Based on these four subtest scores, an estimate of full-scale IQ was computed according to the formula by Schallberger (2005). The short form of the HAWIK III explains at least 90% of the variance in full scale IQ and allows a good estimation of the participant’s IQ (Schallberger, 2005).
Musical aptitude was measured using the Intermediate Measures of Music Audiation (IMMA) (Gordon, 1982) and the Advanced Measures of Music Audiation (AMMA) (Gordon, 1989). To ensure a standardized procedure, both tests were CD-based.
The IMMA consisted of a tonal test, and a rhythm test. For each test, 20 minutes were allowed for the complete test administration. The IMMA offered a total score as well as separate tonal, and rhythm subscores. The CD recording consisted of four practice examples and 40 test questions for the tonal test, as well as two practice exercises and 40 test questions for the rhythm test. Each tone in the two phrases of the tonal test was in beats of equal length, and notes in the two phrases of the rhythm test were on the same pitch. The tonal phrases were all three tones in length. The rhythm phrases varied between 2/4 and 8/8 beats. The procedures for the tonal and the rhythm tests were the same: the child listened to both phrases of a pair, and decided whether the phrases sounded the same. If the two phrases sounded the same, the child drew a circle around a box with two faces that looked the same. In case of a difference between the two phrases of a pair, the child marked the box with two different looking faces (see Table 3 for scoring information).
The AMMA, like the IMMA, provided tonal and rhythm subscores as well as a total score. The CD recording contained the instruction for test taking, three practice exercises, and 30 test items. The complete test administration lasted about 30 minutes. Each test item consisted of two musical phrases. All items comprised 18 tones. The participant was asked to listen attentively to each item, and subsequently indicate (by filling a space on the answer sheet) whether the two musical phrases were the same or different. When the musical phrases were different, the participant had to decide whether the difference was a result of a tonal change or a rhythm change. In contrast to the IMMA, the answer sheet of the AMMA did not contain faces that should be marked. Instead of faces, the answer sheet contained 30 lines with three boxes in each line. Depending on their decision, participants could mark the box for tonal changes, rhythm changes, or no changes (same) (see Table 3 for scoring information). In contrast to the IMMA, in the AMMA the participant was asked to audiate tonality, keyality, melody, implied harmony, rhythm, meter, and tempo in one musical phrase together, and respond to tonal or rhythm aspects of the phrase at the same time.
Procedure
The testing procedure consisted of an individual testing session (40 minutes) and a group testing session (80 minutes including a 10-minute break). In the individual testing session, we assessed intelligence; children were tested in a quiet room by a female assistant trained in administering the test. The group session comprised the assessment of musical aptitude with the IMMA and the AMMA. The order of the musical aptitude tests was counterbalanced: 45 participants worked at first on the IMMA and completed the AMMA afterward, and 44 participants completed the tests in the opposite order. Three female assistants carried out the group sessions. The questionnaire assessing music education and socioeconomic status was sent via mail or email to the participants, and participants handed it in at the individual testing session. At the end of the group session children received a certificate, and a voucher to thank them for their participation.
Data analysis
In preliminary analyses, we looked at Pearson correlations for both musical aptitude tests to determine whether total test score and subtests were correlated. Note that all correlational analyses reported in the results are two-tailed. Secondly, we used a nonparametric test, the Wilcoxon test, to compare the average achievement on the music aptitude tests. In the next step, the sample distributions were inspected more closely. The skewness and kurtosis value were z-transformed and compared to the critical z-score to obtain information about their significance. This had two implications: first, significant skewness yields information about experienced difficulty of the tests (too easy, appropriate, too difficult). Second, a significant skewness or kurtosis indicates a deviance from normal distribution. Thus, where possible, appropriate nonparametrical tests were applied in the analyses.
The main analyses contained correlations between music aptitude, socioeconomic status, amount of music lessons, and IQ and three hierarchical multiple regression models (first step IMMA measures, second step control variables). In the first model the AMMA total score was the criterion measure and the IMMA total score as well as SES, amount of music lessons, and IQ were the predictors. In the second model the AMMA tonal score was the criterion measure and the IMMA tonal score as well as SES, amount of music lessons, and IQ were the predictors. In the third model the AMMA rhythm score was the criterion measure and the IMMA rhythm score as well as SES, amount of music lessons, and IQ were the predictors.
Results
Preliminary analyses
Mean, standard deviation, and minimum as well as maximum scores of age, SES, IQ, and amount of music lessons are shown in Table 1.
Mean, standard deviation, and minimum as well as maximum scores of age, SES, IQ, and amount of music lessons.
Note. Parents’ education was collapsed into a single variable (0, 1, or 2 parents with a university degree).
Correlations between the subtests and the total test score of the IMMA are shown in Table 2. All correlations were significant. Also, for the AMMA the correlations between the subtests and the total test score were significant (see Table 2).
Correlations between subtest and total test score for the Intermediate Measures of Music Audiation (IMMA) and the Advanced Measures of Music Audiation (AMMA).
Note. +Non-parametric correlation coefficient (τ).
The central tendency of the IMMA, M = 68.33 (SD = 4.55), and the AMMA, M = 46.62 (SD = 6.66), indicated a difference of overall achievement in the tests. A Wilcoxon-test revealed that this difference was significant, z = 8.19, p < .001, d = 3.51. Because a ceiling effect might be possible for the IMMA (maximum score = 80), we analyzed the sample distribution of the IMMA scores to assess if the distribution deviates from a normal distribution. To this end, we calculated the skewness and kurtosis of our sample distribution (see Table 3) and compared the z-transformed skewness and kurtosis values to the critical z-scores. For the IMMA total score zskewness = −1.91 and zkurtosis = −0.17 were both lower than the critical z-score zcritical = 1.96, indicating that the sample distribution was not skewed and had no significant kurtosis.
Overview of constitution of score values, mean, standard deviation, skewness, and kurtosis for the total score, the tonal, and the rhythm subtest of the Intermediate Measures of Music Audiation (IMMA) and the Advanced Measures of Music Audiation (AMMA).
With respect to the IMMA tonal score, the comparison of the z-transformed kurtosis and skewness scores revealed no significant kurtosis, zkurtosis = 0.76, of the distribution, but a significant skewness, zskewness = −3.04. The IMMA tonal distribution is significantly skewed to the right, which could indicate that the tonal subtest was relatively easy for our participants. Regarding the IMMA rhythm score, the z-scores for skewness, zskewness = −2.13, exceeded the critical z-score. The distribution was skewed to the right side, which might indicate that the rhythm test was easy. The kurtosis, zkurtosis = 0.19, did not exceed the critical z-score, zcritical = 1.96. Just like for the IMMA, we analyzed the sample distribution of the AMMA to assess the adequacy of this measurement for this particular age group. Concerning the AMMA total score, the comparison of the z-transformed kurtosis and skewness scores revealed a significant skewness, zskewness = 2.06, but no significant kurtosis, zkurtosis = 0.62, of the distribution. The sample distribution was skewed to the left side, pointing toward a higher difficulty of the test. With regard to the AMMA tonal sore, no significant skewness, zskewness = 1.27, or kurtosis, zkurtosis = 1.33, was revealed. With respect to the AMMA rhythm score, z-transformed skewness, zskewness = 0.65, and kurtosis, zkurtosis = −0.96, were both lower than the critical z-score, zcritical = 1.96, indicating that the sample distribution was not skewed and had no significant kurtosis. Furthermore, we analyzed whether the possible differences in test difficulty were reflected in correlations between age and test scores. There was no significant correlation between age and AMMA (tonal, rhythm, total); see Table 4. Age did not correlate significantly with IMMA total and tonal score (see Table 4). However, the IMMA rhythm score was significantly correlated with age, τ = .16, p = .04.
Correlations between age in months, SES, IQ, and music lessons, and the Intermediate Measures of Music Audiation (IMMA) as well as the Advanced Measures of Music Audiation (AMMA).
Note. +Non-parametric correlation coefficient (τ).
Correlations between musical aptitude, socioeconomic status, music lessons, and IQ
Socioeconomic status (SES) was not correlated with any test score of the AMMA. However, SES was significantly associated with the IMMA total score, r = .26, p = .03, and the IMMA tonal score, τ = .21, p = .03. SES was not correlated with the IMMA rhythm score; for details, see Table 4. Amount of music lessons in months was not correlated to any test score of the AMMA. However, amount of music lessons was significantly associated with the IMMA tonal score, τ = .16, p = .05; for details, see Table 4. Amount of music lessons was not correlated with IMMA total score or IMMA rhythm score. IQ was not significantly correlated with any test score of the AMMA or the IMMA (see Table 4).
Our study contains several correlations. Therefore, it should be taken into account that this affects the alpha level. Because 30 correlations were calculated in the principal analyses, it might be reasonable to adjust the alpha level. The adjusted alpha level is p = .002.
Associations between the Intermediate Measures of Music Audiation and the Advanced Measures of Music Audiation
Hierarchical multiple regression analysis was used to predict AMMA total score with the IMMA total score entered on the first step and SES, amount of music lessons, and IQ added on the second step. The IMMA total score accounted for only 4.0% of the variance in the AMMA total score, F(1, 72) = 3.02, p = .09. The addition of the control variables did not improve the fit of the model, Finc(3, 69) = 0.21, p = .89, accounting for an additional 0.9% of the variance in the AMMA total score. No variable contributed significantly, ps > .18. See Table 5 for more details.
Regression table with IMMA scores, IQ, amount of music lessons, and SES as predictor variables and AMMA scores as outcome variables.
A second hierarchical multiple regression analysis was used to predict AMMA tonal score with the IMMA tonal score entered on the first step and SES, amount of music lessons, and IQ added on the second step. The IMMA tonal score accounted for only 0.3% of the variance in the AMMA tonal score, F(1, 72) = 0.23, p = .64. The addition of the control variables did not improve the fit of the model, Finc(3, 69) = 0.49, p = .69, accounting for an additional 2.1% of the variance in the AMMA tonal score. Again, no variable contributed significantly, ps > .38. See Table 5 for more details.
A third hierarchical multiple regression analysis was used to predict AMMA rhythm score with the IMMA rhythm score entered on the first step and SES, amount of music lessons, and IQ added on the second step. The IMMA rhythm score accounted for only 4.3% of the variance in the AMMA rhythm score, F(1, 72) = 3.25, p = .08. The addition of the control variables did not improve the fit of the model, Finc(3, 69) = 0.49, p = .69, accounting for an additional 2.0% of the variance in the AMMA rhythm score. Again, no variable contributed significantly, ps > .11. See Table 5 for more details.
Discussion
We investigated the correlation between two musical aptitude tests (IMMA and AMMA) that are based on the same structural model of musical aptitude in 9- to 13-year old children. Additionally, we controlled for influences of music training, SES, and general cognitive abilities.
It was not possible to predict AMMA scores with the appropriate IMMA scores and the control variables entered as predictors in a hierarchical multiple regression. No regression model of the three hierarchical multiple regressions (total, tonal, rhythm) reached significance. Furthermore, our analyses revealed that the IMMA was significantly associated with SES and amount of music training. We found no significant association between the IMMA and general cognitive abilities. The AMMA, however, was not significantly correlated with SES, music training, or general cognitive abilities.
Associations between IMMA and AMMA
The IMMA scores were no significant predictors of the AMMA scores. Furthermore, the variance explained by the different IMMA scores for the appropriate AMMA scores ranged between 0.3% up to 4.3% (depending on the measure). This indicates that not only is the association not significant, but that the practical significance is also trivial. This was surprising to us, and the question arises as to why these two tests were not significantly associated. First, age could play an important role in the formation of results. Participants’ age was within the extended age range of the IMMA (Gordon, 1989), but they were too young for the AMMA. Gordon himself reported successful application of the AMMA in 12-year-old children (Gordon, 2001). Our youngest participants were 9 years old, and therefore age might have been a problem. However, two points speak against an important contribution of age. On the one hand, age was not correlated at all with the AMMA. For the IMMA, only the subtest rhythm showed a significant correlation with age. However, the age range was not large and this might have contributed to the nonsignificant correlation. On the other hand, according to Gordon’s framework our sample was already in the stage of stabilized musical aptitude. Hence, influences of growing older and gaining more musical experience should be minimal. Second, differences in test difficulty might have contributed to missing significant associations. The existing difference between the means of the IMMA and the AMMA definitely suggested a difference in overall test difficulty. However, our investigation of the skewness of the distributions does not draw a supporting or conclusive picture. A direct comparison of the direction of skewness reveals that the IMMA and its subtest – if they differ from normal distribution – were skewed to the right (tonal subtest score and rhythm subtest score). Hence, if there is an effect, it indicates that the IMMA was relatively easy to work on. The AMMA (total test score), in contrast, was skewed to the left, which might indicate greater difficulty. These results are in accordance with reported differences in the means. However, the total score and subtest scores do not all show significant skewness. For the AMMA only the total score, and for the IMMA only the tonal and rhythm subtest scores were skewed. If the IMMA really was too easy, the subtest scores as well as the total score should have been skewed to the right. Similarly, if the AMMA was too difficult, all test scores should have shown a significant skewness to the left. Taken together, contributions of age and difficulty may (if they are involved at all) only partly explain our results. Our findings may indicate that the small to moderate associations between musical aptitude tests are not only caused by differences in structural models underlying the different tests, but that other factors might drive these low associations. If it was only an effect of different underlying structural models about musical aptitude, we should have found high and significant associations between two tests, which are based on the same model. Perhaps the small associations between musical aptitude tests are more a matter of validity problems. The tasks that both tests contain might only seem to be very similar on a superficial level. They may in fact differ, because the IMMA requests a more analytic judgment, whereas the AMMA relies more on a holistic decision about similarity. In addition, the response format might cause differences, because the required responses differ in the type of cognitive process they rely on. The response format of the IMMA is more similar to a discrimination task (same or different), whereas the response format of the AMMA includes a designation task: at first it is discrimination (same or different) but the second step requires the designation of a particular difference. Hence, the AMMA requests a two-step answer that might be cognitively more demanding. Beyond that, the AMMA comprises several musical features (i.e., pitch and rhythm not disentangled), whereas the IMMA presents musical features in a more isolated fashion. This fact might also contribute to differences in cognitive processing and task demand, which in turn might explain our non-significant correlations. Beyond that, it might be possible that successful test taking in the case of the IMMA and AMMA depends not only on musical aptitude, but also on other abilities.
Associations between the IMMA as well as the AMMA and confounding variables
Indeed, the differing correlations that we found between the IMMA and the confounding variables compared to the AMMA and its confounding variables support this notion. Both tests were not correlated with general cognitive abilities. This finding is in accordance with the assumption Gordon postulates about his tests: that they should be independent of IQ. The IMMA was correlated with SES and music training, whereas the AMMA was not. These results may be interpreted as supportive evidence for the notion that different abilities are needed to perform successfully on these tests. The correlation of the IMMA with music training contradicts Gordon’s postulation that his tests assess musical aptitude in a way that is not influenced by music training. However, with an adjusted alpha level, these correlations were not significant. Thus, applying a more conservative approach means Gordon’s assumption could be right.
Taken together, our results indicate that it might be necessary to evaluate the Gordon tests again. Quite possibly, the AMMA assesses musical aptitude in a purer fashion than the IMMA does. Furthermore, our findings highlight how difficult it is to assess musical aptitude.
Limitations and future directions
When interpreting the results reported above, it is important to keep some limitations in mind. Neither the IMMA nor the AMMA were perfectly adequate for the age group under investigation. However, if someone wants to compare these two tests in one sample this problem will always be present. Since the Gordon tests were designed to measure musical aptitude across a wide age range (from 3-year-old children up to university students), the sequential application of different tests should cause no problems. It might therefore be expected that all tests would yield similar estimates of musical aptitude in a person as he or she grows older. Otherwise, the whole idea of constructing musical aptitude tests for such a large developmental span would be of no use.
What might also limit our results is the relatively small sample size. It cannot be ruled out that with a larger sample the correlations might be closer to significance. However, if the observed coefficients reflect the truth, we would need a sample of at least 700 to 12,000 participants (depending on the coefficient under inspection) to reveal a significant correlation with 95% probability (Faul, Erdfelder, Buchner, & Lang, 2009; Faul, Erdfelder, Lang, & Buchner, 2007).
Another limitation of our study might be the correlational design, which only allows a comparison of the IMMA and the AMMA at one point in time. It could be interesting to compare their results repeatedly in a longitudinal approach. However, the realization of such a longitudinal design remains up to future studies.
Beyond these limitations, it is also important to keep in mind that we only tested 9- to 13-year-old children, and therefore can only make conclusions about this age group. It is not clear if the same results would be obtained in children older than 13 years.
Our study has shown that it is a difficult endeavor to measure musical aptitude, but it is important to have reliable and valid musical aptitude tests. Hence, it might be useful for future research to critically evaluate and improve already existing musical aptitude tests, or create promising new measurements. Our findings might thus contribute to the improvement of tests. When one plans to apply tests of a similar type across a wider age range, even small attempts to facilitate the test for younger age groups (e.g., disentangle rhythm tasks from pitch tasks), as is the case for the IMMA, seem to make a crucial difference. Thus, test constructors should try to find another way of making a test suitable for younger age groups than isolating musical features. In this regard, it might also be important to use musical material instead of artificial stimuli. Another point raised by our findings is the potential problem of assessing musical aptitude independently of musical experience (e.g., music lessons). Although Gordon aimed at assessing musical aptitude without the influence of music experience effects, the IMMA seems to be related to amount of music lessons, at least in our sample. Hence, for the future it is important to be aware of this impurity problem. On the one hand, our findings could yield the assumption that it is not possible to measure musical aptitude purely, which comes as no surprise when taking into account reasonable definitions of musical aptitude (Gembris, 2003). On the other hand, this result highlights the importance of controlling for the influences of music experience when constructing a musical aptitude test.
Footnotes
Acknowledgements
The authors would like to thank the participants and their parents for participating in the study.
The study was conducted in full accordance with the Ethical Guidelines of the German Association of Psychologists (DGPs). In accordance with the ethical guidelines mentioned above informed consent was obtained from the parents for each participant.
Conflict of interest statement
The authors declare that there is no conflict of interest.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
