Abstract
While self-efficacy is known to play an important role in music performance, the magnitudes of reported effect sizes are inconsistent. The purpose of this meta-analysis was to estimate the population effect size for (a) the relationship between self-efficacy and achievement, (b) the relationship between self-efficacy and music performance anxiety (MPA), and (c) the influence of self-efficacy interventions. A literature search identified 220 self-efficacy studies with 46 meeting the inclusion criteria. Heterogeneity among findings required the use of a random-effects model. The results revealed a medium positive effect size between self-efficacy and achievement. Moderator analysis based on age identified a significant difference between secondary school and collegiate participants, while a comparison of instrumentalists and vocalists failed to reject the null. The relationship between self-efficacy and MPA exhibited a medium negative effect size with a significant difference between secondary school and collegiate participants. Self-efficacy interventions demonstrated a substantial impact on self-efficacy beliefs. Multiple contrasts identified differences in intervention effectiveness between K-12, collegiate, and older adult participants. The absence of vocal studies limited comparisons between instrumentalists and vocalists. This study establishes benchmarks for understanding self-efficacy’s role in music performance and makes recommendations for future research to improve achievement and the well-being of musicians.
Self-efficacy is a powerful contributor to performance-based accomplishments (Bandura, 1997). In music, the quality of a performance is influenced not only by the performer’s experience and preparation, but also by their self-efficacy (Cahill Clark, 2008; Matthews, 2007; Papageorgi et al., 2011). Bandura (1997) defines self-efficacy as “beliefs in one’s capabilities to organize and execute the courses of action required to produce given attainments” (p. 3). Although self-efficacy has been associated with achievement in music performance (Hewitt, 2015; McCormick & McPherson, 2003; Zelenak, 2019), music institutions may unintentionally foster environments that diminish performance confidence (Papageorgi et al., 2010) and music teaching practices do not inherently support the development of self-efficacy (Lewis et al., 2022). Anecdotal evidence suggests that music educators can do a better job in supporting the well-being of novice and veteran musicians (Pecen et al., 2018).
Self-efficacy deficiencies affect musicians at all stages of development. Lower secondary school students (ages 12–13 years) leave music programs due to low competence beliefs (Lowe, 2012), as do secondary school students transitioning to college (Mize, 2020). Collegiate musicians’ ratings of health responsibility, physical activity, and spiritual growth reflect diminished levels of self-efficacy, which, in turn, hinder them from reaching their full musical potential (Ginsborg et al., 2009; Panebianco-Warrens et al., 2015). Elite performers identify factors that undermine self-efficacy such as lack of emotional support, competitive environments, negative performance feedback, and stress as challenges to their musical well-being (Pecen et al., 2018). The influence of self-efficacy on music performance, therefore, is worthy of study regardless of performance level.
Elevated self-efficacy empowers musicians to control their musical development (Hendricks, 2016). It raises outcome expectations (Hewitt, 2015), persistence (Fryling, 2015), motivation (Ciorba, 2009), and use of cognitive strategies (Bone, 2011; Cahill Clark, 2008), and diminishes performance anxiety (Cleary, 2013). The reasons, however, are unclear why more music educators and institutions do not embrace the development of self-efficacy. One explanation may be the variability of effect sizes across studies. These inconsistencies bring into question the extent of self-efficacy’s role in music performance. Meta-analysis is a statistical process that summarizes the findings from individual studies (Card, 2012). A meta-analysis evaluating self-efficacy’s relationship with performance achievement, its relationship with music performance anxiety (MPA), and the effectiveness of self-efficacy interventions is needed to establish accurate effect sizes from which changes in practice and policy can occur that support the psychological well-being of musicians.
Theoretical foundations
Bandura’s (1986) social cognitive theory offers a model for human cognition and agency. Within this framework, self-efficacy is the key factor determining intentional acts. Individuals make decisions based on their beliefs in the strength of their abilities in relation to the challenges of impending tasks (Bandura, 1997). Self-efficacy is not a global self-assessment of abilities, but of abilities within specific domains of activity. Domains consist of a range of behaviors that lead to common performance outcomes rather than the duplication of specific behaviors.
Four sources of information contribute to the development of self-efficacy (Bandura, 1997): (a) enactive mastery experience—previous experiences of success or failure in an activity, (b) vicarious experience—using the accomplishments of others to understand one’s own capabilities, (c) verbal/social persuasion—opinions expressed by others regarding the individual’s capability, and (d) physiological and affective states—the degree and quality of arousal brought on by engagement in the task. The influence of this information, however, depends upon its cognitive appraisal. Contextual factors such as social, situational, and temporal circumstances affect an individual’s interpretation of this information (Bandura, 1977). Once assimilated, self-efficacy determines courses of action, degree of effort, perseverance, resiliency, positive thought patterns, and stress levels (Bandura, 1997, p. 3).
Self-efficacy and achievement
McPherson and McCormick (2000) conducted a ground-breaking investigation of 349 instrumentalists (ages 9–18 years) identifying self-efficacy as the strongest predictor of achievement on a performance exam. The authors replicated this finding in subsequent studies (McCormick & McPherson, 2003; McPherson & McCormick, 2006) as did other researchers (González et al., 2018; Ritchie & Williamon, 2012). Although studies agree on the positive relationship, the strength of the relationship (r) (i.e., effect size) varies. These differences are apparent in a comparison of findings among secondary school musicians. Hewitt (2015) identified a moderately strong relationship (r = .69), Davison (2006) reported a weaker relationship (r = .17), and Stewart (2002) encountered a modest negative relationship (r = −.06). In the same study, Hewitt (2015) examined the self-evaluation skills of this age group and discovered that 52% of the participants (middle school males n = 145, high school females n = 31) were inclined to inflate their responses. This inability to provide accurate self-evaluations may account for some of the differences in effect sizes. Meta-analysis, however, minimizes these measurement issues and isolates the influence of moderating variables such as age on effect sizes (Card, 2012).
Increases in self-efficacy promote behaviors that contribute to improvements in achievement. Secondary school and collegiate musicians with high levels of self-efficacy use advanced cognitive practice strategies and attribute their success to hard work and effort (Cahill Clark, 2008; Cremaschi, 2012; Nielsen, 2004). They are more aware of their weaknesses than those with low levels of self-efficacy and identify skills needing improvement (Hewitt, 2015). Elevated performance expectations motivate individuals to attain higher levels of achievement than similar peers with lower expectations (McPherson & McCormick, 2000).
Self-efficacy and MPA
Self-efficacy declines as anxiety from fear arousal increases (Bandura, 1982). In music performance, symptoms of MPA include increased heart rate, elevated blood pressure, physiological arousal (e.g., sweaty hands, racing heartbeat), fear of making mistakes, and decreased performance quality (e.g., memory lapses and loss of technique) (Abel & Larkin, 1990; Craske & Craig, 1984; Robson & Kenny, 2017). Various factors contribute to MPA such as performing in a concert setting, being evaluated during a performance, having prior performance breakdowns, and being a perfectionist (Robson & Kenny, 2017; Sarikaya & Kurtaslan, 2018; Sinden, 1999). Higher levels of self-efficacy, however, have been associated with lower levels of MPA (Paliaukiene et al., 2018). Increases in self-efficacy enable young musicians (Bersh, 2020) and adults (Craske & Craig, 1984) to cope with MPA. But as with achievement, the extent of self-efficacy’s relationship with MPA varies by study ranging from negligible (r = −.08) (Egilmez, 2015) to moderate (r = −.52) (Zarza-Alzugaray et al., 2020). An accurate estimate of the strength of the relationship between self-efficacy and MPA remains to be determined.
Self-efficacy interventions
Researchers have examined the influence of numerous interventions on self-efficacy. They include piano training (Bugos et al., 2016, 2019), mallet training (Bugos & Cooper, 2019), systematic vocalization (Calhoun, 2020), improvisation instruction (Davison, 2006; Hirschorn, 2011; Watson, 2008), goal setting (Mantie, 2019), and mental skills training (Gill, 2020; Kinne, 2016). Once again, the magnitude of effect sizes ranges from very effective (gchange = 1.7, Bugos et al., 2019) to ineffective (gchange = −0.14, Mantie, 2019). A closer look at these two studies reveals that Bugos et al.’s (2019) participants were older adult volunteers, while Mantie’s (2019) 6th-grade participants were required to enroll in the music class. Aside from the age difference, the enrollment conditions (volunteers vs required class) may have also influenced the participants’ receptiveness to the intervention. Concerns in other studies have been the duration of the intervention (Cremaschi, 2012; Miksza, 2015) and the longevity of the improvement (Gill, 2020). Hendricks (2009) found self-efficacy increased among high school students participating in a 3-day orchestra festival. This informal intervention provided a variety of experiences to develop self-efficacy such as chair auditions, rehearsals, and social interaction, while promoting cognitive processing through the completion of surveys and interviews. Effective interventions may not be limited to systematic programs.
Measurement instruments and techniques
Researchers have used several techniques to collect self-efficacy data. Single questions (Hewitt, 2015; McPherson & McCormick, 2000; Paliaukiene et al., 2018; Robson & Kenny, 2017), multiple questions (Cleary, 2013; McCormick & McPherson, 2003), general self-efficacy scales (Schwarzer & Jerusalem, 1995; Sherer et al., 1982) and domain-specific scales (Cahill Clark, 2008; Ritchie & Williamon, 2010; Zelenak, 2011) are a few examples. According to Bandura (1997), general self-efficacy scales are inappropriate for measuring self-efficacy in domains such as music performance unless scale items have been modified to reference actions or self-perceptions related to the domain. Bandura (2006) also recommends phrasing items with the words “can do” rather than “will do” and creating responses based on a 100-point scale broken into 10-point intervals. Problems, nevertheless, have been identified with self-report instruments. Quantitative data do not consistently align with observed behavior (Ali, 2010) and interview responses (Levine, 2019; Mantie, 2019; Mott, 2011).
Achievement in music performance is also difficult to define and quantify (Bergee & Weingarten, 2021). McPherson (1993) identified five skills that represent a broad concept of musical performance. Those skills are performing (a) rehearsed music, (b) by ear, (c) by sight-reading, (d) from memory, and (e) by improvising. He found significant (p < .05) correlations between skills ranging from r = .64 (by ear and from memory) to r = .77 (by ear and by improvising) supporting the inclusion of these skills in the same domain. Bandura (1997) recommends using this intermediate level of organization in which skills share common properties rather than duplicating specific behaviors to identify domains. This view aligns with other meta-analyses in education where the common practice is to include findings that share theoretical grounding and contain activities from the given domain (Unrau et al., 2018). Assessment of these skills, in turn, indicates the quality of the performance resulting in rankings, grades, or qualitative descriptions (McPherson & Thompson, 1998).
Meta-analysis
Meta-analysis is a process by which results from individual studies are aggregated to arrive at a population effect size. The limitations of individual studies imposed by research designs, sampling methods, methodological artifacts, and statistical power have less influence in a meta-analysis (Card, 2012). Glass (1976) is often credited with developing this analytical approach. Meta-analyses are common in medical fields but uncommon in music performance. Music scholars have used meta-analysis to summarize the relationship between music instruction and the development of academic and cognitive skills (Cooper, 2020; Gordon et al., 2015; Sala & Gobet, 2017; Standley, 2008), sight-reading (Mishra, 2014, 2016), deliberate practice (Platz et al., 2014), singing instruction (Svec, 2018), and the use of Gordon’s aptitude tests (Hanson, 2019). The only meta-analyses I found related to the current study were Varela et al.’s (2016) investigation of the relationship between self-regulation and music learning and Barros et al.’s (2022) examination of MPA among undergraduate students. I am aware of no other meta-analyses of self-efficacy in music performance.
Moderating variables
Moderating variables contribute to the heterogeneity among effect sizes (Card, 2012). Bandura (1997) proposed that self-efficacy reacts to biological changes, social events, and physical environments. These factors modify self-efficacy beliefs which are then manifested through behavior (Bandura, 1989). From this perspective, it is reasonable to examine age as a moderator since many changes occur over the life cycle. In music performance, some findings support the assertion that self-efficacy changes with age (Egilmez & Engur, 2017; Randles, 2011), while others oppose it (Akçay et al., 2021; Dempsey & Comeau, 2019; Zelenak, 2015). Inconsistencies are also evident in comparisons of MPA by age with differences occurring between younger and older musicians (Ackermann et al., 2014; Dempsey & Comeau, 2019; González et al., 2018).
Forms of expression within activity domains are differentiated by self-efficacy beliefs (Bandura, 1997). In music, vocal and instrumental activities are distinct forms of expression. Neuroimaging results indicate that vocal and instrumental performances activate shared and unique portions of the brain (Whitehead & Armony, 2018). Although few comparisons of vocal and instrumental self-efficacy exist in the literature, most report no difference (Nielsen, 2004; Sandgren, 2019; Zelenak, 2015). One study, however, claims a significant difference between instrument types (White, 2010). This claim is suspect since the data were collected using an unmodified general self-efficacy scale (Schwarzer & Jerusalem, 1993) and the comparisons were between color guard (flag ensemble) and percussion, as well as color guard and brass. Understanding the influence of age and instrument type (vocal vs instrumental) is essential to the practical application of self-efficacy research since music education programs and resources are typically categorized by age and instrument type.
The primary purpose of this meta-analysis was to summarize self-efficacy effect sizes in music performance and determine accurate estimates for the larger population. Primary questions were (RQ1) What is the magnitude of the relationship between self-efficacy and achievement in music performance? (RQ2) What is the magnitude of the relationship between self-efficacy and MPA? and (RQ3) How much influence do interventions have on self-efficacy? Secondary-level questions examined the impact of the moderators’ age and instrument type within the primary questions.
Method
Literature search
I combined the keywords “self-efficacy” and “music” using the Boolean operator “AND” to locate relevant studies. Initially, the keywords were entered into Google Scholar and then the electronic databases JSTOR, ERIC, Education Full Text, Academic Search Premier, Music Periodicals Database, and Psych Info for published literature, as well as ProQuest Dissertations and Theses for unpublished doctoral dissertations. To identify work published in non-traditional sources, I searched the Directory of Open-Source Journals. To verify results and to determine that no study was overlooked, the same keywords were entered into Journal of Research in Music Education, Musicae Scientiae, Music Education Research, International Journal of Music Education, Research Studies in Music Education, and Psychology of Music. Queries concluded on September 1, 2021, identifying 220 studies.
Inclusion and exclusion criteria
I constructed a coding manual that defined inclusion and exclusion criteria. A decision tree was used to focus the coding process (Supplemental Figure 1). To accommodate my dominant language, only studies available in English were included. The first criterion required studies to be quantitative in approach and included effect sizes or data from which an effect size could be calculated. The significance of the findings was irrelevant. Studies reporting only significance test results were excluded. In several cases, I contacted authors of studies with missing statistics but had limited success. The second criterion required studies to investigate the construct of self-efficacy from the perspective of Bandura’s (1986) social cognitive theory. Next, only studies investigating self-efficacy for music performance were included. Studies measuring self-efficacy as a general self-perception were excluded, while studies that deviated from Bandura’s (2006) guidelines for constructing self-efficacy scales were included to increase the generalizability of the findings. Rosenthal and DiMatteo (2001) recommended mixing “apples and oranges,” suggesting “studies that are exactly the same in all respects are actually limited in generalizability” (p. 68). This rationale also supported my decision to include music majors and non-music majors as participants. It would be highly assumptive to determine each participant’s level of musical experience based on enrollment status. Finally, I used McPherson’s (1993) list (sightreading, performing rehearsed repertoire, playing from memory, playing by ear, and improvising) to identify activities as music performance. Publication status was not a criterion. Doctoral dissertations were included, but master’s degree theses were excluded. Findings from doctoral dissertations published in journals were also excluded to avoid duplication.
Coding of study characteristics and effect sizes
Studies were coded in an Excel spreadsheet to identify those meeting the inclusion criteria and for future moderator analysis. Categories in the spreadsheet were identical to the inclusion/exclusion criteria. I removed studies that did not meet the first three criteria. Remaining studies were coded for four additional characteristics: (d) self-efficacy measurement instrument, (e) sample size, (f) age, and (g) instrument type.
Reliability of coding decisions
The reliability of the coding process was examined from the intercoder and intra-coder perspectives. To establish intercoder reliability, I coded all studies and then asked a doctoral student in music education with an interest in self-efficacy to separately code 20% of them (n = 44) (Lipsey & Wilson, 2001). I numbered the studies using a random number generator (random.org). We agreed on 142 of 158 decision points rendering an overall agreement rate of 90%. To establish intra-coder reliability, I coded the studies once, and then recoded them several months later. My coding was consistent for 98% of the studies. Two studies were found to contain usable data missed in the initial coding. Conducting a more rigorous comparison such as Cohen’s kappa was not possible due to the number of choices in each category.
Statistical analysis
Meta-analyses are usually conducted using one of two models, namely, fixed-effects or random-effects. A fixed-effects model is used when methods and measurement instruments are consistent across studies. Given the diversity of participants, performance activities, and data collection methods in the current study, a random-effects model was appropriate. The random-effects model estimates a single population effect size from multiple effect size distributions (Card, 2012). It can be generalized to the larger population because it is based on population distributions rather than the findings of individual studies.
Power
I used G*Power version 3.1.9.7 a priori to identify the number of studies needed to detect a large effect size (.50), with reasonable probability (.80), given α = .05 (Cohen, 1992). Results indicated that 26 studies were needed for each question. The selection process, however, identified fewer studies for RQ1 (k = 19), RQ2 (k = 15), and RQ3 (k = 20). This lack of power increased the probability of making a Type II error (Ellis, 2010). Card (2012) proposed that meta-analyses with inadequate power have greater statistical power than the studies they are comprised of and “meta-analyses are impacted less by inadequate power than primary studies” (p. 22). I chose to accept the risk of having reduced power as being a more reasonable alternative than adjusting the inclusion criteria and compromising the integrity of the selection process.
Effect sizes
Two types of effect sizes were used in this study, the Pearson Product-Moment correlation coefficient (r), and the longitudinal standardized mean change (gchange). Pearson’s r was used to address RQ1 (self-efficacy and achievement) and RQ2 (self-efficacy and MPA), while gchange was used to answer RQ3 (influence of intervention). The calculation of Pearson’s r followed established procedures (Glass & Hopkins, 1996). To determine the longitudinal standardized mean change (gchange), I calculated the standardized mean difference (g) between independent groups and inserted the pre- and post-treatment observations in place of the group means (Lipsey & Wilson, 2001, p. 44).
Effect size adjustments
For studies that reported effect sizes divided between groups of participants (i.e., beginning and advanced, boys and girls), I used the weighted average of all participants as the cumulative effect size. To accommodate skewed sampling distributions from studies with small sample sizes, all correlation coefficients were transformed into Fisher’s z values and then back into Pearson’s r upon completion of the analysis (Glass & Hopkins, 1996). Since mean change effect sizes (gchange) are also influenced by small samples, studies with fewer than 20 participants were adjusted (Card, 2012; Hedges & Olkin, 1985; Lipsey & Wilson, 2001).
Artifact correction
Artifact correction is a technique used to improve the precision of effect sizes. The general equation for artifact corrections is ES adjusted = ES observed / α such that ES adjusted is the adjusted (corrected) effect size, ES observed is the observed (uncorrected) effect size, and α is the total correction for all study artifacts (Card, 2012, p. 130).
Unreliability and artificial dichotomization were two artifacts needing correction. Unreliability is the error generated by a measurement instrument. A correlation coefficient represents not only the correlation between constructs, but also the correlation between the unreliability and “noise” of each measure. Reliability values reported as internal consistency, interrater reliability, or test-retest reliability were used to calculate unreliability. In studies missing a reliability coefficient, I replaced the missing value with the average of the reported values. For standardized mean change effect sizes, the effect size represents the change in the construct and the change in the error. The change in error reduces the effect size, so that it is smaller than the true population effect size (Card, 2012, p. 132). The second issue, artificial dichotomization, occurs when researchers split results from a continuous variable into two groups. Hunter and Schmidt (2004) provide a calculation to transform the dichotomization back into a normal distribution (p. 36).
Multiple effect sizes
Multiple effect sizes taken from the same study violate the assumption of independence. Consequently, I computed the average effect size when multiple effects sizes were reported within the same study.
Weighting studies
All effect sizes are not equal. Effect sizes coming from studies with larger sample sizes have smaller standard errors and greater precision. They are more desirable and given greater weight. In this study, the effect sizes were multiplied by the weight (i.e., inverse squared standard error) before being aggregated (Card, 2012).
Moderators
Bandura’s (1997) theory suggests that age and form of expression (i.e., instrument type) are potential sources of heterogeneity. I reorganized the findings by age and instrument type and compared their heterogeneity value (Q) to the chi-square value and examined differences in their mean effect sizes.
Estimating the random-effects model
Card (2012) identified four steps to estimate a random-effects model: (a) estimate the heterogeneity among effect sizes (p. 234), (b) estimate the population variability among effect sizes (p. 236), (c) use this population variability to generate random-effects weights for effect sizes (p. 238), and (d) estimate a random-effects mean effect size and standard error (p. 239).
All calculations in this study were completed using Excel from Microsoft’s Office Professional Plus 2016 program suite. Although dedicated software programs are available, the guiding reference for this study (Card, 2012) recommended using Excel to increase the researcher’s understanding of the meta-analytical process.
Results
The literature search identified 220 studies with 46 meeting the inclusion/exclusion criteria. Two studies contained the results of two separate investigations. Since the location, participants, and activities in the investigations were different, I treated the investigations as independent studies increasing the total number of studies (k = 48). In studies reporting multiple effect sizes, the average effect size from the study was used in this analysis (k = 7).
Research Question 1: what is the magnitude of the relationship between self-efficacy and achievement in music performance?
Multiple studies (k = 19) examined the relationship between self-efficacy and achievement (N = 2,363) (Table 1). The studies consisted of published journal articles (k = 12) and unpublished dissertations (k = 7) (Supplemental Table 2). They utilized a variety of self-efficacy measures: single questions (k = 4), multiple questions (k = 1), modified general self-efficacy scales (k = 1), and domain-specific scales (k = 13). Many studies evaluated a combination of musical skills (k = 8) while others measured single skills (rehearsed music k = 4, sight-reading k = 2, from memory k = 2, improvisation k = 3). No studies evaluated the ability to play by ear. Achievement quality was determined by judges with predetermined criteria (k = 11), single performance grades (k = 3), standardized music performance exams (k = 2), and performance-based class grades (k = 3). Most studies followed practices from the Western European tradition (k = 15) while others were jazz-oriented (k = 4).
Meta-analysis Summaries.
Note. CI: confidence interval.
Studies with combined age groups were omitted.
No vocal studies available for comparison.
Only one vocal study available for comparison.
p < .001; **p < .05.
Following Card’s (2012) four steps outlined above, heterogeneity (Q = 113.72) was higher than the critical value, χ2 (18) = 42.31, p < .001 (Table 1) suggesting the variability came from a source other than sampling fluctuation. The magnitude of the heterogeneity (I2) indicated that 84.2% of the observed variance was due to between-study variability and not within-study error. I combined the population variability (τ2) with the study-specific standard error to determine the random-effects weights. The final random-effects mean effect size was r = .44, p < .05, SE = 0.06, 95% CIs = [0.34, 0.54] (Supplemental Figure 2).
Age and instrument type were examined as possible sources of heterogeneity. A comparison between studies with secondary school participants (k = 10) and collegiate participants (k = 9) generated a value (Qbetween = 16.49) larger than the critical value [χ2 (1) = 10.83, p < .001] confirming a difference in the strength of the relationship between each group (secondary school participants r = .55, collegiate participants r = .40). A comparison of instrumental (k = 13) and vocal participants (k = 2) failed to reject the null, Qbetween = 2.70, χ2 (1) = 3.84, p > .05. Four studies were not included in this comparison because the data for instrumentalists and vocalists could not be separated.
Research Question 2: what is the magnitude of the relationship between self-efficacy and MPA?
Fifteen studies examined the relationship between self-efficacy and MPA (N = 3,200) (Table 1). They included published journal articles (k = 11) and unpublished dissertations (k = 4) (Supplemental Table 3). Domain-specific scales were used most frequently to measure self-efficacy (k = 12), while researchers also employed single questions (k = 2) and a modified general self-efficacy scale (k = 1). Many studies did not engage participants in a performance activity (k = 9), while others integrated rehearsed music performances (k = 2), memorized performances (k = 1), and combinations of performance skills (k = 3). MPA data were collected through domain-specific measures (k = 10), general anxiety measures (k = 4), and modified general anxiety measures (k = 1). Heterogeneity (Q = 48.60) was higher than the critical value, χ2 (14) = 36.12, p < .001 (Table 1) suggesting the variability can be attributed to sources other than sampling fluctuation. The magnitude of heterogeneity (I2) indicated that 71.2% of the observed variance was due to between-study variability. The random-effects mean effect size was determined to be r = −.45, p < .05, SE = 0.04, 95% CIs = [−0.51, −0.38] (Supplemental Figure 3).
Age was one source of heterogeneity. A comparison between secondary school (k = 5) and collegiate participants (k = 6) generated a heterogeneity value (Qbetween = 5.59) larger than the critical value, χ2 (1) = 3.84, p < .05, confirming that the magnitude of the effect size between self-efficacy and MPA was different for each group (secondary school participants r = −.40, collegiate participants r = −.50). Four studies were not included because the participants were from both secondary and collegiate populations. Moderator analysis by instrument type could not be conducted because no vocal studies met the criteria.
Research Question 3: how much influence do interventions have on self-efficacy?
Twenty studies examined the impact of interventions on self-efficacy for music performance (N = 903) (Table 1). Studies consisted of published journal articles (k = 6) and non-published dissertations (k = 14) (Supplemental Table 4). All studies used domain-specific scales to measure self-efficacy, and all offered activities that promoted the development of enactive mastery experiences as one source of self-efficacy information. Attention to the other sources varied across interventions. Heterogeneity (Q = 174.81) was higher than the critical value, χ2 (19) = 43.82, p < .001 (Table 1) suggesting that the variability in these studies can be attributed to sources other than sampling fluctuation. The magnitude of heterogeneity (I2) indicated that 89.1% of the observed variance was due to between-study variability. The random-effects mean effect size was gchange = 0.64, p < .05, SE = 0.09, 95% CIs = [0.47, 0.82].
Moderator analysis identified age as one source of heterogeneity among interventions. Multiple contrasts included three age groups, K-12 participants (k = 11), collegiate participants (k = 6), and older adult participants (k = 3). Since many sample sizes were small and the number of studies in each group was unequal, I applied the Bonferroni correction (p < .01) to determine the statistical significance (Card, 2012). The contrasts identified significant differences between K-12 and collegiate participants [Qbetween = 16.69, χ2 (1) = 6.64, p < .01], K-12 and older adult participants [Qbetween = 25.90, χ2 (1) = 6.64, p < .01], and collegiate and older adult participants [Qbetween = 8.01, χ2 (1) = 6.64, p < .01]. A comparison by instrument type was not possible because only one study with vocalists met the criteria.
Discussion
Successful performances are linked to feelings of sufficient preparation and positive mind-sets (Clark et al., 2014). This investigation estimated self-efficacy’s role in music performance by combining results from single studies using meta-analysis. I found the relationship between self-efficacy and achievement to exhibit a moderate effect size (r = .44). According to Cohen (1992), a medium (i.e., moderate) effect size “represents an effect likely to be visible to the naked eye of a careful observer” (p. 156). Fritz et al. (2012) introduced two statistics (probability of superiority—PS, and the percentage of nonoverlap of distributions—U1) to further assist in the interpretation of effect sizes. For the effect size of r = .44, (p. 8), they predict that a randomly selected individual with a high level of self-efficacy will demonstrate a high level of achievement 76% of the time (PS). They also propose that 55% (U1) of the achievement scores for individuals with high levels of self-efficacy will not overlap with those demonstrating low levels of achievement. From another perspective, this effect size demonstrates greater accuracy than the weighted mean effect size because it crosses the confidence intervals of more studies (k = 16) than the other (k = 14) (Supplemental Figure 2).
The random-effects mean effect size for the relationship between self-efficacy and MPA was also moderate, but negative in direction (r = −.45) (Supplemental Figure 3). As self-efficacy increases, MPA decreases. Although Cohen’s interpretation remains the same, Fritz et al.’s (2012) recommendations are the inverse of those stated above suggesting that individuals with higher levels of self-efficacy demonstrate lower levels of MPA 76% of the time (PS), and 55% of the individuals with high levels of self-efficacy will not overlap with individuals demonstrating high levels of MPA (U1).
Further analysis suggests that the heterogeneity of effect sizes can be attributed to between-study variability (Table 1). Age differences were one source of heterogeneity. In the relationship with achievement, secondary school participants exhibited a stronger relationship (i.e., larger effect size) (r = .55) than collegiate participants (r = .40). This finding aligns with Bandura’s premise that self-efficacy beliefs change over the life cycle. Between these participants, the responsibilities of adulthood may weigh heavier on the minds of collegiate students than those carried by secondary school students. This finding also supports claims that suggest self-efficacy decreases among older students due to increasingly difficult repertoire and examination requirements (McCormick & McPherson, 2003; Randles, 2011). Finally, it confirms findings in which no difference was found between middle and high school participants (Dempsey & Comeau, 2019; Zelenak, 2019) since these participants are likely to live with a parent or guardian and are at a similar life stage. In the relationship between self-efficacy and MPA, collegiate participants demonstrate a stronger relationship (r = −.50) than secondary school participants (r = −.40). One interpretation of this finding may be that secondary school musicians with greater control of their MPA continue music studies in college.
Instrument type could not be verified as a significant contributor to heterogeneity in either relationship. Vocal and instrumental participants exhibited similar effect sizes in the relationship between self-efficacy and achievement. This finding aligns with other self-efficacy comparisons by instrument type (Nielsen, 2004; Sandgren, 2019; Zelenak, 2019). One weakness in this comparison may be the difference in sample sizes (instrumental k = 13 and vocal k = 2). A comparison by instrument type could not be conducted for self-efficacy and MPA.
Interventions were found to have a substantial impact on self-efficacy (gchange = 0.64). A similar increase on the Wechsler IQ Test would move an individual with a mean score of 100 from the “average” IQ category to the “high average” category (Wikipedia, n.d.). Although Cohen (1992) and Fritz et al. (2012) do not provide values to interpret gchange, this statistic is in the same family as Cohen’s d (Ellis, 2010). Fritz et al. (2012) proposes that a randomly selected individual participating in an intervention demonstrating an effect size of d = 0.64 would have a 66% chance of obtaining a higher level of self-efficacy than an individual who did not participate in the intervention (PS). Fritz et al. also suggests that 38% of the distributions for individuals participating in interventions do not overlap with those not participating (U1). Both statistics highlight the advantages of participating in interventions. Differences by age, however, contributed to between-group heterogeneity. The effectiveness of interventions increased as participants grew older (Table 1). A comparison by instrument type could not be completed.
A comprehensive description of self-efficacy interventions is beyond the scope of this investigation, nonetheless, some details became apparent. All interventions included a component that engaged participants in enactive mastery experiences (Supplemental Table 4). The moderately strong influence of these interventions on self-efficacy supports Bandura’s tenet that enactive mastery experience is a powerful source of self-efficacy information. Future interventions should consider including enactive mastery experiences in their design. While the primary focus of some interventions was the development of mastery experiences (k = 9), other interventions targeted the cognitive processing of self-efficacy information (k = 11). A comparison of mean effect sizes revealed no significant difference between those developing enactive mastery experiences (M = 0.70, SD = 0.47) and those utilizing cognitive processing (M = 0.57, SD = 0.38). One rare comparison of interventions found pre-performance routines more effective than goal-setting processes in elevating self-efficacy before a performance (Tief & Gröpel, 2020).
Although Bandura (1997) identifies age and form of expression as moderators of self-efficacy, other studies have investigated the influence of gender. Bandura (1989) attributes these differences to “psychosocial” influences. Although not part of this study, the effect sizes found in gender studies are similarly inconsistent and worthy of systematic review. Some identify stronger self-efficacy among males (Ciftcibasi, 2020; Nielsen, 2004; Zarza-Alzugaray et al., 2020), while others among females (Hendricks, 2009; Previti, 2003). Differences in musical genre are another moderator that may influence self-efficacy. Researchers are beginning to examine the role of self-efficacy in genres other than traditional Western European practices. Recent findings suggest that musicians in other genres are influenced by different motivational factors than those in traditional large ensembles (Rolandson, 2020; Schmidt & Gruber, (2023).
Limitations
Publication bias is an important consideration in meta-analysis. I conducted a moderator analysis comparing the published (k = 28) and unpublished studies (k = 20) in this analysis. Between-group heterogeneity (Qbetween = 74.49) was higher than the critical value, χ2 (1) = 3.84, p < .05, indicating the presence of bias. A simple comparison of mean effect sizes, however, casts doubt on this finding (published r = .40, SD = 0.16; unpublished r = .35, SD = 0.21) suggesting publication bias may be minimal as reflected in the modest differences in effect sizes and the relatively large standard deviations.
Research designs limit the interpretation of results. The current study confirmed self-efficacy’s positive relationship with achievement and negative relationship with MPA, but it cannot claim that self-efficacy causes achievement to increase or MPA to decrease due to the correlational design of most studies. Another limitation is the lack of vocal studies in this area. Moderator analysis by instrument type could not be conducted for RQ2 and RQ3. In addition, numerous qualitative studies exist that were not included in this quantitative analysis.
Conclusion
This study offers a summary of self-efficacy’s role in music performance. The goal of this investigation was to examine self-efficacy through the lens of Bandura’s social cognitive theory. From a theoretical perspective, the results are consistent with Bandura’s (1997) self-efficacy theory. They confirm the positive relationship between self-efficacy and achievement, and the negative relationship between self-efficacy and performance anxiety. In addition, they demonstrate that self-efficacy is a malleable construct and can be manipulated by interventions. These assertions provide a foundation for future studies to build on. From a practical perspective, this study establishes population effect sizes that researchers can use as benchmarks, and it corroborates the effectiveness of self-efficacy interventions. From a policy perspective, the results of this study can be used to advocate for the integration of self-efficacy development into music programs. Sun (2022) documents the positive impact of self-efficacy on psychological well-being and advocates for the development of self-efficacy by institutions and practitioners. The results from the current study confirm that improvements in performance achievement depend on more than the development of musical knowledge and skill.
How to implement these findings? Pre-service music teacher training and professional development workshops may provide the best opportunities for communicating this information (Mize, 2020). Disseminating effective strategies to educators and institutions is a crucial step forward. Developing teaching practices that lead to higher levels of achievement can provide the data-based evidence needed to sustain self-efficacy initiatives (Lewis et al., 2022). Other issues, however, are worth considering. Social and cultural forces have been reported to inhibit the development and manifestation of self-efficacy (Egilmez & Engur, 2017; Hendricks, 2009; Topoğlu, 2014). Alternatively, music students have shown a preference for being able to exert ownership, choice, and control over their musical development by participating in a variety of genres and formats (Hendricks & Smith, 2018). One suggestion to address these challenges is to develop interventions that reflect the cultural, community, and personal interests of the targeted population (Lashley, 2018). Contrary to the cliché, one size does not fit all.
Self-efficacy research, therefore, is not complete. This study identifies specific boundaries, but these boundaries highlight opportunities and not limitations. From a design perspective, experimental procedures built on random sampling and large sample sizes are needed to establish causality. From a participant perspective, many performing musicians develop their skills through individualized instruction and are not well represented in these studies. Longitudinal studies are essential to determining the duration and stability of self-efficacy interventions. Finally, the identification of practices and beliefs that promote the equitable development of self-efficacy among diverse musicians remains relatively unexplored. Individuals with different backgrounds and orientations may interpret musical experiences in unique and novel ways.
Effect sizes are an important statistic for the interpretation of results from individual studies and for future researchers conducting meta-analyses. The cumulative effect sizes reported in this study confirm the role that self-efficacy plays in music performance. Hopefully, greater recognition of this role will lead to increased support for the psychological needs of all musicians.
Supplemental Material
sj-docx-1-pom-10.1177_03057356231222432 – Supplemental material for Self-efficacy and music performance: A meta-analysis
Supplemental material, sj-docx-1-pom-10.1177_03057356231222432 for Self-efficacy and music performance: A meta-analysis by Michael S Zelenak in Psychology of Music
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
