Abstract
The purpose of this study was to assess the effects of three choral configurations on a soprano section’s sound. The first configuration resembled a choir section without an assigned standing position, the second configuration grouped singers by timbre, and the third used acoustic-compatibility placement. Three conductors configured a university soprano section (N = 13) who were audio-recorded singing in each configuration and answered questions about their perceptions. Audio recordings were analyzed acoustically using long-term average spectra and perceptually through pitch analysis and listener perceptions. Results indicated that participants sang with significantly increased spectral energy in the acoustic-compatibility configuration (p < .001), and both singer and listener participants preferred intentional standing configurations over the random standing configuration. Findings from this study suggest that choral directors can use intentional configurations in conjunction with 2 ft intersinger spacing to improve singer comfort and overall sound.
Researchers in the fields of educational and behavioral psychology have found that seating charts or arrangements can significantly affect classroom behavior and learning outcomes (Wannarka & Ruhl, 2008). Music ensemble seating or standing arrangements have the potential to affect more than just behavior and educational outcomes: they may also affect musical outcomes, and in the case of choral ensembles, vocal health. Although directors of instrumental ensembles traditionally arrange members within sections according to ability, choral ensemble directors lack consistency in how they arrange singers within sections. Choral pedagogue Kenneth Phillips (2016) claimed that singer configuration can influence blend, intonation, and overall ensemble quality, and posited that permitting singers “to sit or stand anywhere within a section is to show little understanding for how a beautiful choral tone is developed” (p. 203). Conductor Weston Noble believed that purposefully placing singers within sections improved singer comfort, behavior, listening skills, intonation, tone, vibrato, and rhythmic precision (2005). Smith and Sataloff (2013) echoed these sentiments when they claimed that “the placement of certain voices based on the compatibility of vocal color, frequency, and formant affects tuning and choral blend” (p. 242).
Nevertheless, many conductors continue to employ configurations that are traditional, yet lack music basis, such as the commonly practiced height-based configuration where shorter singers stand in the front and taller in the back (Bartle, 1993; Caldwell, 2017; Rosenbaum, 2017). The invention of choral risers largely eliminated the need to place singers by height for ease of seeing the conductor, yet some practitioners continue to use height-based configurations, not for reasons of choral sound, but for visual presentation (Emmons & Chase, 2006) or venue constraints (Smith & Sataloff, 2013). Even distinguished ensembles like the Mormon Tabernacle Choir resort to height-based configurations because they create “a more uniform appearance for the cameras” (The Tabernacle Choir, 2014). Because directors often default to arranging singers by height, research findings about various types of singer configurations are limited. Several researchers studied configurations of sections (e.g., SATB) within the choir as a whole (Aspaas et al., 2004; Atkinson, 2006; Barrett, 2003; Daugherty, 1999, 2003; Lambson, 1961; Ushino, 2013; Wang, 2006), but few investigated intrasection singer configuration, or singers’ arrangement in relation to one another within their choir section or voice part.
Intrasection Singer Configuration
Pedagogues have suggested configuring singers within a voice part by strategic placement of leaders and followers. These suggestions included pairing “opposites” like strong and weak sight singers (Brinson & Demorest, 2013; Ellingboe, 2019), “strong voices” with “strong ears” (Jennings, as cited in Zabriskie, 2010), alternating vibrato rates or sizes (Haasemann, 1991; Smith & Sataloff, 2013), or placing “buffer” voices between singers with more distinctive sounds (Brinson & Demorest, 2013). Others have suggested placing “similar” voices next to each other (Phillips, 2016), or asking singers to blend toward a “model” voice (Christiansen, as cited in Knutson, 1987). Still others recommended placing the “strongest” or most “beautiful” voices in various positions depending on venue acoustics (Ellingboe, 2019; Emmons & Chase, 2006; Rosenbaum, 2017) such as the back row (Turtenwald, 2017), in the center of the section (Fredrickson, 2004), or in the front row, with weaker singers in the back (Archibeque, 2005; Phillips, 2016).
One of the most commonly referenced intrasection configuration strategies in the pedagogical literature was a timbral grouping method in which directors assigned tone colors such as light, bright, and dark (Ellingboe, 2019; Phillips, 2016; Silvey, 2016) or instrumental timbres to singers (Crowther, 2003; Christiansen, as cited in Zabriskie, 2010). Conductor John Molnar categorized his singers according to flute, string, and reed instrumental tone colors, then grouped each color into rows, claiming that his method improved choral blend (1950). Another common method was Weston Noble’s Voice Compatibility Placement, or similar voice-matching adaptations (Archibeque, 2005; Emmons & Chase, 2006; Haasemann, 1991; Silvey, 2016; Warren, 2016). Also known as acoustic-compatibility placement, Noble’s method requires that directors evaluate every possible arrangement of singers within a section to find those whose timbres, or unique overtone series, are “complementary,” “interlocking,” (Holt & Jordan, 2008) or acoustically compatible. Advocates of this method observed an acoustic phenomenon which enhanced blend (Jordan, as cited in Noble, 2005) and improved singer comfort (Noble, 2005).
Research-Based Findings on Singer Configuration
The concept of self-to-other ratio (SOR), or the proportion at which a singer can hear oneself compared with the rest of the choir, is an important variable when choosing singer configurations. Ternström (1999) claimed that an undesirable ratio of self-to-other (i.e., when a singer cannot adequately hear themselves or the rest of the ensemble) leads to intonation and tuning issues. Other researchers also discovered that various changes in the choral rehearsal environment may affect intonation, including changes in rehearsal or performance procedures (Brunkan, 2013, 2016; Cook-Cunningham & Grady, 2018; Grady, 2014b; Grady & Cook-Cunningham, 2018; Grady & Gilliam, 2020), venue acoustics (Ternström, 1999), and choral configuration (Daugherty, 2005; Ekholm, 2000; Ternström, 1999; Tocheff, 1990).
Changes in SOR, and consequently the way singers and listeners perceive choral sound, can also be caused by alterations of singer spacing or choral formation (Ternström, 1999). Daugherty concluded that singer spacing may have a greater effect on choral sound than formation (e.g., mixed vs. sectional). In experiments with both auditioned and nonauditioned high school and university ensembles, he consistently found that both singers and listeners preferred spread spacing of at least 2 ft between singers over close spacing of one inch (Daugherty, 1999, 2003; Daugherty et al., 2013; Daugherty et al., 2019). Adams (2019a) concluded that both large and chamber university ensembles also preferred spread spacing (2 ft) over moderate (1 ft) or close (1 in.) spacing, but Bonshor (2017) suggested that adult amateur choirs may prefer closer spacing than advanced groups for reasons of improved confidence and security. Researchers have also explored arranging singers by vibrato (Daffern, 2017; Folger, 2002; Osinski, 2014) and volume (Yang, 2004). No studies to date explored configurations via timbral grouping, but several studies tested Noble’s acoustic-compatibility placement.
Research on acoustic-compatibility placement has received criticism for the idiosyncratic and difficult-to-replicate nature of the experimental designs (Daugherty, 2001), but each study to date revealed positive benefits of this configuration method. Researchers suggested that acoustic-compatibility placement positively influenced listeners’ perceptions of choral sound (Ekholm, 2000; Giardiniere, 1991; Tocheff, 1990) and choral blend (Killian & Basinger, 2007; Woodruff, 2002). Singers also preferred acoustic-compatibility configurations (ACCs), possibly due to increased phonatory ease (Ekholm, 2000). Because singers may consciously alter vocal technique between choral and solo singing modes (Goodwin, 1980; Reid et al., 2007), Ekholm, Giardiniere, and Woodruff hypothesized that acoustic-compatibility placement homogenized choral sound without the need for singers to manipulate vocal production or compromise vocal health.
Although several researchers specifically investigated acoustic-compatibility methods, I found minimal research comparing different methods of configuring singers within sections. Therefore, the purpose of this study was to assess the effects of three different intrasection singer configurations on acoustic and perceptual measures of a soprano section’s overall sound. I chose the following configurations for their methodological specificity and frequent occurrence in common choral practice and pedagogy literature: (a) a baseline random configuration (RC) to represent a section without an assigned arrangement; (b) a timbre-based configuration, according to Molnar’s (1950) specifications (timbral configuration [TC]); and (c) acoustic-compatibility placement, according to Noble’s 2005 demonstration (acoustic-compatibility configuration [ACC]). The following research questions guided this investigation:
Method
Singer Participants
After gaining institutional review board approval, I recruited participants (N = 13) from a choral program at a large, Midwestern university in the U.S. Participants averaged 21 years of age (SD = 1.79, range = 18-26) with a mean of 9 years choral singing experience (SD = 6.52, range = 1-20). The majority of participants majored in music (n = 12) with voice as their primary instrument (n = 9). I recruited participants from six different choirs in an attempt to minimize the number of singers who were accustomed to singing with each other.
Conductor Panel
A panel of expert conductors (n = 3; n = 2 female, n = 1 male) placed the singers into three different configurations according to procedures specified below. The conductors averaged 38.33 years of age (SD = 2.89, range = 35-40) and 15.33 years teaching and conducting experience (SD = 8.08, range = 6-20). Two of the conductors identified as White and one as Latina. The conductors each directed one or more choirs at the university, and none had previously used the configuration methods with their own choirs. Because the panelists were chosen from the university’s choral faculty, they had some familiarity with the participants’ voices for placement purposes.
Procedures
To control for practice effects, the participants repeatedly sang the first phrase of My Country ‘Tis of Thee (from “My” to “sing”) in F major for the panel’s configuration procedures. After finalizing the placement of singers in each configuration, I audio-recorded participants performing the first verse of the familiar hymn-tune Amazing Grace in A major a cappella. The verse was repeated to achieve the minimum required length for LTAS analysis of sixty seconds.
The singers followed a life-sized projection of the same conductor video for each of the recordings and viewed a 10-second clip of the video beforehand to familiarize themselves with the tempo (70 bpm). I prerecorded the conductor video using QuickTime software on a MacBook computer. The conductor, who wore black and stood in front of a white background, maintained a neutral facial expression throughout the video.
Recording Conditions
Participants wore numbered nametags and stood on color-coded markings (placed according to configuration specifications) on Wenger risers in the regular choral rehearsal room for recordings. I placed markings 2 ft (0.61 m.) apart for optimal intersinger spacing (Daugherty, 1999, 2003; Daugherty et al., 2013; Daugherty et al., 2019). I positioned a Roland R05 digital recorder at a standing-conductor ear height of 5 ft 4 in. (1.62 m) and distance of 15 ft (4.57 m) away from the front row, which was deemed an appropriate distance between choir and conductor for the room size. The device recorded in .wav format at a sampling rate of 44.1 kHz (16 bits). Gain and volume controls, set manually at the start of the recording, remained consistent throughout.
Section Configurations
Random Configuration
The RC reflected a choir section in which the director had no preference about intrasection singer configuration. To avoid possible effects from familiar singers standing next to each other, I arranged singers in rows according to a randomized set of numbers. The final arrangement had four singers in Row 1, five in Row 2, and four in Row 3.
Timbral Configuration
For the TC, the conductor panel used Molnar’s (1950) instructions to classify singers based on the tone color (not resonance) of the following instruments: (a) flutes: “mellow, liquid, round, flute-like;” (b) strings: “voice quality resembles the string tone of the orchestra;” or (c) reeds: “resembles the oboe quality.” While the 13 participants sang individually, each panelist separately classified the participants’ vocal timbres. The panel unanimously agreed on five participants’ classification, had majority agreement for another five participants, and disagreed on three participants (interrater reliability = 51.28%). The panel collectively reexamined the eight participants for which there was disagreement, and after agreeing on the timbral classification of all participants, they listened to each timbre group separately to check for misclassifications. They unanimously agreed to move one participant to a different timbre group and decided on a final TC with three “flutes” placed in row one, six “strings” placed in row two, and four “reeds” placed in row three according to Molnar’s specifications.
Acoustic-Compatibility Configuration
Because Noble (2005) conceded that acoustic compatibility is largely a matter of conductor preference, in cases of conductor disagreement, I used the majority opinion for the ACC. The panel followed Noble’s acoustic-compatibility placement instructions and first listened to several pairs of “smaller-sized voices” with “minimal vibrato” before agreeing on a “model pair” to act as the foundation of the section. The panel then listened to a third participant in the middle, to the right, and to the left of the model pair before deciding on the most acoustically pleasing configuration. They continued this process, adding singers one at a time and listening to each new singer in every possible position until they placed all 13 participants. Per Noble’s recommendation, the panel saved “larger voices” for the end of the configuration process, and, in cases where placement was difficult, listened to singers in groups of three, separately from the rest of the group. The participants then sang as a group and the panel made minor adjustments until they agreed on the most “acoustically compatible” configuration. Because Noble arranged his sections by rows rather than block formation, the participants stood on the first riser row for the ACC.
Order of Recordings
To avoid confounding variables from increased singer awareness due to the involved process of Noble’s acoustic-compatibility method (2005), I intentionally recorded the configurations in the following order: random, timbral, and acoustic-compatibility.
Singer Survey
Immediately following each condition recording, participants responded to survey questions about their perceptions. On a 5-point Likert-type scale, participants rated the effects of the configurations on (a) ease of singing (“Rate the effectiveness of this configuration on ease of singing” anchored from not helpful to very helpful), (b) overall choral sound (“Rate the overall choral sound during this configuration condition” anchored from poor to excellent), and (c) their ability to hear themselves and others (“Rate how well you could hear yourself in this configuration,” and “Rate how well you could hear others in this configuration” anchored from not at all to very well). Participants also responded to open-ended questions by indicating what they liked or disliked about each configuration (“Describe what you liked or disliked about this configuration”). At the end of the recording session, participants ranked the configurations in order of preference and explained the reasoning behind their ranking (“Rank the configurations in order of preference, 1 = most preferred,” and “Describe the main reason you ranked in this order”).
Long-Term Average Spectra Measurements
The LTAS data were obtained by analyzing the recordings through KayPentax Computerized Speech Lab software. I used a window size of 512 points, a bandwidth of 86.13 Hz, and a Blackman window. The obtained data were transferred to an Excel spreadsheet prior to statistical analysis in SPSS.
Pitch Analysis
I used Pitch Analyzer 2.1 software on a MacBook Pro computer to determine the effect of configuration on the soprano section’s collective pitch deviation. I extracted clips from the midpoint of the [e] vowel of the first word “amazing,” and the midpoint of the [i] vowel of the last word “see” for each recording. The extracted pitches were then compared with a reference tone set to the score-notated pitch (A4 = 440 Hz) by adjusting the tone to the recorded pitches. I calculated the difference between the reference tone and each of the six perceived pitches (first and last pitches of the three recordings made in each configuration) in cents, then subtracted the deviation of each final pitch from the deviation of each initial pitch for a total deviation in cents (Cook-Cunningham & Grady, 2018; Grady, 2014b; Grady & Cook-Cunningham, 2018; Grady & Gilliam, 2020). For reliability, another researcher with choral and vocal teaching experience repeated the same procedure. Any differences within ±7 cents were counted as agreements, and obtained interrater reliability (calculated by dividing agreements by agreements plus disagreements) was .83.
Expert Listener Evaluations
To collect perceptual data from expert listeners, I posted a Qualtrics survey link on a choir director discussion board webpage. Respondents (N = 104; n = 74 female, n = 30 male) averaged 43 years of age (SD = 13, range = 23-81) with an average of 19 years of choral directing experience (SD = 12, range = 1-50). A majority of listeners (78%) listed voice as their primary instrument. When asked about their choral voice part, 45% of listeners identified as sopranos, 26% as altos, 16% as tenors, and 13% as basses.
After consenting to participate, listeners were instructed to “evaluate the sound of a soprano section in three different configurations” by listening to excerpts of the three recordings. To control for possible listener bias, I labeled the recordings with ambiguous symbols and did not inform the listeners of which configuration methods were used. The presentation order of recordings was randomized to help control for order effects. I instructed listener participants to evaluate the recordings in a quiet room using headphones or speakers as many times as needed before making decisions. Listener participants rated each recording on a Likert-type scale for overall choral sound (1 = poor, 5 = excellent). Afterward, they ranked each recording in order of preference using sort-and-rank procedures similar to Confredo et al. (2018) and answered open-ended questions about their preferences.
Statistical Analysis
I analyzed LTAS data using a one-way repeated measures analysis of variance (ANOVA) with post hoc Bonferroni-corrected pairwise comparisons (α = .016). Because all quantitative data from singer and listener surveys were ordinal in level, I used the nonparametric Friedman two-way ANOVA and post hoc Wilcoxon signed-rank tests with a Bonferroni adjustment (α = .016).
Results
Long-Term Average Spectra
The sound of the human voice is a complex combination of the simultaneous sounding of a fundamental frequency (F₀) and partials of various intensities along a spectrum of frequencies. These partials can be amplified or dampened in certain frequency ranges which informs listeners of unique vocal timbres. The LTAS data show average spectral energy (presented as sound pressure level, measured in dB) of various frequencies (presented in Hz) over a period of time.
Figure 1 contains the LTAS contours across the entire 0 to 10 kHz spectrum for the three conditions. A visual evaluation of LTAS data from condition recordings showed that the ACC yielded the highest spectral energy, followed by the TC and the RC. According to Howard and Angus (2017), a difference of 1 dB in the energy of complex sound may constitute a just noticeable difference (JND), depending on the hearing ability of the listeners and the nature of the sound. Therefore, a difference of 1 dB may help interpret the differences between each conditions’ voiced sound.

Long-term average spectra contours across the full 0 to 10 kHz range.
Results Across the Entire 0 to 10 kHz Spectrum
The results of a one-way repeated measures ANOVA with a Greenhouse-Geisser correction revealed significant differences in the spectral energy of the three conditions across the 0 to 10 kHz spectrum, F(1.619, 187.786) = 340.45, p < .001, η2 = 0.75. Post hoc tests showed that each pairing was significantly different (p < .001), with the average spectral energy of the TC significantly higher than the RC, and the ACC significantly higher than both the TC and RC. Table 1 displays the grand means, standard errors, ranges, and pairwise comparisons between all pairings of conditions according to frequency range.
Pairwise Comparisons of Long-Term Average Spectra Data According to Specific Frequency Ranges.
Note. In all cases, the condition with less spectral energy was subtracted from the condition with more (i.e., the TC averaged 0.89 dB more spectral energy than the RC in the 0-10 kHz range). SE = standard error; CI = confidence interval; TC = timbral configuration; RC = random configuration; ACC = acoustic-compatibility configurations.
indicates a significant difference in spectral energy between the two conditions of p < .001. †indicates a just-noticeable difference (JND) between the two conditions.
Results According to Specific Frequency Regions
The ACC and TC also had significantly more spectral energy within specific frequency regions (see Table 1). Within the 2 to 4 kHz range, which includes frequencies from the “Singer’s Formant,” or the region at which the human ear is most sensitive (Fletcher & Munson, 1933), spectral energy was significantly different between conditions, F(1.378, 55.481) = 131.58, p < .001, η2 = 0.86. In this region, differences in spectral energy between pairs were more pronounced, especially between the ACC and the RC, averaging a 2.58 dB difference.
Other studies have indicated that variations in high-frequency energy (above 5 kHz) of the voice spectrum may contribute to listeners’ ability to differentiate the qualities of voiced sounds (Monson et al., 2011; Ternström, 2008). A visual assessment of LTAS contours revealed that the spectral energy differences between the ACC and the other configurations were even more pronounced in the 5 to 10 kHz high-frequency energy range, including a peak of ACC spectral energy around the “second Singer’s Formant” (as referenced by Titze & Jin, 2003). These differences were statistically significant, F(1.510, 135.102) = 347.07, p < .001, η2 = 0.86. The ACC had an average of 2.61 dB more spectral energy than the RC (up to 4.81 dB more) and an average of 1.70 dB more than the TC (up to 3.24 dB).
Pitch Analysis
The soprano section began in tune (±7 cents of the notated pitch; Lindgren & Sundberg, 1972) in all conditions and ended out of tune in all conditions. Overall, mean deviations in cents were as follows: (a) RC = −15.85, (b) TC = −35.86, and (c) ACC = −39.89.
Singer Perception
Average ratings, rankings, standard deviations, and results from statistical analyses on quantitative singer survey data are displayed in Table 2.
Data From Statistical Analyses of Singer and Listener Participant Surveys.
Note: Boldface indicates the highest rated or ranked condition for each survey question. ANOVA = analysis of variance; ACC = acoustic-compatibility configurations; RC = random configuration; TC = timbral configuration.
Indicates significant differences between all three conditions at the .05 level. **Indicates a significant difference between pairs of conditions after applying a Bonferroni correction (α = .016).
Ratings
Singers rated their ability to sing with ease for each condition on a Likert-type scale from 1 (not at all easy) to 5 (very easy). Average ratings for ease of singing were highest in the TC at 4.34, followed by the ACC at 4.23, and lowest in the RC at 3.77. However, Friedman ANOVA results revealed that these ratings were not significantly different (p = .303). Participants rated overall choral sound on a scale of 1 (poor) to 5 (excellent). The TC rating (4.54) was highest, followed by ACC (4.31) and RC (3.69). The three ratings were significantly different (p = .011), and the TC ratings were significantly higher than the RC ratings (p = .013).
Ternström (1999) found that singers have a preferred SOR, or difference in sound pressure level between self-sound and other-sound, as perceived by the singer. Although room acoustics can govern preferences of SOR ratios, singers need to hear themselves at a slightly louder (approximately 6 dB) level than singers around them (Ternström, 1999), and an imbalanced SOR may lead to problems with intonation, vocal production, and over-singing (Daugherty, 2001). To evaluate singer participants’ perceived SOR in each condition, singers rated how well they could hear themselves and others from 1 (not at all) to 5 (very well). Average singer ratings for ability to hear oneself were highest for the ACC (4.70), then the RC (4.08), with the TC (3.77) rated the lowest, but were not significantly different (p = .400). Average singer ratings for ability to hear others were highest for the ACC (4.31), followed by the TC (4.31) and RC (4.00). These were not significantly different from one another (p = .518).
Ranking
For most-preferred configuration, seven singers (54%) selected ACC, five (38%) selected TC, and one (8%) selected RC. For least-preferred configuration, nine singers (69%) selected RC, three (23%) selected TC, and one (8%) selected ACC. Results from a Friedman ANOVA indicated significant differences between rankings (p = .023), but post hoc tests showed no significant differences between paired condition rankings.
Singer Comments
To analyze the qualitative data collected from open-ended survey questions, I disaggregated the comments into positive and negative categories (Cook-Cunningham & Grady, 2018; Grady, 2014a; Grady & Gilliam, 2020). Out of 63 total comments, singers wrote nine positive and eight negative comments about the RC, 13 positive and nine negative about the TC, and 17 positive and seven negative about the ACC. Eight singers (62%) commented that they could easily hear either themselves (n = 4) or other singers (n = 4) in the RC, but no singers reported the ability to hear both themselves and others easily in the RC. For the TC, eight participants’ comments alluded to a positive change in SOR for this position, whereas five comments indicated a negative change. Eight singers (62%) commented that the change to the TC from RC improved their ability to blend (e.g., “Easy to hear others, better blend”), and six singers (46%) also commented that they “pulled back,” “sang softer,” or “strained less” during the TC than in the RC. After the ACC, 11 singers (85%) noticed a positive change in SOR, with six of those 11 stating that the ACC had the best balance of self-sound to other-sound. One participant wrote, “I felt like I could hear myself, those around me, and the overall sound of the choir.” Four singers (31%) commented that they were either vocally or physically tired from the ACC process. However, 11 singers (86%) wrote that blend was better in the ACC or that they were more focused on blending with and listening to their fellow section members.
Listener Perception
Averages, standard deviations, and results from statistical analyses on quantitative listener survey data are displayed alongside singer data in Table 2. Average ratings for overall choral sound (1 = poor, 5 = excellent) were highest for the ACC at 3.95, followed by the TC at 3.76, and the RC at 3.45, and were significantly different (p ≤ .001). Listeners rated both ACC and TC significantly higher than RC (p ≤ .001 and .002, respectively). Ranking data revealed that 46 listeners chose the TC recording as their most preferred (44%), 39 chose ACC (38%), and 19 chose RC (18%). For least preferred, 47 chose RC (45%), 32 chose ACC (31%), and 25 chose TC (24%). These rankings were significantly different (p = .002) and listeners rated the ACC and TC significantly higher than the RC (p = .005 and .001, respectively).
Listener Comments
For qualitative listener data, I disaggregated comments into positive and negative categories, then separated the comments into categories based on emergent themes (Cook-Cunningham & Grady, 2018; Grady, 2014a; Grady & Gilliam, 2020). Figure 2 displays the number of both positive and negative comments from expert listeners for each condition according to categories derived from previous research and practice on choral configurations; namely blend or balance, intonation, timbre or tone color, vibrato, and vocal technique (e.g., breath support, vocal freedom, resonance; Ekholm, 2000; Giardiniere, 1991; Goodwin, 1980; Jordan as cited in Noble, 2005; Killian & Basinger, 2007; Molnar, 1950; Phillips, 2016; Reid et al., 2007). Other emergent themes included musicality (e.g., phrasing, expression, dynamics), vowel shape, diction, consonant alignment, rhythmic precision, and general comments (e.g., “my favorite”), and were fairly consistent across categories. Listeners wrote a total of 139 positive comments for the ACC, 119 positive comments for the TC, and 89 positive comments about the RC. For negative comments, listeners offered 141 for the RC, 123 for the TC, and 118 for the ACC.

Number of negative and positive listener comments according to category.
The category with the most comments was ensemble blend (n = 149), which closely related to listeners’ comments about timbre and vibrato. For the ACC, listeners wrote comments such as “one voice,” “voices all sound similar and like one blended section,” “even those with vibrato fit in nicely,” and “vibratos are not clashing in this one.” They also commented about “healthy, free” “well-supported,” and “resonant” vocalization for the ACC. Although there were also many positive comments about “unified,” “homogeneous” sound in the TC, many listeners noticed the separate timbral groups: “tone is less unified than in the first example—some voices seem darker, others more pointed,” and “disliked hearing individual tone colors sticking out.” Even though the recording conditions remained the same across conditions, some listeners described the TC condition as having a different recording quality, using words like “murky,” “distant,” and “further back”—a phenomenon from timbral grouping that was not mentioned in the other recordings. The RC had the most negative comments about blend and vibrato (n = 41), like “poor blend—lots of individuals popping out, mainly due to differing vibrato speeds,” “disliked the lack of blend due to excessive vibrato,” and “disliked hearing individual voices working against each other.” Some listeners liked the “bright” and “forward” tone quality of the RC, but some described it as “shrill,” “strident,” and “thin.” When asked why they selected a certain recording as most or least preferred, a total of 64% of listeners stated that blend was a factor in their decision making, and 59% said that tone color or timbre was a factor.
Discussion
In this study, I assessed the effects of three different intrasection singer configurations on acoustic and perceptual measures of a soprano section’s sound. The main findings included the following: (a) intentional singer configurations (ACC and TC) may enhance the spectral energy of a section’s sound, which leads to differences in loudness or timbre; (b) intentional singer configuration may influence overall pitch deviation; (c) singers prefer singing in intentional configurations, particularly ACCs, for reasons of increased ability to hear self and others and enhanced ease of singing; and (d) listeners prefer the choral sound of intentional configurations for various reasons including improved blend and tone quality. Statistical analyses revealed significant differences in preference when comparing intentional versus RCs, but seldomly between intentional configurations, suggesting that either type of purposeful configuration is better than a random one.
Long-Term Average Spectra
On average, the ACC produced double the amount spectral energy difference needed to be heard by a listener (JND = 1 dB) than the RC, and at times more than quadrupled the amount. The ACC also averaged a heard difference in spectral energy over the TC, ranging up to three times the amount needed to perceive a difference. The average spectral energy of the TC did not constitute a heard difference over the RC, but at times reached three times the amount needed. Moreover, the intentional configurations (especially the ACC) had significantly higher spectral sound energy in specific frequency ranges which may contribute to listeners’ ability to discern between choral sounds. Although LTAS data does not reveal which condition is acoustically superior, it does reveal heard differences between the three conditions, which may be recognized as differences in both perceived loudness and timbre (Monson et al., 2011). Expert listeners, who also heard these differences in loudness and timbre, substantiated these acoustic findings and preferred the recordings with the highest spectral energy. More research is needed to see whether spectral energy differences in certain frequency ranges produce a more desirable choral sound.
Although acoustic data showed increased spectral energy in the intentional configuration recordings, different singers and different conductor preferences may yield varying results. Woodruff (2002) found that choral conductors’ ideal vocal matches did not always align with acoustic data, so it is possible that our expert conductors’ matches were more a matter of personal preference than true “acoustic matches.” For example, initial interrater agreement between the three expert conductors for Molnar’s TC was 51%, and they commented that Molnar’s vague classification instructions caused difficulties in timbral categorization, especially when their personal understanding of an instrumental timbre did not exactly align with Molnar’s definition of the timbre in a vocal context. Molnar (1950) himself acknowledges these difficulties: “It is not easy to describe just what the [timbre] difference is, but [it] is there. With careful, intent listening, the conductor can learn to recognize the three types” (p. 48). The lack of initial agreement between conductors may be a result of Molnar’s ambiguous instructions, the conductors’ lack of familiarity with the participants’ voices, or perhaps different conductor preferences. However, since the interrater agreement increased to 95% after reclassifying participants with less distinct timbres, and 100% after utilizing Molnar’s method to check for misclassifications, the method did eventually yield three distinct timbre groups which the panel unanimously agreed were mutually exclusive.
During the ACC procedures, the expert conductors also informally commented that discerning acoustic differences between singer placements became difficult when six or more participants were singing at the same time, at which point it became advantageous to use Noble’s strategy of listening to three voices at a time. Vocal matches are a matter of conductor preference, and the conductor panel agreed that their chosen configuration was the most optimal for this particular group of singers. Unfortunately, there is no way to ensure that vocal matches are acoustically optimal when conductor preference plays such a key role in decision making. Researchers could examine configuration processes that are more straightforward in nature with clearer definitions, or processes that provide more replicable measures. Studies that involve a formal evaluation of conductor perceptions to gain further insight on their preferences and decision-making rationale would be beneficial.
Pitch Analysis
Although the soprano section in this study deviated in pitch in all three conditions, deviation for each condition was less than a quarter tone (approximately 50 cents) flat. It is possible that the order of conditions contributed to the gradual increase in out-of-tune singing for the TC and ACC, so researchers should evaluate these processes on different choirs in differing orders.
It is also possible that the length of placement processes was fatiguing for the participants, causing their pitch to gradually deviate over time. The ACC “model pair” sang the first phrase of My Country ‘Tis of Thee 105 times before the final recording, and in their comments, some singers mentioned that the ACC “process was long,” “the biggest negative is the time it takes,” “I was a bit tired from the placement singing,” and “my voice was a little rough this time around.” Although the process was repetitive, the entire session (including placement and recording procedures for all configurations) lasted approximately 37 minutes, less than half of the typical choral rehearsal duration at the university. To avoid fatigue from time-consuming processes, choral practitioners—especially those with larger ensembles—and future researchers should experiment with adapted or truncated versions of the ACC process, or configure singers on a designated day where additional rehearsal time is not needed.
Singer Perceptions
Singers preferred the intentional configurations (acoustic-compatibility and timbral). They sang with the most ease and preferred their sound in the TC, yet had the best SOR in the ACC and preferred it most overall. In general, differences between the intentional configuration ratings and rankings were not statistically significant. The differences between the intentional configurations and the RC, however, were often statistically significant, revealing that singers preferred singing in the intentional configurations over the RC. Findings from this study are specific to this university soprano section and may not be generalizable to other sections or choirs with different characteristics. Researchers should test various configurations on diverse groups of singers and voice parts, including groups with various ages, experience levels, genders, and nationalities.
Singers offered insight into why they preferred intentional configurations in open-ended comments. After each condition recording, singer participants answered the question, “What, if anything, did you change for this configuration?” After the TC, singers reported that they could sing softer and with less vocal strain than in the RC. Examples of these comments included “I did not have to sing as loud,” “I was less strained,” and “I pulled back a bit.” According to singer comments, the TC allowed participants to sing with a healthy, sustainable technique. After the ACC, singers reported an increased ability to listen to each other, and mentioned the ability to blend. Sample comments included, “I didn’t do anything differently through singing but I listened better,” and “I also felt like I didn’t use my full voice and was more focused on blending.” It appears that the ACC may have allowed singers to listen to each other more easily, which allowed them to blend while still singing with healthy technique. These findings are analogous to hypotheses that Noble’s procedures allow singers to produce a homogenous choral sound without sacrificing healthy and efficient vocal production (Ekholm, 2000; Giardiniere, 1991; Woodruff, 2002). The findings are also substantiated by listeners who commented that the ACC recording had a “well-blended, freely produced sound.”
Ternström (1999) found that singers prefer to hear themselves slightly more than other singers in the choir (SOR). In addition, Tonkinson (1994) found that singers tend to compete vocally with surrounding noises, including the voices of surrounding singers, due to the Lombard effect. Improved SOR helps reduce over-singing which often occurs due to the Lombard effect. Singer participants reported a general ability to hear others well in all three conditions. However, singers’ ability to hear themselves was highest in the ACC and lowest in the TC. Singers reported feeling less obligated to over-sing or to strain their voices in the ACC, most likely due to the improved SOR. Because the TC places voices with similar timbre directly next to each other (Molnar, 1950), and ACC tends to place voices that are “opposites” next to each other (Noble, 2005), singers can more readily discern their voices from others’ in the ACC. This ability to hear oneself slightly louder than others in the ACC may have reduced over-singing and contributed to the majority of singer participants rating the ACC as their most-preferred configuration. This finding is of interest because the ACC recording also had the most spectral energy (a typical characteristic of quality soloistic sound), yet the singers reported enhanced ability to blend and reduced over-singing in the ACC (typical desired qualities of choral sound). Additional research with larger sample sizes is needed to collect data on singer SOR in various choral configurations.
Although ACC seemed to benefit singers in this study, Daugherty (2001) observed that acoustic-compatibility procedures in the studies of Ekholm (2000), Giardiniere (1991), and Tocheff (1990) had limitations due to their lack of universal application and replication. He suggested that acoustic-compatibility procedures may have pedagogical application if singers are also involved in choosing their optimal acoustic configuration, and Noble (2005) also mentioned that consulting singers in the process may yield positive results. Future studies could evaluate the differences in sound between conductor-placed sections and sections where singers have input on the configuration procedures. Additionally, while there are replicability limitations to Noble’s acoustic-compatibility procedures, each study to date reported positive benefits of the method. Because consistent replicability in the final product may be unachievable due to variability in conductor preferences, it is possible that the benefits from this configuration come from the process of arranging singers, rather than the final product. Researchers should test this theory to see if the positive benefits remain when the configuration procedures are process driven, rather than product driven.
Listener Perceptions
Listeners assigned the highest ratings and rankings to the intentional (ACC and TC) configurations. There were no significant differences, and therefore no clear preferences, between the two intentional configurations, but the RC was rated and ranked significantly lower in all cases. Listeners preferred the ACC and TC recordings for a variety of reasons including blend and tone color/timbre. The lack of agreement between listeners in the current study may be due to listener disagreement on components of ideal choral sound. For example, Killian and Basinger (2007) observed that listener participants had less consistency in evaluating blend for soprano and tenor sections. They discussed the general lack of concurrence about the term “choral blend” and concluded that one confounding variable might be that listeners confuse “vocal tone quality” for blend. Future studies could involve listeners rating the quality of specific aspects of choral sound like blend and tone quality before rating overall choral sound. Although listeners disagreed about which intentional configuration was best, there was agreement that intentional configurations yielded better overall sound than the RC.
An unforeseeable limitation occurred during the recording process when one participant quietly sang an octave below the written key for the beginning of the RC recording. She soon corrected herself and sang in the notated octave for the remainder of the recording procedures. In comments about why they liked or disliked the RC recording, 17 listeners (16%) mentioned hearing this mistake (e.g., “There was an octave lower happening quietly in the background,” and “Was someone singing an octave down for the first few measures?”), but these listeners did not significantly affect the ratings. I intentionally did not have the students practice Amazing Grace before the first recording in an attempt to control for practice effects; however, researchers in future studies could limit potential error by providing a practice session on a day prior to the recording session.
It is important to note that these configurations are contrasting both in theory and practice. Each participant stood next to different singers in the RC and ACC, and only two singers remained adjacent between the RC and TC. When comparing the ACC and TC, one group of three “reed” singers and two pairs of “string” singers remained in the ACC, but 46% of participants were surrounded by singers with different timbres. The fact that singer configuration was different between each condition helps explain not only the listeners’ and singers’ ability to discern differences but also the significant differences in spectral sound energy.
Results of this investigation showed that different configuration methods can significantly affect acoustic and perceptual measures of a soprano section’s sound when controlling for the variable of singer spacing. Daugherty (2001) and Adams (2019b) suggested that singer spacing affects choral sound more than singer configuration alone, yet most of the existing research they analyzed did not account for spacing when arranging singers. Results of this study demonstrate that, when singers are evenly spaced at approximately two feet apart, configuration within choir sections can significantly affect acoustic and perceptual measures of overall choral sound. Because every choir is made up of different voice types and every conductor has a different concept of ideal choral sound, it is difficult to say that a certain configuration method will be the best for all choirs. It is important to mention, however, that intentionally and purposefully arranging singers did have positive effects on spectral sound energy, as well as singer and listener preferences when compared with the RC where singers had no intentional arrangement. Although further research is still needed, choral conductors and pedagogues can use intentional configuration methods and procedures to foster a choral environment where singers can phonate freely and with ease while hearing both themselves and the singers around them to create a desired, homogenous choral sound.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
