Abstract
Introduction
Aesthetic experience, with its observable aspects and phenomenal qualities, has been the focus of philosophical aesthetics since its beginnings (Nadal & Vartanian, 2022). In recent years, the interest in aesthetic experience has acquired new momentum. No longer is aesthetic experience reduced to a recipient's cognitive understanding of the artwork. Experiential and phenomenal aspects are increasingly considered in addition to, or even instead of, cognitive appraisals. Shusterman (1997) identified four dimensions discussed within this wider, experiential context: The evaluative dimension addresses aesthetic experiences as associated with aesthetic judgments of being pleasurable and valuable; the quality of “what it is like” to have an aesthetic experience constitutes the phenomenal dimension as discussed in the philosophy of mind (Nagel, 1974); aesthetic objects have meaning and therefore a semantic dimension; finally, aesthetic experiences are discussed in relation to other types of experience, entailing a demarcational-definitional dimension. Rather than defining aesthetic experience by a specific attitude adopted by the experiencing subject, or by properties of the aesthetic object, the subject–object interrelation during aesthetic experiences is increasingly emphasized (Bertram, 2014). Encountering an artwork, the recipient develops specific bodily, perceptual, affective, and linguistic interpretative activities guided by relations of elements within the artwork; in a self-referential turn, the constellation of elements defining the artwork is articulated by means of these interpretative activities (Bertram, 2015).
Aesthetic experience is thus conceptualized as practice, which in the case of music may involve embodied experiencing through bodily movements, singing along, and affective responses. Becker (2007) viewed musical experience as a process of reenactment (Nachvollzug), which is not only characterized by its focus on the musical work but more importantly also by a recourse to extramusical conceptions. These may be articulated by involving actual or imagined bodily actions or gestures. Musical experience therefore entails a tension between music-directed focusing and accessing extramusical resources.
In the field of empirical aesthetics, several models of aesthetic experience have been developed (Berlyne, 1960; Brattico et al., 2013; Fechner, 1876; Leder et al., 2012; Leder & Nadal, 2014; Pelowski et al., 2016). None of these, however, have specifically addressed concert experience. Little attention was paid to the specific properties of the concert environment (the “concert frame,” Wald-Fuhrmann et al., 2021) and other situational contexts in which aesthetic stimuli are perceived (such as listening to music in company or alone, listening to live performances or recorded music). Even in Leder and Nadal's (2014) complex model, which conceptualizes various influencing and reciprocal factors, the interdependencies of these factors have remained vague.
Approaches considering the relations between listener, music, and context have so far focused merely on single aspects of aesthetic experience, such as musical preferences (Greasley & Lamont, 2016; LeBlanc, 1982; North & Hargreaves, 2010) or on music-induced emotions (Zentner & Scherer, 2001). A comprehensive account covering the complexity of phenomenal aspects of musical experience and how experience emerges from the association of subject- and object-related aspects and their situational and sociocultural framing is still a desideratum.
In general, there is a growing correspondence between recent approaches in philosophical aesthetics and approaches that, informed by the cognitive sciences and explicitly relying on empirical data, emphasize the roles of the body and physical environment for musical experiencing (Cox, 2016; Krueger, 2009; Schiavio et al., 2014). This corporeal turn in (music) psychology is often termed embodiment (Tschacher, Greenwood, et al., 2024; Tschacher, Tröndle, et al., 2024).
Empirical research on the experience of, and physiological responses to, music has become an important research field in psychology and systematic musicology, and has mainly focused on music- and person-related factors. Situational and framing factors were deemed less relevant than measures that have been monitored predominantly in the laboratory; only few studies have been conducted in ecologically valid concert settings (e.g., Egermann et al., 2013; Høffding et al., 2024; McAdams et al., 2004; Stevens et al., 2014; Tschacher, Greenwood, Egermann, et al., 2023).
The much-cited theoretical framework aiming at emotional responses to music by Juslin and Västfjäll (2008) distinguished different theories of emotion induction and applied these to music. They listed these psychological mechanisms for the induction of emotion: brain-stem reflex, evaluative conditioning, emotional contagion, visual imagery, episodic memory, and musical expectancy. Only later, Juslin (2013) proposed a further emotion-induction mechanism that he termed aesthetic judgement – when music is perceived in an artistic frame like a concert, aesthetic judgement is triggered based on criteria such as beauty, expressiveness, originality, skillfulness, and typicality. This approach was recently advanced by Schindler et al. (2017), who developed a philosophically informed and empirically grounded account of aesthetic emotions and provided scales for measurement.
Recently, Stevens et al. (2014) have carried out focus-group interviews to analyze the effect of varying locations in several concert environments, calling for a combination of different approaches in audience research. Given that questionnaires and self-report measures are the traditional methods of concert researchers, they promoted the future use of physiological data, brain scans, and observational data, since “recording a range of indicators of audience response to a live performance will shed light on the interplay between sensory modes such as vision, audition, kinesthesis, and changes in physiological arousal” (Stevens et al., 2014, p. 85).
We agree with this point of view and assume that embodied physiological and emotional states are insufficiently represented by the verbal reports of standardized surveys (Barsalou, 2008; Behne, 1982; Clarke, 2005; Tröndle et al., 2014). Only a combination of different data types, including listeners’ peripheral physiological responses, body movements, and facial expressions, can provide a holistic, integrated view of experiential processes. Consequently, and to include the mentioned additional data types, we are proposing an integrative approach to studying music experience in concerts. The methodological framework described in this article was designed to realize comprehensive data acquisition.
In an earlier project, we developed a research setup to study art experience in a museum context (eMotion – mapping museum experience 1 ). The rationale of this methodology in a different realm of aesthetic research was a building block for the methodological framework presented here covering concert experience. An integrated research approach provides novel insights and a holistic understanding of aesthetic experience. In the following section, this methodological design adapted to concert research will be introduced in detail.
This methodology was developed to address the physiological, experiential, and behavioral dimensions of aesthetic experience. Its goal is to collect dense information on listeners’ cognitive responses to the music, such as its expressive, artistic and performative qualities, as well as self-reported responses to the music in the situation (experiences, emotions). Data are captured as simultaneous time series, which opens the potential to model phenomena of entrainment and synchrony. We consider aspects of the concert frame (spatial and architectural disposition, room acoustics, the aura and reputation of a venue) and listeners’ relating to the presence of others (being part of the audience, relation to musicians). This methodological approach was elaborated to cover three different timescales: the whole concert (timescale 1), the pieces presented in the concert (timescale 2), and specific short music passages (timescale 3).
The main objectives of the methodological framework are therefore to implement data acquisition under the field conditions of real live concerts open to the public. Guarding ecological validity and low invasiveness were major tenets of this framework. As it is a common concern that the fact of measurement may alter the very data measured (the “Hawthorne effect”), we planned to include a control group which would fill out self-report scales yet not provide physiological recordings. A further objective was to allow for the integration of data sources by triangulating objective and subjective data. This demanded exact timestamping of the various data sources in order to merge self-report with physiological measures in the post-concert survey.
Methods
The research design involved the integration of quantitative and qualitative methodologies to assess the behavior, physiology, and self-reported aesthetic-emotional experiences of concertgoers within live classical music concerts. The methodological framework was tested in 3 public pilot concerts and finally implemented in 11 public concerts of the large-scale project ECR–Experimental Concert Research. 2
Overall, quantitative data acquisition in the project included the collection and analysis of physiological measurements of electrodermal activity, cardiac activity, and respiration measured by sensors attached to one hand and a breathing belt. Videos were recorded of the audiences and the music ensemble from different perspectives and formatted to allow motion-capture techniques and detection of facial expressions in the audience. High-quality audio recordings of the concerts were collected to enable audio analysis. Self-reports of listeners were assessed through questionnaires administered by electronic hand-held devices in standardized pre- and post-concert surveys.
Qualitative research methods complemented the wide range of data acquisition methods; these included focus-group discussions with audience members after the concerts and interviews with the musicians. Further, the video recordings of the audiences were also accessible to qualitative analyses.
In the following subsections, we will first introduce the participants’ journey through the various stages of data-collection procedures. Second, we provide an insight into the technical set-up to enable the methodology to be reproduced.
Participants’ Journey
Approaching Attendees. In 2022, 11 music concerts, 2 in the venue “Pierre Boulez Saal” and 9 in the “Radialsystem,” both in Berlin, were advertised in the local press, radio broadcasts, posters, social media, mailing lists and on the concert hall websites. Interested persons could book tickets for a concert either as a regular visitor of the concert or as a participant of the study. 100 seats per concert were reserved for participants. Depending on the seat randomly chosen in the online booking system, participants were allocated to one of 88 seats prepared for physiological data collection or to 12 seats of the control group without physiological recording. Basic study information (“this research addresses concert experience using questionnaires and optional physiological recordings; all personal information will be coded and archived without personal identifiers”) was provided by the booking websites, but no details on the research questions or methodology were given. Over the course of the 11 concerts organized in April and May 2022, close to 700 participants were included in the study.
Pre-Concert Survey. On the concert night, participants arrived at the concert hall one hour before the concert, as instructed by the booking system, registered with their tickets and were escorted to a seat and table prepared for the pre-concert survey with questionnaires (Figure 1). About 75% of the concert audiences consisted of study participants. Due to the specific regulations in Berlin in 2022, Covid-19 rapid self-tests were offered to validate participants’ negative infection status, enabling participation without a face mask. All received the following materials: an information sheet and data privacy statement for informed consent to participate and a unique five-character token as a subject identification (e.g., 9eafw); the token was pre-generated from a random pool of singular tokens for the entire concert series. The token was pre-printed on a badge worn by the participant during the concert evening. Each participant was equipped with a tablet computer (Apple iPad) for questionnaires.

Questionnaire hall in Radialsystem, Berlin, 2022. Photo: Phil Dera.
After signing the consent form and receiving the token, each participant was then assisted on how to fill out the pre-concert questionnaire on the provided tablet computer. Assistants were trained in advance concerning all procedures and the sequence of methodological steps.
After completing the pre-concert questionnaire, participants proceeded into the concert hall. They were greeted by further assistants who showed them to their individually assigned seats with sensors, and control-group participants to their seats in the allocated row (Figure 2). Regular visitors independently entered the hall right before the concerts.

Participant journey – pre-concert and in-concert.
Sensor Hardware and Placement. The participants of the physiological-analysis group were provided at their seats with a custom-developed fabric glove with two electrodermal activity (EDA) electrodes. Assistants attached these to the middle and ring fingers. A blood-volume pulse (BVP) clip for recording cardiac activity was fixed to the index finger. Additionally, a piezo-electric elastic belt to measure respiration was placed around the waist over the clothing (Figure 3). All devices and sensors were manufactured by Biosignalsplux (PLUX Wireless Biosignals SA, Lisbon, Portugal).

Custom-developed fabric glove with sensors for electrodermal activity (under the black cloth around two fingers), cardiac activity (gray finger clip), and black respiration belt (over the waist). Photo: Phil Dera.
After fixing the sensors, a sensor-placement check was carried out. Using the dashboard of the developed web interface, the project engineer monitored each participant's incoming real-time data. Each of the three sensor types (for EDA, BVP, and respiration) were individually checked. Typically, a flat signal indicated that sensor placement was insufficient and needed better fixation by the respective assistant. Common issues also included loose respiration belts and loose BVP clips on thin fingers. The sensor-placement check greatly contributed to the physiological data quality. After the completed sensor placement, the musical program of approximately 70 min started.
The Musical Stimuli. The concert program consisted of string quintet pieces by Ludwig van Beethoven, Brett Dean, and Johannes Brahms. The music program remained the same between concerts, and the music was performed as similarly as possible. Performers in a concert were one of two professional ensembles (Yubal Ensemble or Ensemble Epitaph). One project goal was to trace back measured differences in the experiences of audience members to the respective variations of the concert frame. In each concert, one aspect of the concert format (Wald-Fuhrmann et al., 2021) was modified. The concert frame varied with respect to the performing ensemble, the concert venue, the type of moderation leading through the concert, the lighting of the stage, the spatial sound enhancement, the changed dramaturgy, and the sequence of pieces (Tröndle, Weining, Uhde, et al., 2025). One concert was a late-night event, another concert invited audience participation, and one concert was accompanied by video images. The musical program and the concert variations were developed by the Radialsystem concert planner and mirrored contemporary concert practice in the Western classical music world.
Post-Concert Survey. With the applause at the end of each concert the physiological data recording was stopped. First, regular visitors not participating in the study left the hall and venue, whereas study participants were asked to remove sensors if applicable and return to their desks outside the concert hall, where they had initially filled out the pre-concert questionnaires. Meanwhile, the physiological data of the participants were processed using a custom-developed peak detection algorithm (see below). The control-group participants also went back to their desks prepared for the post-concert questionnaires. All participants were asked to wait until the peak processing was completed (Figure 4).

Participant journey – post-concert.
As soon as peak processing was finished, participants started filling out the questionnaires provided on the tablets. They also received headphones because short music segments together with video sequences (timescale 3) were presented as part of the post-concert questionnaire (see below, Questionnaires). Some of the regular visitors and musicians were invited to take part in group interviews.
Questionnaires. The questionnaire of the pre-concert survey asked participants about sociodemographics (age, gender, education), their motivation for visiting classical concerts, frequency of concert attendance, whether they had attended the present concert in company or alone, their music affinity (knowledge of and relation to music; Tschacher et al., 2015), musical taste (listening frequency of various genres), lifestyle (Otte, 2019), personality traits (Big-Five Inventory BFI-10; Rammstedt & John, 2007) and affective state (affect-valence scale PANAVA, Schallberger, 2005).
The post-concert questionnaire assessed participants’ listening experiences and cognition during the concert (Behne, 1997; Rössel, 2011; Weining, 2022). To analyze the difference between pre-concert expectations and in-concert experiences, a group of corresponding items were entered both in pre- and post-questionnaires. The participants were asked to rate the concert as a whole (timescale 1) and also each musical piece separately (timescale 2) regarding their aesthetic experiences. The questionnaires will be made available online.
Furthermore, timescale 3 referred to short music segments representing specifically salient passages of the concert. The music of the just-completed concert was divided into 96 segments (mean duration, 40 s). These 96 segments were predefined by two of the authors with a strong musicological background to represent a musical unit, following the compositional inherent logic. Compared to language, each segment contains a single “phrase” or “sentence,” so that durations of segments differed slightly to represent meaningful musical entities. Out of the 96 segments of each concert, eight segments were presented to each participant in the post-concert questionnaire. Three segments (one for each piece) were so-called index segments, which were pre-determined and addressed the same passage throughout concerts for all participants. One segment was selected randomly for each participant. Four further segments were individualized for each participant based on their peak physiological responses.
Participants were asked to evaluate each of these eight segments using a stimulated-recall method. Using headphones and the tablet computer, the participants were presented with the video snippet showing this segment exactly as it was played in the concert just attended (Figure 5). After presentation of each video, participants rated the segment using items of the Aesthetic Emotions Scale (AESTHEMOS, Schindler et al., 2017).

Presentation of a video snippet to participant on a tablet computer. Photo: Phil Dera.
Follow-up Survey. After finishing the questionnaire, participants were invited to a six-week follow-up survey. Consenting participants provided their email addresses and were sent a link later. Follow-up participants were asked four open questions concerning their recollection and emotional assessment of the concert. The questions were developed in analogy to the eMotion project (Tröndle et al., 2014).
Qualitative Interviews. For the sake of methodological triangulation, qualitative interviews were conducted. Right after the concert, regular concert visitors were invited to participate in focus group interviews on their concert experience. A maximum of 3 visitors were assigned to one of 10 interviewers on the premises of the concert hall. The guided interviews took about 20 min. Additionally, the musicians were interviewed by a team of two researchers in a further focus group or individually after all concerts to assess the performers’ experiences.
Technical Set-Up
Hardware, Software, and Informational Infrastructure. The methodological objectives of the project were to establish stable and effective data collection while providing for a high degree of ecological validity. The project had several hardware/infrastructural similarities to the prior project eMotion (Tröndle et al., 2014). Importantly, the present project required the simultaneous collection and analysis of up to 88 participants’ physiological datasets in near real-time, for the ratings of segments in the post-concert questionnaire. Decentralized processing was thus an appropriate architecture. This was achieved by employing 88 small Raspberry Pi single-board computers (Raspberry Pi4 Model B, 4 GB; Raspberry Pi Foundation, Cambridge, UK).
Up to 88 participants had their electrodermal activity, cardiac activity, and respiration recorded while listening to the music. Only a few minutes after the end of the concert, hence also the end of recording, participants would take part in the questionnaire, which included video snippets of exactly those four “peak segments” during which each participant had shown peak physiological values. This demanded two processes running in parallel after the concert: First, the video recording of the concert had to be segmented to produce all 96 video snippets; hence, an exact logging of segments with the timestamps of segment starts and ends was needed during each concert. Second, from the raw physiological data of the sensors, meaningful physiological measures had to be extracted, namely heart rate (HR), heart-rate variability (HRV), respiration rate (RR) and skin-conductance response (SCR), to define the four peak segments. Additionally, the three index segments and one random segment had to be determined (Figure 6).

Plan of the informational infrastructure.
Physiological Recording. Each of the participants with physiological recordings were attached to sensors manufactured by the company Biosignalsplux (Figures 7 and 8). Sensors registered three physiological signals: EDA, BVP, and respiration. The EDA sensors provided a medical-grade output of skin conductance data, a direct measure of the activity of the sympathetic nervous system. The BVP finger-clip sensor measured changes in the arterial translucency using a light emitter and a light detector built into the clip housing. BVP provides information on cardiac activity. The respiration sensor consisted of a wearable belt with an integrated sensing element measuring displacement caused by the volume changes of the thorax or waist during respiratory cycles (inhaling/exhaling). Sensor data were acquired at 200 Hz using a Python script. The choice of these sensors was considered to be an appropriate balance between acceptable invasiveness and capturing physiological activity validly in a field situation.

Custom Biosignalsplux 8-Channel Hub.

Seats in the concert hall prepared for participants with physiological measurement.
The set of three sensors was then plugged into a Biosignalsplux 8-channel hub (Figure 9). This modified hub had an extra micro-USB port that was attached via a shielded cable to the port of each participant's Raspberry Pi. The Raspberry Pi’s were connected via Gigabit Ethernet to the wired network. A wired network was chosen as it provided the least latency while ensuring higher reliability than a wireless network connection. The hubs could be programmed to start/stop a data acquisition and thus stream the live data from the sensors to the attached Raspberry Pi.

Unit holder for Biosignalsplux Hub linked with sensors (left) and Raspberry Pi (right).
The 88 Raspberry Pi computers were used to interface with the hubs. They acquired data via the USB connection, stored the data in an Apache Kafka topic (free software of Apache Software Foundation, Wakefield USA), and generated peaks from the stored data.
A project-developed glove held the three sensors close to the participant's hand. After pilot studies, we decided to manufacture gloves for several reasons: The cables could be integrated into them, thus were fixed to reduce failures and artifacts in data collection. Participants were asked to not move the hand much and rest it on the leg. As the sensors ruled out giving applause by clapping the hands, the team recommended to show appreciation by stamping with the feet. This was welcomed without problems by all audiences in the project concerts.
Video Recording. In the concert hall, 100 seats were provided with Ethernet cabling. Each row was connected to a 24-port switch. A further 14 Raspberry Pi computers were connected to the cameras (Figure 10). Raspberry Pi’s were attached to the trusses above the participants to record the video material from the overhead view. The overhead “birds-eye” cameras also had Ethernet access. The questionnaire room and the installation were cabled and were additionally provided with wireless access points for the tablet computers. The overhead cameras video-recorded the complete audience for ensuing motion capture; they were infrared-enabled cameras that worked without much light directed at the audience.

Top: Birds-eye infrared cameras connected to Raspberry Pi’s. Bottom: Stills taken from the birds-eye perspective (participants; ensemble).
Two Raspberry Pi’s, each with a 12 megapixel (12 MP) camera module and a USB sound card, were used to record the musicians on stage. The resulting videos including the timestamp (Figure 11) were dynamically integrated into the segment ratings of the post-concert questionnaire.

Still taken from stage camera.
High-resolution infrared cameras (Geutebrück GmbH, Windhagen, Germany) were mounted above the stage toward the audience capturing the participants’ facial expressions and gestures (Figure 12; four cameras at the Pierre Boulez Saal; eight at the Radialsystem). This data was recorded with the same timestamp as was used for all other recordings (using Clapperboard software) and stored on a camera-specific server.

Still taken from the audience camera.
Servers and Dashboard for Supervision of Recordings. Two Lenovo ThinkStation P520c Tower Xeon W-2225 servers were implemented, which possessed 32 GB/512 GB SSD memory. One was used as the live server, whereas the second, identical to the first machine, was a standby in case the first machine failed. The server's main task was to host the informational infrastructure, thus providing a DNS server, timestamp server (chrony), web server (Nginx), streaming system (Kafka), as well as hosting the questionnaires, videos, and dashboard. The most demanding process within this infrastructure, as mentioned, was the processing of the participants’ segment-wise physiology (timescale 3) immediately after the concert performance finished. Instead of allocating this task to the server, it was decided to move this task to each of the 88 Raspberry Pi’s, thus distributing the processing tasks and successfully avoiding bottleneck problems.
The basic software requirements included presenting the questionnaires on the tablet devices; data acquisition and storage of the sensor data; processing of the sensor data and storage of segments; and the recording of the segment videos and audios and their individual integration into each participant's post-concert questionnaire. All datasets received the timestamp of the central server to synchronize the cross-analyses.
The open-source, online-survey software LimeSurvey (LimeSurvey GmbH, Hamburg Germany) was used due to its flexible customization possibilities needed in the project to enable dynamically integrating the segment videos into questionnaires. The questionnaires used a standard LAMP stack (LAMP stands for Linux, Apache, MySQL, PHP). Export of the questionnaire data in numerous formats for ensuing analysis was possible.
The starting and stopping of the data acquisition of all devices was controlled via the dashboard (web interface of this application: React.js/Node.js/MongoDB). The software had two main pages in its navigation, the Devices and the Logger pages.
On the Devices page, the dashboard operator could start and stop the recording of the Raspberry Pi’s of 88 hubs, 2 stage cameras and 14 birds-eye cameras. Lastly, the peak segment processing was also manually started from the dashboard once all devices had stopped their recording and the stage-camera videos had completed conversion to 4 megapixels resolution, the format for participants’ tablet computers. On this page the status (recording/standby state and error messages) of all accessible devices was presented. Real-time streams of the physiology sensors and video streams of stage and birds-eye cameras could be monitored (Figure 13). Unix timestamps (milliseconds) served as a basic clapperboard that displayed the current time in text. The devices page was also used for the Geutebrück cameras to sync their videos with the other data.

Monitoring of physiology sensors on the Devices page of the dashboard.
The Logger page located in the dashboard allowed for the manual data entry (logging) of events that took place during the concert, most importantly the start and end of the 96 predefined music segments as well as the timing of pieces and movements within the pieces. All timestamps were in Unix time. As the concert progressed, the operator clicked through the segments represented as buttons (Figure 14), thus logging each segment as it was performed by the musicians in the current concert. The logged segmentation was essential for the segment rating in the post-concert questionnaire.

Logging of segments and events using the Logger page of the dashboard.
Logging determined the unique position of each of 96 pre-defined segments across all pieces (Beethoven, Dean, Brahms). If the pieces were performed in different sequences in a concert, the segment could thus still be identified. The start and finish properties represented the start and finishing times of the segment in Unix time.
Peak Processing. The participants wore sensors capturing BVP, EDA, and respiration. Using the BioSPPy library (https://github.com/PIA-Group/BioSPPy), these raw signals were converted into the more salient HR, SCR, and RR signals. The BVP signal was additionally used to compute HRV with the RMSSD procedure, using an interval of 30 s. The four signals, HR, RR, SCR, and HRV, were smoothed with a median filter over a 15-s sliding window before further analyses were performed on them.
After the testing in pilot concerts, we decided to define each participant's peaks in relative terms, and not exclusively use the 15-s sliding window for peak detection. Thus “peaks” were not necessarily those moments in which this participant's extreme value of the entire concert was reached, but moments that stood out within the local context of the respective movement. The reason behind this decision was to not favor segments located in the more “exciting” or high-tempo movements from the start. Correspondingly, peak segments of HR, RR, SCR, and HRV were those four segments of each participant that showed the greatest deviation from their respective movement averages of HR, RR, SCR, and HRV. Detection of peaks was thus performed in segments within the same movement.
Results
In this section, we will first sketch the main findings that have resulted from implementing the presented methodological framework to date and also point to analyses suggested or enabled by the data. Subsequently, we will address the challenges and the benefits that have come to the fore when the methodological framework was implemented.
Main Findings and Analyses
Pre- and Post-Concert Surveys. Using the self-report data of the questionnaires, statistical analyses compared the expectations before to the experiences after the concerts, which generated visitor typologies with respect to visitors’ motivations (Tröndle, Weining, Uhde, et al., 2025; Tröndle, Weining, Wald-Fuhrmann, et al., 2025). Questionnaire data also suggested distinguishing between visitors’ varying listening modes (Weining et al., 2024). Such dimensions of aesthetic experience and attitudes were established via exploratory and subsequent confirmatory factor analyses of the groups of items on aesthetic judgment and experiences contained in the post-concert questionnaire.
Qualitative interviews. The analyses of the qualitative interviews were rooted in the Grounded Theory framework. Interviews were transcribed, then checked by a third party and corrected. Two coders started to generate potential categories, which were adjusted recursively. The software MAXQDA (2022, version 22.6.1) was used, and the responses were individually coded. This procedure was applied to the qualitative interviews directly after the concert, and likewise to the four open questions of the follow-up survey based on e-mail (Böndel et al., 2025).
Physiological Data. The physiological time series of each participant's five electrodermal, cardiac and respiratory signals (SCR, HR, HRV, RR, RESP) were pre-processed to accord with the demands of the ensuing analyses of audience synchronies. Time series of the signals with sampling rates of 1 and 10 Hz were used to assess such synchronies, defined as the significant coordination of participants’ physiological activation induced by the music (Tschacher, Greenwood, Egermann, et al., 2023; Tschacher, Greenwood, Ramakrishnan, et al., 2023). The estimation of synchrony was realized by software for surrogate synchrony (acronym SUSY), available as R packages (Meier & Tschacher, 2021; Tschacher, 2022). The emergence of significant physiological synchronies was found in all signals except breathing behavior (RESP), both on the basis of complete concerts and each of the three pieces (timescales 1, 2). This finding and the associations between the individual tendency to synchronize and participants’ personality traits, affectivity, and aesthetic experience were replicated across concerts (Tschacher, Tröndle, et al., 2024).
Segment Data. For all concerts, team musicologists logged the music and identified the exact timestamps of 96 pre-defined music segments (timescale 3). As part of the post-concert questionnaire, in a stimulated-recall method each participant was presented with video and audio snippets of eight segments and rated their experience and assessment of these moments in the concert. Analyses of physiology–experience associations during music segments were performed (Tschacher et al., in print). Multiple associations between aesthetic experience and physiological measures were detected in this large sample of almost 700 participants, who provided thousands of specific ratings on music segments. For example, associations were that music segments experienced as more beautiful were accompanied by higher sympathetic activations of HR and RR and lowered HRV. This was also true for assessments of being impressed and physically stimulated by the respective music excerpts.
Infrared Birds-Eye Cameras. The video recordings of birds-eye cameras were used to detect body movement of participants during the music presentations. The video-based approach of motion energy analysis (MEA: Ramseyer & Tschacher, 2011) operationalizes the extent of body movement by the numbers of frame-to-frame pixel changes of a video. Pixel changes are quantified continuously within specific regions of interest of the video and thus can be allocated to each individual participant of the audience. The resulting movement time series were used to estimate movement synchrony among participants (Tschacher, Greenwood, Ramakrishnan, et al., 2023). The analysis of synchrony was again computed by software for surrogate synchrony SUSY in R. Motion capture also allows assessing the amount of movement, or bodily unrest, during segments of the concerts in ensuing analyses.
Acoustic Music Information. Sound recording was obtained of all performances using two condenser microphones in an A-B arrangement. At each venue, the recording arrangement remained constant throughout concerts. The audio data was transformed into audio features common in the research field of music information retrieval (MIR, Lerch et al., 2020). These features capture acoustic properties of music and musical/sonic perception from the audio data. Out of the large collection of audio features, meaningful associations between objective features and subjective responses can be studied. Features often analyzed are loudness, tempo, and measures of the frequency spectrum. These audio features, computed using the Python library librosa (McFee et al., 2015), are represented as time series, which can be aligned with physiological time series of listeners to compute feature–physiology synchronies and coordination (Tschacher et al., in preparation).
Infrared Front Cameras. The images of the front cameras are analyzed both in a quantitative and a qualitative way. For the qualitative analyses, a coding system was developed that included all types of relevant movements and actions in the audience. Coders annotated a selection of the concert videos according to such movements using the software annotation tool ELAN (Max Planck Institute for Psycholinguistics, https://archive.mpi.nl/tla/elan9). For each participant, the exact beginning and end time of a movement together with the type of action was documented. From annotations, the frequency and timing of movements becomes available to study how listeners’ movements reflect musical affordances, or to distinguish patterns of group dynamics. The resulting annotations can be analyzed both in qualitative and quantitative approaches.
In the context of the quantitative analysis of facial expressions, the regions of interest were identified first. A Python script was then used to crop individual videos of all faces that allowed detection of facial expression (in the present dataset, 537 faces were selected; participants with masks or reflecting glasses were excluded). Many participants were recorded by more than one camera, so the best video quality for each face was identified, and duplicates excluded. Ultimately, a total of 303 faces were analyzed using the software iMotions Affectiva (Figure 15). Based on the Facial Action Coding System (Ekman et al., 2002), Affectiva monitors muscle movement in 23 different facial action units. Seven basic emotions and five complex emotions can be derived from combinations of action units. With the background of the questionnaire data and physiological measurements, it is possible to investigate the connection between the emotional facial expressions displayed by participants during the concert and their ratings of aesthetic emotions as well as the physiological recordings (Herget et al., 2023; Weth et al., 2015).

Automated facial expression analysis using Affectiva Software. Rectangles show “regions of interest’,” i.e., participants’ faces included in the analysis (green: included faces).
Art installation. As an extensive and complex dataset was recorded in each concert of the project, we considered it would be rewarding for participants to view their own data flows after the concerts and compare their data to other participants’ data. The installation artwork Prāna created an artistic impression of the audience's continuously changing physiological responses. The installation was visible on a large screen after completing the post-concert questionnaires, and each participant received a print-out of the display including all participants’ respiration time series in grayscale, whereas the participant's own personal respiration was colored (Figure 16). The artist Chandrasekhar Ramakrishnan described the display as follows: “Prāna is the Sanskrit word for breath. But it has connotations beyond just the physical act of breathing. Prāna is the unification of body and spirit. This work searches for prāna in the experiences of concertgoers. An inspiration for this work is the research of jazz drummer Milford Graves. He saw the heart as the central carrier of music, independent of culture and musical tradition. Whereas Graves examined the creation of music, this work is about its reception, and breath plays the central role. In the following scenes, the breathing of the concert participants, considered over three different time periods (piece, movement, moment), is represented abstractly as image and sound. Perhaps recurring patterns will emerge, or perhaps the experience of a concert is so individual and unique that no patterns will be visible. The sequence runs for about ten minutes.”

Printout of the artwork Prāna.
Challenges
The methodological framework constituted a large array of variables encompassing multiple components of subjective emotion and experience relevant in music listening (Scherer, 2005): cognitive appraisals (in the present methodological framework, questionnaire data on appreciation), subjective experiences (questionnaire data on experiences), physiological arousal (continuous physiological monitorings), and expressive behavior (camera recordings of musicians’ and participants’ movements, applause intensity and duration). Whereas in controlled laboratory experiments most of these components may be reliably accessible, the implementation in field contexts such as public concerts posed considerable challenges owing to the large sample of participants whose data were recorded simultaneously.
Thus, the technical development of the software, hardware, and infrastructure was demanding. Although a variety of devices for physiological data collection in research settings is available, problems were encountered related to internal time processing, synchronization with external sources, and the connectivity of the devices for real-time data analysis. We therefore collaborated with the producers of Biosignalsplux to adapt their devices to our needs. As described in the Methods section, a modified infrastructure was necessary to arrive at a stable and flexible system.
To ensure a smooth experimental procedure within the context of a live concert evening was another major challenge. A large and trained team of temporary assistants and professional coordinators was crucial. With a ratio of one assistant per two participants, each participant was guided comfortably through the evening. This ensured a convenient personal experience for concertgoers and kept the level of ecological validity and participant satisfaction high. Waiting queues emerged only at the venue entrance, and participants could move freely in the venues except when filling out the questionnaires and getting attached to the physiological sensors. For participants, the duration of the concert evening including the surveys was about 2.5 h, which seemed the maximum in terms of concentration and patience. On principle, future projects should consider condensing the various steps of methodological procedures wherever possible.
Studying the effects of variations of the concert format (i.e., the “concert frame”) on aesthetic experiences was one goal of the methodological framework. These variations were programmed with the background of state-of-the-art concert formats and programs in the field of classical concert practice. Throughout the process and by analyzing the data, we learned that not all such variations turned out as expected (Tröndle, Weining, Uhde, et al., 2025). In some cases, the variations were found too subtle to exert a clear effect. Future research may consider testing the impact of variations before their actual experimental implementation. Further, an artistic realization of a certain format idea (such as “audience participation” or “visual enhancement through lighting”) may in principle take numerous concrete forms. In our experiments, we studied the effects of only one realization per artistic idea, which does not yet allow generalization of findings.
Owing to the complexity of the hardware and software set-up, artifacts and missing data were to be expected as in all physiological research, especially outside the lab. Countermeasures were implemented to reduce attrition to a minimum. In 2020, we therefore conducted a pilot study with three concerts to test the technical setup and design. Of 141 participants who provided informed consent, 9 had to be excluded as they were younger than 18 years, or as they left the venue before the concert. The physiological recordings showed considerable degrees of missing data, ranging from 9% in respiration, 35% in electrodermal activity, to 53% in cardiac measures. This afforded improvement of finger-clip sensors for the blood-volume measures and better fixation of electrodes, which was achieved in time before the final concerts. Considerable work was additionally invested in adapting the peak-detection algorithm, whose performance was found unsatisfactory in the pilot study. The resulting improvements were integrated into the 2022 setting of the regular project concerts. In other words, the rehearsal concerts of the pilot study were essential for the success of the final data acquisition.
Although the hardware and software were tested in the pilot-study concerts, the complexity of the design resulted in some unexpected dropouts and errors. Some limited data losses were caused by misunderstandings among musicians or researchers, which could have been avoided by written instructions, for instance a concise instruction manual for the logger page, and even more extensive previous training.
Concerning network design, we were confronted with problems encountered in a wireless network architecture. It was therefore decided to use a wired network instead because of its higher reliability. This decision increased the effort and costs involved in the installation, as a dedicated network installation company had to be employed to lay hundreds of meters of network cabling. Networking issues occurred particularly with the simultaneous streaming of the locally hosted segment videos to participants. Some participants found the waiting time due to peak processing unpleasant; computational load balancing may alleviate this delay problem in the future.
Some Biosignalsplux hub devices became unresponsive to dashboard calls right before the concert started. We countered this issue by generally conducting short start-and-stop tests. Unresponsive devices were thus identified and turned off and on again. The 15-min time window dedicated to sensor placement on participants was therefore found very short and should be extended.
Benefits
In the face of the described challenges and likely owing to coping with challenges at the right time, the project arrived at a successful implementation of this methodology framework, generating a rich and complex dataset. The integration of the data by the visitor tokens and a computational infrastructure providing a central timestamp allowed us to relate and join the various data types. This multiplied the analytic possibilities and offered novel insights into the aesthetic and social experience of people attending a classical music concert. The presented methodology is scalable and can be readily implemented in other kinds of performances, such as opera, theater, musical, and movie.
As described in the Methods section, only a modified infrastructure (building USB links connecting Plux devices with Raspberry Pi’s) finally allowed us to develop a stable and flexible system, which could start and stop the devices at the same time and if necessary repeatedly, process data streams simultaneously and in parallel on the multiple Raspberry Pi’s, integrate the logger data, and synchronize the physiological data with the videos.
This complex methodology was able to generate individualized video sequences based on the physiological data that had just been recorded during the concert and then had undergone the peak detection process. By using high-performance servers, we managed to reduce the duration of post-concert data processing, simultaneously for up to 88 participants, to between 5 and 10 minutes. Presenting such data in time for ratings in the subsequent post-concert questionnaires is an innovation in empirical aesthetics research.
An essential methodological improvement was monitoring of the real-time data streams to the dashboard, which allowed readjustment of participants’ sensors before the presentations started. This significantly increased the number of high-quality datasets from the pilot study in 2020 to the final study in 2022. In the 2022 data acquisition, 747 adult participants were allocated to the group with physiological data acquisition and 45 to the control group without. Participants reported little distraction resulting from the sensors; the general evaluation of the concert experience in the post-concert survey showed no significant differences between the participants and the control group of participants without physiological measurements (ANOVA F(1,775) = 0.02, p = .88). Of both groups, more than 98% participated in the pre-concert survey and 94% in the post-concert survey (here, however, participation of the control group was only 87%). In the final concerts, the percentage of successful recording of physiological signals was 80% for cardiac measures HR and HRV, 92% for respiration measures (RR), and 88% for electrodermal measures (SCR). Thus, the final methodological framework considerably reduced attrition due to drop-out and missing data observed in the 2020 pilot study.
Discussion
The methodology of data acquisition described in this article allows for types of analyses that are novel for musicology. The setting of live concerts offers ample information on the collective behavior and physiology of whole audiences. As the time series data are recorded simultaneously, physiological and motor synchronies of complete audiences can be derived from all participants’ coordination of physiology and movement. Using cross-correlation algorithms, such synchrony signatures can be quantified for entire audiences but also assigned to the individual participants, their “synchrony contributions.” These latter measures allow linking objective embodied synchronies to each participant's self-reported aesthetic assessments of the presented music, thus supporting fine-grained modeling of individual associations between physiology, body movement, and aesthetic experiences. Additionally, the synchronization of single participants with time series of music properties such as loudness, tempo or pitch (MIR data) offers data for individual associations. The pre-concert survey provides data on further states and traits of participants, their affective and mood states, personality traits, and musical backgrounds and attitudes.
Establishing the methodological framework of a large research project such as the one described here demands considerable financial resources, and much effort must be devoted to team-building. A willingness to overcome unforeseen hurdles is mandatory. Hurdles may include technical and practical problems; it is especially important to cope with the restrictions of scientific disciplines and compose a truly interdisciplinary team (Tröndle et al., 2019, 2022). The funding period of the present project, ECR–Experimental Concert Research, was six years, which does not include years needed for building the core team and the preparation of the proposal, nor most of the time for data analyses and publications after the official termination of the project.
The outlined methodological framework yielded a high degree of valid data while adhering to an ecologically valid, non-invasive approach. Looking back on the data and insights gained to date, we believe that such knowledge can only be accumulated by transdisciplinary cooperations and the collection of large and diverse datasets. The complexity of aesthetic experience must be investigated through integrative studies of the real performance “in the wild” such as in the context of a public concert hall. In this way, reliable and generalizable knowledge of concert experience, and aesthetic experience in general, comes within reach.
Footnotes
Action Editor
Alexander Refsum Jensenius, University of Oslo, RITMO Centre for Interdisciplinary Studies in Rhythm, Time, and Motion, & Department of Musicology.
Peer Review
Sara D'Amario, University of Oslo, RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion.
Jonna Vuoskoski, University of Oslo, RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion.
Author Contributions
MT: project management, writing; SG: setup of informational infrastructure, data integration; writing; CW: organization of data collection; CR: data integration; MW-F: musicological concert log, review; HG: review; A-KH: facial expression analysis; DH: music information retrieval; CS: review; WT: statistical analyses, writing.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical Approval
The written informed consent included permission for recording and processing of participants’ data. The statement informed about privacy protection, data security, and the right to the deletion of data, including video and sound records. Permission was also requested for the usage of video and sound data for demonstration purposes including teaching and publications. The procedure adhered to the principles of the Declaration of Helsinki and ethics regulations in Germany and was approved by the Ethics Council of the Max Planck Society (#2702_12).
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was conducted within the Experimental Concert Research project, which was substantially funded by VolkswagenStiftung. The concert series was additionally supported by the Aventis Foundation (grant number 93 263).
