Flow Optimizer Framework: Validation of a Dynamic Difficulty Adjustment System for Serious Games

Abstract

Introduction:

Serious games are an important tool to overcome the low engagement and adherence to rehabilitation programs due to their repetitive nature and lack of positive reinforcement. Dynamic difficulty adjustment (DDA) systems can contribute by providing algorithms to adapt serious games, keeping players engaged and in a flow state. However, these systems are generally custom-made for specific purposes and goals, lacking the adaptability to be easily integrated into serious games. In response to this problem, we introduced the Flow Optimizer Framework (FOF), a game-agnostic DDA system developed for Unity. This framework facilitates the integration of DDA algorithms with serious games in Unity, enabling real-time monitoring and adaptation based on player state through data processing, rule-setting, and decision-making.

Materials and Methods:

First, we conducted a technical validation of the framework, assessing its performance in handling real-time data streams and its responsiveness to different scenarios. Following this validation, we evaluated its effectiveness in enhancing the flow state by conducting a usability study. Participants were presented with three different types of DDA paradigms implemented in FOF (Implicit, Explicit, and Subjective), each with different algorithms to adjust the game’s difficulty.

Results:

The results obtained showed that the implementation of a biofeedback paradigm using the player’s heart rate was the one that increased game performance the most, and participants reported this condition as the most enjoyable and fitting to their skills.

Discussion:

Overall, participants reported a high usability and a high presence experienced in the serious games implemented with FOF.

Keywords

Dynamic difficulty adjustment Serious games Biofeedback Neurorehabilitation Physiological signals Adaptive systems

Introduction

Adapting game difficulty to the player’s needs is crucial in maintaining engagement, particularly in rehabilitation-focused serious games. One of the significant challenges in rehabilitation is the repetitive and often frustrating nature of traditional therapies, which can reduce motivation and hinder progress.² To counteract this, researchers have explored serious games: interactive applications designed to make rehabilitation more engaging and effective.² By incorporating dynamic difficulty adjustment (DDA) systems, serious games can dynamically modify challenge levels to match a player’s abilities, helping to maintain an optimal balance between challenge and skill.³ This balance is closely tied to the psychological state of flow, where players experience deep engagement and immersion, which has been linked to improved learning and performance.⁴

In serious games, DDA typically operates through two main mechanisms: performance-based adaptation, which adjusts difficulty based on player success or failure, and physiological adaptation, which responds to emotional and physiological signals such as arousal, boredom, or excitement.⁵ While performance-based DDA is widely used, it does not always capture player engagement, as individual perceptions of difficulty can vary significantly.⁶ To address this, researchers have explored real-time physiological monitoring to better assess player states and fine-tune difficulty accordingly.^7,8 However, key challenges remain, including integrating DDA into game frameworks, providing meaningful feedback, and accurately measuring the player’s flow state.^3,9

To address these challenges, biofeedback-based DDA systems must be reliable, adaptive, and seamlessly integrated across different games. In addition, accurately assessing a player’s emotional state remains a significant challenge in delivering a personalized gaming experience. While these challenges may limit the adoption of such systems, previous research has demonstrated that adaptive serious games can significantly improve player engagement and flow state.

Hence, we introduced the Flow Optimizer Framework (FOF),¹ a game-agnostic DDA system designed for Unity. FOF incorporates Python’s PyFlow visual scripting tool to implement and manage DDA algorithms efficiently. To validate its functionality, we conducted a technical performance evaluation to ensure its reliability across various scenarios. Subsequently, we tested its effectiveness and user experience through a usability study using RobotMania, a custom-developed serious game that integrates multiple feedback mechanisms.

State of the Art

DDA has emerged as a crucial mechanism for enhancing engagement in serious games by maintaining an optimal balance between challenge and skill. By dynamically modifying game parameters, DDA helps prevent negative emotional states such as boredom or anxiety, which can arise when a game is too easy or too difficult.⁷ In serious games, DDA systems dynamically assess the player’s state and modify game parameters accordingly to maintain an optimal challenge level. Adaptation mechanisms may involve adjusting task complexity, changing the time available, modifying game-specific variables, or providing adaptive feedback to guide the player’s performance.^8,10 While performance-based adaptation is widely used—where game difficulty adjusts based on the player’s success or failure—it does not always capture engagement accurately, as difficulty perception varies among individuals.¹¹

To address this, two primary DDA mechanisms have emerged: performance-based adaptation, which adjusts difficulty based on in-game performance, and physiological adaptation, which relies on real-time biofeedback to assess emotional states (e.g., arousal, stress, excitement) to trigger adaptive changes.¹⁰ Several studies have explored different DDA techniques and real-time adaptation methods.^5,7,9,12,13 However, several challenges persist, including seamlessly integrating DDA into game frameworks, providing feedback, and accurately measuring the player’s flow state.^4,5

One area of application of DDA is virtual reality (VR) exergames. Küntzer et al.¹⁴ examined the effectiveness of heart rate (HR)-based DDA in VR exergaming by dynamically adjusting gameplay challenges based on HR to optimize physical activity levels. Their findings indicated that HR-based DDA was more effective in maintaining target HR zones compared with randomized difficulty adjustments. Furthermore, participants reported improved perceived exertion and increased enjoyment, emphasizing the potential of this approach for enhancing VR-based exercise and rehabilitation programs.

Biofeedback has been increasingly integrated into serious games to dynamically adjust challenge levels based on real-time physiological signals, thereby enhancing player engagement and promoting a state of flow.^15–17 By continuously monitoring emotional states, biofeedback helps players regulate their responses to challenges, improve in-game performance, and transfer learned skills to real-world stress management scenarios.¹⁸ Standard physiological signals used in biofeedback applications include electrocardiography, electrodermal activity (EDA), photoplethysmography (PPG), electromyography (EMG), and electroencephalography. The real-time analysis of these signals provides valuable insights into stress and anxiety levels, allowing games to adapt dynamically to the player’s mental and emotional state.

An example of biofeedback in gaming is “Nevermind,” an adventure thriller game that integrates consumer-level biosensors to monitor players’ stress and fear in real time.¹⁹ The game dynamically increases its difficulty when players become anxious and reduces the challenge when they successfully regulate their stress, reinforcing emotional control through gameplay. This system demonstrated its effectiveness in creating a more immersive and emotionally responsive game experience.

Another study explored the effects of implicit and explicit biofeedback in a first-person shooter game designed to assess different biofeedback techniques.²⁰ In a two-phase experiment, the implicit biofeedback was compared with the explicit biofeedback. The study found that explicit biofeedback significantly enhanced immersion and enjoyment, as players could directly manipulate the game environment using their physiological signals (EDA and respiration). These findings suggest that explicit biofeedback interactions could be further explored in game applications to enhance player experience.

Previous research has also explored the use of DDA systems to adjust game difficulty; however, studies integrating both DDA and biofeedback in serious games remain limited. First, Bodolai et al.²¹ introduced a framework using physiological sensors to classify players’ mental states and adapt the difficulty to maintain a state of flow. Seyderhelm et al.²² developed the CASG-F framework, using real-time cognitive load measurements to adjust immersive environments in serious games. In addition, Bicalho et al.²³ proposed a Unity Engine plugin classification system, processed in a Python Application Programming Interface (API), using DDA based on the classification. Souza et al.²⁴ created the DDA-MAPEKit framework for Unity, tested in the CicloExergame, where players pedaled on a cycle ergometer while receiving real-time biofeedback on HR and oxygen saturation. Finally, Lima et al.²⁵ developed the Virtual Levada VR environment, dynamically adjusting the difficulty in real time using a Proportional-Integral-Derivative (PID) controller based on the participant’s HR, providing real-time biofeedback to help players regulate their physical effort.

In summary, while various DDA frameworks and biofeedback applications have been developed, integrating these elements into a unified, game-agnostic system remains a challenge.

Proposed Framework

This section introduces the FOF, a game-agnostic framework designed to adapt Unity games in real time, ensuring an engaging and challenging experience. It uses a hybrid approach that integrates Unity, PyFlow^a, and the Lab Streaming Layer (LSL)^b communication protocol, enabling seamless adaptation across various serious games. FOF consists of three core modules as follows: the DDA for Unity module, which is used to select and change which variables within the Unity game will be adapted; the PyFlow module to create rules to adapt the Unity game; and the LSL module to allow communication between all the modules, including all types of physiological sensors that are compatible with LSL.

In the following section, we will describe each one of the modules of the framework in detail. Figure 1 shows the overall structure of the FOF framework.

FIG. 1.

Flow Optimizer Framework (FOF) structure.

DDA for Unity

The DDA module was developed as a Unity package, allowing integration into any existing Unity game. It identifies predefined public variables within a specific C# script by assigning the script to the DDA prefab object via Unity’s inspector tab. To modify and adapt these public variables, the module uses LSL, enabling data transmission and reception. For this purpose, we incorporated the LSL4Unityc package, which allows customization of LSL stream parameters, including stream name and data type. In addition, the FOF allows the user to start, stop, and pause the data’s streaming. The DDA module is shown in Figure 2.

FIG. 2.

DDA module for Unity. DDA, Dynamic Difficulty Adjustment.

PyFlow

To ensure user-friendliness, particularly for individuals with limited programming experience, the FOF integrates PyFlow for rule creation. PyFlow, a Python-based visual scripting tool, enables the design of complex algorithms and data processing workflows using nodes that represent mathematical functions (Fig 9). By combining these nodes, a workflow that seamlessly connects the Unity game with the controller using LSL is formed.

The PyFlow module consists of two main components as follows: the PyMonitor Visualizer and the Controller, which work in combination to monitor input from LSL streams and adapt Unity Game variables in real time.

PyMonitor visualizer

Developed with Python’s PyQt5 and PyQtGraph, PyMonitor provides real-time visualization of LSL data (Fig. 3), a key aspect for tracking the player’s physiological responses. To ensure smooth signal rendering, it uses a one-second data buffer and memorization to avoid redundant computations. In addition, the following features were implemented in PyMonitor: save and load User-Interface (UI) settings, filter graph source, select channel to display, add or delete graph in UI display, and personalize design parameters of the graphs.

FIG. 3.

PyMonitor Visualizer.

Controller

The controller defines adaptation rules for real-time update of Unity variables. It receives, processes, and sends data via LSL to dynamically adjust gameplay. One key node within the Controller is the PID node, which adjusts game difficulty based on the player’s performance, using the proportional (Kp), integral (Ki), and derivative (Kd) constants to match a performance-based target. While adaptable to various game variables, it requires tuning to determine the optimal constant values.

LSL module

The LSL Module is responsible for real-time acquisition and transmission of physiological signals from external sensing devices to the framework. It is designed to interface with any physiological sensor that supports the LSL protocol, ensuring that the framework remains hardware-agnostic and adaptable to a wide range of biofeedback systems. This modular design allows developers to integrate diverse hardware configurations without modifying the core architecture.

In this study, the LSL Module was implemented using the emteqPRO face mask,²⁶ which employs the Emteq LSL Client for real-time data transmission and customizable signal streaming through the LSL interface. The Emteq LSL Client allows the selection of which signals to be transmitted via LSL, including:

PPG data—Raw PPG signal and proximity data from the mask’s central sensor.

HR data—Current, average, and deviation values of the user’s HR, obtained from the PPG signal and computed using the Emteq VR Manager SDK.

EMG data—Facial muscle activity from dry electrodes located at the left and right zygomaticus, orbicularis, and frontalis muscles and the corrugator muscle.

By integrating these signals, the Emteq LSL Client (Fig. 4) enables precise physiological monitoring, supporting adaptive game mechanics based on real-time emotional and physiological responses. The framework is hardware-agnostic at the software level, meaning it can integrate data from any compatible physiological sensor without modifying the core architecture; only a device-specific acquisition client that supports LSL streaming is required. Although the present implementation utilizes the emteqPRO device, the modular design of the LSL Module enables the development of similar acquisition clients for other sensing systems that support LSL streaming.

FIG. 4.

Emteq LSL Client developed to show EMG and Heart Rate data in real time. The variables to be transmitted are also customizable. LSL, LabStreaming Layer; EMG, electromyography.

Technical Validation

Understanding software limitations is crucial to ensure data integrity and project reliability. To prevent issues like crashes or data loss, this study tested the FOF modules for stability and performance under varying conditions. The following tests validated real-time data handling:

Performance and Real-time Data Test—Assess the system’s ability to handle high data loads, ensuring smooth processing, transmission, and visualization without delays, stuttering, or crashes.

Boundary Test—Check the system’s capability to extreme and edge cases, for example, negative values.

Long Duration Test—Identify memory leaks or performance drops over extended periods while transmitting data between Unity and PyFlow.

PID controller tuning—As a key PyFlow node, the PID controller was tested with various constant values (Kp, Ki, Kd) to adjust difficulty based on user emotion.

For these tests, to evaluate performance, the following metrics were established:

Success of Execution time—Measure performance across extended execution periods (10 seconds to 20 minutes).

Synchronization—Evaluate data synchronization time between Unity and PyMonitor.

Consistency—Measure graphical stuttering in the PyMonitor Visualizer, categorized as follows:

0 = Very Frequent—Graphical Stuttering happens frequently.

1 = Common—Graphical Stuttering occurs occasionally.

2 = Rarely/None—Absent graphical stuttering occurs.

Data Delay—Measured latency in receiving Unity data.

PID Controller Optimization—Multiple trials were conducted to minimize the average absolute error by comparing the player’s performance, target set point, and the PID controller output under different experimental conditions.

Three PCs were used to run key software components, including a Unity-based testing program, a Stream Generator to create LSL stream, PyFlow, and the OpenSignals software. Each PC had specific hardware configurations:

PC 1—Windows 10, Intel i7-6700 (3.40 GHz) CPU, 16 GB of RAM, and Integrated GPU.

PC 2—Windows 10, AMD Ryzen 5 3550H (2.10 GHz) CPU, 8 GB of RAM, and AMD Radeon Vega 8 graphics card.

PC 3—Windows 11, AMD Ryzen 5 3600 CPU, 16 GB of RAM, and GTX 1070 Ti graphics card.

This configuration allowed for efficient system operation, maximizing performance and maintaining data flow.

Usability Study

After ensuring the system’s technical reliability, the focus shifted to understanding its effectiveness in engaging users. A usability study was conducted to assess FOF’s ability to create engaging experiences. Participants experienced three experimental conditions, each using distinct stimuli to adapt game difficulty, aiming to evaluate the impact of players. In this section, we will describe the materials and methods used to perform the usability study.

Sample

A total of 26 university students from the University of Madeira were recruited as volunteer participants. The sample included 14 females and 12 males aged 18–39 years. Due to equipment malfunction, one male participant was removed, resulting in a final sample of 25 participants. Informed consent was obtained before data collection.

RobotMania game

The RobotMania game was developed in Unity (2022.1.9f1) to test the FOF’s capabilities and usability of adaptive gameplay. In this game, players destroy balls that spawn out of the robot’s hands, which alternate spawning with a 50% chance every second. The game lasts 5 minutes. Although two versions exist (Desktop and Head-mounted Display (HMD)), only the HMD version was used in this study (Fig. 8).

The game was designed to be engaging, challenging, and adaptive. To assess the game’s challenge, the player’s performance was measured as the percentage of balls destroyed out of the last 10 spawned. In addition, game difficulty was adjusted by modifying the following variables: Field of View (FOV) and Ball Speed. The FOV determines how much of the screen is visible during gameplay, with a higher FOV indicating lower difficulty and vice versa. The Ball Speed corresponds to how fast each spawned ball moves away from the robot’s hands.

Experimental conditions

To validate the usability of the DDA framework for optimizing flow in serious games, three experimental conditions were designed. Each condition used different input and target parameters in PyFlow to dynamically adjust the difficulty of the RobotMania game.

In the Implicit condition, the participant’s HR, acquired from the emteqPRO PPG sensor, was used as the input parameter to adjust the difficulty. The game’s difficulty was adjusted automatically using the PID controller, with an HR target 20% higher than the baseline HR, as the participants needed to pedal on the ergometer bicycle. Hence, the FOV of the player decreased if the participant’s HR was far from that target (higher difficulty) and increased when it was closer to the target (lower difficulty).

For the Explicit condition, the player’s performance (percentage of balls captured with success) was used to adjust the difficulty of the game. In this case, the setpoint for the PID controller was a performance of 50%. The ball’s speed decreased or increased according to it, every second, to drive the player’s performance to that specific target.

Regarding the Subjective condition, the game’s difficulty was adjusted according to self-reported answers about how stressed the participants were. Every 30 seconds, a 5-point Likert scale was prompted in the game so that the user answered to their perceived stress level, with 1 being completely relaxed and 5 being stressed. The target used for the PID controller was the middle point of this stress scale, meaning that the user was neither relaxed nor stressed. The game’s difficulty was adapted by gradually changing the ball’s speed every second during this 30-second interval between prompts.

These three conditions (Table 1) enabled a comparative analysis of different approaches to adaptive difficulty, providing a comprehensive evaluation of the DDA framework’s effectiveness in serious games.

Table 1.

Description of the Experimental Conditions

Condition	Implicit	Explicit	Subjective
Performance	Based on participant’s heart rate (HR)	Based on player’s performance	Based on self-reported stress level
Set point	20% higher than baseline HR	50% of Balls caught	Middle point of the Stress scale
Output	Field of view	Ball speed	Ball speed

Self-report instruments

To assess player experience and system usability, multiple evaluation tools were used. The Game Experience Questionnaire (GEQ) measured in-game and postgame experiences, with a minor adjustment excluding one challenge-related item.²⁷ A DDA Questionnaire captured players’ perceptions of DDA’s naturalness, consistency, speed, and engagement, whereas a Condition Comparison questionnaire evaluated differences in enjoyment, challenge, and suitability across game versions. Usability was assessed using the System Usability Scale (SUS),²⁸ a 10-item questionnaire on a 5-point Likert scale. In addition, the Sense of Presence Inventory (SOPI)²⁹ and Presence Questionnaire (PQ)³⁰ measured the sense of immersion in the virtual environment, with the latter using a 7-point Likert scale but excluding sound and haptic-related subscales.

Experimental setup

Hardware

The hardware used for this study comprises the following elements: two Desktops, an HTC Vive Pro EYE (HTC Corporation, 2019, Taiwan) HMD, a Polar H10 chest band (Polar Electro, Finland), an emteqPRO (Emteq Labs, UK), and an ergometer exercise bicycle.

The HTC Vive Pro EYE is a high-end HMD with eye-tracking technology that enables precise monitoring of the users’ eye movements. It has 1440 × 1600 pixels per eye resolution with a refresh rate of 90 Hz and a 110° FOV. The HMD was connected to a desktop with Windows 10, an AMD Ryzen 7 7700 3.80 GHz processor, 32 GB of RAM, and an NVIDIA Quadro P6000 graphics card. An additional desktop with Windows 10, an Intel i7-6700 3.40 GHz processor with 16 GB of RAM, was used to receive the streaming data from Unity and control the adaptation rules.

The Polar H10 HR chest band was used to measure the participants’ HR during the baseline recording.

The emteqPRO is a wearable sensor system compatible with the HTC Vive Pro EYE HMD. It was designed to measure and analyze physiological signals in real time, providing advanced insights regarding the user’s emotional and cognitive state. It has facial EMG sensors and a PPG sensor placed on the forehead of the participants, which was used to extract the participants’ HR in real-time during the experiment. Finally, an ergometer exercise bicycle was used to induce variations in the user’s HR during one of the experimental conditions (see Table 1).

Software

In terms of software, we used the RobotMania game to test the DDA framework by presenting this game to the users in the HMD. To access the real-time data of the emteqPRO regarding the physiological signals, the EmteqVR SDK^d for Unity was combined with a custom-built LSL streamer application to calculate and send the HR value extracted from the PPG signal through LSL. All the softwares mentioned above ran on the desktop to which the HMD was connected.

Finally, PyFlow was used to create a node-based application to adapt the RobotMania game in real time. This application received data from the RobotMania game and the Emteq LSL streamer, allowing the creation of customizable nodes with specific functions within a workflow to change the Unity game in real time.

Experimental procedure

This experiment followed a repeated-measures, single-session design where all participants experienced the experimental conditions described in Section, Experimental Conditions. Each session began with assigning a unique ID, followed by a 1-minute rest to record baseline HR using the Polar H10 chest band. Eye calibration with the HTC Vive Pro Eye was then performed. After sensor setup and calibration, participants completed a background questionnaire on prior videogame experience.

The order of experimental conditions was randomized to reduce bias in difficulty perception. After each condition, participants completed the GEQ core, GEQ post, and DDA questionnaires to assess their experience. At the end of the session, they completed the PQ, SUS, SOPI, and a condition comparison questionnaire to evaluate the overall usability and experience with the FOF system.

Results

Technical validation

This experiment aimed to identify the limits, weaknesses, and strengths of Pyflow and PyMonitor to determine their overall utility. For each evaluation criterion, statistical metrics, including averages and standard deviations, were calculated to interpret the results.

The Success of Execution time showed that all scenarios were completed without crashing. In contrast, the synchronization time had an average of 1.4 seconds and a standard deviation of 0.03. Figure 5 revealed that the synchronization time remained constant across the tests. The Delay/Latency had an average of 0.11 and a standard deviation of 0.04 seconds, suggesting a relatively stable performance in data reception. However, the longer execution period (Fig. 6) showed a larger dispersion across the tests. In terms of Consistency, our system showed that in 80% of the tests, no stuttering occurred in graphical visualization, 14% of the cases revealed occasional stuttering, and 6% of the cases revealed frequent stuttering (Fig. 7).

FIG. 5.

Results from the Time to Sync test for the different execution periods.

FIG. 6.

Results from the Delay time test for the different execution periods.

FIG. 7.

Results from the Consistency test.

To fine-tune the PID controller, three experiments were conducted for each experimental condition, testing various values of constants (Kp, Ki, and Kd). For the Explicit condition, the optimal constants selected were Kp = 0.10, Ki = 0.05, and Kd = 0.05. These values produced the most consistent results in achieving a 50% ball failure rate, demonstrating stable control of behavior and accuracy in achieving the desired set point. Regarding the Subjective condition, the best values found were Kp = 0.01, Ki = 0, and Kd = 0.003. Finally, for the Implicit condition, the optimal values were Kp = 0.001, Ki = 0, and Kd = 0.015. For both Subjective and Implicit conditions, the Ki constant was set to zero (Ki = 0) because higher values resulted in instability. The detailed results of the PID controller are described in Supplementary Data.

Through iterative testing, these optimized PID values were determined for each condition, ensuring precise and adaptive difficulty adjustment based on the respective stimulus.

Usability study

System Usability Scale

The total score of the SUS ranges from 0 to 100, with higher scores indicating better system usability. Based on previous studies, a SUS score above 68 is considered above average, whereas a SUS score below indicates usability issues. Our system obtained an average score of 84.64 ± 10.85, indicating a high perceived usability of our system.

Sense Of Presence Inventory

The SOPI results are depicted in the box plot of Figure 8. From this figure, the subscale with a higher score was the Engagement (median [Mdn] = 3.17, range = 2.83 (1.33–4.17)), followed by the Spatial Presence (Mdn = 2.61, range = 3.11 (1.06–4.17)) and the Ecological Validity (Mdn = 2.33, range = 2.67 (1.17–3.83)). Finally, the Negative Effects subscale had the lowest score (Mdn = 1.50, range = 2.17 (1.00–3.17)).

FIG. 8.

Results of the Sense of Presence Inventory for each subscale.

Presence Questionnaire

Regarding the Presence Questionnaire, the results are shown in the box plot of Figure 9. In this experiment, participants reported their sense of presence inside the RobotMania game, with high values being reported for the scales Self-Evaluation of Performance (Mdn = 6.00, range = 5.50 (1.50–7.00)), Possibility–Act (Mdn = 5.50, range = 2.75 (3.50–6.25)), Realism (Mdn = 5.14, range = 3.14 (3.43–6.57)), and Possibility–Examine (Mdn = 4.67, range = 4.67 (2.00–6.67)). The lowest reported subscale was the Quality of Interface, with a median of 4.33 and a range of 4.33 (1.67–6.00).

FIG. 9.

Results of the Presence Questionnaire for each subscale.

GEQ core and post

The GEQ Core and Post were used to assess the effect of the different experimental conditions (see section, Experimental Conditions) on the players’ game experience. Since the data from the questionnaires were ordinal, nonparametric statistical tests were performed for a repeated measures design. The Friedman Test was conducted to evaluate the difference in each GEQ Core and Post component between the three experimental conditions (Implicit, Explicit, and Subjective). When the Friedman Test revealed significant results between the three groups, post hoc pairwise comparisons were performed using the Wilcoxon signed-rank test, corrected with Bonferroni correction for the number of tests performed on the same variable (αadj = 0.017).

For the GEQ-Core Questionnaire, we found significant differences between the three experimental conditions for the Tension factor (F_r³ = 6.19, P < 0.05) and the Challenge factor (F_r³ = 15.37, P < 0.001). Despite these significant results, the pairwise comparisons for Tension did not reveal any significant difference between the three experimental conditions. However, for the Challenge factor, participants reported a significantly higher challenge in the Explicit (Mdn = 2.00) condition compared with the Implicit (Mdn = 1.38) condition (P < 0.017, r = 0.59), and a significantly higher challenge in the Subjective (Mdn = 2.00) condition compared with the Implicit (Mdn = 1.38) condition (P < 0.001, r = 0.76). All these results are shown in the box plots of Figure 10 and Figure 11.

FIG. 10.

Results for the Tension factor of the GEQ Core questionnaire. GEQ, Game Experience Questionnaire.

FIG. 11.

Results for the Challenge factor of the GEQ Core questionnaire. *P value < 0.017, **P value < 0.001.

For the GEQ-Post Questionnaire, no significant difference was found for all the factors evaluated (Positive Experience, Negative Experience, Tiredness, and Returning to Reality), between the three experimental conditions. Nonetheless, participants reported the Implicit condition as providing a higher Positive Experience and a higher level of Returning to Reality. The Explicit condition was when participants experienced a more Negative Experience. In terms of Tiredness, the Explicit and Implicit conditions were reported to have the same level of tiredness after the experiment.

DDA questionnaire

From this questionnaire, our goal was to evaluate how well the DDA was implemented in our system. Overall, for all experimental conditions, most participants reported that the challenge created matched their skills and offered an engaging and organic difficulty adjustment during gameplay. The Implicit condition was considered the most suitable in terms of challenge and difficulty adjustment, as it provided the most natural difficulty adjustment. In contrast, some participants reported that the Explicit condition provided a higher challenge because the difficult adjustment was performed too quickly. The Subjective condition was considered the most engaging condition.

Condition comparison

After the experiment, this questionnaire was presented to the participants to assess which experimental condition was their favorite. The results from this questionnaire showed that participants reported the Implicit condition as the most enjoyable and fit with their skills as well as the version they would recommend over the remaining. The Explicit version was the most challenging, hardest, and most unfit to the player’s skills. Finally, the Subjective version was considered the easiest version to play.

Player’s score

To assess the impact of experimental conditions on player performance, we computed each player’s score based on the rate of destroyed balls. We then conducted nonparametric statistical tests to evaluate performance differences. The Shapiro–Wilk test indicated that the data did not follow a normal distribution; therefore, we applied the Friedman test, followed by post hoc Wilcoxon signed-rank tests with Bonferroni correction (α_adj = 0.017). The results (Fig. 12) showed significant differences in the players’ scores between the three experimental conditions (F_r³ = 42.32, P < 0.001). Pairwise comparisons revealed that in the Implicit condition participants had the highest score (Mdn = 99.70) compared with the Subjective (Mdn = 91.36) condition (P < 0.001, r = 0.82) and the Explicit (Mdn = 79.07) condition (P < 0.001, r = 0.87). The Subjective condition also revealed significantly higher scores compared with the Explicit condition (P < 0.001, r = 0.78).

FIG. 12.

Results for the Players’ score in each experimental condition.

Discussion

This study introduced the FOF framework, a game-agnostic DDA system designed to enhance flow in serious games. Technical validation confirmed the framework’s reliability and its ability to process real-time physiological and game-related data efficiently across various scenarios. The validation demonstrated seamless operation, without crashes across all test scenarios. The average synchronization time between Unity and PyMonitor was 1.4 seconds, with slight increases during longer execution periods, indicating optimization opportunities for extended gameplay sessions.

A key component of the technical validation was the fine-tuning of the PID controller for adaptive difficulty adjustment. Through iterative testing, optimal PID constants were identified, ensuring that the system could effectively adjust the game’s difficulty based on the player’s physiological signals. The Explicit condition, with Kp = 0.10, Ki = 0.05, and Kd = 0.05, achieved stable results with a consistent 50% ball failure rate. In the Subjective and Implicit conditions, lower values for Kp and Kd, along with setting Ki to zero, resulted in more stable performance. This fine-tuning was essential for maintaining a balanced level of challenge, contributing to the system’s ability to maintain a state of flow.

Following the technical validation, usability testing further assessed the system’s impact on user engagement. The results revealed a high perceived usability score (84.64 ± 10.85), with players reporting a positive experience, feeling present and engaged in the game. However, interface quality was rated poorly, highlighting the need for UI refinement to improve overall usability.

Adaptive difficulty using the RobotMania game showed that the Implicit condition was the most suitable for balancing challenge and enjoyment. Participants achieved higher performance scores and reported that this version aligned best with their skills and provided a more personalized experience. This condition also showed the lowest levels of Tension and Challenge in the GEQ Core assessment. These findings underline the potential of using physiological feedback for personalized game difficulty adjustments to enhance player engagement and induce a state of flow.

Limitations and Future Work

While the FOF framework demonstrated strong performance during technical validation, several limitations should be noted. First, the system was exclusively tested with the custom-developed game RobotMania, limiting its broader applicability to other game types. Future research should extend validation to a broader range of serious games to assess adaptability across genres and gameplay contexts.

Second, although the framework is designed to be hardware-agnostic at the software level through the LSL protocol, the current implementation relies on the emteqPRO device for physiological data acquisition. Integrating alternative hardware setups could help confirm the framework’s flexibility and ensure its applicability in different sensing environments. Nonetheless, any new acquisition client must support LSL streaming to be compatible with the framework.

The system also showed increases in latency and synchronization time during prolonged gameplay sessions, indicating inefficiencies that should be optimized for extended use. In addition, tuning the PID controller based on individual physiological data may further enhance adaptive performance and maintain flow states.

Finally, future studies should explore additional gameplay factors such as time pressure and multitasking demands, which could influence perceived challenge and engagement. Expanding the range of physiological signals and refining the user interface design would further contribute to a more immersive and responsive adaptive gaming experience.

Conclusion

In summary, the FOF framework offers a promising solution to the challenge of DDA, utilizing biofeedback to adapt gameplay difficulty based on real-time physiological data, foster player flow, and enhance player engagement and experience in serious games. Its modular and hardware-agnostic design enables integration with diverse sensing systems, whereas validation through RobotMania demonstrates the feasibility of emotion-driven adaptation. Continued refinement, testing across diverse games, and improvements in user interface design will strengthen the framework’s applicability and reliability across varied gaming contexts.

Authors’ Contributions

R.L.: Conceptualization, methodology, supervision, writing—original draft, and writing—reviewing and editing. D.B.: Conceptualization, methodology, supervision, writing—original draft, and writing—reviewing and editing. P.L.: Methodology, formal analysis, and investigation. S.B.i.B.: Methodology, project administration, supervision, and writing—reviewing and editing.

Footnotes

Author Disclosure Statement

The authors have no competing interests to declare that are relevant to the content of this article.

Funding Information

This work was supported by Fundação para a Ciência e Tecnologia (FCT) under the PhD grants 2020.06024.BD () and 2021.05646.BD and by NOVA Laboratory for Computer Science and Informatics (UID/04516/NOVA) with the financial support of FCT.IP. This work was also supported by the MACbioIDi2 (INTERREG program MAC2/1.1b/352) and SIH (n°.03/C16-i03/2022) projects.

Supplemental Material

References

Lobo

, Lima

, Branco

, et al. Flow optimizer: A dynamic difficulty adjustment framework for serious games in neurorehabilitation. In: 2024 IEEE 12th International Conference on Serious Games and Applications for Health (SeGAH)., 2024. pp. 1–8.

Cyrino

, Tannús

, Lamounier

, et al. HarpyGame: A Customizable Serious Game for Upper Limb Rehabilitation after Stroke. In: Anais Estendidos do XXII Simpósio de Realidade Virtual e Aumentada. 2020. pp. 67–8.

Hunicke

. The case for dynamic difficulty adjustment in games. In: Proceedings of the 2005 ACM SIGCHI International Conference on Advances in Computer Entertainment Technology. New York, NY, USA: Association for Computing Machinery; 2005. pp. 429–433. (ACE ‘05).

Brom

, Buchtová

, Šisler

, et al. Flow, social interaction anxiety and salivary cortisol responses in serious games: A quasi-experimental study. Comput\& Educ, 2014; 79:69–100.

Seyderhelm

AJA

, Blackmore

. Systematic review of dynamic difficulty adaption for serious games: The importance of diverse approaches. SSRN Electron J, 2021.

Jawinski

, Kirsten

, Sander

, et al. Human brain arousal in the resting state: A genome-wide association study. Mol Psychiatry, 2019; 24(11):1599–1609.

Andrade K de

, Pasqual

, Caurin

GAP

, et al. Dynamic difficulty adjustment with Evolutionary Algorithm in games for rehabilitation robotics. In: 2016 IEEE International Conference on Serious Games and Applications for Health (SeGAH). Orlando, FL, USA: IEEE; 2016. pp. 1–8.

Pezzera

, Borghese

. Dynamic difficulty adjustment in exer-games for rehabilitation: A mixed approach. In: 2020 IEEE 8th International Conference on Serious Games and Applications for Health (SeGAH). Vancouver, BC, Canada: IEEE; 2020. pp. 1–7.

Liu

, Agrawal

, Sarkar

, Chen

. Dynamic difficulty adjustment in computer games through real-time anxiety-based affective feedback. Int J Human–Computer Interact, 2009; 25(6):506–529.

10.

Navarro

, Sundstedt

, Garro

. Biofeedback methods in entertainment video games: A review of physiological interaction techniques. Proc ACM Hum-Comput Interact, 2021; 5(CHI PLAY):1–32.

11.

Zohaib

. Dynamic Difficulty Adjustment (DDA) in computer games: A review. Adv Human-Computer Interact, 2018; 2018:1–12.

12.

Sepulveda

, Besoain

, Barriga

. Exploring dynamic difficulty adjustment in videogames. In: 2019 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON)., 2019. pp. 1–6.

13.

Hagelback

, Johansson

. Measuring player experience on runtime dynamic difficulty scaling in an RTS game. In: 2009 IEEE Symposium on Computational Intelligence and Games. 2009. pp. 46–52.

14.

Küntzer

, Scherer

, Mentler

, et al. Dynamic difficulty adjustment in virtual reality exergaming to regulate exertion levels via heart rate monitoring. 2024. 1–2.

15.

Houzangbe

, Christmann

, Gorisse

, et al. Effects of voluntary heart rate control on user engagement and agency in a virtual reality game. Virtual Real, 2020; 24(4):665–681.

16.

Zelada

, Gutierrez

. Dynamic difficulty adjustment of video games using biofeedback. In: International Conference on Ubiquitous Computing and Ambient Intelligence. 2022. pp. 925–936.

17.

Paraschos

, Koulouriotis

. Game difficulty adaptation and experience personalization: A literature review. Int J Human–Computer Interact, 2023; 39(1):1–22.

18.

Jerčić

, Sundstedt

. Practicing emotion-regulation through biofeedback on the decision-making performance in the context of serious games: A systematic review. Entertain Comput, 2019; 29(December 2018):75–86.

19.

, Qin

, Lyu

. Application of Biofeedback Technology in Human-Computer Interaction in Video Games. IOS Press; 2024.

20.

Kuikkaniemi

, Laitinen

, Turpeinen

, et al. The influence of implicit and explicit biofeedback in first-person shooter games. Conf Hum Factors Comput Syst - Proc. 2010:859–868.

21.

Bodolai

, Gazdi

, Forstner

, et al. Supervising Biofeedback-based serious games. 6th IEEE Conf Cogn Infocommunications, CogInfoCom 2015—Proc. 2016; 273–278.

22.

Seyderhelm

AJA

, Blackmore

, Nesbitt

. Towards cognitive adaptive serious games: A conceptual framework. Vol. 11863 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer International Publishing; 2019. 331–338.

23.

Bicalho

, Baffa

, Feijó

. A dynamic difficulty adjustment algorithm with generic player behavior classification unity plugin in single player games. ACM Int Conf Proceeding Ser, 2023:76–85.

24.

Souza

CHR

, De Oliveira

, Berretta

, et al. DDA-MAPEKit: A framework for dynamic difficulty adjustment based on MAPE-K loop. ACM Int Conf Proceeding Ser, 2023:1–10.

25.

Lima

, Asif

, Sousa

, et al. Adaptive control of cardio-respiratory training in a virtual reality hiking simulation: A feasibility study. In: Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - BIOSIGNALS. SciTePress; 2022. pp. 91–99.

26.

Gnacek

, Broulidakis

, Mavridou

, et al. emteqPRO—fully integrated biometric sensing array for non-invasive biomedical research in virtual reality. Front Virtual Real, 2022; 3(March):1–17.

27.

IJsselsteijn

, De Kort

YAW

, Poels

. The Game Experience Questionnaire. Eindhoven University of Technology; 2013.

28.

Brooke

. Sus: A “quick and dirty’usability. 189(3):189–194.

29.

Vasconcelos-Raposo

, Melo

, Teixeira

, et al. Adaptation and validation of the ITC - Sense of presence inventory for the Portuguese language. Int J Hum Comput Stud, 2019; 125:1–6.

30.

Witmer

, Singer

. Measuring presence in virtual environments: A presence questionnaire. Presence, 1998; 7(3):225–240.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.03 MB