Abstract
This paper explores an initial attempt to use the Unity ML-Agents toolkit to model the behavior of people evacuating from indoor fires. The virtual environment was created in the Unity game engine and populated with humanoid agents capable of moving autonomously within the scene. Each agent perceives information from the rendered environment, such as surfaces, directions, and line-of-sight depth and uses it to navigate toward the nearest exit. Agents were trained through reinforcement learning, using the Proximal Policy Optimization (PPO) algorithm to balance rewards and penalties for their actions. We tested five different reward schemes in single-agent simulations to observe how these affect navigation behavior. Among them, the version referred to as mark5 showed the most plausible and efficient evacuation strategy, reaching the exit quickly while avoiding collisions. The same trained agent was then used in multi-agent settings, where its performance remained stable even with groups of up to 20 individuals. These first results suggest that Unity ML-Agents can offer a practical foundation for building more realistic and adaptive evacuation models.
Keywords
Get full access to this article
View all access options for this article.
