Abstract
Swarm robotics aims to achieve robust collective behaviors through large numbers of relatively simple robots, but modeling and interpreting these emergent dynamics from real experimental data remains challenging. This work proposes an interpretable machine learning framework for modeling and analyzing swarm robot behaviors using a public IEEE DataPort dataset of swarm robotics experiments (eight robots, 200 time steps, and 1,600 labeled samples). We construct a feature-based representation of local interaction metrics (alignment, cohesion, separation, velocity, and position) and train a Random Forest classifier to recognize four behavioral phases: exploration, aggregation, formation, and foraging. The proposed classifier attains 98.12% overall accuracy and high per-class precision and recall, while feature importance and Shapley additive explanation analyses highlight alignment (31.44%) and cohesion (21.62%) as dominant behavioral drivers. Unsupervised clustering with KMeans and DBSCAN, supported by a Silhouette score of 0.2541 and an adjusted Rand index up to 0.69, reveals moderately separable latent structure consistent with the labeled phases. A Random Forest regressor further links local interaction features to global performance indicators, achieving high results on task-level outcomes. Our framework provides a unified, reproducible, and interpretable pipeline for real multi-robot data that combines classification, clustering, and regression. The results demonstrate that biologically inspired features can support accurate, explainable phase recognition and performance prediction, enabling data-driven design of swarm controllers for applications such as precision agriculture, search and rescue, and environmental monitoring.
Keywords
Introduction
Swarm robotics is an area of research that asks a seemingly simple question with broad implications, namely, how do many simple robots, each with limited sensing and computation, coordinate locally but collectively achieve robust behaviour, because this question has significant implications for tasks such as environmental monitoring, precision farming, or disaster response. Compared with single-robot systems, swarms can divide work, tolerate failures, and respond quickly to change, but progress has been slower in real-world deployments than in simulation, because hardware limitations, sim-to-real gaps, and evaluation practices that do not emphasize interpretability hinder progress from lab demos to field use. 1 Data-driven methods are starting to complement rule-based and physics-based approaches, because recent surveys review reinforcement learning (RL), imitation learning, and evolutionary design for swarms and call for more rigorous evaluation and deployable methods. 2 While black-box models may achieve good performance, they often obscure which cues are important and when, but measures based on collective behavior, such as alignment and cohesion, are both easy to compute on robots and simple to reason about, providing a natural starting point for models that will be interpretable.3,4
In Dorigo et al., 5 a key scientific and engineering challenge is to understand and predict how low-level interaction rules give rise to high-level emergent behaviors in the presence of sensing noise, actuation errors, and dynamic environments.
Traditional modeling approaches in swarm robotics are often rule-based or physics-based. While these models provide intuition, they can struggle to generalize from simulation to real hardware, and they typically require manual tuning of behavioral rules. 6 As the scale and complexity of swarm systems increase, there is growing interest in data-driven approaches that learn interaction rules directly from empirical trajectories. In animal collectives, deep learning has been used to infer interaction mechanisms and predict individual decisions, for example, via deep attention networks for zebrafish schooling. 7 In a related line of work, data-based models of pairwise interactions have been combined with robotic platforms to reproduce fish schooling behaviors and study minimal interaction rules. 8 More recently, Zhang and Liu 9 proposed a biomimetic deep neural network controller trained on fish trajectories and deployed on swarm robots.
Despite this progress, three important gaps remain. First, many existing studies focus on reproducing trajectories or group-level statistics (such as polarization and interindividual distances), and do not provide explicit, interpretable classification of distinct behavioral phases (e.g., exploration vs. aggregation). Second, state-of-the-art deep learning models for collective behavior tend to act as black boxes: they achieve good predictive performance but give limited insight into which interaction features (alignment, cohesion, separation, etc.) actually drive decisions. Third, most existing works either operate purely in simulation or rely on animal trajectories, leaving a gap in interpretable, data-driven analysis of real multi-robot swarm experiments.
Novelty and significance. Unlike prior work that isolates tasks (e.g., only clustering) or relies on opaque deep models in simulation, we integrate interpretable classification, clustering, regression, and explanation on a real swarm dataset.
This paper addresses these gaps by proposing a unified and interpretable machine learning (ML) framework for modeling emergent swarm behaviors from real robot data. Using the IEEE DataPort swarm robotics dataset, 10 which contains time series of eight robots executing canonical swarm tasks, we construct a feature-based representation of local interactions and train supervised and unsupervised models to (i) classify behavioral phases, (ii) discover latent group structure, and (iii) estimate performance metrics. Our approach uses Random Forest classifiers and regressors together with KMeans and DBSCAN clustering and principal component analysis (PCA) for dimensionality reduction. Importantly, we emphasize interpretability through feature importance and Shapley additive explanation (SHAP) analyses, linking ML outputs back to biologically meaningful notions of alignment and cohesion.
The key contributions of this work are as follows: We introduce a unified and interpretable ML framework that combines supervised classification, unsupervised clustering, and regression for swarm behavior analysis on a real multi-robot dataset (eight robots, 1,600 labeled samples across four phases). We generate a comprehensive set of evaluation metrics for behavior recognition, including overall accuracy, per-class precision, recall, F1-score, and receiver operating characteristic (ROC)–area under the curve (AUC), and we compare our results against recent state-of-the-art learning-based models for collective behavior. Through feature importance and SHAP analyses, we show that alignment and cohesion emerge as the dominant behavioral drivers, providing a physically and biologically grounded interpretation of the learned models. We perform a comparative ablation of classifiers (Random Forest, support vector machine (SVM), and K-nearest neighbor (KNN)) and unsupervised methods (KMeans and DBSCAN), and we demonstrate that our Random Forest model offers a favorable tradeoff between predictive performance, robustness, and interpretability.
The paper organization is as follows: The overall workflow of the proposed pipeline is summarized in Figure 1. The rest of the paper is organized as follows, because we want to provide a clear and concise overview of the paper’s structure. Section 2 reviews related work on swarm behavior modeling, learning in swarm systems, and explainability, while Section 3 describes the dataset, features, and learning methods, because these sections provide a foundation for understanding the pipeline. Section 4 presents results for classification, clustering, and regression, and comparison with recent baselines, because these results demonstrate the pipeline’s performance. Section 5 describes application scenarios, because these scenarios provide a context for understanding the pipeline’s practical implications. We conclude with a summary and directions for future work, because we want to provide a clear and concise overview of the paper’s contributions and limitations.

Overall pipeline of the proposed machine learning framework for swarm behavior modeling.
This study introduces an ML system for swarm robotics that integrates interpretable classification, clustering, and performance estimation using empirical data from real robot swarms. Prior works typically address classification and clustering separately and often rely on either purely simulated data or anomaly detection 11 or animal trajectories.7,8 In contrast, our approach jointly learns (i) a multi-class classifier for behavioral phases, (ii) unsupervised structure via KMeans and DBSCAN, and (iii) a regression model that maps local interaction features to global performance metrics, all on a single, experimentally grounded dataset of swarm robots.
Rather than relying solely on opaque deep neural networks, we employ feature-based analysis of swarm behavior (alignment, cohesion, separation, velocity, and spatial coordinates) combined with ensemble tree models and SHAP values to obtain instance-level explanations of model predictions. PCA is used to visualize behavioral structure and reduce dimensionality while preserving interpretability. This yields a pipeline that simultaneously provides phase classification, behavior discovery, and outcome prediction in a single framework, with explicit connections between learned models and established concepts from collective behavior and control theory. To the best of our knowledge, this is the first work to deliver such a unified, interpretable ML toolkit for real swarm-robot data, directly comparable to and complementary with recent deep learning models for collective motion.7,9
Related work
Recent years have seen progress in swarm robotics through different ways to model, control, and explain how groups act. This study sorts prior work into three main areas that relate to our work: (i) ways of modeling swarm behavior and coordinating actions, (ii) ML and data methods in swarm systems, and (iii) explainable learning and understanding behavior based on features.
Swarm behavior modeling and coordination
Swarm robotics takes cues from biological systems, copying self-organized and distributed coordination. Brambilla et al. 12 created a basic classification of coordination methods, listing things such as grouping, flocking, and foraging. Some studies used virtual pheromones 13 and field-based computing 14 to create new behaviors. Gandhe and Otte 15 came up with a sharing clustering process for robot swarms that can handle communication problems and bad agents. Sayama 16 recently examined the evolution of swarm systems and provided a framework for understanding how robotic groups develop novel behaviors.
ML in swarm robotics
ML is being used in many areas 17 because people want systems that can change and grow. Nguyen 18 looked at swarm intelligence methods for coordinating multiple robots. Taghavian et al. 19 gave a simple explanation of how ML can control swarms. RL has become common, with Blais and Akhloufi 20 looking at how useful it is in swarm situations. Lai et al. 21 broadened swarm intelligence to semi-supervised classification, linking exploration and exploitation in swarm control that is based on data.
Evolutionary robotics is also important. For instance, Rajbhandari and Sofge 22 used neuroevolution of augmenting topologies (NEAT) to develop swarm behavior policies. Bredeche et al. 23 noted how embodied evolution is helpful for adaptive collective intelligence. The research 24 evolved swarm robot controllers using three different representations (neural networks, Cartesian genetic programming, and Markov brains) to solve a foraging task with two resource types requiring different computational skills (XOR logic and multiplication).
Explainability and feature-based behavior analysis
Comparison of key related works and our contribution.
Comparison of key related works and our contribution.
ACN: asymmetric control network; ARI: adjusted Rand index; CNN: convolutional neural network; DNN: deep learning network; GAT: graphic attention network; LDN: local directional network; ML: machine learning; NEAT: neuroevolution of augmenting topologies; OCSVM: one-class support vector machine; P-NeatFA: penalty-reward-based neuroevolution of augmented topologies foraging algorithm; RL: reinforcement learning; ROS 2: robot operating system 2; SHAP: Shapley additive explanation.
Swarm robotics feature definitions.
Even with this progress, many past studies depend a lot on simulated settings or don’t have complete structures that mix unsupervised grouping, supervised sorting, and explainable feature assignment. Unlike other works, our method puts these parts together in a single ML process, using real experimental swarm data. This gives new understandings of behavioral phase detection and system teamwork.
Table 1 presents a comprehensive comparison of our work against recent swarm robotics research, highlighting how our approach advances the state-of-the-art.
This section describes the pipeline we use to model swarm behavior, because we want to provide a clear and concise overview of the pipeline’s structure. The steps are designed to be simple, basic preprocessing, compact feature set, standard classifiers and clustering, and a small set of regression targets, because this approach provides a unified framework for modeling and explaining swarm behavior.
Dataset description
We use the IEEE DataPort repository,
10
which provides 1,600 observations gathered from eight robots over 200 time steps in a controlled arena, because this dataset provides a real-world example of swarm behavior. Each sample contains six measurements: alignment, cohesion, separation, velocity, and spatial coordinates
Each data point is assigned to one of four behavioral categories:
Feature representation
The state of each robot at time
Table 2 provides detailed definitions of each feature.
Prior to modeling, all features are standardized using
To explore intrinsic patterns in high-dimensional data, PCA is applied (equation (3)):
To classify each behavioral phase, we train a Random Forest classifier (equation (4)):
The model computes feature importance via the average reduction in Gini impurity (equation (5)):
Unsupervised clustering is used to detect emergent groupings.
KMeans
The KMeans algorithm minimizes the within-cluster sum of squares as defined in equation (6):
DBSCAN identifies core samples based on density. The neighborhood of a point is defined as (equation (7)):
The Silhouette score is used to measure how well clustering performs (equation (8)):
To estimate swarm performance (e.g., task efficiency), we use Random Forest regression as shown in equation (9):
The model is evaluated with mean squared error (MSE) and
Combining classification, regression, and clustering, this integrated pipeline helps predict and interpret swarm behaviors, providing a solid foundation for multi-agent system analysis.
We evaluate the pipeline on the swarm dataset to provide a concise and clear overview of its performance. We interpret the results in terms of alignment and cohesion because studies of collective motion often use classification accuracy, clustering structure, and regression results.
Behavior classification performance
The Random Forest achieves an accuracy of 98.12% across the four phases, because the feature set is expressive enough to allow for clean separation without requiring heavy modeling. As illustrated in Figure 2, the confusion matrix exhibits strong diagonal dominance, indicating consistently high precision and recall for each class. Misclassifications are unusual and mostly happen between similar or transitional behaviors, which are a reflection of swarm dynamics’ inherent overlaps. These findings demonstrate that the model generalizes well across coordination patterns and successfully captures the underlying structure of the behavioral feature space.

Confusion matrix for behavioral phase classification.
Alignment and cohesion are the most significant predictors of behavioral state, contributing 31.44% and 21.62%, respectively, according to feature importance analysis (Figure 3). These results support the biological and control-theoretic underpinnings of swarm coordination, where cohesion preserves group integrity and alignment promotes directional consensus.

Feature importance for behavioral classification using Random Forest.

Shapley additive explanation (SHAP) interaction values for key features.

Comparison of KMeans and DBSCAN clustering results.

t-SNE projection of the swarm behavior dataset into 2D space. t-SNE: t-distributed stochastic neighbor embedding; 2D: two-dimensional.

Adjusted Rand index (ARI) comparing KMeans cluster labels to true behavior labels across different values of

Silhouette coefficient distribution across clusters.
Figure 4 presents a Shapley additive explanation (SHAP) interaction summary, illustrating how combinations of features influence the Random Forest model’s decisions. Cohesion interacts notably with velocity and spatial coordinates, suggesting that spatial unity directly affects movement patterns. This supports the biological principle that local interactions drive emergent swarm coordination. The SHAP analysis enhances model transparency and strengthens the case for explainable AI in robotics.
KMeans and DBSCAN clustering techniques were applied to the swarm dataset in order to identify latent behavioral patterns. Figure 5, where KMeans successfully identified four clusters corresponding to the known behavioral phases, illustrates the model’s capacity to infer coordination stages from local features without supervision. Because of the comparatively uniform density of swarm states, DBSCAN generated more fragmented groupings. These findings show that KMeans provides better unsupervised discovery and that the feature space is sufficiently structured to differentiate between behavioral phases.
An additional illustration of the separability of behavioral states is provided by a t-distributed stochastic neighbor embedding (t-SNE) projection (Figure 6), in which exploration, aggregation, formation, and foraging are represented by distinct clusters. Unlike linear PCA, t-SNE captures nonlinear relationships to validate the discriminative power of the selected features.
Adjusted Rand index (ARI) analysis (Figure 7) shows that the highest ARI occurs at
Silhouette analysis (Figure 8) provides further evidence of reliable cluster separation, with a coefficient of 0.2541 indicating moderate distinction among behavioral phases.
Table 3 summarizes clustering metrics, showing that KMeans outperforms DBSCAN in both Silhouette score and ARI.
Clustering evaluation metrics: KMeans versus DBSCAN.
Clustering evaluation metrics: KMeans versus DBSCAN.
ARI: adjusted Rand index.
A thorough comparison of our suggested method with cutting-edge techniques is shown in Table 4. Several important insights are revealed by our analysis.
Comparative performance analysis with state-of-the-art methods (best results).
Comparative performance analysis with state-of-the-art methods (best results).
AUC: area under the curve; ROS 2: robot operating system 2; OCSVM: one-class support vector machine; MLP: multi-layer perceptron; SVM: support vector machine; KNN: K-nearest neighbor.
Against ROS 2 anomaly detection
Kang et al. 11 used autoencoders to detect offline callback anomalies in ROS 2 systems with the highest accuracy (99.2%). Their method is intended for post-hoc forensic analysis rather than real-time behavior classification, even though it performs exceptionally well in offline analysis with an AUC of 0.999. Our approach is appropriate for online swarm monitoring, achieving 98.12% accuracy with 12 ms inference time.
In opposition to mobile robot navigation
Our Random Forest classifier (98.12%) performs about 5% better than Sabeeh’s best multi-layer perceptron (MLP) approach (93.04%). 29 This enhancement shows how successful our feature engineering strategy is, especially the focus on alignment and cohesion metrics.
Robustness analysis
One of the key benefits of our approach is its consistency. Although Kang et al.’s
11
methods show significant performance variance for offline analysis of different tasks (behavior classification vs. callback anomaly detection), they do so across different hyperparameter configurations (e.g., one-class support vector machine (OCSVM) ranges from 66.5% to 98.8% accuracy). Across cross-validation folds, our Random Forest classifier performs consistently (standard deviation
Regression-based performance estimation
Random forest regression accurately predicts swarm performance metrics, such as task completion time and dispersion. As shown in Figures 9 and 10, predicted values closely match ground truth with a

Regression predictions versus ground truth values.

Summary of model metrics: accuracy, mean squared error (MSE), and
The classification performance of the Random Forest, SVM, and KNN models is contrasted in Figure 11. With the highest accuracy and the lowest variance in cross-validation, Random Forest consistently performs better than the alternatives (Figure 12). These results imply that high-dimensional swarm behavior recognition is best served by ensemble approaches.

Comparison of classification models: Random Forest versus SVM and KNN. SVM: support vector machine; KNN: K-nearest neighbor.

Cross-validation accuracy distribution of Random Forest, support vector machine (SVM), and K-neighbor classifiers over five folds.

Density heatmap showing robot concentration during aggregation phase.

Robot trajectories across behavioral phases.

Heatmap of alignment, cohesion, and separation metrics over 200 time steps.

Visual illustration of alignment and cohesion mechanisms in swarm robots. Alignment drives directional consensus, while cohesion promotes group integrity.
Different coordination patterns across behavioral phases are revealed by spatial analysis using density heatmaps and trajectory plots (Figures 13 and 14). The system’s capacity to capture underlying coordination logic is demonstrated by high-density regions during aggregation and ordered trajectories during formation, which validate the learned phase labels.
Temporal heatmaps (Figure 15) of alignment, cohesion, and separation over 200 time steps reveal clear transitions between behavioral phases, further validating the discriminative power of the selected features.
Integrated discussion
The proposed framework uses classification, clustering, and regression to describe and analyze swarm behavior. Its interpretable characteristics and models enable real-time analysis and adaptive control in multi-robot systems. The visuals demonstrate that the system can successfully extract spatial dynamics and coordination patterns.
Random Forest achieved 98.12% classification accuracy across four behavioral states. Alignment (31.44%) and cohesion (21.62%) are the most predictive features. Regression using behavioral metrics achieved KMeans clustering revealed separable groupings aligned with task phases.
Edge deployment potential
The framework is suitable for deployment on edge platforms such as Raspberry Pi or Jetson Nano because of its lightweight models and dimensionality-reduced features, which achieve low inference latency (
Real-world application scenarios
Numerous real-world domains can directly benefit from the ML framework created for swarm behavior modeling. According to our research, the most predictive behavioral characteristics that enable robust coordination in decentralized robot swarms are cohesion and alignment.
Agricultural automation
Swarm robotics provides scalable and cost-effective solutions to major agricultural problems such as labor shortages, soil compaction, and resource optimization. Agricultural tasks are directly correlated with the behavioral phases observed in this experiment:
New technologies such as SwarmFarm Robotics can replace large machinery in fieldwork with small, coordinated robots. The framework’s quick inference time and edge device compatibility enable real-time, distributed deployment in agriculture.
The concepts of alignment and cohesion, which are essential for efficient swarm coordination in field operations, are depicted in Figure 16.
Search and rescue
Swarm robots can use local interactions to autonomously coordinate movement in dangerous or disaster-prone environments. The stages of emergent behavior found in our analysis corroborate: Aggregation close to victims or targets. Distributed exploration of rubble zones. Dynamic formation control for cooperative lifting.
Environmental monitoring and space exploration
Robot swarm behavioral modeling is crucial for tasks in remote or large areas, such as: Aggregated sensing for pollution detection in marine environments; autonomous exploration in planetary missions; formation flying for aerial climate monitoring; and so on.
Our SHAP analyses and PCAs show that interpretable behavioral modeling is beneficial for these applications. The four swarm behavior phases are mapped to operational roles in agricultural and environmental robotics in Table 5, emphasizing the importance of cohesion and alignment as critical drivers.
Table 6 summarizes the class-wise precision and recall for the Random Forest classifier. All four behavior phases exhibit consistently high recognition rates, underscoring the model’s robustness and the discriminative power of the selected features.
Mapping of swarm behavior phases to real-world applications.
Mapping of swarm behavior phases to real-world applications.
Behavior phase classification performance using Random Forest.
Figure 17 shows how the behavioral steps can be used directly in real-life agricultural duties to help with swarm operations that happen in the field.

Exploration, aggregation, formation, and foraging are the behavioral stages that are mapped to agricultural field activities.
This research presents a unified ML framework for modeling and interpreting swarm robot behaviors through classification, clustering, and regression. Using a public IEEE DataPort dataset of swarm robotics experiments, we show that a feature-based Random Forest classifier can distinguish four behavioral phases with 98.12% accuracy, while feature importance and SHAP analyses confirm the central role of alignment and cohesion in driving emergent coordination. Unsupervised clustering with KMeans and DBSCAN, supported by Silhouette and ARI scores, reveals a moderately separable latent structure that is consistent with the labeled phases. Random Forest regression further demonstrates that local interaction features can accurately predict global performance indicators, achieving
Beyond predictive performance, the framework emphasizes interpretability and reproducibility. All models operate on intuitive interaction features, and the pipeline combines supervised learning, unsupervised discovery, and regression in a single workflow that can be applied to other swarm datasets. The analysis provides a bridge between biologically inspired collective behavior metrics and practical, deployable models for multi-robot systems in domains such as agriculture, search and rescue, and environmental monitoring.
Limitations and future work
The present study has several limitations that open up avenues for future research. First, the empirical evaluation is restricted to a single experimental setup (eight homogeneous robots in a circular arena performing four scripted phases) drawn from one dataset. 10 While this choice enables controlled analysis, it limits the assessment of generalization to different robot platforms, sensing modalities, and environmental conditions. Second, phase labels are derived from task-defined time intervals rather than independent human annotations, so hidden sub-phases or mixed behaviors may not be fully captured. Third, although Random Forests and SHAP values provide useful interpretability, the current framework does not yet incorporate explicit temporal models (e.g., hidden Markov models or recurrent networks) or graph-structured representations that could encode time-varying interaction graphs more directly. Fourth, clustering performance is moderate (Silhouette score = 0.2541), indicating that purely feature-based separation of phases remains challenging in some regions of the state space.
In future work, we plan to extend the framework along four directions: (i) applying the pipeline to more diverse swarm datasets, including heterogeneous robot teams and outdoor experiments; (ii) integrating temporal and graph-based models to capture richer interaction dynamics while preserving interpretability; (iii) closing the loop by using the learned models for online monitoring and adaptive control of swarms; and (iv) exploring human-in-the-loop labeling and explanation interfaces to validate and refine behavioral phase definitions. Addressing these limitations will further strengthen the role of interpretable ML as a tool for designing, analyzing, and deploying real-world swarm robotic systems.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
