Object detection algorithm and graph convolutional network for graded early warning of laboratory misconduct

Abstract

With the increasing demand for laboratory safety management, precise detection and graded early warning of misconduct become more critical. Traditional manual monitoring methods suffer from low efficiency, limited coverage, and difficulty handling complex scenarios. To address this, this paper proposes a laboratory misconduct graded early warning model based on object detection algorithms and Graph Convolutional Networks. The model uses You Only Look Once version 9, combined with Convolutional Block Attention Mechanisms, to enhance key feature extraction and accurately identify misconduct. Meanwhile, the Graph Convolutional Network explores spatial correlations between behaviors, and gated recurrent units capture temporal dynamic features to implement graded risk warning. The experimental evaluation showed a minimum loss of 0.027 after 120 iterations, demonstrating superior performance compared with the comparison models, which recorded loss values of 0.24, 0.25, and 0.32. In graded early warning tests, the model reaches an accuracy of 95.62%, with precision and recall exceeding 92%, clearly higher than the highest values of comparison models at 88.21% and 88.01%. These results indicate that the model can achieve precise detection and graded early warning of laboratory misconduct, providing an intelligent solution for laboratory safety management and promoting efficient and accurate safety monitoring.

Keywords

YOLO v9 convolutional block attention mechanism graph convolutional network gated recurrent unit laboratory misconduct graded early warning

1 Background

In the context of rapid scientific and technological development, laboratories serve as key sites for research and teaching, which is important in promoting technological progress and cultivating professional talent. However, common misconduct in laboratories can easily cause equipment overheating, short circuits, and other faults, leading to fire, electric shock, and other safety accidents, seriously threatening personnel and property safety.¹ Traditional manual monitoring methods have low efficiency and coverage, weak flexibility, and poor generalization ability, making them insufficient to meet modern laboratory safety requirements.² Object detection is a fundamental task in computer vision that focuses on recognizing and pinpointing objects within images or video frames.³ The You Only Look Once version 9 (YOLO v9) algorithm adopts a unique single-stage detection strategy, transforming the detection task into a regression problem and completing object classification and localization in a single neural network.⁴ The Graph Convolutional Network (GCN) can directly operate on graph-structured data, fully utilizing node and edge information, and thus has a unique ability to capture complex relationships between data.⁵ Based on this, this paper proposes a laboratory misconduct graded early warning model combining the YOLO algorithm and GCN, aiming to accurately detect and identify violations and provide a new intelligent solution for laboratory management. The innovation lies in introducing an attention mechanism, which enables YOLO to focus more on key regions and features closely related to misconduct, enhancing the algorithm's adaptability and robustness in complex scenarios. Furthermore, an adaptive graph convolution kernel is designed, automatically adjusting convolution weights and parameters according to the importance of nodes and edges in graph data, effectively mining potential correlations and patterns between misconduct behaviors and achieving precise classification and graded early warning.

2 Literature review

Detecting misconduct is key in preventing safety accidents, enhancing management efficiency, and improving personnel safety awareness. Researchers at home and abroad have conducted a series of practical studies. Yan et al. proposed a violation detection system based on YOLO and Convolutional Neural Networks (CNN) for safety monitoring at substation construction sites. The experiments showed that the model reached a precision of 0.852 and a recall of 0.922, satisfying the accuracy requirements for identifying actual violations in power construction.⁶ Shanti et al. proposed a real-time monitoring model for high-altitude activities based on deep learning and unmanned aerial vehicles. Tests indicated that the model reached an accuracy of 97.2% and a recall of 90.2%, with an average time of approximately 12 s to detect violations.⁷ Zhu and Yang introduced a real-time student behavior detection model for classrooms, which leveraged multi-scale feature fusion and self-calibrated convolution. Experimental results showed that the model accurately detected student behavior in class while maintaining fast real-time computation.⁸ Patwal et al. introduced a hybrid model integrating CNN and Long Short-Term Memory networks to address low efficiency in time-frequency abnormal behavior detection. Tests showed that the model achieved a detection accuracy of 92.4%, a false positive rate of 4.1%, and single-frame processing delay controlled within 32 ms.⁹ Patel analyzed crowd behavior using an online object tracking method enhanced by motion compensation. Experimental results indicated that the technique reached a multi-object tracking accuracy of 90.2% across three datasets, ran at 38 frames per second, and achieved an abnormal behavior recognition accuracy of 89.5%.¹⁰

The YOLO algorithm has been widely applied in scenarios such as autonomous driving obstacle detection, security monitoring object tracking, industrial quality inspection defect detection, and medical image lesion screening due to its end-to-end fast detection capability and high real-time performance. Ganagavalli and Santhi proposed an abnormal activity detection model based on YOLO to identify criminal behavior. Tests showed that the area under the curve for subjects performing destructive behavior reached 0.8299, with excellent recognition accuracy across 14 types of criminal activities.¹¹ Xiao et al. proposed a daily behavior recognition model for ducks based on YOLO version 8 and attention mechanisms for animal behavior detection under low-light conditions. Experimental testing demonstrated that the model achieved an average precision of 94.8% under well-lit conditions and 93.6% under dark conditions.¹² Zhao et al. proposed a YOLO-based steel surface defect detection model. Tests indicated that the model reached an average precision of 81.1% and 75.2% on two test datasets, improving by 4.3% and 5.8% compared to comparison models.¹³ GCN has been widely applied in social network analysis, recommendation systems, chemical molecular property prediction, and biological network modeling due to its ability to capture node dependencies and aggregate neighborhood features in graph-structured data. Wang et al. proposed a three-dimensional object detection model based on self-attention GCN for point cloud detection in autonomous driving. Tests showed that the model improved average precision by 4.88%, 5.02%, and 2.79% compared to comparison models.¹⁴ Xu et al. proposed a tool wear recognition model based on GCN and cross-attention structures for high-precision machining. Test results indicated that the model achieved a weighted average F1 score of 0.987 and enabled real-time recognition of tool wear during machining.¹⁵

In summary, researchers have made inevitable progress in detecting misconduct. However, studies on identifying and providing graded early warning of laboratory misconduct remain limited. Therefore, this paper proposes a laboratory misconduct graded early warning model based on YOLO and GCN. The model aims to accurately identify violations during experimental processes, optimize management efficiency, and improve laboratory safety.

3 Graded early warning model based on YOLO v9 and GCN

3.1 Detection algorithm design combining YOLO v9 and attention mechanism

Laboratory environments are complex and variable, with frequent activities of personnel and equipment. Meanwhile, the types of misconduct are diverse, placing high demands on algorithm accuracy and robustness. YOLO algorithms can efficiently process a large number of video frames in a short time, quickly capturing potential misconduct and providing strong support for timely intervention.¹⁶ YOLO v9 introduces the Generalized Efficient Layer Aggregation Network (GELAN) based on YOLO v8, significantly enhancing feature extraction capability and achieving higher detection accuracy. The GELAN feature extraction process is shown in Figure 1.

Figure 1.

GELAN feature extraction process diagram.

As shown in Figure 1, GELAN first extracts features through cross-stage local expansion networks combined with arbitrary computation blocks. At the same time, the module's pooling and convolution layers downsample feature maps to obtain multi-scale features. These multi-scale feature maps are then input to the detection head to predict object categories and locations. Finally, post-processing, such as non-maximum suppression, produces the final detection results. The process optimizes the network structure through gradient path planning, achieving efficient information flow and parameter utilization. In this process, the initial convolution expression is shown in Equation (1).

\begin{aligned} x_{1} = c o n v_{(c_{1}, c_{2})} (x_{i n}) \end{aligned}

(1)

In Equation (1), $x_{1}$ is the output feature tensor, $c o n v$ represents the convolution operation, $x_{i n}$ represents the input feature tensor, $c_{1}$ is the number of output feature tensor channels, and $c_{2}$ represents the number of output feature tensor channels. The DOWN module then performs feature map downsampling, reducing spatial resolution and increasing channel numbers to extract multi-scale features. The final output expression is shown in Equation (2).

\begin{aligned} x_{o u t} = c o n v_{(c_{1}, c_{2})} (x_{n e w}) \end{aligned}

(2)

In Equation (2), $x_{o u t}$ represents the final output feature, $x_{n e w}$ represents the final input feature, $c_{3}$ indicates the number of channels in the final output feature, and $c_{4}$ represents the number of channels in the final input feature. GELAN optimizes the network structure, enabling efficient propagation and aggregation of features at different levels. It reduces computation and parameters while achieving an excellent balance of lightweight and efficiency. However, YOLO v9 may still suffer from insufficient feature extraction in complex laboratory scenarios. Therefore, the Convolutional Block Attention Module (CBAM) is introduced to significantly enhance YOLO v9's feature representation capability. It uses global pooling and multi-layer perceptrons to capture channel semantic dependencies and assign higher weights to key features. Spatial attention focuses on object regions and filters background interference, improving feature extraction for small and occluded objects and enhancing generalization in varying lighting and complex backgrounds.^17,18 The CBAM optimization process is illustrated in Figure 2.

Figure 2.

Schematic diagram of the CBAM optimization process.

As illustrated in Figure 2, within CBAM, the input feature map first undergoes average and max pooling to extract feature vectors, which are then fed into a shared perceptron. The resulting outputs are summed and passed through an activation function to produce channel attention weights. These weights are applied to the original feature map channel-wise, generating the channel-enhanced feature map. This enhanced map is subsequently processed through average and max pooling, followed by a convolution operation to produce intermediate features. An activation function then generates spatial attention weights, which are applied position-wise to the channel-enhanced feature map to obtain the final enhanced feature map. The calculation of CBAM's channel attention is presented in Equation (3).

\begin{aligned} M_{c} (F) = σ (M L P (A v g P o o l (F)) + M L P (M a x P o o l (F))) \end{aligned}

(3)

In Equation (3), $M_{c}$ is the output channel attention weight, F denotes the input feature map, $M L P$ indicates the non-linear transformation of two descriptors by the shared multi-layer perceptron, and $σ$ represents the Sigmoid activation function. The computation of CBAM's spatial attention is shown in Equation (4).

\begin{aligned} M_{s} (F^{'}) = σ (f^{7 \times 7} (A v g P o o l (F^{'}); M a x P o o l (F^{'}))) \end{aligned}

(4)

In Equation (4), $M_{s}$ represents the output spatial attention weight, f denotes the convolution kernel, and $F^{'}$ is the channel-attention-adjusted feature map. In summary, the structure of YOLO v9 optimized by CBAM (CBAM-YOLO v9) is shown in Figure 3.

Figure 3.

Schematic diagram of the structure of CBAM-YOLO v9.

As shown in Figure 3, CBAM-YOLO v9 extracts and preliminarily processes features through convolution, efficient layer aggregation, and adaptive convolution operations. CBAM enhances feature representation. Upsampling increases feature map resolution, and concatenation fuses features from different levels. Finally, dual detection layers complete object detection. CBAM-YOLO v9 employs a dynamic allocation strategy, expressed in Equation (5).

\begin{aligned} t = s^{α} + u^{β} \end{aligned}

(5)

In Equation (5), t represents the dynamic allocation index, s denotes the prediction score, u indicates the localization match, and $α$ and $β$ are hyperparameters controlling the prediction score and localization match. CBAM-YOLO v9's loss includes classification and regression losses. The classification loss is shown in Equation (6).

\begin{aligned} L_{1} = \frac{1}{N} \sum_{i} - [y_{i} \cdot \log (p_{i}) + (1 - y_{i}) \cdot \log (1 - p_{i})] \end{aligned}

(6)

In Equation (6), $L_{1}$ represents the classification loss, N denotes the number of samples, i represents the sample index, y indicates the actual label, and p represents the predicted probability of the sample. The regression loss is shown in Equation (7).

\begin{aligned} L_{2} = 1 - I o U + \frac{ρ^{2} (b, b^{'})}{a^{2}} + ω v \end{aligned}

(7)

In Equation (7), $L_{2}$ denotes the regression loss, $I o U$ is the intersection over union, b represents the predicted bounding box, $ρ$ is the distance between the centers of predicted and actual boxes, a represents the diagonal length of the minimum enclosing rectangle of the predicted and true boxes, $ω$ denotes the weight coefficient, and v indicates the aspect ratio consistency between predicted and true boxes. By dynamically balancing classification and regression losses, CBAM-YOLO v9 achieves unified high-confidence classification and precise localization, completing the object detection task.

3.2 Construction of the laboratory misconduct graded early warning model

Although the constructed CBAM-YOLO v9 algorithm can detect laboratory misconduct precisely, relying solely on the algorithm cannot fully explore potential correlations and risk level differences among different misconduct behaviors. When processing graph-structured data with complex relationships, GCN can construct a multi-dimensional relationship graph of elements in the laboratory scenario and aggregate node correlation information through graph convolution operations. It effectively analyzes dependencies and pattern features between misconduct behaviors, transforming isolated detection results into correlated risk information and providing key spatial association support for subsequent risk level assessment and dynamic early warning of misconduct.¹⁹ The GCN data analysis process is shown in Figure 4.

Figure 4.

Schematic diagram of the GCN data analysis process.

In Figure 4, the input layer receives the original node feature matrix, providing a proper format for subsequent layers. The hidden layer, the core of GCN, consists of multiple stacked graph convolution layers. Repeated aggregation and transformation captures local and global structural information and enables feature interaction among neighboring nodes. The output layer maps the abstract features learned by the hidden layer into specific prediction results according to task design. The graph convolution operation is expressed in Equation (8).

\begin{aligned} H (l + 1) = δ (A H (l) W (l)) \end{aligned}

(8)

In Equation (8), H represents the node feature matrix, l denotes the number of graph convolution layers, $δ$ represents the non-linear activation function, A is the normalized adjacency matrix, and W denotes the learnable weight matrix. After multiple aggregation and transformation steps, local structure information gradually abstracts into a global semantic representation and maps to prediction results, with the output expressed in Equation (9).

\begin{aligned} Z = A H (l - 1) W (l - 1) \end{aligned}

(9)

In Equation (9), Z represents the output node prediction matrix. GCN constructs a multi-dimensional relationship graph that includes personnel, equipment, behaviors, and environment, and uses graph convolution to extract cross-dimensional features and hidden association patterns. This enables risk level assessment and dynamic early warning of misconduct. However, GCN alone cannot capture temporal dynamics and struggles to model the time dependencies of violations. As a temporal model, GRU effectively extracts temporal features of behavior sequences through gating mechanisms, modeling the continuity of infringements and improving the timeliness of early warning.²⁰ The hybrid GCN-GRU structure is shown in Figure 5.

Figure 5.

Hybrid model GCN-GRU structure diagram.

As shown in Figure 5, the GCN-GRU model inputs multi-layer graph-structured data, extracts features through graph convolutional gating layers, and captures topological relationships and feature interactions between nodes. After each graph convolution operation, a non-linear transformation is applied to strengthen the model's capacity to capture complex features and perform deep feature extraction on the input graph data. The features processed by multiple graph convolutional gating layers are passed to a fully connected layer, integrating and transforming the feature dimensions, mapping graph-structured features to the output space and producing the corresponding results. The graph convolution operation replaces the memory gate, correlation gate, and hidden state in GRU, achieving joint learning of spatiotemporal features in graph-structured data and improving feature extraction in complex environments. The memory gate is expressed in Equation (10).

\begin{aligned} U (\overset{⌢}{t}) = σ^{'} (ϕ_{U} G [X (\overset{⌢}{t}), H (\overset{⌢}{t} - 1) + b_{U}^{'}]) \end{aligned}

(10)

In Equation (10), U represents the output of the memory gate, $\overset{⌢}{t}$ denotes the time step, $σ^{'}$ represents the activation function, $ϕ$ is the learnable parameter, G denotes the graph convolution operation, X represents the input, H is the hidden state, and $b^{'}$ represents the memory gate bias. The correlation gate is expressed in Equation (11).

\begin{aligned} r (\overset{⌢}{t}) = σ^{'} (ϕ_{r} G [X (\overset{⌢}{t}), H (\overset{⌢}{t} - 1) + b_{r}^{'}]) \end{aligned}

(11)

In Equation (11), r represents the output of the correlation gate. The candidate hidden state stores selective neuron state information, expressed in Equation (12).

\begin{aligned} C (\overset{⌢}{t}) = \tanh (ϕ_{c} G [X (\overset{⌢}{t}), (r (t) ⊙ H (\overset{⌢}{t} - 1)) + b_{c}^{'}]) \end{aligned}

(12)

In Equation (12), C denotes the candidate hidden state, and $⊙$ represents the Hadamard product. The output gate determines the new hidden state based on the stored memory unit, as shown in Equation (13).

\begin{aligned} H (\overset{⌢}{t}) = U (t) ⊙ H (\overset{⌢}{t} - 1) + (1 - U) ⊙ C (t) \end{aligned}

(13)

In Equation (13), the core is the weighted integration of the historical and candidate states through the memory gate, generating a new hidden state and achieving information retention and update. In summary, this study combines CBAM-YOLO v9 and the GCN-GRU model to construct a laboratory misconduct graded early warning model (YOLO v9-GCN-GRU) to precisely identify misconduct, explore spatial correlations and temporal dynamic features between behaviors, and implement graded risk level early warning. The YOLO v9-GCN-GRU model process is shown in Figure 6.

Figure 6.

YOLO v9-GCN-GRU model operation flow chart (Icon source from: iconpark.oceanengine.com).

As shown in Figure 6, in the YOLO v9-GCN-GRU model, laboratory monitoring data first enter the CBAM-YOLO v9 module. GELAN extracts multi-scale features to capture objects and scene information at different scales in video frames. CBAM attention enhances key features, highlighting features critical for misconduct detection and suppressing irrelevant information. After feature processing, extracted features are converted into graph-structured data, representing elements and their relationships as nodes and edges. GCN aggregates node associations in the graph structure to explore spatial feature correlations, while GRU captures temporal dependencies in video sequences and integrates temporal information. Finally, the fully connected layer integrates and classifies the processed features, outputting graded early warning results and achieving detection and risk-level assessment of laboratory misconduct.

4 Performance validation of the graded early warning model based on YOLO v9 and GRU

4.1 Performance analysis of CBAM-YOLO v9 misconduct detection algorithm

To evaluate the performance of the CBAM-YOLO v9 misconduct detection algorithm, the study compared it with Faster Region-based CNN (Faster R-CNN), Squeeze and Excitation-YOLO v8 (SE-YOLO v8), and Fully Convolutional One Stage (FCOS). The experiments were conducted on a Windows 10 system using the PyTorch deep learning framework with the Adam optimizer. The CPU was Intel Core i9-13900K@5.8 GHz, the GPU was NVIDIA RTX 3090 (24GB), and the memory was 32GB. The UCF-Crime and ShanghaiTech datasets were used. The average loss was measured, and the results are shown in Figure 7.

Figure 7.

Comparison of average loss training and testing results.

In Figure 7(a), CBAM-YOLO v9 reached an average loss of 0.015 after 120 iterations, with the overall loss curve stabilizing gradually after 60 iterations. Figure 7(b) showed that FCOS exhibited large fluctuations in average loss between 30 and 90 iterations, with the lowest average loss of 0.12. Figure 7(c) indicated that SE-YOLO v8's average loss curve stabilized after 90 iterations. Figure 7(d) revealed that Faster R-CNN reached an average loss of 0.03 at 130 iterations. In summary, CBAM-YOLO v9, by introducing GELAN and CBAM attention, precisely focused on key features of laboratory misconduct and filtered background interference, achieving faster loss convergence than other algorithms. Subsequently, the mean Average Precision (mAP) and Frames Per Second (FPS) of the four algorithms were tested, as shown in Figure 8.

Figure 8.

Comparison of mAP and FPS test results.

Figure 8 indicated that on the UCF-Crime dataset, CBAM-YOLO v9 achieved a mAP of 97.2%, while SE-YOLO v8 achieved 92.7%. CBAM-YOLO v9 reached 93 FPS on UCF-Crime, 12 frames higher than SE-YOLO v8. On ShanghaiTech, CBAM-YOLO v9 achieved 91 FPS, 24.3 frames higher than FCOS. CBAM-YOLO v9, through CBAM synergy, enhanced feature representation for small and occluded objects, achieving high accuracy and high frame rate. The Area Under Curve (AUC) value and F1 scores were tested to further evaluate performance, with results shown in Figure 9.

Figure 9.

Comparison of AUC and F1 score test results.

As shown in Figure 9(a), CBAM-YOLO v9's Receiver Operating Characteristic (ROC) curve was closest to the top-left corner, with an AUC of 0.921. Figure 9(b) indicated that its F1 scores remained above 80%, reaching 93.7% at 120 iterations. In conclusion, CBAM-YOLO v9 demonstrated strong robustness and fast convergence, accurately detecting misconduct behaviors.

4.2 Application validation of the laboratory misconduct graded early warning model

After validating CBAM-YOLO v9, the study assessed the performance of the laboratory misconduct-graded early warning model based on CBAM-YOLO v9. It compared YOLO v9-GCN-GRU with YOLO v9-GCN, SE-YOLO v8-Transformer, and Fully Convolutional One Stage-Spatial Temporal-GCN (FCOS-ST-GCN). The datasets were constructed from Lab A and Lab B of a university. The loss test results are shown in Figure 10.

Figure 10.

Comparison of loss value test results.

Figure 10(a) showed that on the Lab A dataset, YOLO v9-GCN-GRU reached a loss of 0.11 at 60 iterations, while YOLO v9-GCN, SE-YOLO v8-Trans, and FCOS-ST-GCN reached 0.20, 0.22, and 0.27, respectively. Figure 10(b) showed that on the Lab B dataset, YOLO v9-GCN-GRU reached a loss of 0.027 at 120 iterations, with the loss curve stabilizing after 60 iterations. Overall, YOLO v9-GCN-GRU achieved the lowest loss and fastest convergence compared with the other models. To further verify performance, graded accuracy and per-frame processing delay were tested, as shown in Figure 11.

Figure 11.

Comparison of accuracy and single-frame processing delay test results.

Figure 11(a) showed that at 150 iterations, YOLO v9-GCN-GRU reached a graded accuracy of 94.57%, 4.45% higher than YOLO v9-GCN and 7.13% higher than SE-YOLO v8-Trans. Figure 11(b) indicated that its average per-frame processing delay was 29.46 ms, 6.79 ms, and 2.75 ms lower than FCOS-ST-GCN and YOLO v9-GCN, respectively. In summary, YOLO v9-GCN-GRU achieved significant advantages in graded accuracy, enabling fast and precise early warning of misconduct. Precision and recall were tested to analyze overall performance, with results shown in Table 1.

Table 1.

Comparison of precision and recall test results.

Dataset	Model	Precision (%)	Recall (%)
Lab A	YOLO v9-GCN-GRU	92.34	91.82
	YOLO v9-GCN	86.17	86.61
	SE-YOLO v8-Trans	83.03	81.77
	FCOS-ST-GCN	81.78	80.85
Lab B	YOLO v9-GCN-GRU	93.52	92.63
	YOLO v9-GCN	88.21	88.01
	SE-YOLO v8-Trans	84.15	83.44
	FCOS-ST-GCN	82.46	81.67

Table 1 indicated that on the Lab A dataset, YOLO v9-GCN-GRU achieved a precision of 92.34%. On the Lab B dataset, its precision reached 93.52%, while YOLO v9-GCN, SE-YOLO v8-Trans, and FCOS-ST-GCN achieved 88.21%, 84.15%, and 82.46%, respectively. YOLO v9-GCN-GRU improved recall by 9.19% compared with SE-YOLO v8-Trans. In conclusion, YOLO v9-GCN-GRU leveraged GCN to enhance risk correlation for graded prediction, reduced false warnings from isolated behavior detection via GRU, and shortened preprocessing time through CBAM's efficient feature extraction, maintaining high graded precision.

5 Conclusion

To address the limitations of traditional laboratory safety monitoring methods, such as low effectiveness and insufficient coverage, this study proposed a laboratory misconduct graded early warning model based on the CABM-YOLO v9 object detection algorithm, further optimized with GRU and GCN. Experimental results showed that the proposed CABM-YOLO v9 achieved a minimum average loss of 0.027, with the overall loss curve gradually stabilizing after 60 iterations. On the two datasets, its mAP reached 97.2% and 95.4%, while the Area Under Curve (AUC) was 0.921, and the F1 score reached 93.7%. Furthermore, experiments indicated that the YOLO v9-GCN-GRU graded early warning model achieved loss values of 0.031 and 0.027 on the two datasets. The graded accuracy exceeded 94%, with an average per-frame processing delay of 29.46 ms. Both precision and recall exceeded 92%, outperforming comparable models. Overall, the YOLO v9-GCN-GRU model demonstrated high misconduct detection rates and accurate graded early warning performance. Although the proposed model performed well, the coverage of behavior categories was limited, and its generalization capability requires further evaluation. Future work tests more categories of misconduct to improve adaptability in complex dynamic scenarios, providing more comprehensive support for intelligent laboratory safety management.

Footnotes

Acknowledgment

N/A.

Ethical approval

N/A.

Author's contribution

Yubin Wang, writing—original draft, writing—review and editing, conceptualization, methodology.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

All data generated or analyzed during this study are included in this article.

References

Zhao

DeSousa

Henriksen

, et al. An assessment of laboratory safety training in undergraduate education. J Chem Educ 2024; 101: 1626–1634.

Dunn

Decker

Cartaya-Marin

, et al. Reducing risk: strategies to advance laboratory safety through diversity, equity, inclusion, and respect. J Am Chem Soc 2023; 145: 11468–11471.

Sirisha

Praveen

Srinivasu

, et al. Statistical analysis of design aspects of various YOLO-based deep learning models for object detection. Int J Comput Intell Syst 2023; 16: 126–154.

Jiang

Ergu

Liu

, et al. A review of Yolo algorithm developments. Procedia Comput Sci 2022; 199: 1066–1073.

Mokayed

Quan

Alkhaled

, et al. Real-time human detection and counting system using deep learning computer vision techniques. Artif Intell Appl 2023; 1: 221–229.

Yan

, et al. Deep learning-based substation remote construction management and AI automatic violation detection system. IET Gener Transm Distrib 2022; 16: 1714–1726.

Shanti

Cho

SBG

, et al. Real-time monitoring of work-at-height safety hazards in construction sites using drones and deep learning. J Saf Res 2022; 83: 364–370.

Zhu

Yang

. Csb-yolo: a rapid and efficient real-time algorithm for classroom student behavior detection. J Real-Time Image Process 2024; 21: 140–157.

Patwal

Diwakar

Tripathi

, et al. An investigation of videos for abnormal behavior detection. Procedia Comput Sci 2023; 218: 2264–2272.

10.

Patel

Vyas

, et al. Motion-compensated online object tracking for activity detection and crowd behavior analysis. Vis Comput 2023; 39: 2127–2147.

11.

Ganagavalli

Santhi

. YOLO-based anomaly activity detection system for human behavior analysis and crime mitigation. Signal Image Video Process 2024; 18: 417–427.

12.

Xiao

Wang

Liu

, et al. DHSW-YOLO: a duck flock daily behavior recognition model adaptable to bright and dark conditions. Comput Electron Agric 2024; 225: 109281–109297.

13.

Zhao

Shu

Yan

, et al. RDD-YOLO: a modified YOLO for detection of steel surface defects. Measurement (Mahwah N J) 2023; 214: 112776–112790.

14.

Wang

Song

Zhang

, et al. SAT-GCN: self-attention graph convolutional network-based 3D object detection for autonomous driving. Knowl Based Syst 2023; 259: 110080–110103.

15.

Zhang

Fan

, et al. Deep-learning-driven intelligent tool wear identification of high-precision machining with multi-scale CNN-BiLSTM-GCN. Adv Eng Inf 2025; 65: 103234–103257.

16.

Wang

Nie

, et al. Gold-YOLO: efficient object detector via gather-and-distribute mechanism. Adv Neural Inf Process Syst 2023; 36: 51094–51112.

17.

Magacho

Espagne

Godin

. Impacts of the CBAM on EU trade partners: consequences for developing countries. Clim Policy 2024; 24: 243–259.

18.

Wang

Tan

Zhang

, et al. A CBAM based multiscale transformer fusion approach for remote sensing image change detection. IEEE J Sel Top Appl Earth Obs Remote Sens 2022; 15: 6817–6825.

19.

Peng

Ren

, et al. Novel GCN model using dense connection and attention mechanism for text classification. Neural Process Lett 2024; 56: 144–165.

20.

Guo

Yang

Wang

, et al. A novel deep learning model integrating CNN and GRU to predict particulate matter concentrations. Process Saf Environ Prot 2023; 173: 604–613.