Machine learning-based modeling of noise in warehouse environments

Abstract

The expansion of industrial and logistics facilities increases the impact of environmental noise on residential areas, making it essential to apply accurate and reliable methods for noise assessment and prediction. The aim of this study is to model the dispersion of noise generated during warehouse operations and to evaluate the suitability of machine learning methods for predicting noise levels. Noise calculations were performed using Inter-Model Integration (IMMI) software according to the ISO 9613-2 methodology and the requirements of the Lithuanian hygiene standard HN 33:2011. The spatial distribution of noise was visualized in a GIS (Geographic Information Systems) environment by calculating zonal raster statistical indicators. Predictive modeling was performed using five machine learning algorithms - Random Forest, M5P, Multilayer Perceptron, SMOreg, and Linear Regression - implemented within the WEKA environment. Results indicate that noise generated by warehouse operations near the closest residential areas did not exceed the regulatory limits, with the highest noise levels observed in areas of high traffic flow. The machine learning (ML) algorithms demonstrated very high prediction accuracy - all tested models achieved a correlation coefficient (r) above 0.97. ML analysis revealed particularly high predictive accuracy when using Random Forest (r = 0.9984) and M5P (r = 0.9898) algorithms. These findings confirm that integrating zonal raster statistical indicators with machine learning methods is an effective tool for analyzing industrial noise dispersion and can be successfully applied for practical environmental noise assessment and planning purposes.

Keywords

Machine learning noise warehouse environments GIS correlation coefficient

Introduction

Environmental noise is one of the most significant environmental factors adversely affecting public health and quality of life, especially in urbanized and industrial areas. At the European Union level, noise pollution is recognized as a significant environmental issue linked to sleep disturbances, cardiovascular diseases, and a decline in population well-being.¹ Due to the rapid development of the logistics and warehousing sector, situations are increasingly encountered in which industrial facilities are located within residential areas, creating a need for accurate assessment and prediction of their noise impact.

Traditionally, environmental noise assessment is based on physical sound propagation models, such as the ISO 9613-2 methodology, which is widely applied in industrial noise calculations (ISO 9613-2, 1996). Although these models are standardized and reliable, the literature notes that they do not always accurately reflect complex spatial and nonlinear patterns of noise propagation, particularly in areas with diverse infrastructure and variable noise sources.² For this reason, increasing attention has been paid in recent years to the integration of advanced data analysis methods into noise research. Givargis and Karimi (2022) emphasize that in industrial settings, traditional linear regression often falls short compared to Artificial Neural Networks, which better capture these interdependencies.

Machine learning (ML) methods are becoming increasingly popular in environmental noise modeling, as they enable efficient processing of large datasets, identification of complex nonlinear relationships, and achievement of high predictive accuracy.³ Studies show that models such as Random Forest, Gradient Boosting, or neural networks often outperform traditional regression techniques in predicting noise levels in both urban and industrial areas.⁴

The integration of geographic information systems (GIS) and machine learning methods occupies an important place in contemporary noise research. GIS enables detailed analysis of the spatial distribution of noise, while the use of zonal raster statistical indicators provides the opportunity to summarize noise characteristics across different areas.³ Studies by Liu et al. (2021) have shown that combining traditional propagation models with the Random Forest algorithm can significantly improve the accuracy of high-resolution noise maps. Similar conclusions are reported by other authors, emphasizing that nonlinear ML models better capture the influence of extreme noise values and spatial variability on overall noise levels.

Despite the growing interest in the application of machine learning to noise modeling, most studies focus on transport or urban noise, while the prediction of noise from industrial and warehousing facilities - especially using zonal raster statistical indicators remains insufficiently explored. While this study focuses on macro-level noise propagation from warehouse operations, it is important to note that the broader field of noise and vibration control encompasses various specialized mitigation strategies. For instance, research has addressed the modification of vibratory behavior in mechanical systems to reduce injury,⁵ the use of advanced metamaterial absorbers for wave control in infrastructure,⁶ and the development of efficient numerical methods for solving complex non-linear equations in damping systems.⁷ Furthermore, the literature emphasizes the importance of employing not only accurate but also interpretable models that facilitate a deeper understanding of noise generation mechanisms and inform evidence-based noise management decisions.⁸ The scientific novelty of this study lies in the application of zonal raster statistics as multi-dimensional predictors for machine learning, creating a highly efficient computational ‘digital twin' of the ISO 9613-2:1996 standard. This approach allows for rapid noise assessment in large-scale industrial zones without the need for repetitive and time-consuming 3D simulations, offering a scalable framework for industrial noise management.

This study aims to model the propagation of environmental noise caused by warehouse operational activities and to evaluate the suitability of machine learning (ML) methods for predicting noise levels using spatial geographic information systems (GIS) raster-based statistical indicators.

Materials and methods

Environmental and warehouse-site noise sources

First, preliminary information on the warehouse operations and its immediate surroundings was collected, and the necessary data for noise propagation modeling were gathered by identifying potential noise sources and assigning the noise emission parameters they generate.

During warehouse operation at maximum production capacity, approximately 20 heavy-duty vehicles and about 130 passenger cars enter and leave the site per day (between 08:00 and 24:00), of which around 20 passenger cars per day arrive specifically at the warehouse. Employee transport and other arriving passenger vehicles are parked within the company’s premises in parking lots with a total of 55 spaces. The parking lot areas, taking into account the number of parking spaces, are treated as area noise sources.

Inside the warehouse, products are transported using electric forklifts (2 units) and electric pallet trucks (7 units). Based on data provided by various manufacturers, the sound power levels emitted to the environment by electric pallet trucks range from 66 to 70 dBA; therefore, a worst-case scenario was assumed in the calculations.

The dominant noise sources within the building are those generated by equipment used in technological processes. These noise sources are located inside the premises; therefore, environmental noise will be effectively attenuated by internal partitions and the enclosed external building envelope, which consists of multilayer wall panels (sandwich-type) with 120 mm PIR insulation infill, glazed units, doors, and gates.

For the calculations, it was assumed that the sound insulation index of the walls is not less than Rw = 27 dBA (see Table 1).

Table 1.

Technical and acoustic parameters of the building.

Object	Wall thickness, mm	Wall type	Sound insulation
Warehouse building	120	“Sandwich” panel	RW -27 dBA

All equipment within the building is not characterized by high noise levels, as none of the equipment exceeds 85 dBA; however, as a worst-case scenario, a noise level of 85 dBA is assumed in the production area during modeling. This assumption is made because employees work near noisy equipment, and in accordance with the amendment of the Order of the Minister of Social Security and Labour of the Republic of Lithuania and the Minister of Health of the Republic of Lithuania of 15 April 2005 No. A1-103/V-265 “On the Approval of Regulations on the Protection of Employees from Risks Related to Noise,” as amended on 25 June 2013 No. A1-310/V-640 (Vilnius), the upper exposure action value for noise in employees’ work areas, Lex, 8h, must not exceed 85 dBA (Figure 1).

Figure 1.

Analyzed area and noise sources.

Installed on the building roof are roof-mounted fans (8 units), external cooling units for air conditioning (5 units), and external cooling units for ventilation sections (3 units). This equipment operates during daytime, evening, and nighttime periods. A diesel generator is used as a backup (emergency) power supply outdoors (not indoors) approximately three times per year; therefore, its noise is not assessed. More detailed information on noise sources is provided in Table 2.

Table 2.

Noise sources.

Source location	Noise source name	Number of sources/daily flow	Noise emission level	Noise source location	Operating time
Yard	Heavy-duty vehicles (delivering and transporting stored products)	20 cars	-	In the outdoor	08:00–24:00
	Light vehicle traffic flow	130 cars	-	In the outdoor	08:00–24:00
	Passenger vehicles (55-space parking lot)	55 cars	-	In the outdoor	08:00–24:00
Warehouse	Loading/unloading operations using electric forklifts	2 pcs.	79 dBA	Indoors	08:00–24:00
	Electric pallet trucks	7 pcs.	70 dBA	Indoors	08:00–24:00
	Ventilation equipment	5 pcs.	51 - 69 dBA	Indoors	24 hours
	Ceiling cassette units in the administrative area	29 pcs.	28 dBA	Indoors	24 hours
	Air curtains in the production area	5 pcs.	65 dBA	Indoors	24 hours
	Air conditioners in the warehouse	8 pcs.	50 dBA	Indoors	24 hours
	Roof-mounted fans	9 pcs.	68 - 72 dBA	Outside on the roof	24 hours
	Outdoor cooling systems	13 pcs.	30 – 87 dBA	Outside on the roof	24 hours

Residential environment

The nearest residential building is located approximately 73 m from the boundary of the analyzed operational site. Other residential buildings and their protected (residential) areas are situated more than 600 m away; therefore, noise level calculations were performed only for the nearest residential location (see Figure 2).

Figure 2.

The residential buildings located closest to the warehouse’s operational activities.

Assessment method

Noise from warehouse operations is evaluated using the Lday, Levening, and Lnight noise indicators.

Acoustic modeling was conducted to determine whether the warehouse’s operation could lead to exceedances of regulatory noise limits, and, if so, to select measures to prevent them (see Tables 3 and 4).

Table 3.

Terms and conditions of legal documents and recommendations⁹.

Document	Terms, recommendations
The noise management Act of the republic of Lithuania was approved on 26 October 2004. IX–2499, (official Gazette, 2004, no. 164–5971),	Noise limit value shall mean an average value of L_day, L_evening, or L_night, the exceeding of which causes the manager of a noise source to enforce noise prevention and/or reduction measures.
Directive 2002/49/EC of the European Parliament and of the Council of 25 June 2002 relating to the assessment and management of environmental noise.	Annex II. Assessment methods for noise indicators.
	For industrial noise: ISO 9613-2: ‘Acoustics. Abatement of sound propagation outdoors. Part 2. The general method of calculation.
	For road traffic noise: The French national computation method ‘NMPB–Routes–96 (SETRA–CERTU–LCPC–CSTB)’, referred to in ‘Arête du 5 mai 1995 relatif au bruit des infrastructures routières, Journal offieciel du 10 mai 1995, article 6’ and French standard ‘XPS 31–133’.
	The above-mentioned methodology is also recommended by the Lithuanian hygiene regulation document HN 33:2011.
Lithuanian hygiene regulation HN 33:2011: Noise limit values in residential and public buildings and their environment, approved by the minister of health of the republic of Lithuania on June 13, 2011. By order no. V-604.	This hygiene regulation determines the limit values of noise emitted by noise sources in residential and public buildings and their surroundings and is applied when assessing the impact of noise on public health.

Table 4.

Terms and conditions of legal documents and recommendations.

Measurement sites	Time of day, h	Equivalent sound pressure level (L_AeqT), dBA	Maximum sound pressure level (L_AFmax), dBA
Dwellings in residential buildings (houses), bedrooms in public buildings, wards in inpatient health care institutions.	7–19	45	55
	19–22	40	50
	22–7	35	45
In an environment of residential buildings (houses) and public buildings (except the catering facilities and culture centers), excluding transport noise.	7–19	55	60
	19–22	50	55
	22–7	45	50

Data preparation and spatial analysis

Noise calculations were performed using the IMMI software, applying the noise sources listed in Table 2. The calculations took into account building heights, Rw values, terrain, meteorological conditions, and the noise-absorbing properties of the area. The modeled noise indicators were Lday (12 h), Levening (3 h), and Lnight (9 h). The noise generated by the analyzed site was assessed according to the limit values of HN: 33:2011, which are intended for evaluating noise from industrial facilities. The assessment also considered the time of day during which the noise sources operate.

The resulting noise level calculations were visualized on maps using ArcGIS Pro software, with different color intervals representing every 5 dBA. Noise dispersion was calculated at a height of 1.5 m, with a grid resolution of dx = 5 m and dy = 5 m.

The raster statistical indicators (MEAN, MIN, MAX, RANGE, and STD) were used in this study as descriptors of the spatial variability and structural characteristics of the modeled noise field. These indicators were not treated as fully independent environmental predictors, but rather as analytical variables representing local spatial noise dispersion patterns derived from the standardized acoustic model output.

Application of machine learning models

The obtained statistical variables were used as input data in the WEKA software environment to determine the most accurate noise level prediction model. The machine learning analysis performed in this study has a methodological and exploratory character, aiming to evaluate the capability of ML algorithms to identify relationships between raster-derived spatial statistical indicators and modeled environmental noise levels. Five different models were applied in the study: Random Forest, M5P (model trees), Multilayer Perceptron (artificial neural networks), SMOreg, and Linear Regression.

The reliability of the models was evaluated using the correlation coefficient (r) and the mean absolute error (MAE). An attribute importance analysis was conducted using the Correlation Ranking Filter.

Model training and validation

To ensure the reliability of the predictive models and to prevent overfitting, a 10-fold cross-validation procedure was employed for all machine learning algorithms in the WEKA environment. In this process, the dataset was randomly partitioned into ten equal-sized subsamples; nine subsamples were used for training the model, and the remaining single subsample was used for testing. This process was repeated ten times, with each subsample used exactly once as the validation data. The exceptionally high correlation coefficients observed (e.g., r > 0.9984 for Random Forest) are attributed to the deterministic nature of the training data generated by the ISO 9613-2 methodology. Unlike empirical field measurements, which contain stochastic environmental noise and measurement errors, the simulated data follow consistent physical laws that the ML models can learn with high precision. All input features were checked for potential data leakage to ensure that no information from the target variable was present in the training set.

Model Implementation and Hyperparameters

To ensure reproducibility, all machine learning models were implemented in the WEKA (v3.8) environment using the following hyperparameter configurations:

• Random Forest: 100 iterations (trees) were used to build the ensemble. No maximum depth limit was set for individual trees to allow the model to capture complex nonlinear relationships in the spatial data.

• M5P: This model combined a decision tree with linear regression functions at the leaves. Pruning was enabled to prevent overfitting and ensure the interpretability of the model’s logic.

• Multilayer Perceptron (MLP): The architecture consisted of a sigmoid-based hidden layer. The learning rate was set to 0.3, and the momentum to 0.2, with 500 training epochs. Input data were automatically normalized within the WEKA MLP wrapper.

• SMOreg: A Support Vector Machine for regression was implemented using a polynomial kernel. Data were normalized by default to ensure the efficiency of the hyperplane separation.

• Linear Regression: A standard least-squares approach was used, with an integrated attribute selection method (M5 method) to handle potential multicollinearity between zonal indicators.

Regarding data preprocessing, the Correlation Ranking Filter was applied before training to evaluate the predictive power of the input features. The analysis revealed that RANGE and MAX zonal indicators had the highest information gain (weights of 0.93 and 0.91, respectively), confirming their suitability for the model.

Results

Based on the assessment of the calculation results, it can be observed that during the daytime, a potential noise level exceedance of approximately 1-3 dBA is possible along the northern boundary of the company’s land plot. This level is caused by the access/exit road to the company’s territory running along the northern boundary of the plot. The calculations determined that noise generated by the equipment located inside the warehouse building and on the roof (e.g., ventilation systems) reaches a maximum level of less than 25 dBA at the nearest residential boundaries. This level is significantly below the existing ambient background noise and does not contribute to a measurable increase in the total noise load on the residential environment. The noise level at the residential building during the daytime is 48.2 dBA, in the evening 43.5 dBA, and at night 39.5 dBA (see Table 5). Following the noise modelling of the warehouse’s economic activity, it was determined that the limit noise levels at the façades of the nearest residential building and in its surroundings comply with the requirements of HN 33:2011 (see Figure 3).

Table 5.

Calculated noise levels of the warehouse acoustic situation.

Address	Calculation height	Lday	Levening	Lnight
Northern boundary of the plot	1.5 m	56.3	51.9	48.4
Eastern boundary of the plot	1.5 m	51.2	47.7	43.8
Southern boundary of the plot	1.5 m	42.6	45.6	44.4
Western boundary of the plot	1.5 m	46.3	42.9	37.7
Residential environment	1.5 m	48.2	43.5	39.5

Figure 3.

Modeled noise dispersion maps for: (a) Lday, (b) Levening, (c) Lnight.

The machine learning analysis was conducted using a sample of 1808 spatial objects derived from zonal raster statistical indicators. The dependent variable used in the analysis was the level of noise pollution. The following machine learning algorithms were applied: Random Forest, M5P, Multilayer Perceptron, SMOreg, and Linear Regression, all of which demonstrated very high predictive accuracy. In all cases, the correlation coefficient exceeded 0.97, indicating an extremely strong relationship between the independent variables and the predicted noise values.

The highest accuracy was achieved using the Random Forest algorithm, which reached a correlation coefficient of 0.9984 and a mean absolute error (MAE) of 0.0785. This demonstrates that raster statistical indicators enable highly accurate prediction of noise levels. The M5P model tree ranked second in terms of accuracy, with a correlation coefficient of 0.9898 and MAE of 0.1933.

The neural network model (Multilayer Perceptron) also showed high accuracy (correlation coefficient of 0.9886); however, its error was higher than that of the tree-based algorithms. The poorest performance was obtained using the Linear Regression model, which achieved the lowest accuracy (correlation coefficient of 0.9725, MAE of 0.3609), indicating that the relationships within the analyzed data are not purely linear.

The comparative analysis performed in this study revealed a clear trend indicating that tree-based algorithms (Random Forest and M5P) describe the spatial patterns of noise pollution more effectively than linear or neural network models. This suggests that non-linear and hierarchical relationships between environmental factors have a significant influence on noise level formation.

The Random Forest algorithm stood out as the most suitable for practical prediction due to its lowest error, while the M5P model proved to be an optimal solution for result interpretation and rule analysis (see Table 6).

Table 6.

Comparison of machine learning models.

Algorithm	Correlation coefficient	Mean absolute error (MAE)
RandomForest	0.9984	0.0785
M5P (modelis-medis)	0.9898	0.1933
MultilayerPerceptron	0.9886	0.2933
SMOreg	0.9723	0.3500
LinearRegression	0.9725	0.3609

The attribute importance analysis conducted using the Correlation Ranking Filter revealed that the value range (RANGE) has the greatest influence on noise pollution levels, with a correlation weight of 0.9316. The second most important indicator was the maximum value (MAX), with a weight of 0.9063.

The structure of the M5P model tree showed that the primary data-splitting criterion is RANGE ≤11.45. This enables the identification of two fundamentally different zones: areas with low value dispersion and areas with high dispersion.

In high-dispersion zones, the model becomes particularly sensitive to maximum values (MAX), indicating that extreme noise peaks in such areas play a decisive role in determining the overall noise level. In low-dispersion zones, predictions were more accurate, with minimal error.

The obtained results confirm that the spatial patterns of noise pollution can be modeled with very high accuracy using zonal raster statistical indicators and advanced machine learning methods. Tree-based algorithms proved to be especially effective, as they are capable of capturing both non-linear relationships and clear rule-based logical structures.^10–14

Discussion

The noise pollution model proposed in this study, based on the application of spatial raster statistical indicators and machine learning methods, demonstrated high predictive accuracy, particularly through the use of Random Forest and the M5P decision tree. Analysis of the results revealed that non-linear models are more effective at capturing complex noise dispersion patterns than linear models, aligning with observations by other researchers regarding the superiority of non-linear methods in noise modeling.¹⁵

Similar to our study, Random Forest-based methods are frequently identified as effective for predicting noise levels. Research combining GIS with Random Forest and Gradient Boosting exhibits higher predictive accuracy than traditional regression techniques, especially when spatial and topographical variables are incorporated into the model.⁴ This indicates that such models better capture complex spatial noise dispersion patterns and may be more useful in practical urban noise studies.¹⁶

Another significant study² introduced a hybrid noise prediction method that combines dispersion models with machine learning models, such as Random Forest, to improve predictions of spatial noise distribution and reduce model errors. This confirms the model’s effectiveness in complex spatial environments where traditional dispersion models may be inaccurate due to complex environmental conditions.¹⁷

The literature also provides examples of applying deep neural networks to model noise propagation in three-dimensional environments. For instance, a study utilizing a model based on convolutional layers achieved a low error rate in noise pressure prediction while significantly reducing computation time compared to traditional methodologies.⁸ Although such deep learning methods often require larger datasets and computational resources, their ability to process complex spatial and contextual relationships represents an important direction for development.¹⁸

Another aspect concerns the inclusion of infrastructural factors in noise models. Studies using street view imagery in conjunction with ML models have found that incorporating street environment features improves the predictive accuracy of noise models. Non-linear models like XGBoost can detect critical spatial relationships that traditional analysis methods miss.¹⁹ This suggests that multispectral variables, moving beyond simple dispersion statistics, can significantly improve noise predictions, particularly in urban contexts.

While many authors emphasize the application of ML methods to noise modeling, it is important to note that data quantity and quality are essential prerequisites for success.^3,20,21 In most cases, large, diverse, and accurately measured noise monitoring samples are required to achieve high accuracy.^1,15,22 In our study, although high predictive indicators were achieved, the accuracy of the ML models also partly depends on the quality of the spatial variables and their alignment with real-world noise source characteristics.

It is important to acknowledge that the raster statistical indicators used as predictor variables were derived from the same modeled noise raster used to define the target variable. Consequently, a degree of redundancy and spatial autocorrelation between predictors and target values is inherently present in the dataset. Therefore, the very high predictive accuracy achieved by several ML models partly reflects these mathematical and spatial relationships rather than completely independent predictive capability.

Future research should consider additional data sources, such as meteorological data or infrastructure parameters, which could further enhance the model’s ability to reflect real noise conditions.^2,8,23

In summary, the application of machine learning methods in noise dispersion modeling is an intensively researched field, where neural networks and hybrid solutions are frequently used to improve prediction quality and ensure better spatial noise maps. Our results—specifically the high Although the machine learning models in this study were trained using noise data simulated according to the ISO 9613-2 methodology, this approach serves as a critical first step in validating the integration of zonal raster statistics with predictive algorithms. The use of standardized simulation data allowed for a controlled environment to evaluate the models’ ability to capture non-linear noise propagation patterns without the interference of complex background noise, such as the high-intensity traffic from the nearby A1 highway. This methodology aligns with recent trends where hybrid models are trained on established physical propagation outputs to improve computational efficiency in environmental planning. While field measurements remain the ultimate empirical benchmark, the current analysis demonstrates that ML can effectively replicate the regulatory assessment logic required by standard HN 33:2011. While it is true that the ML models in this study approximate a deterministic dataset generated by the ISO 9613-2 methodology, their added value lies in computational efficiency and diagnostic interpretability. Traditional acoustic modeling in software like IMMI requires detailed 3D spatial data and significant computational time for large-scale territories. In contrast, the trained ML models developed in this study can provide near-instantaneous noise predictions using only 2D zonal raster statistics as inputs. This makes them a highly efficient “proxy model” for rapid urban planning and preliminary environmental impact assessments. Furthermore, by utilizing feature importance analysis, this approach reveals the underlying spatial drivers of noise distribution - such as the critical role of the RANGE indicator - which traditional deterministic models do not explicitly quantify. Thus, the ML integration serves not just as a replication of ISO 9613-2, but as a diagnostic tool that enhances the understanding of spatial-acoustic relationships. The uniqueness of this study lies in the integration of zonal raster statistical indicators with interpretable machine learning models. This approach allows not only for highly accurate noise pollution forecasting but also reveals the role of internal spatial dispersion and extreme values in the process of noise formation.

Study limitations and future perspectives

Despite the high predictive accuracy achieved, this study has several limitations that must be acknowledged. First, the ML models were trained and validated exclusively on simulated data generated by the ISO 9613-2 methodology. While this approach is valid for creating a “digital twin” of standardized noise propagation, it does not account for real-world uncertainties such as fluctuating meteorological conditions or transient background noise that field measurements would capture. Therefore, the reported accuracy reflects the models’ ability to replicate a deterministic standard rather than empirical reality. Second, the current model’s generalization is limited to the specific spatial configuration of the analyzed warehouse site. While the methodological framework - utilizing zonal raster statistics - is highly scalable and could be applied to other industrial contexts, the specific model weights and decision rules (e.g., in the M5P tree) are site-dependent. Future research should focus on integrating multi-site data and in situ acoustic measurements to enhance the models' robustness and evaluate their performance across diverse environmental settings.

Conclusion

Noise dispersion modeling results indicate that the noise generated by the warehouse’s economic activities does not exceed the limit values set by HN 33:2011 at the nearest residential environment (located approximately 73 m away). The noise levels at the residential building reached 48.2 dBA during the day, 43.5 dBA in the evening, and 39.5 dBA at night. The contribution of internal equipment and rooftop units remained negligible, with calculated levels below 25.0 dBA at the receptors.

Random Forest was identified as the most accurate algorithm (r = 0.9984, MAE = 0.0785), making it the most suitable for practical noise level forecasting. The M5P model tree was designated as the optimal solution for result interpretation and rule analysis (r = 0.9898). Tree-based algorithms (Random Forest and M5P) described noise dispersion more effectively than linear or neural models due to their ability to capture non-linear and hierarchical relationships among environmental factors.

Attribute importance analysis revealed that the range of values (RANGE) with a correlation weight of 0.9316 and the maximum value (MAX) with a correlation weight of 0.9063 have the greatest influence on noise pollution levels.

The integration of zonal raster statistical indicators and advanced ML methods is an effective tool for industrial noise analysis. This methodology allows not only for accurate noise prediction but also for a better understanding of its formation mechanisms, which is crucial when planning noise management measures.

The high correlation coefficients achieved (r > 0.97) prove that machine learning models can accurately function as a “digital twin” of the ISO 9613-2:1996 standard, providing a significantly faster alternative for noise prediction during the planning stages of industrial facilities.

It should be noted that these findings are based on a single warehouse case study. While the developed methodological framework - utilizing zonal raster statistics - is scalable and adaptable, the specific predictive rules and noise dispersion patterns identified are site-specific. Therefore, further research involving a wider range of industrial facilities is required to generalize these conclusions to broader industrial contexts.

Despite the reliance on simulated data for model training, this study establishes a robust methodological framework for spatial noise analysis. Future research will focus on integrating in situ noise measurements to further calibrate the models and assess the variance between theoretical propagation and real-world acoustic environments.

Footnotes

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Vytaute Juodkiene

References

European Commission . Directive 2002/49/EC of the European Parliament and of the Council of 25 June 2002 relating to the assessment and management of environmental noise. Official Journal of the European Communities 2002.

Liu

Oiamo

Rainham

, et al. Integrating random forests and propagation models for high-resolution noise mapping. Environ Res 2021; 195: 110905. https://doi.org/10.1016/j.envres.2021.110905

Oiamo

Davies

Rainham

, et al. Using spatial statistics and GIS-based metrics to characterize environmental noise exposure. Int J Environ Res Publ Health 2020; 17(9): 3146.

Almansi

Ujang

Azri

, et al. Traffic noise prediction model using GIS and ensemble machine learning: a case study at Universiti Teknologi Malaysia (UTM) campus. Environ Sci Pollut Res Int 2024; 31(51): 60905–60926. https://doi.org/10.1007/s11356-024-35243-0

Afsharfard

Jafari

Rad

, et al. Modifying vibratory behavior of the car seat to decrease the neck injury. J Vib Eng Technol 2023; 11: 1115–1126. https://doi.org/10.1007/s42417-022-00627-4

Jafari

Afsharfard

. Metamaterial Bi-stable vibration absorbers for railway tracks: experimental study of flexural wave control, 2025.

Afsharfard

Farshidianfar

. An efficient method to solve the strongly coupled nonlinear differential equations of impact dampers. Arch Appl Mech 2012; 82: 977–984. https://doi.org/10.1007/s00419-011-0605-1

Tan

Bao

, et al. Machine learning-based prediction of drone noise propagation in three-dimensional urban environments. J Acoust Soc Am 2025; 158(2): 946–957. https://doi.org/10.1121/10.0038749

Juodkiene

Rekus

. Noise dispersion modelling in the planned logistics warehouse and residential area. Civ Environ Eng 2025; 0(0), University of Žilina. https://doi.org/10.2478/cee-2026-0035

10.

Ahmadi

, et al. Monitoring and modeling of industrial noise pollution using machine learning techniques: a data-driven approach. Environ Monit Assess 2022; 194(8): 582.

11.

Givargis

Karimi

. A comparative study of artificial neural networks and multiple linear regression for industrial noise prediction. J Environ Health Sci Eng 2022; 20(1): 115–128.

12.

International Organization for Standardization . ISO 9613-2:1996. Acoustics — attenuation of sound during propagation outdoors — part 2: general method of calculation. Geneva: ISO, 1996.

13.

Lithuanian hygiene regulation HN 33:2011: noise limit values in residential and public buildings and their environment, approved by the minister of health of the Republic of Lithuania on June 13, 2011, by order no. V-604.

14.

Torija

Self

. A machine learning approach for the assessment of environmental noise impact. Appl Acoust 2023; 202: 109144.

15.

Bande

Aris

Yusof

MFM

Shith

ARM

Noor

. Comparison of machine learning algorithms for environmental noise mapping in smart cities. Appl Soft Comput. 2022; 121: 108745.

16.

Krizan

Gasparovic

Jogun

. Spatial analysis and modeling of urban noise using advanced machine learning regression. Remote Sens. 2023; 15(4): 1023.

17.

Park

Chung

. Hybrid modeling for industrial noise assessment combining computational fluid dynamics and random forests. Sustain Energy Technol Assessments 2024; 62: 103567.

18.

Verma

Singh

. Deep learning in environmental acoustics: future trends and challenges. Artif Intell Rev 2025; 58(1): 45–68.

19.

Saeed

Al-Sarray

Al-Sultany

. Utilizing XGBoost and GIS for highly accurate spatial noise distribution forecasting. Expert Syst Appl. 2023; 215: 119230.

20.

Zhang

Tian

Zhao

Liu

Tao

Wang

Guo

Luo

. Hybrid modeling of urban noise using land use regression and machine learning: a global perspective. Sci Total Environ. 2024; 912: 168942.

21.

Nguyen

. Data quality assessment for machine learning-based acoustic models in industrial zones. Process Saf Environ Prot 2024; 178: 245–259.

22.

Wang

. The role of high-resolution spatial data in machine learning-based noise mapping. Sustain Cities Soc 2025; 108: 105421.

23.

Chen

Liao

. Improving environmental noise prediction using multi-source data fusion and ensemble learning. J Clean Prod 2025; 412: 137384.