Abstract
With the rapid industrialization and urbanization worldwide, air quality levels are deteriorating at an unprecedented rate and posing a substantial threat to humans and the environment. This brings the concern to effectively monitor and forecast air quality levels in real-time. Conventional air quality monitoring stations are built based on centralized architectures involving high latency, communication technologies demanding high power, sensors involving high costs and decision making with moderate accuracy. To address the limitations of the existing systems, we propose a smart and distinct Air Quality Monitoring and Forecasting system embracing Fog Computing with IoT and Deep Learning (DL). The system is a three-layered architecture with the Sensing layer first, Fog Computing layer in between, and Cloud Computing layer at the end. Fog Computing is a powerful new generation paradigm that brings storage, computation, and networking at the edge of the IoT network and reduce network latency. A DL based BiLSTM (Bidirectional Long Short-Term Memory) model is deployed in the Fog Computing layer. The proposed system aims at real-time monitoring and accurate air quality forecasting to support decision making and aid timely prevention and control of pollutant emissions by alerting the stakeholders when a dangerous Air Quality Index (AQI) is expected. Experimental results show that the BiLSTM model has a better predictive performance considering the meteorological parameters than the baseline models in terms of MAE and RMSE. A proof of concept realizing the proposed system is elaborated in the paper.
Keywords
Introduction
India currently has around 731 manual and 150 real-time air quality monitoring stations across 70 cities. With the number of real-time monitoring stations being relatively less in the current scenario as per the analysis of CPCB (Central Pollution Control Board), the Indian Government is planning to double the number of real-time monitoring stations by 2024. This also has motivated to develop a real-time system embracing state-of-the-art technologies.
Most of the existing works of air quality monitoring and prediction are based on centralized architectures involving high latency, communication technologies demanding high power, sensors involving high costs, and decision-making with moderate accuracy. To address these limitations, we propose a distinct Air Quality Monitoring and Forecasting system embracing Fog Computing with IoT, LPWAN, Deep Learning (DL), and Cloud Computing.
Owing to the proliferation of IoT applications, sending a large volume of data from end devices to the cloud layer for storage, analysis, and decision making would be infeasible due to the latencies from round trip propagation delay, bandwidth constraints, and computation overhead. To address these limitations, a new computing paradigm namely Fog Computing [4], introduced by Cisco aims to provide services at the edge of the IoT network. Instead of directly transferring the data to the cloud, Fog Computing enables storage, computation, decision making, and data management in the proximity of data sources to reduce the network traffic and the computational load of Cloud Computing. Fog computing supports low latency, distributed computing, location awareness, minimized bandwidth consumption and improves real-time decision-making. Thus, the proposed system utilizes the Fog Computing architecture with IoT, where the sensor node undertakes the data acquisition task, transmits it to the Fog Computing layer using LPWAN technology. Low-Power Wide-Area Network (LPWAN) technologies are a promising solution to the distributed Fog Computing nodes in IoT networks offering long-range, low-cost, and low-power consumption. Previous research show that utilizing deep learning technologies to forecast air quality is more accurate than using typical machine learning methods. To guarantee good prediction accuracy and low response latency, a DL based BiLSTM method is employed on the fog node to accurately forecast the pollutants PM2.5, PM10, CO, NO2, SO2, 03 responsible for determining the Air Quality Index (AQI). The results reveal that the BiLSTM method has good predictive performance compared with existing baseline Machine Learning (ML) and Deep Learning (DL) methods, namely Decision Tree Regression (DTR), Long Short-Term Memory (LSTM), Convolutional LSTM (Conv-LSTM), Multilayer Perceptron (MLP), Gated Recurrent Units (GRU) and Bidirectional GRU (BiGRU). Furthermore, the model considers the meteorological parameters as it plays a vital role influencing the pollutant values. Accurate forecasting is crucial to provide early warnings and enhance pollution management system. Cloud Computing offers long-term storage of data collected from the Fog layer and supports computation-intensive tasks and responds to client requests. The paper elaborates on the proposed fog-enabled architecture and present the proof of concept.
The main contributions of the paper are as follows: Proposed a Fog Computing enabled IoT architecture for Air Quality monitoring and forecasting. Designed a real-time Air Quality monitoring system to acquire the air pollutants and meteorological data. Developed a smart fog gateway and identified a suitable model to accurately forecast air quality and deploy on the fog node. Developed a user interface using cloud services to display the air quality trends to the users.
The rest of the paper is organized as follows. Section 2 discusses the related works of air quality monitoring and forecasting in the recent years. Section 3 presents in detail the functions and construction methods of the proposed Fog Computing enabled system. Section 4 details the system implementation and compares the results of the model’s prediction performance. Finally, Section 6 provides the conclusion of this work.
Related studies
The related works and technologies discussed in this section are carefully analyzed and considered in the work to solve the issues of the existing IoT-based air quality monitoring and prediction methods.
Air quality monitoring and related requirements
Air Quality monitoring system is classified as indoor and outdoor monitoring based on the location of the event’s occurrence. Outdoor refers to air pollution in open and industrial areas. On the other hand, indoor denote air pollution in a closed space like a home, office, shopping store, etc. With different scenarios and pollutant nature in outdoor and indoor monitoring, the requirements of the monitoring system vary as mentioned in Table 1. Our work focuses on outdoor monitoring.
Requirements of Air pollution monitoring
Requirements of Air pollution monitoring
As a standard for measuring and expressing air quality, government agencies have established a quantitative tool called Air Quality Index (AQI). AQI describes the degree of air pollution and provides easy-to-read and understandable information to the public [5]. AQI emphasizes the seriousness of air pollution, and the health impacts it poses, especially to vulnerable groups like children, the elderly, and people who are suffering from cardiovascular or respiratory disorders. AQI is determined by measuring six major pollutants, i.e., Ozone (O3), Carbon Monoxide (CO), Sulphur Dioxide (SO2), Nitrogen Dioxide (NO2), Particulate Matter 2.5 (PM2.5), Particulate Matter 10 (PM10). AQI transforms the pollutants value into a single index value, categorizing the pollution situation into one among the six categories. Each country has its own method for assessing and defining the AQI levels. The description categories of air quality adopted for India along with the associated health impacts are depicted in Table 2. AQI values are represented in six quantiles ranging from good to severe. The six classifications are as Good, Satisfactory, Moderate, Poor, Very poor and Severe. Higher the AQI prompts an increase in the contamination range with higher concentrations of some pollutants and greater is the health concern.
AQI levels and remarks
AQI levels and remarks
Individual AQI for each pollutant’s concentration as I1, I2 …,In is calculated with the linear interpolation equation as follows.
c high - Breakpoint concentration greater or equal to given concentration.
c low - Breakpoint concentration smaller or equal to given concentration.
I high - AQI value corresponding to c high
I low - AQI value corresponding to c low
C - Concentration of the pollutant.
The highest of all values is the overall AQI of the location at a given point in time i.e., Overall AQI (IAQI)=Max {I1, I2 . . . , In}
With IoT increasingly becoming a part of our daily life, there is a discernible increase in the smart objects generating large volumes of data. However, sending all the data directly to the centralized cloud poses challenges, including latencies from round trip propagation delay, bandwidth constraints, and computation overhead. To address some of the short comings of cloud computing, the Fog Computing [6] concept is introduced to bring cloud services to the edge of the IoT network. Fog Computing is a distributed computing paradigm that enables local storage and computation, real-time decision making, and data management closer to the data sources. Fog Computing layer is an intermediary layer between IoT devices and the cloud layer. The devices in the Fog Computing layer are called fog nodes. The fog nodes have limited resources in terms of processing, computation, memory size, network, and storage like gateways, switches, routers, base stations, smartphones, etc., The capacity of fog nodes is smaller and less powerful compared to cloud servers and typically higher than that of end devices but still constrained. The fog nodes that are at one or two hops away from IoT devices filters out irrelevant data, partially or completely processes the sensor data on the fog node and minimizes the amount of data sent to cloud platforms for processing, analysis, and storage. Fog Computing significantly reduces the service latency, enables better utilization of resources, minimizes bandwidth consumption, and offers improved levels of services and outcomes for IoT deployments.
Characterization of fog computing
Security: A level of protection is provided to ensure safety and facilitate trusted transactions. FC nodes store sensitive data before transmission of data to the cloud layer. Cognition: Fog layer nodes are aware of the end-users’ objectives. FC addresses the cognition by providing solutions based on contextual requirements. Agility: FC delivers greater business agility and offers opportunities for developers to develop applications with the right tools and deploy them. It plays a vital role in the growth of mobile devices and other new services. Latency: The fog layer performs filtering, processing, analytics, and other time-sensitive tasks on fog nodes like gateways, routers located in the proximity of the end-devices. This meets the need of IoT applications with stringent time requirements. Efficiency: Fog because of the vicinity, is tightly integrated with the end-devices, supporting enhanced performance and efficiency. These advantages serve as enablers of new services and business models for networking and computing.
Few research that investigated Fog Computing are as follows: Santos et al. [7] investigated Fog Computing for anomaly detection in a smart city. The system achieves a faster response time on implementing the approaches on the fog layer compared with the traditional centralized cloud layer. Nicholas et al. [8] developed a smart fog-enabled gateway and enhanced the end-to-end interaction between the end device and the cloud. Tuli et al. [9] showed that the experimental results of Fog Bus settings enhanced the latency, energy, network, and computing infrastructures CPU usage. Dutta et al. [10] proposed a fog-based IoT setup, and the results proved to be advantageous in terms of delay over the cloud-only setting. Furthermore, embracing Fog Computing for aggregation and compression of data reduced the amount of data sent to the cloud.
With the growth of networked devices, Fog Computing has gained emphasis in recent times making IoT systems more efficient and scalable. The related works above reveal that introducing Fog Computing in IoT applications offers benefits in terms of bandwidth, latency, local storage, and processing. With this consideration, Fog Computing is embraced in the proposed system.
Low Power Wide Area Network (LPWAN) technologies
Widely installed short-range connectivity like Wi-Fi, Zigbee, and Bluetooth are not suitable for long-range performance scenarios. Although 5 G solutions based on cellular technology can provide extensive coverage, it consumes a lot of energy. Whereas, Low Power Wide Area Networks (LPWAN) have gained popularity with the characteristics of low power, long-range, and low-cost communication [11]. The LPWAN technologies did not even exist as early as the beginning of 2013. Nevertheless, with the rapid growth of the IoT industry and its requirements, LPWAN has gained emphasis in recent days and is one of the fast-growing fields of IoT networks. Table 3 summarizes the characteristics and differences of popular LPWAN technologies like Sigfox, LoRa, DASH7, and Narrow Band IoT (NB- IoT) that emerged in both licensed and unlicensed markets.
Comparison of LPWAN technologies
Comparison of LPWAN technologies
Mekki et al. [12] and Sinha et al. [13] discussed and compared LoRa, Sigfox, and NB-IoT. The studies disclose that LoRa and Sigfox have very good characteristics in terms of capacity, battery, and cost. Meanwhile, Tan et al. [14] suggested NB-IoT has advantages in terms of QoS, latency, range, and performance. But the resources of the NB-IoT spectrum are very limited, costly, consume higher power, and lower battery life compared with LoRa. Osman et al. [15] mentioned that Sigfox has a long communication range of 50 km that reduced the number of transmissions to ensure energy-saving features. Sigfox aims for a more global solution, while LoRa enables developers to construct a secure and private network. LoRa and DASH7 use the 868 MHz band. Although both use the same frequency band, the maximum communication range of DASH7 is lower in comparison with LoRa. Furthermore, DASH7 uses Gaussian Frequency-Shift Keying (GFSK) modulation, where LoRa uses a proprietary spread spectrum modulation owned by Semtech. In the comparison of LPWAN solutions, LoRa is considered a suitable option for deploying city-scale applications.
LoRa is an emerging LPWAN solution to transfer and receive data over 2-20 kilometers with low costs. The name LoRa derives from its long-range functionality. Although it supports very low data rates, it consumes less power resulting in the increased battery life of the sensor nodes. It uses a spread spectrum approach that helps against interference or noise. Tests in urban areas have shown that a communication range of 3 km can be achieved in noisy environments. LoRa WAN protocol is designed with very good built-in security features based on AES cryptography in comparison with other IoT solutions.
Few works featured LoRa under real-world settings. Carlsson et al. [16] explored the performance of LoRa and further discussed its capabilities, performance, and limitations in urban and rural settings. El Chall et al. [17] presented an analysis on LoRa and determined its transmission range is 8 km in urban areas. A recent study by Basford et al. [18] on the long-term implementation of LPWAN technologies concluded that LoRa is the most viable choice in a smart city monitoring framework. A similar study by Sanchez-Gomez et al. [19] reached the same conclusion, suggesting LoRa for several smart city use-cases. This section provided an overview and a comparative analysis of the essential parameters of LPWAN technologies and identify that LoRa is an efficient communication technology to adapt and enhance the performance of the proposed Air Quality Monitoring system.
The urge to monitor Air Quality levels effectively stems from the dangers that poor air quality brings to humans and the environment. Real-time Air Quality monitoring plays a significant role in laying a foundation for accurate air quality forecasting. We review the related works of air quality monitoring systems that considers primary air quality attributes.
Lai X et al. [20] implemented a low-cost IoT based air quality monitoring system. The system monitored the concentration of six air pollutants and achieved immediate predictions. The hardware part of the system included Raspberry Pi, sensor network, and Wi-Fi and the software part included the cloud system. They employed ML based models on the edge device to avoid transmission delays caused by bandwidth and network connection limitations. Furthermore, they mentioned including the external environmental factors in the future work to enhance the accuracy of the system. Duangsuwan, Sarun, et al. [21] focused on the development of smart sensors to track air pollution in Bangkok, Thailand. The parameters monitored include PM10, CO, CO2, noise level, and O3. The sensor node uses the Linkit Smart 7688 microcontroller. The processed data on the sensor node is shared with the next layer using NB-IoT, and AQI is determined for the collected data. The main limitation is that the resources of the NB-IoT spectrum are costly. Kim SH, et al. [22] proposed an IoT-based atmospheric monitoring system to effectively observe air quality levels. The developed system was installed in Changwon National University and the Haman industrial complex area. The system uses the LTE network, and the scaling can cause the network cost to increase.
Santos et al. [23] presented an edge computing-based anomaly detection approach for Air Quality Monitoring in smart cities. The implementation was performed on Antwerp’s City of Things platform, where a set of sensors were integrated, and mounted on BPost’s delivery cars to gather real-time air quality information. They deployed the ML based unsupervised anomaly detection models on the fog node to send timely alerts to the citizens when unusual Air Quality levels are detected. Furthermore, the results verified that using Fog Computing over the centralized Cloud Computing architecture improved the response time of the application. Wang D et al. [24] designed an air quality monitoring system to achieve a home weather station allowing the user to access the data on his mobile. A tree-based network is created with the Zigbee, and the data forwarded from sub nodes collecting the sensor data is transmitted to the LM3S8962 gateway, and finally uploaded to the server. Toma C et al. [25] provided a solution for pollution monitoring in a smart city. The parameters collected by the system included CO2, CO, NH4, CH4, toluene, temperature, humidity, pressure, altitude, and noise decibels. They discussed the technologies and platforms used within the solution and claimed their system offered benefits in terms of security, reliability, availability, and scalability. Senthilkumar et al. [26] proposed an embedded-based intelligent air pollution monitoring system. The embedded system collects the data and forwards it to the fog node. The functionality of the system was demonstrated on experimentation under various settings. However, they failed to use LPWAN technology to communicate with fog nodes. Rebeiro-Hargrave A et al. [27] designed and developed a Mega Sense Cyber Physical System to monitor urban air quality levels in real-time. The system aggregated the collected data and presented privacy-aware maps and graphs. The citizens of Helsinki received history profiles, personalized advice, and pollution hot spot maps through the mobile application. Their future work aims to integrate the system into mobile platforms such as drones. Baiocchi et al. [28] explored IoT concepts with smart systems to track real-time pollution around the City of Tacoma. It demonstrates how edge devices can ease cloud. They embraced IoT concepts, edge computing, and Microsoft Azure Framework, and demonstrated the effectiveness of the system. Furthermore, they discussed the pros and cons of integrating edge computing and cloud computing capabilities in the system.
Although the existing works as summarized in Table 4 propose mature and robust air quality monitoring systems, majority of the works investigate a very few pollutant parameters, fail to embrace gateways as the fog node, ignore the meteorological parameters that have a direct correlation with pollutants concentration, and do not effectively integrate LPWAN technologies and Fog Computing to meet the requirements of remote connections, low latency and real-time decision making. Hence, the proposed system considers these limitations and effectively integrate Fog Computing, LoRa, and Cloud Computing technologies to develop a smart fog enabled Air Quality monitoring system as in Table 5 assuring measurement accuracy with minimum cost.
Summary of air quality monitoring systems
Summary of air quality monitoring systems
Proposed system
Air quality forecasting or prediction refers to estimating the concentration of pollutants for time ahead based on real-time data and trends in historical data. Predicting air pollutants concentration has gained attention in recent research. It provides important information to the public, helps reduce emissions and enhance pollution management. To make a good prediction, fitting an appropriate model is important. However, accurate prediction is still a challenge because of the complex trends in the data. The common methods for air quality prediction encompass traditional statistical models and data-driven methods such as Machine Learning and Deep Learning methods.
Yu R et al. [29] proposed a Random Forest approach to predict the AQI for urban sensing systems. The model forecasts the AQI of Shenyang based on the data from air quality monitoring stations. The results of the model outperformed single Decision Tree (DT), Naïve Bayes, Logistic Regression, and ANN. However, the amount of data used for modelling is limited. Taneja S et al. [30] discussed the prediction of pollutants trends for future years using Linear Regression and Multilayer Perceptron (MLP) methods. They predict that pollutants NO2 and O3 are likely to increase, while SO2 and CO levels would follow past trends and PM10 is likely to increase drastically. The main limitation is that the models look at only the linear relationships between the data, easily affected by outliers and fails to model non-linear patterns in the data. The meteorological parameters are ignored while modelling the data. Ameer S et al. [31] performed air quality prediction using four regression techniques and presented a comparison analysis to find the best model to accurately forecast air quality in terms of data size and processing time. The techniques included DTR, Random Forest Regression (RFR), MLP regression and Gradient Boosting Regression GBR). The results indicated that RFR is the most effective methodology for predicting pollution for data sets of various sizes, locations, and features. Furthermore, RFR had the lowest error rate and performed well in detecting the peak values among the four approaches. Bisht M et al. [32] proposed an intelligent air pollution prediction system using Extreme Learning Machine (ELM) to predict the AQI for five pollutants. It is identified that the AQI values predicted by ELM is close to the actual values for five out of seven days in the test set. ELM resulted in much faster learning compared to backpropagation based FFNN. ELM-based prediction was found to have greater accuracy than the existing SAFAR system. Rekhi JK et al. [33] analyzed the time series air pollution dataset of New Delhi and forecasted monthly future values of two major pollutants SO2 and NO2 using ARIMA. However, ARIMA fails to effectively capture the sudden changes in exhaust emissions. Zheng H et al., 2018 [34] developed a novel multiple kernel learning-based approach with support vector classifier as the base learner for air quality forecasting. The proposed method forecasted severe air pollution and outperformed other models including Random Forest, LSTM, Multilayer Perceptron and ARIMA. Leong WC et al. 2020 [35] proposed a support vector machine to model the air pollution index. The study investigated the kernel functions model parameters. SVM using radial basis function (RBF) kernel function effectively solved the problem of complex air pollution index modeling. However, SVM algorithm is not suitable for large datasets and does not perform well when the data set has more noise. Bai Y, et al., 2016 [36] constructed a W-BPNN model using wavelet technique and back propagation neural network (BPNN) to forecast daily air pollutants (NO2, PM10, and SO2) concentrations. The results revealed that the W-BPNN model had better prediction performance for the three pollutants than mono-BPNN model. However, the work neglected data pre-processing and optimization of parameters, leading to some bias in modelling. Yan R et al. [37] developed a multiple-hour and multiple-site forecasting model using CNN-LSTM. The CNN-LSTM exhibited better performance compared with the BPNN and the CNN. Li X et al. [38] presented a novel long short-term memory neural network extended (LSTME) model to predict air pollutant concentrations. The model captured long time dependencies and automatically determined the optimum time lags. LSTME outperformed the spatiotemporal deep learning (STDL) model, the time delay neural network (TDNN) model, ARMA, SVR, and the traditional LSTM. Compared with the RNN model, the LSTME exhibited better prediction performance. The experiments revealed that DL-based models exhibited better prediction performance compared to traditional shallow models such as SVR, ARMA and TDNN. Based on the related works summarized as in Table 6, most studies consider only one or a small number of pollutants while modelling the data and few works ignore the meteorological variables that have a direct correlation with the pollutants. Furthermore, the works that use statistical models like SMA, WMA, and ARIMA for short term forecasting exhibit simplicity, good performance, interpretability and are computationally less expensive but they fail to capture the non-linear patterns. Furthermore, researchers investigate ML-based approaches that overcome the non-linear limitations and uncertainties to achieve better accuracy. For Indian context, many air quality forecasting works have employed various traditional ML and DL methods. Besides the ML algorithms, DL methods shows higher prediction accuracy. Although the traditional ML models are widely employed, they are not suitable to capture long-term dependencies in time series data. Whereas DL uses the structure of ANN with capabilities of non-linear mapping, robustness, self-adaption and modeling complex relationships between inputs and output variables to effectively capture the trends of air quality. The main research gap is that the works fail to employ forecasting models with good predictive performance on the Fog nodes to support real-time decision-making and low latency. Hence, the proposed system employs a DL based model with good predictive performance on the fog node considering all the six primary pollutants and meteorological parameters to predict air quality accurately.
Summary of Air Quality Prediction models
Summary of Air Quality Prediction models
The proposed Fog Computing based Air Quality Monitoring and Prediction (AQMP) system is a three-layered architecture with the Sensing layer first, Fog Computing layer in between, and Cloud Computing layer at the end. It integrates multiple modules to effectively monitor, store and predict air quality in real-time. All the layers communicate via LoRa and Wi-Fi. An overall view of the system architecture is depicted in Fig. 1, and the description of the proposed system is explained below.

Architecture of the proposed air quality monitoring prediction system.
The first layer is the sensing layer, comprising the resource-constrained sensor nodes to acquire the data from the environment and transmit it to the Fog Computing layer. The components of this layer are elucidated below.
Sensors to monitor air pollutants and meteorological parameters
Sensors to monitor air pollutants and meteorological parameters
The Fog Computing layer uses Raspberry Pi 3 Model B + running Raspbian OS, called the fog node [39, 40]. This development board has the features of low cost, low power consumption, adequate storage, good processing, networking, and computing capabilities to serve as the fog node. There are many expensive industrial standard boards with greater computation power that can serve as the edge node. But a cost-effective Raspberry Pi is preferred as the fog node in many IoT applications.
Raspberry Pi 3 Model B has BCM2837 Quad-Core (4x ARM Cortex-A53, 1.2 GHz), 4 USB ports, 40 GPIO pins. Raspberry Pi does not have internal flash memory, so it uses an SD card for storage. The GPIO pins on the board support only digital inputs, so an ADC is required to connect analog sensors. R Pi with built-in quad-core1.2 GHz 64-bit CPU that offers good processing and computing capabilities and thus preferred as the fog node in many IoT applications. The nodes of the previous layer share the data with the nearest fog nodes without directly uploading it to the cloud. Data is received on R Pi (fog node) by the LoRa receiver.
Fog computing services
The services to manage the resources and the ingested data on the fog node in the proposed system are explained below.
Fog enables the processing and operation of data closer to the source. It interacts with the opposite two layers. The services rendered by the fog layer above makes the fog node a smart fog gateway. Finally, after processing and prediction the fog node, the results and monitoring data are sent to the cloud using MQTT protocol.
Cloud computing layer
The Cloud Computing layer is the end layer, supporting long-term storage and complex processing of air pollution data collected over sustained periods.
This layer acquires the data from the fog computing system, stores it in the database, and provides feedback results and data visualization to the clients. The work chooses the AWS platform, as it provides secure IoT services. The APIs are made available to share data with third parties.
System implementation
A proof-of-concept of the proposed system entailing the hardware and software implementation is detailed in this section. The hardware module mainly includes the Sensing Node and the Fog Computing device. The software component includes the prediction model and the cloud. The summary of tools and technologies to develop the proposed system is described in Table 8.
Summary of the tools and technologies used to develop Fog Computing based Air Quality Monitoring and Prediction system
Summary of the tools and technologies used to develop Fog Computing based Air Quality Monitoring and Prediction system
A customized PCB is developed to monitor and predict air quality in real-time. The components of the hardware and other configurations are explained in Section 3.1. The hardware of the system necessarily includes the Sensor node that acquires the data and the Fog Computing node that processes the data. The system is tested successfully in the lab environment as in Fig. 2.

Air quality monitoring and prediction system.
The data acquired from the monitoring system is transmitted to the Fog Computing layer using LoRa. Further, a suitable air quality prediction model is identified and deployed on the fog node. The air quality data is further transmitted asnd stored in the AWS cloud. A detailed description of this is presented below:
Air quality prediction model
The intelligence is introduced on the Fog node by deploying a Deep Learning model to forecast pollutants concentration for future time steps. Initially, the data gathered from the Central Pollution Control Board (CPCB) monitoring station [41], situated at SIDCO Kurichi in Coimbatore city is used train the data with the model and deploy it on the Fog node. The CPCB dataset is presented in 15-minute intervals from 00 : 00 on June 1, 2019, to 23 : 59 on April 30, 2020. The investigative variables acquired from the location include PM2.5, PM10, CO, NO2, SO2, O3, Temperature, Pressure, Relative Humidity, Solar Radiation, Wind speed, and Wind direction. The work considers the meteorological parameters as they are responsible for the dispersion and transformation of pollutants in the atmosphere and have a direct correlation with the pollutant’s values.
To analyze the basic and statistical characteristics of six pollutants such as minimum, maximum, extremum, mean, median, standard deviation, skewness, and kurtosis, a descriptive analysis of the data is presented in Table 9. PM2.5, PM10, SO2, NO2, and O3 have a a dimension of μg/m3, and CO has a dimension of mg/m3.
Descriptive Statistics of the exploratory variables in the dataset
Descriptive Statistics of the exploratory variables in the dataset
4.2.1.1. Experimental parameters
Appropriate features that show a strong relationship should be chosen as they are important in forecasting the dependent variables. The lack of relevant variables could prevent the model from approximating the dynamics between the predictors and the prediction variables. If there are numerous input parameters, there would be a need for a dimensionality reduction method. Based on the findings from the literature review, we choose an optimal number of influential parameters for air quality modeling. The parameters include PM2.5, PM10, CO, NO2, SO2, 03, Temperature, Relative Humidity, Pressure, Solar Radiation, Wind speed, and Wind direction.
4.2.1.2. Data pre-processing
Due to the time-dependency, time-series data are subjected to a few missing points while being recorded because of sensor failure or network problems in the monitoring station. Data pre-processing is a key step in developing ANN models as it improves the representation of the collected data. The incorrect or incomplete data may cause discontinuity and introduce significant bias while modelling the data. In our work, outliers are eliminated, and the data is cleaned by filling up the missing values with the average value of the data in the same period. Min-max scaling is performed to linearly transform each of the attributes of data in the range 0-1. It helps to avoid model sensitivity to extreme values of the data. The scaled data is then fed into the network for prediction. The results are re-scaled to get the actual value. A minimum data pre-processing is sufficient for Deep Learning.
4.2.1.3 Model deployment
Multivariate Bidirectional LSTM model is deployed on the fog node to predict future trends of the primary air pollutants concentration. Long Short-Term Memory (LSTM) is a form of artificial Recurrent Neural Network (RNN) used to model the sequential relationship between past, present, and future data [42]. The LSTM models are effective, particularly when it comes to the preservation of long short-term memory. Furthermore, LSTM effectively overcomes the gradient explosion and disappearance problems seen in RNN with concept of memory cells and controlling gates. LSTM is a special kind of RNN, capable of learning long-term temporal dependencies across time steps and predict the sequences. The generalization ability of LSTM allows it to exhibit superior performances over the traditional NN. Each LSTM unit has an input gate, a forget gate, an output gate, and a memory cell. The memory cells in LSTMs allow them to accumulate steps over prediction sequences, allowing them to do well in long-term tasks [43]. The model includes processing input, recurrent and output layers at each time step. In bidirectional LSTM, the hidden layer is added in the reverse direction of the data flow. Bidirectional LSTM model runs the inputs in forward and backward directions, one from past to future and the other from future to past. Therefore, BiLSTM is more effective than unidirectional LSTM.
The architecture of a typical LSTM layer is briefly described as follows.
–Input Layer: By concatenating pollutant concentrations and meteorological factors, this layer creates dense embedding.
–Recurrent layer: This layer generates hidden representations of vector
–Output Layer: This is the final layer that outputs pollutant concentration values of future time step considered with y t .
LSTM comprises multiple LSTM units. Each LSTM unit has an input gate, a forget gate, an output gate, and a memory cell. Inputs to the LSTM cell at any time step are the current input x t , previous hidden state ht1, and previous memory state ct-1. The LSTM architecture used in this paper is defined by the equations below.
–Memory cell (m
t
):
–Input gate (i
t
): a gate that determines which values to update the memory state from the input.
The overview of the LSTM cell is shown in Fig. 3 where x t , h t , and m t represents the input vector of the model at time t, LSTM output at time t, and memory cell state respectively. The symbol * represents the element-wise multiplication. The tanh is the hyperbolic tangent function and σ is the logistic sigmoid function. i t , o t , and f t are the input, output, and forgot gates respectively. ct is the input update value b i , b o , b m , and b f are the bias vectors of each gate, W are the weight matrices.

Overview of LSTM cell at time t.
The bidirectional structure is applied to the LSTM to make a Bidirectional LSTM (BiLSTM) as shown in Fig. 4.

Bidirectional LSTM structure.
In BiLSTM, the two hidden sequences forward
By using BiLSTM, trends in air quality data can be effectively captured from past and future dependency information.
The time series Air Quality dataset is split into train and test set constituting 80% and 20% instances. The training set data is transformed into time-series samples using a sliding window approach as in Algorithm 1. The data, window size, and step length are the inputs to the algorithm that outputs X and Y used to train the models.
The number of samples in a batch is the window size for each epoch in training. The training model takes sequence of the data with the chosen window size as the input and generates the output for the future time steps in multiple batches. The training data input and output is split based on the sliding window procedure as in the Algorithm 2,
where data is one-by-d matrix, d is the length of the data, win_size is the number of observations in the sliding window, step_length is the number of steps ahead to forecast, X is the training input, and Y is the training output.
Based on this BiLSTM model is trained, fitted, and deployed on the smart fog gateway. Further, based on the real-time data acquired from the developed air quality monitoring system at the same location, the future hours pollutant values are predicted by the BiLSTM model deployed on the smart fog gateway and AQI is determined. AQI is then categorized into one of the situations as in Table 2. If the AQI category is very poor or severe, the early warning module on the fog node issues warnings to the stakeholders. Also, the fog node configures the end node to sample data more frequently in the occurrence of such events to gain a detailed inference.
4.2.1.4 Performance evaluation metrics
To evaluate the accuracy of the forecasting models the statistical metrics namely, RMSE and MAE are considered. It comprehensively investigates the difference between the actual and predicted values. The equations defining the metrics are listed below.
–Root Mean Square Error (RMSE): RMSE is a statistical parameter to determine the accuracy of a model. Smaller the RMSE indicates a lower error.
–Mean Absolute Error (MAE): MAE is another statistical measure to measure the average performance of the model. Smaller values of MAE are desired.
The air quality data transmitted from the Fog Computing layer using MQTT protocol is fetched by AWS IoT Core and the data is stored in the AWS Dynamo DB, a No SQL database. Further a Lambda function is created using the AWS Lambda console and HTTP API is created using the API Gateway console. Whenever the user requests HTTP API, API Gateway forwards the request to the Lambda function, fetches data from Dynamo DB, and returns the data to the front-end of the website as highlighted in Fig. 5 and presents the Air Quality data to the users.
Result and analysis

Air quality monitoring and prediction system.
We conducted a series of experiments to evaluate the prediction approach proposed in the paper. First, the influence of meteorological features on air quality prediction is validated. Second, the prediction performance of the Bidirectional LSTM model in the proposed system is verified by a comparative analysis with the baseline models for predicting six pollutants as presented in Table 10. The baseline models include LSTM, Convolutional LSTM (Conv-LSTM), Multilayer Perceptron (MLP), Decision Tree Regression (DTR), Gated Recurrent Units (GRU) and Bidirectional GRU (BiGRU). The prediction performance of the models is validated against MAE and RMSE to compare the actual and predicted values. The parameters of the models are optimized using grid search. By using BiLSTM, trends in air quality data can be effectively captured from past and future dependency information.
Prediction performance of models without considering meteorological parameters
The hyperparameters play a vital role in the performance of the Deep Learning models. The BiLSTM network is designed with 200 neurons in the first hidden layer and six neurons in the output layer for predicting pollutant values. With the same configuration, over-fitting issues occur when neurons exceeded 200. The models are trained for 100 epochs. To avoid overfitting, using a dropout of 0.2 between the layers provided good results. Adam’s version of the stochastic gradient descent is chosen as the optimizer which is a good option for RNN. Deciding the window size is important because a smaller window size cannot provide sufficient information for model inputs, while a higher window size increases the computation complexity and the problem of learning important features. To minimize the computing costs, we chose the window size as 194 (2 days data). There is no specified rule to choose the best value of hyperparameters and is identified by the trial-and-error method. We also aim to determine and optimize the prediction model’s hyperparameters to achieve the best accuracy. The models are tested with multiple network configurations to find the best network with more accurate results.
The influence of meteorological parameters on pollutants is validated by comparing the results of the prediction model with and without considering the meteorological parameters as in Tables 10 and 11 respectively. On comparing the results of Tables 10 and 11, the average RMSE and MAE values of pollutants for Table 10 that does not consider meteorological parameters are significantly higher for all the models compared to Table 11 that considers the meteorological parameters. This implies that models trained with the dataset considering the meteorological parameters as in Table 11 have good prediction performance, confirming the influence of meteorological parameters on the pollutants prediction results.
Prediction performance of different models considering the meteorological parameters
Further, analyzing the results of Tables 10, 11, it is observed that the ML-based DTR model has larger MAE and RMSE values reflecting its poor predictive performance. Among the variants of RNN (LSTM, BiLSTM, Conv-LSTM, GRU, and BiGRU) show better results compared to the Feed Forward network model (MLP) and DTR. Furthermore, comparing the variants of RNN, BiLSTM has a lower total error value of MAE and RMSE and performs well on all pollutant factors except CO and Ozone. BiGRU has the second smallest average RMSE and MAE values for CO and Ozone, indicating good performance. In addition, Conv LSTM also has a performance closer BiGRU. The comparison results suggest RNN models are more suitable for air quality forecasting. The fitting curve representing the patterns of predicted values of the BiLSTM model over the actual values is displayed in Fig. 6. The orange line indicates the predicted values while the blue line represents the actual values of the data. The trends of the predicted values are reflected closer to the actual values of the pollutants in the BiLSTM model which helps in determining AQI effectively. The results show that BiLSTM not only captures the local trends, but also the long-term temporal dependencies of pollutant concentrations and meteorological parameters from the data. BiLSTM accurately predicts the pollutant concentration fluctuations and delivers stable performance. To sum up the comparison, BiLSTM exhibits superior performance over the baseline models as indicated by RMSE and MAE values in Table 11. Our forecasting model comparison results are consistent with those of the related studies that DL based method is more suitable for air quality prediction.

BLSTM forecast results vs actual values of six primary pollutants.
The analysis of the results of Table 11 reveals that the BiLSTM model is a suitable model and achieves more accurate air quality predictions and by integrating historical time air quality data and meteorological data into the model. Further, the BiLSTM model is deployed on the fog node and the air quality predictions are made for future time steps and users are warned if AQI breaches the threshold. Further, the historical data is stored in the cloud and the clients access the real-time data through API calls.
In this paper, an Air Quality Monitoring and Prediction System is developed leveraging Fog Computing and Deep Learning. A low-cost real-time air quality monitoring system acquires the primary pollutants along with the meteorological parameters and transmits them to the fog node using LoRa for local processing and accurate forecasting to provide early warnings. The paper also reviewed and analyzed the state-of-the-art air quality prediction methods to tackle the highly dynamic pollution forecasting problem that plays a vital role in making informed decisions, supporting early warning and control mechanisms to mitigate the poor air quality. Further, to determine an accurate model, seven models are compared, and the results verify that the performance of BiLSTM outperformed other methods to accurately forecast future hour’s pollutants concentration by considering the meteorological parameters. The BiLSTM model is then deployed on the fog node to make future hours air quality predictions. Thus, the fog node is embedded with intelligence making it a smart fog gateway. The smart fog gateway forwards the real-time air quality data to the cloud and users access the data through cloud services. The prediction of air quality on smart fog gateway helps to determine future hours AQI in real-time and help the city planning committee in road traffic signal coordination and promote the usage of public transport for commutation. In turn, the proposed Air Quality Monitoring and Prediction System aims to bring environmental, economic, and social benefits by leveraging state-of-the-art technologies to promote a sustainable smart city. In the future, we plan to optimize Deep Learning methods on Fog Computing devices for long-term predictions.
