Abstract
The foul odor of foul gas has many harmful effects on the environment and human health. In order to accurately assess this impact, it is necessary to identify specific malodorous components and levels. In order to meet the qualitative and quantitative identification of the components of malodorous gas, an electronic nose system is developed in this paper. Both principal component analysis (PCA) and linear discriminant analysis (LDA) were used to reduce the dimensionality of the collected data. The reduced-dimensional data are combined with a support vector machine (SVM) and backpropagation (BP) neural network for classification and recognition to compare the recognition results. Regarding qualitative recognition, this paper selects the method of LDA combined with the BP neural network after comparison. Experiments show that the qualitative recognition rate of this method in this study can reach 100%, and the amount of data after LDA dimensionality reduction is small, which speeds up the pattern speed of recognition. Regarding quantitative identification, this paper proposes a prediction experiment through Partial least squares (PLS) and BP neural networks. The experiment shows that the average relative error of the trained BP network is within 6%. Finally, the experiment of quantitative analysis of malodorous compound gas by this system shows that the maximum relative error of this method is only 4.238%. This system has higher accuracy and faster recognition speed than traditional methods.
Introduction
VOCs are harmful to humans, seriously endanger human health, and have become one of the most critical problems of environmental pollution [1]. Therefore, developing a hybrid odor system with strong real-time performance and online detection is a crucial research direction in odor detection. The current odor detection methods include the olfactory test, instrumental analysis, and electronic nose [2–5].
Olfactory testing is also known as sensory analysis. The method uses the human nose as a detector. It evaluates and measures odors through the reaction of olfactory personnel to the odor, which can directly judge the impact of odor pollution on humans and the environment. According to the degree of odor stimulation to the human olfactory organs, the odor intensity grade can be obtained by comparing the odor intensity classification table. Different countries and regions have different odor intensity classifications. China divides the odor intensity into six grades [6]. The classification of odor intensity in China is shown in Table 1.
China’s odor level
China’s odor level
Unlike substance concentration, odor concentration is dimensionless. The three-point comparison stinky bag method is the most usual method of smell testing. The measurement process of the three-point comparison odorless bag method is: to prepare three odorless bags, two of which are filled with odorless air, and the other is filled with odorless gas according to a certain odorless air dilution ratio. An olfactory team consisting of more than six olfactory people compared three bags of gas, identified the odorous bags, and then gradually diluted and smelled the odorous gas until the concentration of the odorous gas was lower than that of the olfactory people. Stop dilution at the threshold. After the test is completed, the odor concentration is obtained based on the average threshold of the members of the olfactory discrimination group [7].
The advantages of the olfactometry method are simple operation, fast measurement speed, relatively high accuracy, good measurement repeatability, and no cross-contamination of samples. The disadvantage is that the olfactory discriminator and the environment easily affect the measurement results. Sometimes, it is difficult to objectively reflect on the actual situation, so the measurement results are difficult to unify [8]. The key to olfactory testing is to strengthen the technical training of olfactory personnel. The odorant should fully understand the odor characteristics of typical odorous substances, distinguish various odors, and have high professional ethics and a sense of responsibility to avoid the influence of human subjective factors as much as possible.
Instrumental methods are mainly used for the qualitative and quantitative analysis of single malodorous substances [9]. Commonly used precision analytical instruments include gas chromatography, gas chromatography/mass spectrometer, high-performance liquid chromatograph, ultraviolet-visible spectrophotometer, etc. The odor, in reality, is usually a compound odor composed of various odorous substances. For example, the odor produced by landfills contains dozens of elemental gases such as hydrogen sulfide, ammonia, and methyl mercaptan [10]. However, the traditional instrumental analysis method cannot detect the compound odor. In addition, the method requires special equipment for testing, which is more expensive and takes longer to analyze; the equipment is large and unfavorable to carry; it requires sampling in advance and cannot achieve online testing. In order to evaluate the harm of odor to the environment and human beings, it is necessary to understand the odor composition to find the odor source [11] and prevent the generation of odor from the root. It is hoped that instrumental analysis will not only include the chemical composition and concentration of malodor but also include results consistent with the human sense of smell. Therefore, the electronic nose analysis method has been paid more and more attention.
The electronic nose (EN) is an artificial olfactory system that analyzes, identifies, and detects complex malodors. It can obtain the characteristics of odors and accurately identify different odors and their concentrations based on these characteristics [12]. Compared with the olfactory test method, the electronic nose analysis method has high reproducibility, short sampling time, and is not easily affected by subjective factors. Compared with the instrumental analysis method, the electronic nose analysis method can measure the compound malodor, and the sampling time and analysis time are short, and the cost is low. In 1994, Gardner published a review article on the electronic nose. He formally gave the concept of “electronic nose": “The electronic nose is a system composed of a sensor array with partial specificity combined with the corresponding pattern recognition algorithm. Identify gases or odors of single or complex components [13].” Lamagna et al. [14] used an electronic nose to realize the identification of pollutants in polluted rivers. Yang et al. [15] adopted an electronic nose to distinguish VOCs with different components. Chung et al. [16] in South Korea introduced a vehicle air quality system, which uses a reverse transfer neural network to realize the concentration alarm of harmful gases. Blaschke et al. [17] in Germany combined human senses and MEMS sensor arrays to classify air quality according to different human senses. They realized the test of carbon monoxide, nitrogen dioxide, and benzene. It can be seen from the above examples that the detection method of the electronic nose can be used for the detection of malodorous gas and applies to various scenarios. Two main factors affect the detection effect of the electronic nose: the performance and category of the sensors in the detection system and the other is the pattern recognition method of the system. W. Zhang et al. designed an electronic nose for alcohol detection based on a specific logic control. The system has low cost and robust reliability but has a high noise level and average sensitivity [18]. S. Licen et al. applied self-organizing mapping to electronic noses to establish frequency-intensity-duration odor signatures for surrounding air types [19]. J. Zhang et al. designed a detection system composed of 6 MOS sensors for indoor gas, especially kitchen gas. They combined the BP-ANN (back-propagation neural network) model to improve the anti-interference ability of the system, but this method did not achieve Quantitative detection of mixed gas [20]. J. C. R. Gamboa et al. conducted experiments on the detection effect of the electronic nose by the support vector machine (SVM) method combined with three deep learning models. The experimental results show that the method can improve the detection speed of the system [21]. L. Feng et al. proposed an augmented convolutional neural network for improving sensor drift in electronic nose systems [22]. H. Bakiler et al. proposed a long-short-term memory method to improve the success rate of metal oxide gas sensor classification and regression of different gas concentrations, thereby improving the accuracy of subsequent pattern recognition [23].
As seen from the above examples, the pattern recognition method used by the system is the key to affecting detection accuracy. The support vector machine method has been verified. It can be applied to the analysis of electronic nose detection information. However, this method still has problems such as general anti-interference ability and the need to improve the detection speed. This paper summarizes the research on gas sensors, selects three different types of 5 sensors to form a sensor array, and proposes a BP neural network combined with the LDA dimensionality reduction method to process the collected gas information. The method proposed in this paper improves the performance of the electronic nose and realizes the qualitative analysis and quantitative detection of complex malodorous gases. This paper also compares the method with the traditional SVM method. The experiments show that the method proposed in this paper can realize the qualitative and quantitative detection of malodorous compound gas and has high accuracy and detection efficiency.
Experimental Equipment
The experimental odor detection system comprises the sensor array, gas circuit structure, data acquisition circuit, signal conditioning circuit, control module, communication circuit, power module, and PC. During the experiment, the gas enters the reaction chamber of the sensor array through the gas path structure and is discharged from the gas outlet. The gas information captured by the sensor is converted into an electrical signal by the data acquisition system and then sent to the host computer for recording and analysis. The structure diagram of the odor detection system is shown in Fig. 1.

Schematic diagram of the compound malodor detection system.
Since a single sensor cannot effectively identify a compound odor gas, a sensor array consisting of multiple sensors is used to obtain odor gas information.
Gas sensors generally have the following performance requirements: It has cross-sensitivity and broad-spectrum response characteristics; The stability, repeatability, and dynamic characteristics of the sensor are good; The response time and recovery time are as short as possible, which is beneficial to shorten the test time; Low power consumption and small size so as to reduce the volume of the electronic nose and realize miniaturization; Easy to use and maintain.
Gas-sensitive sensors generate electrical signals by adsorption, reaction, and desorption with the sample gas, which changes the sensor properties (e.g., conductivity) [24], and information about the sample gas can be obtained by measuring the electrical signal. At present, the commonly used gas sensor types in the field of the electronic nose are metal oxide semiconductor sensor (MOS), electrochemical sensor, quartz crystal microbalance (QCM), surface acoustic wave sensor (SAW), and photoionization detector (PID) [25–29].
This system requires real-time detection, miniaturization, and high measurement accuracy. After comparing the sensors on the market, the photoionization detector is finally selected as the primary sensor, and the metal oxide semiconductor sensor is used as the auxiliary sensor. Since most odorous substances contain hydrogen sulfide, a hydrogen sulfide sensor is added to detect hydrogen sulfide. According to the strong selectivity of the electrochemical sensor, an electrochemical sensor is selected to detect hydrogen sulfide. Based on the above analysis, the sensor selection is shown in Table 2.
Sensor selection list
The gas path structure in the experimental system includes the gas path switching part, the gas processing part, and the sensor reaction part, mainly composed of pipelines, gas stop valves, three-way valves, dryers, and diaphragm pumps, mass flow controllers, and voltage stabilizers. The upper part of Fig. 1 is the gas path structure of the hybrid odor detection system. In order to meet the requirement that the pipeline has no gas adsorption, the pipeline adopts an austenitic stainless steel pipe. The gas switching consists of two gas shut-off valves and a three-way valve, which are used to switch between clean air and sample gas. The dryer installed at the gas injection place can ensure that the injection gas is dry and reduce the influence of humidity on the sensor array. The diaphragm pump in the gas circuit structure provides power for the gas flow so that the whole system is in a negative pressure state so that the gas flows into the gas chamber. A gas mass flow controller is also installed in the gas circuit structure. The gas mass flow controller can adjust and stabilize the gas flow rate, reduce the influence of the flow rate on the sensor array, and prevent the sensor array from heating due to too fast gas flow rate or insufficient response due to too slow flow rate. A large number of experiments show that the response of the sensor array to malodorous gas is the most obvious when the flow rate is 1 L/min, so the flow rate of the whole system will be controlled to 1 L/min. The pre-stage of the sensor array is equipped with a gas storage tank as a voltage regulator to stabilize the pressure of the malodorous gas flowing into the sensor reaction gas chamber. The sensor array is sealed in the reaction gas chamber to reduce the influence of the external environment on the sensor.
The data acquisition system of the experimental system is mainly composed of sensor signal acquisition and AD sampling. The five kinds of sensors used in this paper can be divided into three categories according to the signal output: resistance-type sensors TGS2600 and TGS2602, current-type sensors NE-H2 S, and voltage-type sensors PID-A1 and PID-AH. Sensor signal acquisition converts the detected gas signal into an electrical signal (resistance signal, current signal, or voltage signal). Then, the electrical signal collected by the sensor is output as a digital signal through the AD module, which is convenient for PC processing.
Experimental gas configuration
This paper uses the dynamic gas distribution method to configure the required concentration of malodorous gas. The gas concentration is expressed by the volume concentration method, and the unit is ppm (10-6). The dynamic gas distribution method can realize the precise mixing of different gases, configure odorous compound gases with different concentrations of components, and provide a variety of gas concentrations to be measured so that the gas distribution work and the experiment are carried out synchronously. The structure of the gas distribution system is shown in Fig. 2.

Gas distribution system.
The steps for preparing the gas distribution are as follows: Calculate the dilution gas and test the gas flow rate corresponding to the gas distribution concentration value. MFC1 is used for dilution gas, MFC2 is used for test gas, and the host computer controls the flow rate of gas in each MFC according to the total gas distribution or output time; Adjust the external air supply pressure, the pressure of the pressure reducing valve one is controlled at about 0.2MPa, and the pressure of the pressure reducing valve two is controlled between 0.4∼0.8MPa. Open the electromagnetic stop valves 1 and 2, then open the stop valve 3 to make the pipeline unblocked. Set the flow rates of gas one and gas two as required. After the gas is configured, in order to quickly stop the gas distribution system, the outlet stop valve three can be closed first, and then the MFC of the two intake channels is set to 0.
Qualitative and quantitative analyses were carried out for three malodorous gases: methyl sulfide (C2H6 S), ethyl acetate (C4H8O2), and the mixture of these two gases. Methyl sulfide and ethyl acetate were mixed with eight and nine concentrations, respectively. Methyl sulfide (1 ppm, 2 ppm, 3 ppm, 5 ppm, 10 ppm, 15 ppm, 20 ppm, 25 ppm), ethyl acetate (1 ppm, 2 ppm, 3 ppm, 5 ppm, 10 ppm, 15 ppm, 20 ppm, 30 ppm, 40 ppm). Methyl sulfide and ethyl acetate were mixed in pairs in different proportions to obtain nine kinds of mixed gases.
This system uses a metal oxide semiconductor sensor, so it must be warmed up for half an hour before each operation. When detecting a certain concentration of the gas to be tested, the experimental steps are as follows: The sensor is warmed up. Before testing the experimental gas, power the sensor array to preheat and inject clean air to clean the sensor array simultaneously. This process lasts for half an hour. Data collection. A single data acquisition consists of three parts: ding172 Baseline process. First, inject clean air for 50 seconds to clean the entire air system so that the response of the sensor array reaches a steady state. ding173 Sampling process. The sample gas was injected for 50 seconds, the response curve of the sensor array was acquired and recorded, and the sampling frequency was 10 Hz. ding174 cleaning process. Inject clean air for 100 seconds to clean the entire air circuit system and sensor air chamber so that the sensor returns to the baseline and prepares for the subsequent experimental measurement. Experimental data acquisition is obtained by cyclically repeating steps ding172 ding173 ding174. Data recording. The recorded sensor response curve extracts the reaction volume and rate as characteristic values, and the data is recorded and saved in the PC to prepare for subsequent pattern recognition. Pattern recognition. The pattern recognition steps are shown in Fig. 3.

Pattern recognition step.
Feature Dimensionality Reduction and Feature Selection
Four eigenvalues can be extracted from the response curve of a sensor. In this experiment, a sensor array composed of five sensors is used for data acquisition so that 1*20-dimensional data can be obtained in each experiment. Repeat the measurement 20 times for each gas concentration; then, each can obtain 20*20-dimensional data. The whole experiment is designed with 25 different concentrations of gases, and a total of 500*16-dimensional data can be obtained. These data are extensive for subsequent pattern recognition and contain much repetitive information. Therefore, it is necessary to use feature dimension reduction and feature selection and use a smaller dimension. The data represents the raw data, which is convenient for pattern recognition.
Due to the different performance of sensors, the eigenvalues of different sensors often show different characteristics, so the eigenvalues extracted from each test sample do not necessarily play a positive role in classification, and some will have a negative impact. In order to improve the proportion of practical information in the feature matrix, it is necessary to filter the feature values. Two types of eigenvalues are extracted for each sensor: one is the response amount, and the other is the response rate. This paper uses two-dimensionality reduction methods, PCA and LDA, to extract and compress the electronic nose data, reduce the data to two dimensions, and select the most suitable eigenvalues.
The analysis was carried out with different concentrations of methyl sulfide, as an example.
Preliminarily take the amount of reaction in the front section as the characteristic value for analysis, and Fig. 4 is the analysis diagram of its PCA and LDA. The contribution rate of the first two principal components in PCA analysis was 96.17%, and the contribution rate in LDA analysis was 98.71%. It can be seen from the PCA chart that there is a sample of 3 ppm methyl sulfide overlapping with 5 ppm methyl sulfide, and the 15 ppm, 20 ppm, and 25 ppm methyl sulfide are relatively close and cannot be distinguished well. It can be seen from the LDA analysis chart that 15 ppm, 20 ppm, and 25 ppm of methyl sulfide can not be well distinguished. However, the aggregation effect of different concentrations of methyl sulfide is better than that of PCA analysis. The effect of LDA analysis is better than that of PCA analysis.

Analysis of thioethers characterized by anterior reaction.
In order to find the most suitable eigenvalues, respectively, the amount of reaction in the front section, the amount of reaction in the back section, the rate in the front section, the rate in the back section, the amount of reaction in the front section, the rate in the front section, the amount and rate in the front section, the amount and rate in the back section, The reaction volume and rate of the segment were analyzed as eigenvalues. Table 3 shows the first two principal component eigenvalues and contribution rates of different eigenvalues after PCA analysis. Figure 5 is a PCA analysis diagram of different eigenvalues. As can be seen from Table 3, after dimensionality reduction, the eigenvalues of the first two principal components can well represent the original data information when other eigenvalues are used as eigenvalues except for the front and back rate, and the cumulative contribution rate is above 85%. As shown in Fig. 5, the best differentiation is achieved when the amount and rate of reaction in the front section and the amount and rate of reaction in the front and back sections are characteristic values. At this time, the different concentrations of methyl sulfide have the highest degree of discrimination, and their cumulative contribution rates are 89.18% and 85.86%, respectively.
Principal component contribution rate of PCA with different eigenvalues

PCA of different eigenvalues of methyl sulfide.
Figure 6 is an LDA analysis graph of different eigenvalues. As can be seen from Fig. 6, the front-end rate, the back-stage rate, the front-end reaction amount, the front-end-stage rate, the front-stage reaction amount and rate, the back-stage reaction amount and rate, the back-stage reaction amount, the front-end reaction amount and the rate of the first 2 The contribution rates of the principal components are 82.36%, 98.7%, 98.45%, 88.26%, 92.27%, 97.63%, 98.76%, 94.31%, respectively. Except for the front-end rate, other eigenvalues’ first two principal components can reflect the original sample information well. In addition, when the reaction amount and rate of the first stage are taken as the characteristic values, the difference between different concentrations of methyl sulfide is the best.

LDA analysis of different eigenvalues of methyl sulfide.
To sum up, both PCA and LDA analysis have the best effect when the reaction volume and rate of the previous stage are used as eigenvalues.
Taking the reaction amount and rate of the previous stage as characteristic values, the PCA and LDA analysis diagrams of ethyl acetate, as shown in Fig. 7, were obtained. As shown in Fig. 7, the cumulative contribution rate of the first two principal components of PCA analysis is 82.37%, which cannot fully reflect the original information. The cumulative contribution rate of the first two principal components of the LDA analysis was 97.79%, which could well reflect the original information and distinguish different concentrations of ethyl acetate.

Ethyl acetate analysis chart characterized by anterior reaction volume and rate.
The gas mixture analysis diagram is shown in Fig. 8. It can be seen that in the PCA analysis diagram of the gas mixture, different gas samples have overlapping parts, and the samples of the same gas are relatively scattered. Whether it is a two-dimensional or three-dimensional mixed gas PCA analysis diagram, the three gases are linearly inseparable, but LDA can clearly distinguish the three gases. In the LDA analysis of the mixed gas, the samples of different gases are relatively scattered, and the samples of the same gas are relatively concentrated. It shows that LDA is more effective in identifying gas mixture. Therefore, the LDA method can be given priority when reducing dimensionality.

Aeration analysis chart characterized by anterior reaction volume and rate.
It has been determined that the reaction amount and rate of the previous stage are used as characteristic values. Not only do the eigenvalues affect the pattern recognition results, but the selection of the classification algorithm also affects the quality of the classification results. In order to facilitate analysis and comparison, the dataset will be processed in three scenarios. The support vector machine and BP neural network are used as classifiers to classify the original data set, the data set after PCA or LDA dimensionality reduction, so as to compare the classification effects of different schemes. We use the classification recognition rate (number of correct recognitions/total number of samples) to evaluate classifier performance: the higher the recognition rate, the better the classifier’s performance.
Option 1: Use a support vector machine or BP neural network directly on the original data set for classification.
Option 2: First, use principal component analysis for feature extraction and then use a support vector machine or BP neural network for classification.
Option 3: First, use linear discriminant analysis for feature extraction, and then use a support vector machine or BP neural network for classification.
Twenty parallel experiments were conducted for gases with the same composition and different concentrations, and 16 samples were randomly selected as training data. The remaining four samples were used as test data for later sample prediction data. Therefore, there are 400 training data and 100 test data. The test data numbers 1 to 32 are methyl sulfide, 33 to 64 are ethyl acetate, and the numbers 65 to 100 are mixed gas.
Classification based on SVM
The parameters of the SVM model will change according to the training data, so there is no need to set the parameters in advance. The main parameters affecting the SVM model are the penalty parameter c and the kernel function parameter g. After the kernel function is determined, the penalty parameters are optimized by the cross-validation and grid search methods. Commonly used kernel functions are linear kernel function, polynomial kernel function, sigmoid kernel, and radial basis function (RBF). RBF is a globally convergent learning algorithm with the advantage of fast learning speed and is the most used kernel function. This paper uses RBF as the kernel function, creates a model for the training data set, and finally predicts the test data set according to the created model. The model performance is judged by the classification accuracy rate (the ratio of correctly classified samples to the total number of all samples). The higher the accuracy rate, the better the classification effect; vice versa, the worse the classification effect. The flowchart of the SVM prediction model is shown in Fig. 9.

SVM prediction model flow chart.
The test results are shown in Table 4.
SVM test results based on different dimensionality reduction methods
It can be seen from Table 4 that the accuracy rate of SVM for identifying the original data is 94%, and the accuracy rate for identifying the data processed by PCA is 92%. After PCA processing, the accuracy of SVM recognition decreases. The SVM pattern recognition method based on LDA has the highest recognition accuracy and can identify the gas category 100%, so the LDA-SVM classifier has the best classification effect. Both PCA and LDA have commonly used dimensionality reduction methods. The difference is that PCA is unsupervised learning; it projects the data into a low-dimensional space through the orthogonal transformation of the data and retains the components with significant variance. LDA is a supervised learning method, and the output data has class labels. Therefore, theoretically, PCA does not classify as well as LDA, as demonstrated in Table 4.
Different dimensionality reduction methods combined with BP neural network were used to identify three types of gases. The design of the BP neural network mainly includes the determination of the number of network layers, the determination of the number of neurons in the input layer, the hidden layer, and the output layer, and the determination of the transfer function between the hidden layer and the output layer.
According to the actual situation of the data and the number of expected outputs, the BP neural network designed in this paper includes an input layer, a hidden layer, and an output layer. The number of neurons in the input layer is determined according to the type of analysis data. In this system, the number of neurons in the input layer is 4 or 8. The number of neurons in the output layer is determined according to the number of recognized gas categories. In this system, the number of gas categories is 3, and the number of neurons in the output layer is 3. Determining the number of neurons in the hidden layer is crucial to the performance of the entire neural network. Too many or too few neurons in the hidden layer will lead to a decline in the neural network’s learning ability. If too many neurons are in the hidden layer, the network training time will be longer, the system complexity will increase, and even overfitting will occur. If too few neurons are in the hidden layer, the network cannot meet the training requirements and cannot correctly identify the test samples. The number of hidden layer neurons is often determined using the following formula. Where m represents the number of neurons in the input layer, n represents the number of neurons in the output layer, and l represents the number of neurons in the hidden layer.
The number of hidden layer neurons for the BP neural network can be determined by referring to the above formula. Combined with experiments to change the number of neurons and observe the performance of the BP network, the optimal number of neurons in the hidden layer can be determined. Select login as the transfer function of neurons in the hidden layer and S-function (sigmoid) as the transfer function in the output layer. The flow chart of BP network identification is shown in Fig. 10.

BP network identification flow chart.
When the BP network is used for qualitative identification, the expected output is 0 or 1, the output is 0 means that the gas is not contained, and the output is 1 means that the gas is contained. However, because the data before qualitative identification are normalized, the actual output data are not an integer 0 or 1 but a number between [0,1]. We stipulate that the number of actual output values greater than 0.8 is considered 1, and the number less than 0.2 is considered 0. The output of the BP network is three different types of gases, represented by [y1 y2 y3], and the expected output is:
Methyl sulfide: [y1,y2,y3] = [1,0,0]
Ethyl acetate: [y1,y2,y3] = [0,1,0]
Mixed gas: [y1,y2,y3] = [0,0,1]
To identify methyl sulfide, the following conditions must be met simultaneously:
The recognition results of the BP network are shown in Table 5. Using the original data as the input of BP and the data after LDA dimensionality reduction as the input of BP, the recognition rate can reach 100%. The data after PCA dimensionality reduction is used as the input of BP. The recognition rate is 95%.
BP neural network test results based on different dimensionality reduction methods
The recognition effect of LDA combined with SVM or BP was better than that of PCA combined with SVM or BP. Although directly using the original data as the input of BP can reach 100%, using the LDA dimension-reduced data as the input can reduce the amount of data and reduce the computational difficulty while ensuring correct classification.
The output results of the LDA-based BP neural network for the test samples are shown in Fig. 11. The mean square error of the BP network output is 9.79e-05, which is less than the set error of 0.0001.

Qualitative identification results of BP neural network based on LDA.
In Fig. 11 (c), 5 /10 represents a gas mixture sample composed of 5 ppm methyl sulfide and 10 ppm ethyl acetate. Tables 5 and Fig. 11 show that the prediction results of the BP neural network on the test samples are consistent with the actual situation, which verifies the feasibility of the LDA-based BP neural network for qualitative analysis.
Different dimensionality reduction methods and classification methods will affect the pattern recognition accuracy. In this study, the LDA dimensionality reduction method performs better, and the SVM and BP network recognition results are similar.
The concentrations of methyl sulfide and ethyl acetate were predicted by partial least squares and BP neural network, respectively. A quantitative pattern recognition algorithm suitable for the odor detection system is selected by comparing the test samples’ average relative errors to judge the pros and cons of the two models.
Concentration prediction based on partial least squares
A partial least squares regression model was established for methyl sulfide and ethyl acetate. In order to evaluate the established model, an evaluation model was established with value p, correlation coefficient R2, and average relative error. Figure 12 is a graph of PLS concentration prediction.

PLS concentration prediction.
When PLS predicts the concentration of methyl sulfide, the value p is 0.000, less than 0.05, and the correlation coefficient R2 is 0.9735, indicating that the model fits the data well. The average relative error of the training sample is 34.53%, and the average relative error of the established model is significant. It shows that the concentration prediction effect of this model is not good. It can be seen from Fig. 12(a) that the deviation between the observed data and the fitted curve is significant, especially since the relative error of low-concentration methyl sulfide is significant. The reason is that the electronic nose does not respond significantly to low concentrations of gas, which makes the measurement error and prediction error prominent. The relative error decreases with increasing concentration but is still at a high level. The experimental results show that the partial least squares method is not ideal for predicting methyl sulfide concentration.
When PLS predicts ethyl acetate concentration, the value p is 0.000, the correlation coefficient is 0.9887, and the average relative error of the training samples is 10.87%, which is relatively small. It can be seen from Fig. 12(b) that the training samples are evenly distributed around the fitting curve, the relative error between the fair value and the actual value is small, and the fitting output obtained by the model has a good correlation with the actual value. The relative error between the predicted and actual values will decrease with the increase in gas concentration. PLS has a better regression effect for a higher ethyl acetate concentration, but the prediction accuracy of low ethyl acetate concentration is low. It can be seen that the prediction accuracy of PLS for ethyl acetate is higher than that of methyl sulfide. The possible reason is that the electronic nose has a good response and high sensitivity to ethyl acetate.
The partial least squares method combines the advantages of multiple linear regression and principal component regression. It has substantial advantages when dealing with variables with multiple correlations and when the number of samples is less than the number of variables. The partial least squares method is essentially a linear regression. However, the partial least squares method cannot establish a suitable model to describe the sample data when faced with nonlinear data. Using partial least squares regression to predict the concentration of VOCs, ethyl acetate has a relatively good regression model, but the regression model of methyl sulfide is not ideal. The reason is that the relationship between electronic nose signal and concentration is primarily nonlinear, so linear regression methods such as PLS are used to model the concentration of VOCs, which cannot accurately predict the concentration. In order to improve prediction accuracy, nonlinear methods such as neural networks should be used for prediction.
Similar to qualitative analysis, the training samples are first selected to train the BP network, and then, the test samples are used to test and validate the BP network.
The concentration prediction results of the BP neural network established for methyl sulfide and ethyl acetate are shown in Fig. 13.

BP neural network concentration prediction.
The BP neural network has a better prediction effect on low-concentration VOCs than the partial least squares method. The maximum prediction error for methyl sulfide was 8.98%, the average relative error was 3.24%, and the average relative error for ethyl acetate was 5.43%, indicating that the concentration prediction based on BP neural network was better than the partial least squares method.
The key to the quantitative detection of the mixed gas is to classify the gas correctly and then train the model established by a single gas according to the BP neural network. The obtained mixed gas test data is shown in Table 6.
Aeration quantitative identification result
Aeration quantitative identification result
It can be seen from Table 6 that in the quantitative identification of gas mixture by BP neural network, the maximum prediction error of methyl sulfide is 4.238%, and the maximum error of ethyl acetate is 4.138%. It can be seen that the BP neural network can achieve the expected effect on the quantitative detection of a gas mixture. The experimental results show that the detection and data processing method proposed to this paper has better results, and compared with the existing methods, it has the advantages of smaller equipment, easier to carry and faster online detection speed. At present, infrared spectroscopy is commonly used to detect mixed gases. Y. Shi et al. used infrared spectroscopy combined with principal component regression algorithm to rapidly measure octane content of gasoline [30], which proved that infrared spectroscopy is very suitable for quantitative analysis of fluid components. However, when this method is used in the analysis of mixed gases, the analysis speed and stability need to be improved. Most recently, H. ling et al. propose the NAS-DPCA-PLS prediction model to improve the quantitative analysis of infrared spectroscopy [31], which improved the prediction accuracy of gas mixture identification, but did not analyze the stability system and anti-interference ability of the system. Due to the selection of rich sensor array, the system in this paper is very sensitive to the composition of the gas. So it has good anti-interference ability, and can accurately analyze the content of each component of the mixed gas.
There is a tremendous demand for detecting malodorous compound gases in the air, especially VOCs. However, there are problems in the qualitative identification of malodorous compound gas during detection, the large amount of data to be processed, the difficulty in quantifying the malodor content, and the inability to detect it online. This paper’s particular qualitative and quantitative detection device for malodorous compound gas is designed. Compared with the traditional smell test method, the system is faster, more objective, more accurate, and easier to operate. The system is more convenient than the instrumental analysis method and can directly measure the malodorous compound gas without carrying massive precision equipment.
In this study, we determined the sensor type of the sensor array according to the composition of the composite malodorous gas. We designed a comparative experiment of 2 dimensionality reduction schemes and 3 pattern recognition schemes. We first conducted a comparison test between the PCA downscaling method and the LDA downscaling method using ethyl acetate and methyl sulfide as experimental objects. Based on the concentration differentiation effect, it was concluded that the LDA downscaling method was more effective than the PCA method. Then the experiments were conducted to analyze different eigenvalues using the PCA and LDA methods, and the reaction volume and rate of the former and latter segments were determined as the eigenvalues. Regarding the qualitative identification of gas, this paper uses the support vector machine method and the inverse neural network method to conduct experiments on the original data set, the data set after PCA dimension reduction, and the data set after LDA dimension reduction. Experiments show that the performance of the BP neural network in gas qualitative recognition is better than that of SVM. In this study, the BP neural network accuracy rate for qualitative identification of malodorous gas reached 100%. In addition, after dimension reduction by LDA, the data set has a small amount of data and low computational difficulty and can realize the qualitative identification of malodorous gas faster. Therefore, the composite malodorous gas detection system constructed in this paper adopts the pattern recognition method of the BP neural network combined with LDA dimension reduction in the qualitative gas recognition. Regarding the quantitative analysis of gas, this paper uses two partial least squares methods and a BP neural network to conduct quantitative detection experiments for methyl sulfide and ethyl acetate, respectively. The experimental results show that the BP neural network is better than the partial least squares method for concentration prediction. Finally, the quantitative detection experiment of the mixed malodorous gas was carried out in this paper. Experiments show that the error of the method based on the BP neural network does not exceed 4.3%, which meets the detection requirements of the hybrid odor detection system. In this paper, a qualitative and quantitative detection system for compound odor is designed and implemented. The feasibility and accuracy of the proposed LDA combined with BP neural network recognition are proved through experiments. In the next stage, we will study how to improve the system’s anti-jamming performance and detection speed. We will design more sensor arrays and pattern recognition methods to optimize the system.
In general, this paper proposes the construction and selection of sensor arrays, compares the pattern recognition effects of PCA and LDA dimension reduction methods and SVM and BP neural network methods, and achieves stable and fast qualitative and quantitative detection of mixed gases. However, the system is mainly for the detection of VOC gas mixture, and if it is used for other gas detection, it needs to add and delete sensor categories. In this paper, we have demonstrated the high performance of BP neural network combined with LDA method and analyzed the advantages and shortcomings of this method compared with other methods of the same field, hoping to provide some references and ideas for scholars in related fields.
