Abstract
Demand forecasting of auto parts is an essential part of inventory control in the automotive supply chain. Due to non-stationarity, strong randomness, local mutation, and non-linearity in short-term auto parts demand data, and it is difficult to predict accurately. In this regard, this paper proposes a combination prediction model based on EEMD-CNN-BiLSTM-attention. First, the model uses the ensemble empirical mode decomposition method to decompose the original data into a series of eigenmode functions and a residual item to extract more feature information. And then uses the CNN-BiLSTM-attention model to analyze each mode separately. The components are predicted, and the prediction results are summed to obtain the final prediction result. The attention mechanism is introduced to automatically assign corresponding weights to the BiLSTM hidden layer states to distinguish the importance of different time load sequences, which can effectively reduce the loss of historical information and highlight the input of critical historical time points. Finally, the final auto parts demand prediction results are output through the fully connected layer. Then, we conduct an experimental analysis of the collected short-term demand data for auto parts. Finally, the experimental results show that the prediction model proposed in this paper has more minor errors, higher prediction accuracy, and the model prediction performance is better than the other nine comparison models, thus verifying the EEMD-CNN-BiLSTM-attention model for short-term parts demand forecasting effectiveness.
Introduction
For the automotive supply chain, accurately grasping the demand for parts is an essential factor in inventory control, and timely and accurate supply from parts suppliers is a fundamental step and a critical link to ensure that the manufacturer (OEM) is in a good state of production operation. Parts and other materials demand forecasting is affected by many internal and external factors of many member enterprises in the supply chain; and there are many kinds of parts, and there are often urgent orders, equipment failures that lead to changes in production capacity, process changes and other unexpected situations, and the host plant cannot challenge the production plan in real-time, resulting in a case of disconnection from the actual production and difficulty in controlling the production schedule [1]. Parts demand forecasting is not only critical for automotive companies to reduce costs and prevent inventory shortages, but also to further coordinate activities such as purchasing, production, and inventory in the automotive supply chain to minimize the impact of the bullwhip effect [2].
Historical demand data for automobile manufacturer A in China indicate several products for which the demand time series contains a large number of zero values, or intermittent demand in the general sense. Intermittent demand means that demand is non-smooth and non-continuous, not occurring in every period, and at the same time, the size of the demand that arises is constantly changing. That is, Short-period auto parts demand data are characterized by non-smoothness, strong randomness, abrupt local changes, and non-linearity, making forecasting accuracy more difficult. This complex form of demand makes demand forecasting and inventory control face many difficulties. Because most of the classical forecasting methods and models are difficult to apply to this demand pattern. Practice shows that in an automotive equipment manufacturing enterprise A, due to poor demand forecasting, product overload and surplus occur from time to time, surplus will lead to loss of profitability and market expansion opportunities, and overabundance will cause a lot of waste of human, material and financial resources of the enterprise.
In many industries, especially for commodities related to spare parts, such as heavy machinery and electronics, process industries, automotive industries, durable goods spare parts, telecommunication systems, and aircraft maintenance service parts [3]. Since intermittent demand is expected in practice, forecasting intermittent demand is of great importance for many companies, as a slight improvement in forecasting can result in significant cost savings for the company [4].
In addition, in complex equipment manufacturing companies such as vehicle manufacturing, goods with intermittent demand structures account for a significant share of total inventory value, and the inventory value of accessories typically accounts for more than 60% of total inventory costs. To balance the production cycle, and reduce production and inventory costs while meeting the individual needs of customers, more and more enterprises are adopting the production model of Assemble-to-order (ATO).Under the ATO model, manufacturers can effectively meet the customized needs of customers while ensuring the quick delivery of products to customers, thus better-helping enterprises to adapt to the current The ATO model has been widely used and promoted in many automobile manufacturing enterprises. For example, Toyota has proposed “to produce goods in demand”, and after implementing the ATO production model, it has not only realized the combination of low cost and high quality, but also realized multi-species, and The ATO production model has not only achieved a combination of low cost and high quality, but also realized a multi-variety, low-volume production plan. According to Toyota statistics, of the 364,000 vehicles produced by the company in three months, there were four basic models and 32,100 models, with an average production of 11 units of the same model, with a maximum of 17 units and a minimum of 6 units.
In the ATO model, manufacturers need to assemble different quantities and types of parts to fulfill a customer order, the lack of any one factors can lead to the failure of the entire order, so it is crucial to ensure a reasonable number of various parts in stock. Inventory control of parts depends on demand forecasting, and the result of demand forecasting often determines the reasonableness of parts inventory. If manufacturers can accurately forecast the demand for parts and set the parts inventory level based on the demand forecast, they can reduce the parts inventory backlog and thus reduce the parts inventory cost.
The demand-driven theory has been produced for a long time and is widely accepted by manufacturing enterprises. The idea believes that the formation, existence and reconfiguration of the supply chain occur based on particular market demands, and in the process of supply chain, user demand is the driving source for the operation of information flow, service flow, and capital flow of supply chain. In the supply chain management model, the operation of the supply chain is order-driven; commodity purchase orders are generated under the drive of user demand orders, then commodity purchase orders drive product manufacturing orders, and product manufacturing orders go raw material purchase orders. Demand forecasting is increasingly essential.
Therefore, how to accurately capture the change characteristics of different time dimensions at the micro level and predict the short period parts logistics demand can not only improve the timeliness, accuracy, and economy of the parts supply chain, but also the correct parts demand prediction is important for improving the production efficiency of the production plant of the host factory, making more reasonable production plans, ensuring the lean management and improving the operational efficiency of the automotive production enterprises significance [5].
Demand forecasting, as the first line of defense in the supply chain, is a core part of supply chain management, so many experts and scholars in academia and industry have been working on such product inventory management issues for a long time and have started to pay attention to applying some new forecasting techniques to demand to forecast of products.
To effectively deal with non-stationary time series demand data, some researchers have proposed the processing idea of decomposing non-stationary time series demand data first and reconstructing the results after forecasting, and introducing the time-frequency analysis method in the signal field into the research of forecasting auto parts with non-stationary demand.
Therefore, in this research paper, we propose to combine the decomposition method with mathematical models, using the decomposed modal component prediction algorithm and using machine learning algorithms, which can achieve high prediction accuracy.
In contrast to existing studies, the main contributions of this paper are as follows: (1) Unlike most studies that use forecasting models to forecast data for more extended time series, forecasting work is carried out for data characteristics such as short-period auto parts demand, non-smoothness, strong randomness, local mutation, and non-linearity for short-term irregular demand. (2) Different from previous traditional statistical methods and machine learning methods, this chapter starts from two perspectives of data feature decomposition and feature extraction, organically combines decomposition methods with deep learning, and proposes a combined EEMD-CNN-BiLSTM-Attention-based prediction model to solve the problem of large direct prediction error as well as prediction lag due to the non-linear unsteady original demand. Firstly, the original data is decomposed into N different eigenmodal functions (IMFs) and one trend term (Res) based on the ensemble empirical modal decomposition method, and each modal component after decomposition contains local data features of the actual data at different time scales, and there is no need to set any basis function in the next prediction, followed by using CNN-BiLSTM-Attention is performed for each modal component separately, and the prediction results are summed to obtain the final prediction results.
The rest of the paper is arranged as follows: Section 2 is a literature review of the research on the related models. Research methods and data collection are in Section 3. Data Processing in Section 4. In Section 5, based on previous research methods and models, examples and experimental results are analyzed. Conclusions, policy implications, and limitations are offered in Section 6.
Review of the literature
Demand forecasting of auto parts
Currently, the existing literature on demand forecasting methods for automotive parts, spare parts, materials etc., is divided into four main categories as follows:
(1) Mathematical and statistical models. Statistical methods include autoregressive (AR), autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA), generalized autoregressive conditional heteroskedasticity (GARCH), line gray GM (1,1) model [6], linear discriminant analysis (LDA), etc. CHEN et al. adapted the ARMA model for demand forecasting of “painting materials” and “oil filters” of auto parts, and the results showed that the model has some practical value and high prediction accuracy, Still the time series fluctuates or is unstable, the ARMA prediction effect is not satisfactory [7]. Hu et al. used the gray prediction model GM(1,1) to forecast the demand for magnesium materials and corrected the residuals using neural networks [8]. The results showed that the average forecast error was reduced by 6.5%, but the model lacked a combination of regression analysis and time series forecasting [9]. The above models are based on statistics, primarily based on linear and regular distribution assumptions, taking into account the interaction between more straightforward variables, and the prediction of time series demand data through simpler model assumptions, only for less volatile data prediction effect is better, due to the non-linear and non-time series characteristics of parts demand, there is often a shortcoming of low prediction accuracy.
(2) Machine learning models. Compared with traditional statistical methods, artificial intelligence methods can better handle non-linear, discontinuous, and non-smooth dynamic time series auto parts demand data [10].YU et al. used an artificial neural network to forecast demand for spark plugs, an auto part, which has improved prediction accuracy compared to the ARMA model, but requires tuning of a large number of parameters before obtaining better prediction results [11]. Artificial neural networks (ANN) have been widely used to fit complex automotive supply chain systems as an adaptive information processing method that can better capture non-linear relationships between complex data. Tang et al. used BP neural networks to input inventory information and material attribute information into the model to forecast sales demand in time series and analyze the enterprise material demand budget [12].CHEN et al. also proposed a regression-Bayesian-back propagation neural network forecasting model, which has higher prediction accuracy and better robustness than the ARMA model [13]. Shallow machine learning can reasonably and effectively capture the non-linear pattern of parts demand, which further improves the prediction accuracy. Still, it has limited capability in data feature learning and significantly underperforms in generalization capability.
With the rapid development of big data and computer technology, research on demand forecasting of automotive parts is increasingly using machine learning methods, and this research method is gradually becoming a research hotspot [14]. Convolutional neural networks (CNN), recurrent neural networks (RNN), and attention mechanisms (Attention) are typical deep learning algorithms. CNN reduces neural network complexity by sparse interaction, shared parameters, and equilateral representation, which makes the model have excellent feature extraction ability and noise resistance. Recurrent neural networks (RNN) are mainly used to describe the dynamic behavior among time series data, but they suffer from gradient disappearance and explosion and lack long-term memory capability [15].
To overcome these drawbacks, Hochreiter and Schmidhuber first introduced Long Short-term Memory Neural Network (LSTM), which enables RNNs to effectively utilize long term temporal information [16]. Demand forecasting is one of the primary decision-making tasks for companies, and Khan and Saqib predict the demand of stores with the help of deep learning models [17]. Minglu Chen et al. constructed a combined forecasting model based on RF and long and short-term memory networks, and conducted experiments using battery sales and aftermarket data of auto parts, and the results showed that the model not only outperforms the combined ARIMA-LSTM model due to a single model, but the weights of the sub-models need to be further optimized [18]. Deep learning models have a deeper structure, emphasize the learning of features, can more accurately describe the complex correlation between input and output, and the prediction of time-series data is usually better than mathematical statistical models, and support vector machines, etc., and easy to implement, but its prediction accuracy for non-smooth data with strong randomness still needs to be improved. LSTM networks, because of their special memory capacity and gate structure, can simultaneously BiLSTM implements a bi-directional loop structure with forward and backward propagation based on LSTM, which increases the connection between data streams. In recent years, the attention mechanism as an efficient resource allocation mechanism has gradually become a research hotspot speech recognition, image recognition and machine translation, etc. Attention models can also be integrated to capture multi-level saliency factors to improve model training efficiency [19]. CNN, BiLSTM, and Attention models all have their own advantages and disadvantages. BiLSTM is suitable for solving time series problems, which can overcome problems such issues gradient disappearance and explosion and better capture the bi-directional temporal features between data. However, it needs to train a large number of parameters, which hinders the training efficiency. Parameterr sharing in CNN can overcome this weakness. In addition, BiLSTM can better uncover the temporal features between data and learn bidirectional temporal features. Therefore, in this paper, CNN, BiLSTM and Attention are used to predict time-series automotive parts demand data.
(3) Combination forecasting models of two or more methods. Since each single forecasting model has different strengths and weaknesses, the use of combined forecasting can effectively combine the advantages of other models to further reduce the risk of forecasting and improve the accuracy of forecasting [20]. Combined forecasting models can take advantage of the strengths of each model in combination with forecasting, thus improving the forecasting accuracy [21]. For example, MEHDIZADEH proposed a forecasting model that integrates categorical inventory analysis and rough set theory for demand forecasting of automotive parts, but further improvement of forecasting accuracy is needed using conventional forecasting methods [22].Wang et al. proposed an error that uses rolling long and short-term memory neural networks to capture non-linear data and then compensates the residuals with a rolling autoregressive moving average model compensation mechanism, which can only reduce the maintenance cost of the forecasting model and reduce the time for model correction and training [23]. In the field of parts demand combination forecasting, YANG et al. established a non-negative variable weight combination model based on ARIMA, multiple regression, and SVR forecasting models for demand forecasting of parts in the automotive aftermarket, and verified experimentally that the model has higher forecasting accuracy compared to a single model, but did not consider the selection of the optimal submodel [24]. Chandriah et al. proposed a recursive neural network/long and short-term memory (RNN/LSTM) and improved Adam optimizer approach for spare parts demand forecasting. In this model, the weight vectors are generated separately. The weights are optimized using the improved the Adam algorithm. The experimental results show that the improved RNN/LSTM method has good working performance with minimum error compared to the existing methods. The results show that the proposed RNN/LSTM based on the improved Adam algorithm can be used well for predicting automotive parts [25].
The integration of EEMD and other forecasting methods
However, due to the presence of a lot of noise in the raw demand data, it is usually necessary to pre-process the data during the study to improve the prediction performance as much as possible. The decomposition algorithm proposed by Norden e. Huang overcomes the difficulties of this aspect of the study. The empirical mode decomposition (EMD) is advantageous in dealing with non-smooth data, and it can decompose the signal to achieve the smoothing of non-smooth data. However, there are still limitations of EMD method, and the main problems are modal conflation and endpoint effects [26]. In response, Wu et al. proposed an integrated empirical mode decomposition (EEMD) method to solve the modal aliasing problem as a noise-assisted data analysis algorithm [27, 28]. Currently, the prediction models combined with EEMD decomposition algorithm are broadly classified into two types: “decomposition algorithm-statistical model” and “decomposition algorithm-machine learning”.
For the “decomposition algorithm-statistical model”, the statistical model is simple and easy to operate. Jiang et al. proposed a high correlation-based EMD-VAR wind speed prediction model for multiple adjacent measurement points. The empirical modal decomposition (EMD) method is used to remove the noise from the raw data and obtain various IMF components. For each IMF component of the spatial group, a corresponding vector autoregressive model is developed. The final prediction results were obtained by aggregating the predictions of each IMF component [29]. Liu et al. developed three single models, ARIMA, BP, and SVM and three hybrid models EEMD-ARIMA, EEMD-BP, and EEMD-SVM to predict short-term urban water consumption and optimizing the operation of urban smart water distribution stations [30]. However, the linear characteristics of the data or the typical distribution characteristics make it difficult to further improve the prediction accuracy.
In “decomposition algorithm-machine learning”, Yu et al. organically combine empirical modal decomposition with neural network models to forecast world crude oil futures prices [31]. Shao et al. propose an integrated model based on long and short-term memory neural networks with whale-like bionic optimization for short-term load forecasting. The original signal was decomposed into multiple feature components by ensemble empirical modal decomposition. Each feature component is fed into the bionic optimization combinatorial model for prediction. The bionic whale algorithm is used to solve the problem that long and short-term memory neural networks tend to fall into local optimization and to improve the accuracy of parameter optimization [32]. Amanollahi et al. combined with generalized regression neural network (EEMD-GRNN) and adaptive neuro-fuzzy inference system (ANFIS, a mixture of non-linear models) with the help of integrated empirical modal decomposition to predict urban PM2.5 concentrations [33]. He et al. proposed an integrated approach to wind power prediction based on integrated empirical modal decomposition (EEMD) and least absolute shrinkage selection operator-quantile regression neural network (LASSO-QRNN) models. The model cleverly integrates data pre-processing techniques, feature selection methods, prediction models, and data post-processing techniques. It dramatically improves prediction accuracy and effectively quantifies the uncertainty of the prediction process [34]. Wang and Xu proposed a combined EEMD-PSO-SVM prediction model for the effectively predicting of rainfall in the Yellow River basin [35]. From the analysis of the above literature, it can be seen that the machine learning algorithm predicts the decomposed modal components, and this “decomposition first, ensemble later” prediction method achieves higher prediction accuracy. Although the above combined models have achieved good prediction results, the latest combined deep learning and decomposition models are less frequently used in automotive parts demand forecasting.
Research methodology
Ensemble empirical modal decomposition (EEMD)
Empirical mode decomposition (EMD) tends to suffer from defects such as confounding modal phenomenon and lack of uniform criteria for iteration stopping conditions when decomposing raw data. To solve these problems and shortcomings in practical mode decomposition, HUANG and WU proposed the ensemble empirical mode decomposition (EEMD) algorithm [27]. The ensemble empirical mode decomposition algorithm adds Gaussian white noise to the original signal, mixes the original signal with Gaussian white noise to decompose the new signal, and averages the modal components to obtain the real mode.
The specific steps of EEMD are as follows:
(1) Add a white noise signal satisfying the normal distribution ζ j (t) to the original signal q (t) to obtain the new signal q′ (t):
Where, ζ j (t) is the white noise signal, j = 1, 2, ⋯ , M, M is the number of tests.
(2) Decompose the signal q′ (t) to obtain multiple eigenmodal functions I
i
(t) and a trend term res (t):
(3) Repeat steps (1) and (2), adding white noise of the same intensity and different amplitude magnitude each time:
(4) The IMF component is averaged to obtain the final IMF component by using the white noise spectrum with zero mean
During the decomposition process, the number of iterations has a significant impact on the overall effect of the modal decomposition. The white noise standard deviation is set to 0.2, and the number of white noise iterations is 200. At this point, there are N IMF components that will be predicted separately for the next stage.
CNN is one of the most widely used algorithms in the field of deep learning, because it implicitly extracts features from training data during the learning process, has efficient feature extraction capability, and avoids the process of relying on empirical knowledge for feature extraction. Hence, CNN has more applications in various fields of prediction research [36, 37]. CNN is mainly composed of convolutional layer and pooling layer. Where the convolutional layer is the core of the convolutional neural network and is mainly used for the extraction and mapping of the input data features. The input data is convolved with different convolutional kernels for convolutional computation. The convolutional kernels are used for effective non-linear local feature extraction of automotive parts demand data, and the pooling layer is used to compress the extracted features and generate more important feature information to improve the generalization ability. If the CNN model contains multiple convolutional layers, the number of feature parameters output by the convolutional layers is significant, and to reduce the number of parameters, the pooling layer is often used to perform a secondary sampling operation on the convolutional features of the data to extract some information and prevent overfitting of the model. The fully connected layer is usually used at the end of the CNN model to reduce unnecessary feature loss and to integrate and compute all features as the final output. The basic structure of CNN is shown in Fig. 1.

CNN structure diagram.
Hochreiter et al. first proposed a novel recurrent network architecture LSTM neural network, which aims to solve the gradient disappearance problem during the training of long sequences [16]. LSTM helps to improve the accuracy of short-term automotive parts demand prediction by learning the long time correlation of the time-series data, so that the network can converge better and faster. The EEMD decomposed long-period modalities have a significant trend component. Compared to other artificial intelligence methods such as support vector machines (SVMs), LSTM can learn this long-term trend component better. In addition, the long-term trend component of the auto parts demand sequence is highly non-linear and non-smooth, which is not suitable for fitting with general econometric models [38]. LSTM enhances the long-term memory capability by introducing three kinds of gates, forgetting gates, input gates and output gates for logic control unit to maintain and update the cell state, which can well solve the problem of RNN gradient disappearance and gradient explosion. The LSTM unit internal structure is shown in Fig. 2.

LSTM unit structure.
(1) The forgetting gate determines the information be forgotten from the cell state at the moment t - 1, as shown in Equation (5). The forgetting gate reads the hidden layer state h(t-1) at the moment of t - 1 and the input sequence x(t) of the prediction model at the moment of t, i.e., historical auto parts demand data, and outputs a value between zero and one, with one indicating that the complete information is retained and zero indicating that the information is completely discarded.
Where, f(t) is the oblivion gate state at the moment t, Wfbf is the weight and bias of the oblivion gate respectively, σ is the bipolar sigmoid activation function, h(t-1) denotes the output vector at the moment of t - 1, [•] denotes the vector concatenation operator.
(2) The input gate reads the input x(t) at the moment of t and determines the information stored in the neuron. Then the temporary state of the memory cell at
Where, i(t) is the input gate state at the moment of t, controlling the amount of information passed from x(t) to C(t), W i b i is the weight and bias of the input gate respectively, W c b c is the weight matrix and bias term of the cell state respectively, tanh is the hyperbolic tangent activation function, ⊗ is the Hadamard product.
(3) The output gate can select the important information output from the current state. sigmoid layer first decides which part of the neuron state needs of being output, then the neuron state to be output goes through tanh layer and multiplies with the output of sigmoid layer to get the output value h(t), this output value h(t) is also the input value of the next hidden layer. The output gate is calculated as shown in Equations (9) and (10).
The bi-directional long short-term memory network is an optimized improvement of the traditional one-way LSTM. BiLSTM combines a forward LSTM layer and a backward LSTM layer, both of which affect the output. The one-way LSTM can fully use historical data and avoid long-distance dependence situations. BiLSTM facilitates both the input of antecedent sequence information and further improves the accuracy of model prediction. BiLSTM network implements a two-way cyclic structure with forward and backward propagation on top of the LSTM network, which increases the connection between data streams and can better explore the temporal sequence between data features [39], and the BiLSTM structure is shown in Fig. 3.

BiLSTM neural network structure.
The hidden layer update states of the forward LSTM, the backward LSTM, and the final output of the BiLSTM proceed as follows:
Where, x1, x2, x3, ⋯ , x t denotes the corresponding genus-in data at each moment of t1 ∼ t i (i ∈ [1 ∼ t]), A1, A2, A3, ⋯ , A t , B1, B2, B3, ⋯ , B t denotes the corresponding forward and backward iterations of the LSTM hidden states, respectively, Y1, Y2, Y3, ⋯ , Y t denotes the corresponding output data, ω1, ω2, ω3, ⋯ , ω t denotes the corresponding weights of each layer. f1f2f3 denotes the activation functions between the different layers, respectively.
Attention originated from the simulation of the attentional features of the human brain, and the method was first applied to the field of image processing [40]. In the field of deep learning, the Attention mechanism assigns the size of weights according to different features, assigning greater weights to critical content and smaller weights to other content, and the efficiency of information processing can be improved by differentiated weight assignment. The attention mechanism unit structure is shown in Fig. 4.

Attention mechanism unit structure.
The attentional state transition process as shown in Equations (14), (15), (16), and (17).
Where, a
n
is the attention weight value of the BiLSTM hidden layer output value h
t
to the current input, y1, y2, y3, ⋯ , y
t
is the input sequence, h1, h2, h3, ⋯ , h
t
is the hidden layer state value corresponding to the input sequence of y1, y2, y3, ⋯ , y
t
,
CNN has the ability to acquire local trend features of data sequences, while BiLSTM has the ability to acquire long-term dependent features of data sequences, while the bidirectional loop structure of forward and backward propagation increases the connection between data streams, which can better learn the interaction features between data. On the other hand, the attention mechanism allows the model learning process to focus more on the important features and improve the generalization ability of the model. Based on this, a deep learning prediction model combining CNN, BiLSTM and Attention mechanism is designed in this paper.
In the CNN-BiLSTM-Attention model, CNN, BiLSTM and Attention are connected sequentially into a sequential combination structure. In terms of network structure design, the model has three layers of network structure, input layer, hidden layer and output layer, where the hidden layer includes four main types of data computation: CNN layer, BiLSTM layer and Attention layer and fully connected layer. The hidden layer of this model has three CNN layers, BiLSTM layer, Attention layer and fully connected layer. Due to the periodic and sequential nature of auto parts demand data, 1DCNN is used and the output of 1DCNN is calculated by the activation function ReLu, which is given by the following equation:
The input to the BiLSTM network layer is the local trend features extracted by 1DCNN, and the BiLSTM network layer mines the patterns of all extracted features and outputs the hidden layer state to the attention layer. The input state of the attention layer h1, h2, h3, ⋯ , h t comes from the last thoroughly connected layer in the BiLSTM. Hidden network, which can fully learn, capture the output new messages from the previous network structure and output the final prediction through the fully connected layer.
We propose a combined EEMD-CNN-BiLSTM-Attention-based prediction model, and the specific implementation process is shown in Fig. 5. First, the collected auto parts demand data are preprocessed and divided into training and test sets, respectively. Secondly, the EEMD-CNN-BiLSTM-Attention combined prediction model is constructed. In this paper, the EEMD algorithm is used to decompose the auto parts demand data, and the decomposition generates multiple modal components, followed by using convolutional neural network to extract the internal features of auto parts demand data. The convolutional layer performs the non-linear local feature extraction of auto parts data, and the pooling layer selects the maximum pooling method to generate the key feature information. The BiLSTM hidden layer modeling learns the local features extracted by the CNN and iteratively extracts more complex global features from the local features by changing the internal dynamics of the CNN. The features generated by the BiLSTM hidden layer are used as the input of the Attention mechanism, and the attention mechanism is used to automatically differentiate the importance of the temporal information extracted by the BiLSTM hidden layer by weighting, which can more effectively exploit the time-series properties of the auto parts demand data itself and explore the deep temporal correlation. The Attention mechanism can effectively reduce the loss of historical information and highlight the information at critical historical time points to weaken the influence of redundant information on the load prediction results. The output of the Attention layer is then used as the input of the Fully Connected layer, and the prediction results of each modal component are output through the Fully Connected layer. Finally, the individual modal components obtained from the prediction are linearly summed and summed to get the final prediction results.

EEMD– CNN– BiLSTM– Attention flowchart.
In summary, the combined EEMD-CNN-BiLSTM-Attention prediction model applies the ideas of decomposition and integration. The decomposition is to simplify the prediction work, while the integration is to form a consistent prediction based on the original data, validate and make the regular features of the extracted IMF and residual components reflected in the prediction model, and further improve the prediction performance.
Data description
In this chapter, the actual demand data (sampling period is 90 min in units) for the component body wiring harness from September 1, 2022 to September 30, 2022 of a well-known domestic automotive company is selected for short-term automotive component demand forecasting. The historical data from September 9, 2022 to September 25, 2022 is chosen as the daily training set and the demand data from September 26, 2022 to September 30, 2022 is used as the forecast data set. As can be seen from Fig. 6, the short-time auto parts demand data exhibit obvious characteristics of non-stationarity, strong randomness, abrupt local changes, and non-linearity.

Raw data for body wiring harness demand, September 9, 2022 to September 25, 2022.
In order to improve the accuracy and accelerate the convergence speed of the algorithm, this paper normalizes the original data and maps the original data values between [0, 1] with the following conversion function:
Where, X
i
is the actual measured data of the first i-th sample point,
Where, y* is the normalized load forecast, x* is the actual auto parts demand forecast obtained after inverse normalization.
In order to determine the prediction accuracy, three commonly used metrics, root mean squared error (RMSE), mean absolute error (MAE), and mean fundamental percentage error (MAPE), are selected as model evaluation metrics in this paper, respectively.
Where, y′ (i) , y (i) is the predicted demand value and the true demand value at the timeof i, respectively, n represents the total number of samples tested.
Analysis of model prediction performance
(1) Raw data decomposition
The EMD decomposition of the collected short-period automotive parts demand data yields seven IMF components and one residual component. This is shown in Fig. 7. The EEMD decomposition of the collected short period automotive parts demands data yields eight IMF components and one residual component. This is shown in Fig. 8.

EMD decomposition result of auto parts demand data.

EEMD decomposition result of auto parts demand data.
For we can find the obvious modal mixing phenomenon in Figs. 7, 8 overcomes this shortcoming and deficiency by decomposing the auto parts demand data into eight modal components with different characteristics and different frequencies and one trend term. The modal components are arranged in order from high to low according to the frequency magnitude, and the average amplitude varies from small to large, showing apparent multi-scale characteristics. According to Fig. 8, it can be seen that the average period of IMF1∼IMF3 is short and is a high-frequency sequence; the average period of IMF4∼IMF8 gradually becomes longer and all are low frequency sequences. The time series are then converted into a dataset available for supervised learning by sliding window fetching, so that the dataset can be fed into the training model for the subsequent prediction work. Then the model can be further optimized according to its performance on top of the test set.
(2) Prediction model parameter setting
Hyperparameters are prerequisites for determining the structure of a neural network model.
First, the main model structure parameters of the convolutional neural network are confirmed. According to the existing research experience, after many repeated example analyses to verify the selection process of CNN network layers, the prediction accuracy is judged by the size of MAPE and RMSE. When the number of CNN layers exceeds two layers, the data appears overfitting situation, so this paper sets the number of CNN layers to 2 layers. The number of convolutional kernels filters is 64, the convolutional kernel size is 3, the boundary treatment of convolution padding is 1, the activation function activation used is ‘Relu’, the pooling window size pool size is 3, and the number of neurons units is 64 when the convolutional neural network model performs optimally in terms of prediction results.
Second, for the number of hidden layer units in the BiLSTM neural network model is identified. Since the number of different hidden layer units can have an important impact on the prediction accuracy, this chapter performs an optimization search for the parameters of the number of hidden layer units. Determining the number of hidden nodes and the number of hidden layers is still a difficult task, and there is no mature theoretical method to solve this problem, so the experimental method of decreasing or increasing the number of layers and the number of neurons per layer has to be used to determine the optimal parameter settings. Combined with existing research experience, 4, 8, 16, 32, and 64 hidden layer units are selected for experimental analysis in this paper. Other parameters were set as follows: the number of training iterations is 1000, the initial learning rate is 0.002, and the test set accounted for 30%. After several experiments, it is found that the hidden layer is two layers, and the prediction model performs best when the hidden layer unit is 16. In addition, in the attention-based model, an appropriate time step needs to be set, and a time step of six is chosen, so four hidden layer units are selected as the hidden layer of BiLSTM in this paper. In addition, the dropout rate for BiLSTM is 0.2, the dropout rate for Flatten is 0.5, optimization function is ‘Adam’, batch size is 32 and epoch number is 50.
To further verify the effectiveness of the model proposed in this paper, several other automotive parts demand forecasting models are selected for comparative study, mainly including (1) single forecasting model: BP neural network model, CNN convolutional neural network model, ARIMA model, BiLSTM model. (2) Two methods combined forecasting model: I. CNN-LSTM, CNN-Bi LSTM; II. EMD-BiLSTM, EEMD-BiLSTM. (3) Three methods combined forecasting model: EMD-CNN-BiLSTM-attention. in order to visualize the gap between the prediction results of different algorithms and the actual values, the prediction results will be grouped according to the above grouping is reflected in Fig. 9. Each subplot contains the actual values and the algorithms in this paper in order to compare the prediction results.

Comparison of short-term auto parts demand forecasts.
In order to more accurately assess the prediction accuracy, this paper compares the RMSE, MAE and MAPE of all the above models. According to the results reported in Table 1, the performance of EEMD-CNN-BiLSTM-attention in all three metrics of RMSE, MAE and MAPE are significantly due to the other nine models, with RMSE of 26.8631, MAE of 19.9575 and MAPE of 17.5164%, indicating that the short-term automotive parts demand forecasting models proposed in this paper all have high accuracy and show good forecasting performance. In addition, it can be seen that the models EMD-BiLSTM and EEMD-BiLSTM, EMD-CNN-BiLSTM-attention and EEMD using both data decomposition CNN-BiLSTM-attention two sets of forecasting models outperform the other forecasting models without data decomposition, which is a good indication of the importance of data decomposition for short-term auto parts demand that is highly non-stationary and stochastic. Besides, the prediction performance of EEMD-BiLSTM outperforms that of EMD-BiLSTM, and the prediction performance of EEMD-CNN-BiLSTM-attention exceeds that of EMD-CNN-BiLSTM-attention, indicating that EEMD can effectively solve the modal mixing situation existing in the EMD decomposition process.
Comparison of forecasting errors of each model
To further verify the effectiveness of the prediction model proposed in this paper and fully reflect its generalization capability, the generalization capability of the model is applied. This paper specifically collects 1081 demand data of a particular auto parts enterprise for a certain lamp class of parts, and the collection time point interval is 90 min. The newly collected demand is also analyzed by applying all the prediction models described in the previous paper. In this experiment, in the decomposition process of original demand data, the standard deviation of white noise was set at 0.05 and the number of integration was set at 100. The number of CNN layers is 2 layers. The number of convolutional kernels filters is 128, the convolutional kernel size is 4, the boundary treatment of convolution padding is 1, the activation function activation used is ‘Relu’, the pooling window size pool size is 6, and the number of neurons units is 64. the number of training iterations is 1000, the initial learning rate is 0.001, and the test set accounted for 30%. The dropout rate for BiLSTM is 0.3, the dropout rate for Flatten is 0.5, optimization function is ‘Adam’, batch size is 32 and epoch number is 20.
According to the results reported in Table 2, it can be seen that for the short-term demand data of the lamp class parts, the EEMD-CNN-BiLSTM-attention models built in this paper all have significantly better prediction effects than other prediction models. This further proves the effectiveness and superiority of the prediction model constructed in the short-term auto parts demand prediction.
Comparison of forecasting errors of each model—data of car lamp parts
Comparison of forecasting errors of each model—data of car lamp parts
Conclusions
Accurate automotive parts demand forecasting is the basis for enterprises to achieve capacity planning and inventory optimization decisions. And it is also a critical link for automotive companies to complete full lifecycle smart manufacturing to improve after-market service effectiveness. However, in actual business, parts planning is often linked to new online projects or related to sporadic demand arising from the lack of parts at maintenance sites, resulting in a typical intermittent and blocky distribution of the time series of demand data, lacking clear cyclical characteristics, thus making it more difficult to extract sufficient fluctuation patterns from such series and affecting the forecasting effect. How to extract the inherent evolutionary laws from intermittent time series and achieve accurate serial trend forecasting is an urgent need in the current parts management of manufacturing enterprises, and at the same time has evident theoretical research value.
In the face of the severe damage caused by demand uncertainty and supply shortage to the automotive supply chain, it is urgent to build a safe and orderly automotive supply chain. Short-term auto parts demand data has the characteristics of intermittency, non-linearity, non-smoothness, strong randomness, etc. This paper proposes a combined EEMD-CNN-BiLSTM-attention prediction model method for the features of this data. Through experimental validation and analysis, the following conclusions are drawn through exploratory validation analysis.
From two perspectives of data feature decomposition and feature extraction, the decomposition method is organically combined with deep learning. With the help of EEMD algorithm to finite decomposition and extraction of the original data, give full play to the advantage that CNN can effectively extract spatial features, combined with the BiLSTM neural network’s ability to extract bi-directional temporal features of sequence data, and the attention mechanism can selectively focus on the hidden layer state, so as to fully explore the time series properties of the load data itself and obtain deep temporal correlation. The attention mechanism can also effectively reduce the loss of historical information and highlight information at key historical time points to reduce the impact of redundant information on load prediction results.
Management insights
(1) Improving the accuracy of parts demand forecasting can ensure smooth order assembly work. The result of parts demand forecast reflects the manufacturer’s ability to grasp the market product orders. The more accurate the parts demand forecast, the more comprehensive the manufacturer’s grasp of market information. Only by accurately predicting the demand for parts can we make timely responses to customer orders and improve customer satisfaction.
(2) Reduce parts inventory backlog and improve parts inventory management efficiency. Parts inventory as the core issue of manufacturers in ATO model, a reasonable number of parts inventory can ensure the assembly while providing the capital tied up, thus ensuring the economic benefits of the enterprise.
(3) Under the ATO model, parts are diverse and common, and manufacturers control inventory according to the characteristics of each type of part combined with demand forecasting results, reducing idle resources. The so-called idle resources refer to the backlog of parts that are not commonly used, and too much inventory of such parts will form a kind of idleness and waste. A reasonable amount of inventory for each type of component will create greater economic value for the company while increasing inventory turnover.
Research gaps and future prospects
The research in this paper does not go far enough, and many aspects of short-term, intermittent demand forecasting need to be studied in depth.
(1) Due to limited data, the input characteristics considered in this paper for demand forecasting include only the size of demand, the time interval between demand occurrences, the cumulative number of periods when demand is continuously zero, and the number of working days in the forecast week, while the actual influencing factors should be more than these characteristics, so the accuracy of intermittent demand forecasting will be improved if more other effective influencing factors can be added in future studies. Furthermore, we will further optimise the hyperparameters of the model proposed in this paper using an optimisation algorithm to make the prediction results more accurate.
(2) This paper only forecasts demand based on the demand time series of intermittent products, but in practical applications, demand forecasting and inventory control are closely related. If demand forecasting and inventory control can be studied as a system, there will be a greater degree of improvement for enterprises to reduce inventory and procurement costs and improve the overall efficiency of the supply chain.
Footnotes
Author contributions
Conceptualization, Kai Huang and Jian Wang; methodology, Kai Huang and Jian Wang; validation, Kai Huang and Jian Wang; formal analysis, Kai Huang and Jian Wang; data curation, Kai Huang and Jian Wang.; writing—original draft preparation, Kai Huang and Jian Wang. writing—review and editing, Kai Huang. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Social Science Fund of China (18BGL018).
Institutional review board statement
Not applicable.
Informed consent statement
Not applicable.
Data availability statement
The forecasting data used to support the results of this study has not been provided because it is private data of enterprises.
Conflicts of interest
The authors declare no conflict of interest.
