Abstract
Parallel power loads anomalies are processed by a fast-density peak clustering technique that capitalizes on the hybrid strengths of Canopy and K-means algorithms all within Apache Mahout’s distributed machine-learning environment. The study taps into Apache Hadoop’s robust tools for data storage and processing, including HDFS and MapReduce, to effectively manage and analyze big data challenges. The preprocessing phase utilizes Canopy clustering to expedite the initial partitioning of data points, which are subsequently refined by K-means to enhance clustering performance. Experimental results confirm that incorporating the Canopy as an initial step markedly reduces the computational effort to process the vast quantity of parallel power load abnormalities. The Canopy clustering approach, enabled by distributed machine learning through Apache Mahout, is utilized as a preprocessing step within the K-means clustering technique. The hybrid algorithm was implemented to minimise the length of time needed to address the massive scale of the detected parallel power load abnormalities. Data vectors are generated based on the time needed, sequential and parallel candidate feature data are obtained, and the data rate is combined. After classifying the time set using the canopy with the K-means algorithm and the vector representation weighted by factors, the clustering impact is assessed using purity, precision, recall, and
Keywords
Introduction
The electric load data will have many outliers due to the inherent uncertainty in the numerous signals that make up the electric power automation system’s information [1]. The reliability of past data is crucial to the success of power load characteristic analysis and load forecasting [2, 3, 4]. As a result, it is crucial to handle outliers in historical data properly. In [5] the fast decomposition orthogonal transformation state estimation algorithm is used to find and eliminate inaccurate readings. In [6] wavelet singularity detection is implemented to rectify and smooth the load readings. Furthermore, [7] implements the filter technique Kalman. In [8] proposes using a neural network for identification and adjustment to find and remove outliers from a large set of historical data, then subbing in predicted values where appropriate. The proposed power load separation technique is based on the data’s entropy [9, 10].
The small dataset is the US Department of Energy’s residential electricity load data published on the available website with energy-related information [11]. A dataset consisting of power load statistics for 936 residential consumers is recorded every 60 minutes, yielding 24 measurements per day, throughout the year. Electricity data makes up the huge dataset from a research Perspective Energy Experiment using Intelligent Meteres [12]. This includes every day’s information on the load curve for more than 6,000 consumers from 2009 to 2010. The large dataset’s power consumption is more intricate and varied than the tiny datasetsand is the most important application of data clustering [13, 14, 15, 16]. The k-means algorithm is used to simplify communication in the analysis of power load data for both tiers of a multi-layered clustering model [17, 18, 19]; local clustering of temporal data using adaptive k-means is subsequently imported into a larger model.
Reference [20] has developed a cutting-edge method for real-time monitoring of subsynchronous control interactions in power systems using Improved Intrinsic Time Scale Decomposition. This method provides critical insights into complex system interferences, crucial for maintaining system stability and performance. Complementing this reference [21] have made significant strides in autonomous vehicle technology by improving anomaly detection using a denoising variational transformer, which is key for the interpretability and reliability of self-driving cars. Furthermore, an extensive review of potential sensor data anomalies in autonomous vehicles, underlines the need for robust sensors to ensure vehicle safety and optimize efficiency [22, 23].
When weighted time-domain and frequency-domain data are combined, affinity propagation clustering is employed to produce the clustering results [24, 25, 26]. However, the literature [27] also provides a related divide-and-conquer strategy by combining the adaptive k-means and density peaks clustering techniques. The literature [28] suggested a hybrid two-step strategy for creating several sub clusters before combining those using K-Medoids. However, data reduction and multi-level clustering alleviate the big data curse at the expense of the intricate data structure of raw time series data. Mainly, substituting derived features for the original data reduces the interpretability of patterns found and nearly inevitably results in an inaccurate grouping. Irrational data segmentation may also lead to these unwanted outcomes for multi-level methods, mainly when applied first-level clustering. Because each starting data block must always determine a large number of cluster centers, some of which may be significantly different from the actual centers.
This loss of global information affects the analysis of the final clustering results. Therefore, it could be more sensible to increase the parallelism of clustering algorithms so that they can immediately accept the raw dataset as input. Adopt an improved algorithm for cluster data mining and the load characteristic curve abnormalities for inaccurate readings [29, 30]. In this paper, the authors proposed a negative selection technique that generates a detector set using negative selection to identify abnormalities rather than employing a dataset directly. Results show that the algorithm outperforms the conventional approach regarding prediction accuracy, low maintenance requirements, and convergence rate. While compared to the algorithm, which is superior to ways of predicting with neural networks, this one is more flexible and gives better results. Therefore, it is necessary to extract data that can represent the main content of the dataset to obtain the time-required results. This paper retains monitoring data in the power system with three categories of Consumer power consumption, Weather data, and Power generation and uses them as candidate feature data after deduplication [31, 32]. The calculation of this vector needs to be improved. In order to verify the effectiveness of the canopy
The contributions of this paper are as follows
Hybrid system development, it presents a system that combines Canopy and K-means algorithms to detect anomalous patterns in energy consumption. The implementation of a detector provides a detector that precisely measures the level to which power usage is anomalous. Instead of using high-dimensional daily load curves, focus on describing how customers’ power usage varies from day to day using daily load characteristics. The hybrid approach appears to be superior to previous detection algorithms in identifying instances of anomalous power consumption.
The hybrid’s deployment can greatly decrease the time and materials needed to carry out assessments, resulting in cost savings. Instances where the system improves accuracy and makes the best use of available resources. Thus, the hybrid has improved, and it is now more accurate and applicable in a wider range of power consumption terms.
The paper is organized as follows
The organization of the remaining sections in this paper is visualized through the following flowchart [35]:
Materials and methods
Implementation process of K-means and canopy algorithms under the hadoop platform
Hadoop’s scalable and reliable big data storage and processing infrastructure makes it ideal for managing massive data sets in a distributed setting [34]. When it comes to handling large amounts of data and applying machine learning, Hadoop and Apache Mahout are practically inseparable. When it comes to large data analysis, Hadoop provides the underlying distributed storage and processing infrastructure, and Mahout takes advantage of this by offering scalable machine learning algorithms [36]. Because of the Mahouts’ compatibility with Hadoop’s distributed computing features, machine learning methods may be applied to massive datasets. In order to execute machine learning algorithms on massive data efficiently, Mahout takes advantage of the parallelism and fault tolerance provided by Hadoop. Hadoop’s infrastructure is both scalable and dependable, making it ideal for processing massive amounts of data in a distributed setting. While Apache Hadoop is primarily used for its distributed processing capabilities, Apache Mahout is an open-source library that provides a collection of machine learning algorithms and tools such as clustering, classification, and collaborative filtering. With its superior capabilities and user friendliness, Apache Mahout is an attractive alternative to Hadoop. With these supplementary resources, Apache Mahout becomes a more compelling option for machine learning and data processing jobs than Hadoop alone. The advantages of using Apache Mahout rather than Hadoop are outlined in Table 1 [37, 38].
The capabilities of Apache Mahout that are lacking in Hadoop
The capabilities of Apache Mahout that are lacking in Hadoop
Dataset
The term “multi-level clustering techniques” refers to an approach used for multi-level data structure analysis. These methods can be applied to a sizable dataset from the Malaysian Electricity Department, specifically from the utility company “Tenaga National Berhad” (TNB), in the context of power consumption analysis. Residential electricity load statistics, including energy-related information, are assumed by TNB’s Department of Electricity. With information on the power load of 10,000 residential consumers, this dataset is significant. Every day for a whole year, the data is gathered at regular 60-minute intervals [12]. This indicates that there is precise information on the power load statistics of these residential consumers for each day within the chosen timeframe. The dataset was gathered as a part of a study or experiment that used smart metres to analyze energy consumption. Smart metres are instruments that can track and measure patterns in electricity consumption. In this instance, data from more than 10,000 users was gathered from 2015 through 2022 [13, 14, 15, 16]. For researchers looking to examine home electricity usage trends over a prolonged period, this huge dataset offers a plethora of data. Researchers can study several levels of data structure hierarchies and get insights into the patterns and trends within the dataset by using multi-level clustering algorithms. It’s crucial to remember that the dataset mentioned here is particular to a research experiment and might not be made available to the general public. Normally, anyone who is interested in accessing and analyzing this material must go through the proper channels, like working with the Malaysia Electricity Department or the Department of Electricity at TNB.
The K-means and Canopy algorithms can be implemented in the Hadoop environment thanks to the Apache Mahout programme. Create and deploy K-clusters with Canopy first. The K-means algorithm can be initialized with these k-clusters to produce the desired outcome in clustering. The following are the main steps depicted in Fig. 1’s schematic:
Preparation of Data, First, a Mahout vector format conversion of the power load dataset is performed. Clustering of the canopy, Canopy is used to find clusters to begin with, and the results are saved in a file or folder. Clustering using K-means and the Hybrid CKMA algorithms take their input from the results of the Canopy clustering stage. Hybrid algorithms’ output is kept distinct from the K-Means output. Clustering of sensitivity peaks, K-Means and Hybrid clustering findings are utilized as input for the quick sensitivity peak clustering method, with the final results being written to a novel directory. Accelerating clustering through the use of parallel processing involves dividing the task between several computers, or nodes. The clustering procedures’ output is examined and visually shown so that outliers can be picked out.
The following schematic diagram is a simplified representation of the Apache Mahout architecture for identifying power load anomalies utilizing the Canopy, K-Means, and Hybrid algorithms.
The process framework using Apache Mahout for detecting power load abnormalities based on Canopy with K-means and Hybrid algorithms.
The Canopy algorithm is a fast approximate clustering technique. Its advantage is that obtaining clusters is very fast, and the result can be obtained by traversing the data only once. Because of this, the canopy calculates. However, the method needs accurate cluster results [39]. The basic process of the Canopy algorithm is as follows [40]:
Determine the two distance thresholds of the canopy, namely T1 and T2, where T1 Take any data object from the dataset and calculate the distance between it and all Canopy centres. If the canopy does not currently exist, take the data object as a Canopy centre, and delete it from the dataset. Otherwise, go to (4). If the distance of the data objects to a Canopy centre is within T2, add it to the canopy and delete it from the data set. Because the data object is close to this canopy, it can no longer serve as a hub for another Canopy. Suppose the distance between the data object and a Canopy centre is within T1 outside of T2. The data object is also added to the canopy. However, the data object is not deleted from the data set at this time. This data object will participate in the next round of the clustering process. Suppose the distance of the data objects to all Canopy centres is beyond T1. It is regarded as a Canopy centre and is deleted from the data set. Repeat iterations (2) to (6) until all data objects in the dataset are divided into the corresponding canopy.
K-means algorithm
The core idea of the K-means algorithm is to iteratively divide all data objects into k clusters so that the objects in the clusters have high similarity. Furthermore, the objects between each cluster have a low similarity. The basic process of the K-means algorithm is as follows [41]:
Input the dataset Select Calculate the distance from any object in the cluster to the centre of each cluster. Moreover, assign it to the cluster where the closest cluster centre is located. Recalculate the average of all data objects in each cluster as the new cluster centre. Repeat (3 and 4) until the cluster centre does not change or the maximum number of iterations is reached.
Proposed Hybrid (CKMA) algorithm methods
It begins with preprocessing the dataset in this section of the proposed methodology. After identifying power load abnormalities in a parallel dataset, the data is written to the Hadoop Distributed File System (HDFS), converted to sequence files, and finally read from HDFS for processing and analysis. As a preparatory step in processing, the canopy algorithm is applied to the extracted data to select optimal source points. Then, using the Canopy output as a guide, the k-means clustering algorithm defines the most informative clusters within the Load dataset after representing the data as a feature vector. The dataset’s feature weight is based on the vector execution model to measure the between Parallel and Sequential. The clustering process of the canopy with the K-means algorithm is shown in Fig. 2.
Proposed clustering process using Canopy with K-means algorithm.
Each customer’s load curve matrix is as Eq. (1).
Where
Five indicators can be used to describe the typical electrical usage behavior for every client
Characteristics of daily load
The object p’s k-means neighborhood is defined as depicted in Eq. (2) using the definitions of k-means neighbourhood in Eq. (3).
Where,
The analysis of the eigenvalue of the matrix of correlation
Where the eigenvectors of
Where,
Where, The behaviour of element
The local distribution matrix is reconstructed as Eq. (7).
Where
Each object’s local outlier score is calculated as Eq. (10). Where
Where,
In general, the proposed execution model takes advantage of the hybrid Canopy-K-Means Algorithm (CKMA) within the Hadoop platform to efficiently process power consumption data. Initially, the power load data is formatted for compatibility with Mahout, followed by a rapid initial clustering using the Canopy technique to establish preliminary cluster centers. These centers inform the subsequent K-Means clustering, which refines these clusters for more precise. Further enhancement is achieved through fast sensitivity peak clustering, applying the insights gained from previous steps to better isolate. This multi-stage clustering process is performed in parallel, utilizing the distributed computing power of Hadoop to manage and analyze the large-scale data effectively. The final clustering results are then visualized, providing a clear depiction of normal versus anomalous consumption patterns. Throughout this process, performance is rigorously evaluated using metrics like purity, precision, recall, and
Big data clustering process based on canopy 
The clustering algorithm can be evaluated using internal, external, and relative validity evaluation criteria [42]. The clustering outcomes are evaluated in this article primarily using four external assessment criteria [43, 44].
Purity: It is an easy-to-understand evaluating indicator. It allocates every clustering to the document category with the most significant count frequently occurring in the cluster. It splits the number of docs that were appropriately assigned to the overall amount of docs N. to get the cluster by Eq. (11).
Where Precision: It measures the proportion of objects of a particular category in each cluster calculated by Eq. (12).
Where TP (True-positive) true positive refers to the decision to correctly classify two similar data into the same cluster; FP (False-positive,) false positive refers to the wrong classification of two dissimilar data into the same cluster decision. Recall: It measures the degree to which each cluster contains all objects of a particular category and is calculated by Eq. (13).
Among them, FN (False-negative) false negative refers to the decision to classify two similar data into different clusters.
Among them,
In the practical clustering project described, real power consumption data, inclusive of metrics such as peak and off-peak usage, is standardized and uploaded to the Hadoop Distributed File System (HDFS). Utilizing Apache Mahout, the Canopy algorithm swiftly determines initial cluster centers, exploiting Hadoop’s distributed processing for data scalability. These centers seed the K-Means algorithm within Mahout, which iteratively refines clusters. Evaluating these clusters with purity and
However, this example serves to illustrate how to apply these evaluation metrics in practice
The plot shows 8 observations that have been clustered into 3 groups. Each observation is represented by an ‘x’ and is colored according to the cluster it has been assigned to, with the color bar on the right indicating the cluster numbers.
This clustering could have been the result of a hybrid clustering approach using K-Means initialized by Canopy cluster centers. In an actual application, the observations would represent data points with features extracted from the power load data, and the clusters could represent different typical load profiles or potential anomalies.
The plotted data is generated randomly for this example, and the clustering is performed using the K-Means algorithm. In practice, the Canopy algorithm would first be used to quickly generate rough clusters that serve as initial centroids for the K-Means algorithm, which then refines the clustering. This two-step process is particularly useful for large datasets as it can significantly reduce the time K-Means would otherwise take to converge on large datasets.
The visualization aids in interpreting the clustering results, where the proximity of points and their colors represent the grouping determined by the algorithm. In the context of power load analysis, such clustering might help in identifying patterns of usage that correspond to normal behavior or various types of anomalies or inefficiencies.
Given the previous plot details, we will create a hypothetical example with 8 observations and 3 clusters
In the provided clustering plot, outliers or abnormal data points are typically those that lie a significant distance away from the clusters of other points. Outliers can be identified as points that do not group well with any cluster or are far away from their cluster centers.
Based on the figure, there do not appear to be any significant outliers. All points seem to be relatively close to others within the same color group, indicating they are close to their respective cluster centroids. However, the point in the cyan color at the bottom left (with the lowest values on both Feature1 and Feature2) could potentially be considered an outlier within its cluster due to its distance from the other points in the same cluster.
It’s important to note that outlier detection depends on the context and specific criteria used to define what is considered “normal” within the dataset. In a power load dataset, for instance, what constitutes an outlier would depend on typical load profiles, the expected variability in power usage, and other domain-specific factors. In statistical terms, a point may be considered an outlier if it lies more than 1.5 times the interquartile range (IQR) below the first quartile or above the third quartile of the dataset.
Experimental set-up and results
Experiments setup
The experiments provided an overview of a setup for constructing a parallel data mining Also this experiment demonstrated an intelligent cloud computing framework for smart monitoring of power systems used a hybrid of canopy clustering and K-Means clustering with Apache Mahout.
For dataset preparation, obtained a power load dataset and preprocess in a compatible format by normalizing. For Hadoop Cluster Configuration Hadoop clustered to handle the size of your dataset and the computational requirements. ForApache Mahout Installation Installed and configured Apache Mahout on the Hadoop cluster and implemented the fast-density peak clustering algorithms, using Mahout’s MapReduce framework. Also evaluated the performance of algorithms for power load abnormalities detected depended on the parameters, such as the number of clusters and distance thresholds. After that the input data was in a formatted Hadoop Sequence File and submitted the MapReduce job for the algorithms to the Hadoop cluster. For the performance evaluation wasRetrieved the clustered results generated by algorithms from the output paths specified in the MapReduce job to evaluated the performance of the algorithm in terms of clustering quality, and detection accuracy metrics. Assessed and analyzed the effective approach results to gain into power load abnormalities by comparing the results.
Experimental and discussion
The well-known and logical technique used for the simplicity and effectiveness of solving clustering issues partitioning-based clustering when dealing with unlabeled data (i.e., data without defined categories or groups) is K-means, as shown in Fig. 4 [45, 46].
Evaluation of the clustering effect of the Canopy, K-means, and Canopy 
There are two models to enhance the clustering of big data time by comparing two algorithms Canopy and K-means based on Sequential and Parallel models as followings:
On the other hand, the canopy takes approximately for sequential (0.36803) second to perform the parallel (0.36118) on multiple nodes. At the same time, k-means it takes for sequential around (0.0099415) second to perform the parallel (0.0083332) k-means algorithm on multiple nodes without canopy. However, canopy with k-means with canopy takes about (0.0081674) second to perform the parallel (0.0073397). Time and node requirements for implementing parallel k-means via Hadoop were presented, along with the results. The time needed to complete the k-means technique might be reduced by using canopy clustering as a preliminary step. Because of this, we have shown that the k-means algorithm in parallel mode works well with the canopy method, and the outcomes depend solely on the size of Hadoop clusters.
Time comparisons between running the Sequential & Parallel of the Canopy, K-Means and Canopy with k-means algorithms.
The evaluation metrics used to assess the performance of the clustering algorithms are purity, precision, recall, and
The assessment criteria employed in this research have shown that the hybrid approach and the Canopy and K-Means algorithms are effective for clustering the dataset on the power system. The hybrid Canopy
Hybrid approach, K-means, and Canopy algorithms evaluation metrics for each K range from 30 to 120
Hybrid approach, K-means, and Canopy algorithms evaluation metrics for each K range from 30 to 120
The hybrid Canopy with K-Means algorithm exceeds previous versions Canopy and K-Means algorithms with regard to precision and recall. Furthermore, there is a consistent upward and downward trend between the three algorithms’ Precision-Recall (PR) curves. Raising the detection threshold increases the number of outliers the algorithms can pick up without compromising precision. In order to catch more out-of-the-ordinary consumptions, one could lower the judgement threshold, although doing so would reduce detection accuracy. We compare the three approaches’ performance at the sweet spot where Precision and Recall are equal by analysing the crossing points of their respective PR curves on the massive dataset. Finding trends in energy consumption becomes more challenging as the information grows larger and more users with varying energy habits are included. The relative performance curves of PR of the three algorithms are compared for big datasets in Fig. 6a–c. Area Under the Curve (AUC) data show that the hybrid Canopy
Comparison PR between Canopy, Hybrid and K-means using the large datasets.
Figure 7a–c compares the PR curves of the three detection techniques for small datasets and the results of a PR experiment with these algorithms. Higher Area Under the Curve (AUC) values reveal that the hybrid Canopy and K-Means-based detection method outperforms the individual Canopy and K-Means techniques. The hybrid Canopy and K-Means algorithm has an AUC of 1.00
Comparison PR between Canopy, Hybrid and K-means using the small datasets.
Canopy, Hybrid and K-means based on ROC curves with different 
By contrasting the actual and predicted fault locations, we were able to evaluate the performance of the clustering techniques. The detection method’s efficiency is highly sensitive to the choice of k. Figure 8 illustrates the ROC curves for different
The outcomes indicate that the Hybrid algorithm outperforms the Canopy and K-Means methods in detecting aberrant electricity usage patterns, even for huge datasets. The ROC curves show that the Hybrid algorithm has improved detection performance due to greater TPR values at lower FPR levels. In addition, the optimum points of the canopy algorithms are relatively close to one another, unlike the optimum point of the K-Means method. The Hybrid, K-Means and Canopy approaches have a greater recall value at the state of optimum point than the K-Means algorithm, indicating that they are more accurate at detecting true positives. It’s important to note, though, that the proposed approach may pose greater challenges than the other two algorithms for consumers who use electricity in unconventional manners. This indicates that the proposed method may require further development or customization for certain applications, such as identifying anomalous electricity usage patterns for customers with specific energy consumption habits. Thus, it appears that the Hybrid technique has potential for improving the precision and efficiency of detecting unusual patterns in electrical consumption. Additional research in this area may have significant benefits for the energy industry and its customers. When compared to the Canopy and K-Means methods for
Large dataset, the AUC of Canopy, Hybrid and K-means with different
values tested
Large dataset, the AUC of Canopy, Hybrid and K-means with different
CKMA, CA and KMA based on ROC curves with different 
Figure 9a–c shows that the TPR is greater than the FPR at the sweet spot of the ROC curve when
Small dataset, the AUC of Canopy, Hybrid and K-means with different
values tested
Small dataset, the AUC of Canopy, Hybrid and K-means with different
Comparison between proposed study and literature methods
Based on the results of our literature review, we have compared several popular algorithms. In this study, we tested how well a method that combines Canopy and K-Means for base detection could perform. We compared this method against both Canopy and K-Means separately. Three algorithms – Local Matrix Reconstruction (LMR), Local Outlier Factor (LOF), and a Gaussian Kernel Function Improved LOF Algorithm (GKLOF) – were used in the studies we reviewed. Our results are summarized in Table 6, which compares the Canopy, K-Means, and Hybrid algorithms to the LOF, LMR, and GKLOF algorithms across a variety of criteria. Precision values achieved for different values of k clustering for both small and large datasets are listed in Table 6. In this work, we focus on contrasting the efficiency of the Canopy and K-Means algorithms with that of the hybrid-base detection technique, Canopy
1. Small data performance
According to the literature benchmark, the hybrid algorithm outperforms the Canopy and K-Means algorithms in terms of performance detection. This shows that the hybrid algorithm has a stronger ability to discriminate between normal and aberrant patterns of electricity usage in limited datasets. The LMR-based detection algorithm, on the other hand, is highlighted in the literature benchmark and consistently achieves a precision score of 1.00 for any value of k in the short dataset. This shows that anomalous consumption patterns are consistently and precisely detected by the LMR algorithm.
2. Large data performance
Even when used on big datasets, the hybrid method continues to outperform the Canopy and K-Means methods. This suggests that even when dealing with a larger population of clients with varied power usage patterns, the hybrid algorithm’s performance is still reliable. In addition, a review of the literature shows that the performance of the Grid-basis on K-Neighbor GKLOF, LOF, and LMR algorithms varies when handling large datasets. Performance for the LOF approach falls more precipitously than for the LOF method, while the GKLOF algorithm performs fairly consistently. Our research shows that for both small and large datasets, the hybrid-based detection approach outperforms the canopy and K-Means techniques. The usefulness of the LMR-based detection approach, especially for small datasets, is highlighted by these results, which are consistent with earlier research. They also show the shortcomings of the LOF and GKLOF algorithms, particularly when dealing with big datasets and small
Conclusion
In order to find unusual patterns in power consumption, this paper offers a hybrid- technique that combines the Canopy and K-means algorithms. The results of this investigation have provided several important observations. First, a powerful optimal that measures the level of abnormality in each sample of power consumption has been created and put into practise. It does this by using the Hybrid algorithm at the neighbours level. Second, the study focuses on characterizing daily load characteristics rather than extremely dimensional daily loads, resulting it simpler to compare load data gathered at various rates of sampling. Thirdly, the suggested approach outperforms the previous two algorithms for detection with regard to precision in detection and parameter sensitivity. It might considerably cut down on the time and materials needed to conduct inspections while increasing overall accuracy. The necessary algorithms and threshold settings can also be used to determine the threshold amount for abnormal line losses for a power system. Enhancing our comprehension of threshold creation in various circumstances should be the goal of subsequent studies in order to advance the area.
Footnotes
Acknowledgments
The researchers would like to thank the Research lab of the FTSM-UKM, in Malaysia and the University of Fallujah in Iraq for their assistance.
Conflict of interest
The authors declare no conflict of interest.
Funding
Universiti Kebangsaan Malaysia (UKM) financed this research UKM Grant Code: FRGS/1/2021/ICT07/UKM/02/1.
