Abstract
Online services have advanced to the point where they have made our lives much easier, but many problems should be solved to make these services safer for consumers. Numerous transactions are conducted daily, and much personal information is published and shared on e-commerce and social media platforms. This makes security, privacy, and problematic reliability barriers to overcome. One of these problems is detecting credit card fraud because thieves aim to make all transactions legitimate by stealing credit card information. Imbalanced data is a potential problem in machine learning that impairs the performance of the classifiers used in real-world systems. For example, anomaly detection and fraudulent transactions. The term “data imbalance” refers to the problem in which the sample distribution is skewed or skewed towards a particular class. Due to its inherent nature, the software failure prediction dataset falls into the same category as non-defective software modules. The main objective of this paper is to solve the problem of the imbalanced fraud credit card dataset for enhancing the detection accuracy of using machine learning algorithms. This paper provides a unique fraud detection model using the Particle Swarm Optimization (PSO) based on oversampling technique of the minority class to solve the imbalanced dataset problem compared with the Genetic Algorithm (GA) technique. Random Forest (RF) algorithm shows up with sensitivity, specificity, and accuracy. The experimental results achieved 99.3% and 99.4% for GA and PSO within seconds, respectively. Experiments show that the proposed methods outperform other methods, evidenced by the higher classification accuracy obtained.
Introduction
Technological advances and the development of new payment methods have brought society many unique benefits, establishing the credit card as the most popular payment method for online and offline purchases. However, it has increased card fraud. As e-commerce has grown in popularity, online payment has become more widespread. On the other hand, the security of Internet transactions has become increasingly vital. According to a study conducted by the Network Security Team of the Beijing Municipal Public Security Bureau and the Hunting Network Platform Company, the number of cybercriminals reached 1.6 million yuan, and the market for online scams reached 110 billion yuan [1]. Automatic credit card fraud detection is a complicated machine learning problem, and several strategies have been proposed to solve it. The lack of adequate online payment mechanisms, user trust in those making electronic payments, and issues with the perceived security of the payment mechanism contributed to the low demand for online purchases. As a result, this proposal examined current commercial fraud protection strategies to identify flaws and proposed a model for e-commerce frauddetection [2].
Financial fraud is increasingly prevalent in financial markets over the Internet, from buying and selling on merchant websites like Amazon and eBay to Automated Teller Machine (ATM) transactions. ATM card fraud detection has become a data mining task for several fundamental reasons. Fraudsters are more organized, professional, and accurate in their actions and approaches, and ATM card fraud data sets are significantly skewed. The sampling method significantly impacts the ATM card transaction fraud detection performance in the dataset, the variants chosen, and the detection algorithm used. The banking and digital payment industries are being disrupted by ATM card fraud. Machine learning is quickly becoming the industry standard for risk mitigation. The topic of fraud defense has received a lot of attention recently. Fraud detection is a serious problem that affects a variety of applications in a variety of disciplines. Figure 1 shows the main processes of the fraud detection system. Researching, designing, testing, and improving fraud detection algorithms require extensive information about the domain and its specific problems [3]. This paper presented an electronic payment system that uses optimization algorithms such as GA and PSO to detect fraudulent transactions. The main objective of the suggested system is to provide high accuracy detection in a short period and at a low cost.

General represintation of fraud detection system.
The main problem is that the machine learning models perform much worse when training data is not evenly distributed across classes. The most representative samples of minority groups are first isolated through the unique research method of selective oversampling. Many data sets lack balance, making certain entries belonging to the same category numerous and others rare. The performance of a classifier is affected by the unbalanced nature of the data sets. Also, the accuracy of instance selection approaches is poor when dealing with imbalanced database problems. Therefore, there is a big need to find an algorithm to more accurately oversample the minority class. In this paper, a new oversampling technique based on the PSO technique is proposed. It performs better than GA up-sampling techniques in terms of precision, andaccuracy.
The rest of this paper is organized as follows. Section 2 introduces background about credit card fraud. Section 3 reviewed the related works. The proposed model is discussed in detail in Section 4. Section 5 discusses the experiments and performance measures. Finally, a brief conclusion and recommendations for further research are investigated in the last section.
Credit card fraud has become a widespread phenomenon due to technological advances. We can make any purchase anywhere with a credit card. Protecting these transactions from scammers is a challenging problem for researchers to solve. It is also possible to argue that economic fraud has expanded dramatically as global communication improves. However, the annual losses resulting from these fraudulent operations could amount to billions of dollars. Therefore, it is essential to enhance the performance of machine learning methods. In this section, the applied machine learning techniques of the proposed models will be discussed in detail. The general structure of the fraud detection system is shown in Fig. 1.
Random Forest (RF)
RF classification algorithm is used as a classifier in fraud detection systems. The popularity of decision tree models in data mining can be attributed to their algorithmic simplicity and versatility in processing various data attributes. At the same time, single-tree models are sensitive to specific training data and are easy to overfit. The Ensemble approaches, which integrate a set of individual trials in some way to address these problems, are more accurate than single classifiers. RF is one of these ensemble approaches. It is a collection of multiple tree predictors in which each tree depends on a random independent dataset, and all trees in the forest have the same distribution. The capacity of RF is determined by the strength of individual trees and the association between them. The higher the strength of a single tree and the lower the correlation between multiple trees, the better the performance of the random forest. Tree randomization arises because they use bootstrap samples and select a subset of attributes from data at random. The training data per tree comprises starter samples randomly chosen from the original training set with replacement [4]. Randomly selects a subset of attributes at each internal node and calculates the centers of different data classes at the current node, as shown in Fig. 2.

Representation of random forest.
GA is an oversampling technique based on evolutionary processes. In contrast, the chromosome represents a solution. Then, new solutions are generated using the crossover and mutation processes. The process of fusing two chromosomes into one is known as crossover. The mutation is modifying a chromosome to produce a new solution. After that, the solution will be evaluated using the objective function, and any that do not meet the criteria will be removed. The remaining solution will repeat the mutation and crossover procedure until the stop condition is met, such as the execution time or the number of iterations. The evolutionary algorithm can be applied to overcome the imbalanced credit card fraud dataset. The oversampling method produces offspring based on the nearest neighbor, resulting in less diverse samples. Using the inheritance theory, new samples are generated for the population and live within the minority border. The parents involved in the generations are retained in the population to retain all the information of the minority samples even after the offspring are generated [5]. The GA based oversampling technique is shown in Fig. 3.

GA based oversampling technique.
Credit card datasets often have an unbalanced number of training samples. For example, class fraud has only a few samples, but class normal has thousands of samples. As a result of this mismatch, the misclassification rate is exceptionally high, and accuracy suffers. Before classification, the training samples should be balanced to improve the classifier’s precision. The well-known PSO technique is used to oversample the training data for each class. Eberhart and Kennedy [6] presented PSO as an evolutionary algorithm. PSO-based learning is one of the most common evolutionary optimization strategies due to its simplicity of convergence. Assume that the search space is D–dimensions, then the swarm is the population, the particle is the i–th individual of this population. The particle can be illustrated by a D–dimensional vector, X i = (xi1, xi2, …, x iD ). The velocity of this particle is described by another vector V i = (vi1, vi2, …, v iD ). The i-th particles best previously visited position is defined as P i = (pi1, pi2, …, p iD ). Then, its purpose is to discover the best position of a particle X i using the velocity update equation (1 and 2), which is given by
Detecting credit card fraud is one of the most pressing problems today, which has led numerous researchers to attempt to address it using various algorithms and methodologies. This section will discuss recent research on detecting fraudulent transactions to improve the model performance and secure future transactions.
Kamaruddin et al. [7] developed a hybrid architecture to solve the big data paradigm for classifying a SAPRK cluster that integrates A self-Associative Neural Network (AANN) and PSO. They could correctly classify 85.8% of fraudulent credit card transactions on average. In [8], the authors created a model based on the Adaptive Neuro-Fuzzy Inference System (ANFIS). The Teaching-Learning-Based Optimization (TLBO) and PSO algorithms are used to fine-tune the ANFIS parameters. The goal of using this methodology is to increase network performance and reduce computational complexity.
S. Xuan et al. [9] examined the performance of two different types of randomized forest models. The first is the traditional random forest, which chooses a subset of attributes at random for each interior node. Furthermore, the approaches in different classes are estimated. Classification And Regression Tree (CART) is the second algorithm that splits the dataset at each node. They compare machine learning techniques (RF, SVM, and ANN) to detect credit card fraud. The experiments proved that RF is more accurate than SVM at detecting fraud but less accurate than ANN. They reached 91.96% of the first algorithm, while the second achieved 96.77%. Moreover, Yee et al. [10] tested four supervised-based classifiers for credit card fraud detection, including Bayesian (K2), Tree Augmented NB (TAN), Logistics, J48, and NB classifiers. The data wrangling step was performed using the Principal Component Analysis (PCA). Classifier’s accuracies were more than 95.0% for all classifiers. The limitations were that they used a simulated dataset, and the data required manual labeling. Another contribution in [11] used the “deep automatic encoder” to build a one-of-a-kind bipartite graph taxonomy and evaluate the experimental comparability of user behavior. This deep integration could uncover fraud blocks. Not only does DeepFD outperform previous benchmarking methodologies, but it will also be significantly more effective at detecting various barriers to duplicationregularly.
Devi et al. [12] created a model based on a random forest algorithm to increase fraud detection effectiveness. The model has not been tested on two benchmark datasets. The performance of the model over the two datasets is above 80%. Amusan et al. [13] created a hybrid technique called Counter Propagation Neural Network and GA (CPNN-GA) for online transaction systems to detect anomalies. It combined the GA and artificial neural networks. The GA is an efficient algorithm to fine-tune the best parameters of the classification solution with a low false alarm rate and a high fraud detection rate, and experiments show that both CPNN (84.42) and GA (89.42) are accurate, but CPNN-GA (95.58). Alghofaili et al. [14] tested the “Long Short-Term Memory (LSTM)” approach. They improved a standard financial fraud detection system over an actual dataset, and the model detects fraud patterns accurately and quickly with 99.95% accuracy in less than a minute. Authors in [15] created the Dual Sequential Variational Autoencoders (DuSVAE) model for fraud detection. Two sequential variational automatic coders use sequential input data to construct a concentrated representation vector, which is then supplied to the classifier to classify the transactions as genuine or fraudulent. Saheed et al. [16] applied the GA as a feature selection technique. GA’s feature selection technique consists of two phases. The first phase is the priority and includes the selection of the eight most suitable qualities. Another set of eight features was reviewed and selected in the second stage, referred to as second priority features. The NB, RF, and SM were used to detect fraud in the German credit card dataset, which is an imbalanced dataset. Last, GA tuned the hyperparameter of the AdaBoost (AB), RF, Decision Tree (DT), Logistic regression (LR), and SVM classifiers to detect fraud effectively [17].
Compared to the GA algorithm, the results acquired by accuracy, precision, recall, and F1 score suggest that the genetic algorithm can produce superior results in a shorter period. We summarize the related works of credit card Fraud Detection in Table 1.
Related works of credit card fraud detection
This section describes the experiments and proposed framework of the fraud detection model. The evolutionary algorithms were used to oversample the minority class in the proposed method to detect fraudulent transactions. The proposed oversampling strategy aims to improve failure prediction performance on unbalanced datasets. The proposed fraud detection model was designed and developed to show customer transactions as genuine, or attackers based on optimization technique. The data was collected from “BankSim” available online at Kaggle. The data was processed such as (Independent Variables, Data cleaning, Data Splitting). The data was also classified as unbalanced data, and to address the shortcomings. GA and PSO-based over-sampling algorithms. Finally, the model was tested, evaluated, and checked for suitability. The proposed Fraud detection model is clearly shown in Fig. 4.

The proposed model for fraud detection.
The implementation scenario of the credit card fraud scenario as follows:
Credit card transaction dataset from KAGGLE
Data is prepared and normalized.
Data is split to train and test.
The target classes are oversampled after the class distribution is examined and found to have an unbalanced distribution of the target data.
Then, the oversampled data set is created using GA and PSO techniques. To examine performance, a RF model is fitted to the oversampled data set.
The above two oversampling methods are compared with accuracy, precision, and Specificity and confusion matrices.
The scenario is shown in Fig. 5.
There are no missing values in the “BankSim” dataset. As a result, there is no need to fill in the missing information. However, the data set contains category (object) data types. Therefore, using the Scikit-Learn package, we apply data encoding to the categorical data columns. The columns gender and age contain erroneous data. Figure 6 shows the data after preprocessing. To start with, we change the gender values (“E", “U") in the “Gender” (“M") column. Second, we change the “U” value in the “Age” column to a number (7) as shown in Fig. 5. The cleaned dataset into two categories: Fraud and genuine dataset, to prepare it for oversampling data process.

The implementation scenario of credit card fraud detection.

Preprocessing of features: a) Age before processing. b) Age after processing. c) Gender before procession. d) Gender after.
“BankSim” datasets have an unbalanced number of training samples. As a result of this mismatch, the misclassification rate is extremely high, and accuracy suffers. Before classification, the training samples must be balanced to improve the accuracy of the classifier. The well-known GA algorithm is used to oversample the training samples for each class. The genetic algorithm crossover operator is added to the sample synthesis process in this phase. To produce new individuals, the crossover operator mixes the properties of two samples. The pseudocode of the algorithm is shown in Algorithm 1.
PSO based oversampling
A critical step is determining the amount of oversampling required per class. This has a significant impact on the accuracy of the classifier. The PSO algorithm is used in this investigation to oversample the minority samples. The training dataset is oversampled as following:
The PSO is initialized by specifying the balanced degree. Then, a new dataset is created by adding the new dataset to the original data as shown in Algorithm 2.
Classification
The RF ensemble learning algorithm is used to train the model at this point. After partitioning the data, the model was trained on 80% of the data. During the training and testing phases, the learned classifier distinguishes between measurements that were not necessarily found. The training and testing phases of the fraud detection procedure are two significant parts. In the test phase, the data is replenished with test sends, and in the training phase, data class labels are created. The produced model is used to process new data during the testing phase. The probability values of the produced model are compared to identify fraudulent transactions. The trained model’s projected fraud transaction is the result of the training and testing process.
Results and discussions
This section discusses the results of the two experiments. First, the environment used in all experiments will be shown. Second, the dataset used in experiments will be listed. Third, the evaluation metrics will be explained. Fourth, the performance of the two experiments will be discussed. Finally, a performance comparison is made on the same subset of data results.
Environment specifications
The following three tools were used in the implementation of the proposed experiments: the Google COLAB [18], a cloud-based service hosted on a Jupyter notebook that provides free access to computational resources such as Graphics Processing Units (GPUs). Python is the second tool that has gained popularity as a programming language. Machine learning libraries include Matplotlib, Scikit-Learn, Pandas, and NumPy.
Dataset
The “BankSim” dataset [19] is used, which is based on a sample of aggregated real-world transaction data provided by a single Spanish bank and uses a multi-agent-based simulation technique. The initial bank details comprise thousands of transactional data entries from November 2012 to April 2013. Thousands of transactions corresponding to card purchases were simulated over 180 days in this dataset. There are 7,200 records in the data collection classified as “Fraud". The remaining 587,443 are assigned “genuine". Table 2 shows the features of the dataset transaction.
Features details of “ BankSim"
Features details of “ BankSim"
Accuracy, precision, and specificity are numerical performance evaluation measures used to assess the performance. The number of positive (TP) and negative (TN) samples was correctly classified, and the number of positive (FP) and negative (TN) samples were misclassified (FN) are cases classified as negative but are truly positive [20].
The confusion matrices of the two proposed methods are shown in Fig. 7.

Confusion matrix of proposed oversampling techniques: a) represnts the confusion matrix of GA and b) represents the confusion matrix of the PSO technique.
This experiment aims to demonstrate whether the Genetic algorithm is more helpful in oversampling the fraud samples.
The performance and processing time of GA-based oversampling with the Random Forest classifier are measured. Data is split into 70% (train) and 30% (Test and validation). The classifier accuracy for the original unbalanced and balanced datasets after using the proposed technique. GA oversampling technique improved fraud detection performance by approximately 4%. Also, the GA algorithm achieved an accuracy of 99.3% on balanced data and 95.37% on unbalanced data, as shown in Table 3.
Results of GA-based oversampling technique
Results of GA-based oversampling technique
Likewise, regarding sensitivity, the GA achieved 99.8% in balanced data and 85.72% in unbalanced data, a difference of 14%.
The fraud rate in the original dataset is 0.012% indicating a significant data imbalance issue. This experiment investigates the relationship between the performance of a model and the proportion of legitimate and fraudulent transactions. To compare the performance of the two approaches in this experiment, we use the same evaluation metrics. The PSO-based oversampling technique improved fraud detection performance by about 4.03% as shown in Table 4. As you can see on balanced data, the PSO algorithm was 99.4% accurate, and on unbalanced data, it was 95.37%. Precession was also 98.3% on balanced data and 76.05% on unbalanced data.
Results of PSO-based oversampling technique
Results of PSO-based oversampling technique
The main goal of this work is to create a reliable oversampling approach. The performance of suggested oversampling techniques is studied based on GA and PSO using RF classifier. The performance of suggested oversampling approaches is evaluated compared to existing oversampling methods. In the case of fraud detection, we would like to simultaneously have a high true positive count and a low false positive count. Results show that the false positive rate is very high for skewed data without the use of any oversampling technique. It is widely known that fraudulent transactions are considered a positive class. The number of false positives increases when supervised machine learning classifiers misclassify legitimate (normal) samples as fraudulent. Due to the soft boundary of a classifier dividing fraud and non-fraud data sets, even the true positive count is high. As a result, even if a model produces a high true positive count, it may misclassify legitimate transactions that occur near the soft limit as fraudulent. As a result, the number of false positives increases. The suggested model outperformed all other oversampling strategies since it limits the false positives.
Comparing the proposed experiment results with those obtained using two other research contributions, the standard K2 technique [21] and GA [22], without preprocessing procedures. As indicated in Table 5, the proposed models had the best accuracy with 99.4% and 99.3% for PSO and GA, respectively, while K2 had the lowest accuracy at 99.27% and GA without processing had the worst accuracy with 97.6%. There was a 1.8% increase in accuracy compared to the previous study. As a result, the recommended models are gaining popularity. Also, the related works ignored the execution time.
Comparison between different oversampling techniques
Comparison between different oversampling techniques
As can be seen in the graph, the GA and the PSO obtained the best results in terms of precision, with the PSO having a slightly higher value, while the GA obtained the highest value in terms of sensitivity. The superior was from PSO, as illustrated in Fig. 8.

Comparison between different oversampling techniques.
Overall, the results showed that the fitness value obtained by PSO-based oversampling technique outperformed the GA-based. But the executing speed of GA has a faster time compared to PSO due to the GA converge is faster than PSO [23] as shown in Table 6.
Execution time of GA and PSO based oversampling techniques
Although credit card fraud detection has attracted research attention, it has some drawbacks. The unbalanced number of transactions for different classes is a key drawback. This paper contributes to solving the problem of the unbalanced credit card fraud detection dataset. The RF classifier was used to classify the transaction as fraud or genuine. The experiment uses the Scikit Learn library, with data coming from Kaggle’s online credit card transaction dataset. The credit card-based oversampling technique is done using two evolutionary algorithms to address the aforementioned issues. The GA and PSO were used to oversample the lower dimensional training samples. In terms of detecting fraudulent transactions, the performance of the GA and PSO algorithms was compared. Both of them could produce good results in a short time in terms of accuracy, precision, sensitivity, and specificity. The results proved that the accuracy of the classifier could be improved by oversampling the training samples appropriately.
Deep learning to detect fraudulent transactions can be used to evaluate the performance of oversampling techniques. This paper could be extended to other fraud datasets in the future, and we could employ various metaheuristic techniques along with alternative distance metrics to generate oversampled data. a GA can be used to find the optimal hyperparameter that gives us the best accuracy.
Footnotes
Acknowledgement
The authors are thankful to the Deanship of Scientific Research at University of Bisha for supporting this work through the Fast-Track Research Support Program.
