Newton’s second law based PSO for feature selection: Newtonian PSO

Abstract

High dimensional data have brobdingnagian number of features, but not all features are useful. Irrelevant and redundant features may even reduce the classification accuracy. Feature selection is a process of selecting a subset of relevant features to decrease the dimensionality of data. When applied on high dimensional datasets (Big Data) the feature selection methods perceives many challenges and it is pertinent to come up with the new methods or revamp the existing methods. In this study, a new method ‘Newtonian particle swarm optimization (NPSO)’ has been proposed. In the proposed method Newton’s second law of motion has been used to update the learning mechanism of PSO. In NPSO, particle not only learn from the position but also from the mass and acceleration of neighboring particles. The proposed method is mathematically validated at equilibrium using eigen values. Further, the proposed method has been applied on high dimensional microarray gene expression dataset. The NPSO is also compared with other state of art feature selection methods. Selected features, classification accuracy and dimension reduction are used to appraise the goodness of the proposed method. Mathematical validation and experimental results clearly validates the merits of the proposed method in field of feature selection. This paper show the classwise analysis of SRBCT, Brain1, 11-Tumor and 14-Tumor datasets. When number of classes increased dimension reduction is increased but classification accuracy of dataset is decreased.

Keywords

Particle swarm optimization law of motion big data microarray gene expression cancer data feature selection classification accuracy

1. Introduction

High dimensional data means number of feature is high as compared to the number of samples. In real world situations, relevant features are often unknown a priori [1]. A relevant feature is neither irrelevant nor redundant to target concept. However, irrelevant and redundant features are not useful for classification and they may even reduce the classification performance due to the large search space known as the curse of dimensionality [2]. Feature selection can address this problem by selecting only the relevant feature and reducing the irrelevant features. Feature selection is a difficult task because complex interaction among features. An individual relevant feature may become redundant when working together with other features [2]. Features selection task is challenging because of the large search space. It is reported in literature that in case of high dimensional data nature inspired methods are giving promising results [3]. Nature inspired techniques are well known for their global search ability. Particle swarm optimization (PSO) is a nature inspired computing method which is based on swarm intelligence [4]. PSO is computationally less expensive because of its simple calculation.

Despite of many advantages and benefits, PSO has some bottlenecks which causes the local optimization. Several variants of PSO have been reported in literature to solve the drawbacks of PSO. Therefore, a brief survey of PSO for dimensionally reduction has been conducted and it is concluded from survey that velocity update equation of PSO need some reconsideration. Hence, in this paper a novel Newtonian PSO (NPSO) method has been proposed which works on the Newton’s law of motion. In NPSO, velocity update equation is rewritten in terms of mass, position and time step of local and global best particles. NPSO has been mathematically validated using equilibrium concept of the dynamic system which depends on the eigen values. The proposed method is applied on the gene expression microarray cancer dataset to select the most informative genes. A comparative study is also done to validate the goodness of the proposed method.

This paper is organized as follows, Section 2 presents the basic PSO and a survey of PSO variants. Section 3 provides the problem definition and proposes the novel NPSO method. Section 5 covers the analysis of the proposed algorithm. Section 6 deals with the simulation part of this paper. Section 7 shown the classwise analysis. Finally, Section 8 concluded the entire work.

2. Particle swarm optimization

Particle swam optimization(PSO) is a nature inspired methods that is computationally less expensive and faster than other methods [4]. PSO start with population of particles. Particle moves in search space to search the optimal solution by updating each particle based on its own experience and experience of its neighboring particles. Each particle is depicted using position and velocity vectors. Position vector of i^th is elucidated as X_i = (x_i1, x_i2, . . . , x_id) where, d is the dimensionality of search space. The velocity of i^th particle is defined as V_i = (v_i1, v_i2, . . . , v_id). There are two terms namely; pbest and gbest which represent the personal and global learning receptively. pbest(pb_i) is the personal best position obtained by any particle(i^th) which is defined as pb_i = (pb_i1, pb_i2, . . . , pb_id). gbest (gb) is global best position obtained by swarm which is defined as gb = (gb₁, gb₂, . . . , gb_d). There are two equations to update position and velocity of each particle which are given as Equations (1) and (2). $x_{id}^{new} = x_{id}^{old} + v_{id}^{new}$ (1)

$\begin{matrix} v_{id}^{new} = v_{id}^{old} & + b_{1} * r_{1} * ({pb}_{id} - x_{id}^{old}) \\ + b_{2} * r_{2} * ({gb}_{d} - x_{id}^{old}) \end{matrix}$ (2) where r₁,r₂ are random values uniform distribution (0,1). b₁ and b₂ are cognitive and social parameters.

In 1998, inertia weight(ω) has been added to the velocity update Equation (2) and modified equation is given as Equation (3).

$\begin{matrix} v_{id}^{new} = w * v_{id}^{old} & + b_{1} * r_{1} * ({pb}_{id} - x_{id}^{old}) \\ + b_{2} * r_{2} * ({gb}_{d} - x_{id}^{old}) \end{matrix}$ (3)

2.1. Variants of PSO

This section describes different variants of PSO implemented in various field of research.

2.1.1 Binary PSO(BPSO)

BPSO [5], is a binary version of PSO to solve discrete problems. The position of each particle is encoded by binary form. The values of x_id, pb (pbest) and gb (gbest) in the form of 0 and 1. In BPSO, velocity is transformed using sigmoid function according to Equation (4). The position vector is updated according to Equation (5). $sig (v_{id}) = \frac{1}{1 + e^{- v_{id}}}$ (4) $x_{id} = {\begin{matrix} 1, & if rand < sig (v_{id}) \\ 0, & if otherwise \end{matrix}$ (5) where, rand is uniformly distributed random number in the range of (0,1).

Catfish BPSO [6], is developed to avoid the trapping in the local optimum solution. Discrete PSO is applied to discrete and combinatorial optimization problems [7]. Discrete PSO has a high success rate in solving integer programming problems as compare with other methods. It has a quick convergence and better performance.

2.1.2 Neighborhood learning PSO

Neighborhood search PSO, utilizes one local and two global neighborhoods search strategies. It includes two operations, first it generates the three trial particles corresponding to each particle in swarm using the local and global search strategies, then the best one among the three trial particles and the current particle is selected.

Guaranteed convergence PSO has an additional particle to search the region around the current best position [8]. Neighborhoods GCPSO model the structure of social networks [9]. To implement the neighborhoods in the standard PSO velocity update equation of basic PSO is replaced with Equation (6).

$\begin{matrix} v_{id}^{new} = X (v_{id}^{old} & + b_{1} * r_{1} * ({pb}_{id} - x_{id}^{old}) \\ + b_{2} * r_{2} * ({gb}_{d} - x_{id}^{old})) \end{matrix}$ (6)

2.1.3. Quantum-Behaved PSO

In QBPSO [10], implicit space decomposition is adopted, according to this method the whole swarm is divided into several sub swarms. Particles in each subswarm will search in different areas of whole search space, therefore, maintains the diversity of the search. Agarwal & Ranjan [11 –13], have also proposed the quantum behaved ternary particle swarm optimization on the map reduce platform. Li et al. [14], have also developed a map reduced quantum based PSO for complex optimization problem.

2.1.4. Fuzzy logic based PSO

In fuzzy PSO [15], position of particle is defined using a matrix of membership values to solve the travel salesman problem. Fuzzy K-mean PSO [16], fuzzy adaptive turbulent PSO [17], fuzzy adaptive catfish PSO [18], and PSO with fuzzy controller [19], are some of the fuzzy logic and PSO hybrid methods applied on different fields of research. In fuzzy rule binary PSO (FRBPSO), the strength of fuzzy logic is used for the decision making of feature selection in PSO [20]. In FRBPSO, position bit of particle is updated using fuzzy inference system.

2.1.5. Multi objective PSO

In multiobjective optimization (MO) problems, multiple objective functions are optimized independently from each other and the best solution for each function is obtained separately which in turn results into a group of alternative solutions [21]. This group of alternative solutions is called Pareto optimal solution set. Non-dominated sorting PSO (NSPSO) is used for better multi objective optimization [22].

2.1.6. Accelerated particle swarm optimization

The basic PSO is based on local and global best experience. Individual best is used to solve the non linear problem and to increase the diversity of the search. However, diversity can be pushed using some randomness. Therefore, in accelerate PSO (APSO) [23] only global best is used.

2.1.7. Some other PSO variants

There are some other PSO variants show in Table 1.

Table 1
Some other PSO variants

PSO variant	Description
Niche PSO [24]	Solves the multimodel problems
Regrouping PSO [25]	To deal with the stagnation problem
Random Drift PSO [26]	Uses the concept of thermal motion and drift motion of electron
Nondominated sorting PSO [27]	Used for better multiobjective optimization
Immunity-enhanced PSO [28]	It is most efficient method for damage identification
FastPSO [29]	Maintains the classification performance
RapidPSO [29]	Reduces the computational time
Barebones PSO [30]	Good performance for optimization problems
Unified PSO [31]	Combine the local and global version of PSO
Fully informed PSO [32]	Improve the performance of PSO in multiobjective problems
Comprehensive learning PSO [33]	Each particle get attracted by the different particle’s previous position

Some merits and demerits of different variants of PSO are summarized in Table 2.

Table 2

A brief summary of merits and demerits of basic variants of PSO

PSO variant	Merits	Demerits
BPSO	Binary version of PSO for discrete problems.	Local trapping, stagnation, uncertainty due to many random variables.
Catfish PSO	Useful for discrete and combinatorial optimization problems, a attempt the problem of local. It has a quick trapping convergence.	Very basic search mechanism is used, many random variable leading to uncertainty of the search.
Neighborhood learning PSO	Has an additional search methods are used to search around global best position.	Local trapping could happen.
Quantum Behaved PSO	Different search strategies are used to divide the search space into several small search spaces, many variants has utilized the quantum concept at the time of particle generation.	Uncertainty in the search result due to different parameters.
Fuzzy logic based PSO	Uncertainty problem is handled using fuzzy logic, fuzzy logic is used to optimize PSO parameters	Local optimum solution, not able to handle multiobjective problems.
Multiobjective PSO	Solves the multiobjective problems.	Focus is only on multiobjective optimization
Accelerated PSO	Global best is used for the search with some random diversity.	Local trapping, stagnation and uncertainty in the results.

There are many other methods apart from PSO which are very effectively used for feature selection. Zang et al [34] have developed the fuzzy rough set-based information entropy for feature selection in a heterogeneous dataset.

Li et al. [35] has proposed a knowledge reduction framework for incomplete decision contexts by constructing a discernibility matrix and its associated Boolean function.

Cai et al. [36] have computed the type-1 and type-2 characteristic matrices for calculating the second and sixth lower and upper approximations in dynamic covering approximation spaces.

3. Proposed method

3.1. Problem description

Let X is a dataset defined as X = {x₁, x₂, . . . , x_N}, where N is the number of samples. The i^th sample of the dataset is denoted as x_i ϵ R^D, where; D is the dimensionality of the dataset. The associated class labels of dataset are y_iϵ {1, 2, . . . , c}, where; c is the numbers of classes.

The objective of the proposed method is to select the most relevant features with highest fitness in terms of classification. To achieve this goal a Newton’s second law based PSO method called Newtonian particle swarm optimization (NPSO) has been proposed. The aim of the proposed method is to improve the PSO based search to select the relevant feature which in turn optimize the k-nearest neighbor (k-NN) based classification model. It is a wrapper feature selection method in which feature selection is done using NPSO and testing validation is done using k-NN based classifier.

3.2. Problem in existing variants of PSO

PSO standard equation for velocity update Equation (3) has three components. First is old velocity, second and third are learning from itself and neighbor particle respectively. The second and third components are the main ingredients which are contributing in the change of velocity therefore, Equation (3) is rewritten as Equation (7). $v_{id} (t + 1) = w * v_{id} (t) + ∆v$ (7) Δv is the change in velocity over the period of time. The assumption taken by previous variants of PSO is that the change in velocity(Δv = $b_{1} * r_{1} * ({pb}_{id} - x_{id}^{old}) + b_{2} * r_{2} * ({gb}_{d} - x_{id}^{old})$ ) is in unit time. Therefore, when difference in position of personal best and global best from current position is divided by the unit time, it gives the result equal to the difference of position itself. Hence, effect of time is invisible in the equation.

On the other hand according to law of motion v = u + at where v, u and at are new velocity, old velocity and change in velocity due to acceleration. If Equation (3) is compared with this equation then it is observed that Equation (3) does not follow the law of motion concept because it is not considering the acceleration of each particles while calculating the new velocity. It could be confidently stated that there should some acceleration because velocity is not constant and it is changing in every iteration.

Therefore to incorporate the time and acceleration in the PSO equation NPSO is proposed to enhance the strength of PSO by utilizing the Newton’s law of motion.

3.3. Newtonian particle swarm optimization (NPSO)

Newton’s second law states that the acceleration of an object depends upon two variables the net force (F) acting upon the object and the mass (m) of the object. The acceleration of an object depends directly upon the net force acting upon the object, and inversely upon the mass of the object. As the force acting upon an object is increased, the acceleration of the object is also increased as shown in Equation (8). $F = m * a$ (8) The instantaneous acceleration of any object at time t is given by Equation (9). $a = lim_{Δ t \to 0} \frac{Δ v}{Δ t} = \frac{dv}{dt}$ (9) Replace a in terms of $\frac{dv}{dt}$ . $F = m * (\frac{dv}{dt})$ (10) The acceleration is computed as a ratio of change in the velocity vector to time elapsed as Equation (11). $\frac{dv}{dt} = \frac{∆v}{∆t}$ (11) Replace $\frac{dv}{dt}$ in Equation (10) and the new Equation (12) is in the form of change in velocity and change in time. $F = m * (\frac{∆v}{∆t})$ (12) $∆v = \frac{F * ∆t}{m}$ (13)

Put the value of velocity from Equation (13) in Eq (7), and a new velocity update Equation (14) in terms of force (F) and duration of the period of time (t) and mass (m_i) for i^th particle for d^th dimension is given as Equation (14). $v_{id} (t + 1) = w * v_{id} (t) + \frac{F^{id} * ∆t}{m_{i}}$ (14) $∆v = b_{1} * r_{1} * [{pb}_{id} - x_{id}] + b_{2} * r_{2} * [{gb}_{d} - x_{id}]$ (15) Divide Equation (15), with ∆t in both side. $\frac{∆v}{∆t} = \frac{b_{1} * r_{1} * [{pb}_{id} - x_{id}] + b_{2} * r_{2} * [{gb}_{d} - x_{id}]}{∆t}$ (16) $a^{id} = \frac{b_{1} * r_{1} * [{pb}_{id} - x_{id}] + b_{2} * r_{2} * [{gb}_{d} - x_{id}]}{∆t}$ (17) To define the ∆t more specifically, two terms p_timestep and g_timestep are introduced which are time steps for the cognitive and social learning components respectively. p_timestep is defined as the time difference between current iteration and the iteration at which previous personal best position has been obtained, p_timestep = (current _ iteration _ number – previous _ personal _ best _ iteration _ number).

g_timestep is defined as the time difference between current iteration number and the iteration number of obtaining the previous global best. g_timestep = (current _ iteration – iteration _ of _ previous _ gb). $a^{id} = \frac{b_{1} * r_{1} * [{pb}_{id} - x_{id}]}{p_{timestep}} + \frac{b_{2} * r_{2} * [{gb}_{d} - x_{id}]}{g_{timestep}}$ (18) First and second terms in Equation (18) are acceleration due to cognitive (personal experience) and social (neighbor’s experience) learning respectively, which can be written as Equations (19) and (20). $a_{pb}^{id} = \frac{b_{1} * r_{1} * [{pb}_{id} - x_{id}]}{p_{timestep}}$ (19) $a_{gb}^{id} = \frac{b_{2} * r_{2} * [{pg}_{d} - x_{id}]}{g_{timestep}}$ (20) Mass of the cognitive component is denoted as pb_mass and mass of social component is denoted as gb_mass. Hence, force applied on a particle is given by Equation (21). $\begin{matrix} F^{id} = \frac{{pb}_{mass} * b_{1} * r_{1} * [{pb}_{id} - x_{id}]}{p_{timestep}} \\ + \frac{{gb}_{mass} * b_{2} * r_{2} * [{gb}_{d} - x_{id}]}{g_{timestep}} \end{matrix}$ (21) Put Equation (21) in Equation (14) to get final velocity update Equation (22) of NPSO. Since velocity update is calculated in unit iteration t to (t + 1) therefore, ∆t is considered in unit time in Equation (14).

$\begin{matrix} v_{id} (t + 1) = w * v_{id} (t) + \frac{1}{m_{i}} (\frac{{pb}_{mass} * b_{1} * r_{1}}{p_{timestep}} * \\ [{pb}_{id} - x_{id}] + \frac{{gb}_{mass} * b_{2} * r_{2}}{g_{timestep}} * [{gb}_{d} - x_{id}]) \end{matrix}$ (22)

4. Proposed algorithm

The complete algorithm of the proposed NPSO is shown in algorithm (1). It shows, step by step process of NPSO algorithm. The velocity equation of basic PSO is depend on personal best and global best of particles. But in NPSO velocity update equation depends on personal best position, mass, p_timestep, global best position, global best mass and g_timestep. Algorithm (1) starts with initialization of population of agents (particles) which are uniformly distributed, then it evaluates each particles position using LOOCV-kNN(Leave-one-out cross validation using k-NN classifier). If particles current position is better than its previous position, update it. Determine the global best particle then update particles velocities and move particles to new positions.

Table 3
Summary of experimental gene expression datasets

Datasets Samples Features Classes

SRBCT 83 2308 4

DLBCL 77 5469 2

Brain1-Tumor 90 5920 5

Brain2-Tumor 50 10367 4

Leukemia1 72 5327 3

Leukemia2 72 11225 3

Prostate Tumor 102 10509 2

Datasets	Samples	Features	Classes
SRBCT	83	2308	4
DLBCL	77	5469	2
Brain1-Tumor	90	5920	5
Brain2-Tumor	50	10367	4
Leukemia1	72	5327	3
Leukemia2	72	11225	3
Prostate Tumor	102	10509	2

5. Analysis of algorithm

This section covers the two main objectives of equilibrium and convergence analysis. First objective is to check the convergence point of NPSO and second objective is to find the time complexity of the algorithm. Mathematical approaches of Trelea [37, 38], has been used for analysis of algorithm.

5.1. Equilibrium and convergence analysis

Let’s consider NPSO velocity update equation as one dimension since each dimension is updated in NPSO independently from other dimension. Velocity update Equation (22) of NPSO rewritten in generalized one dimensional form Equation (23). For deterministic analysis, parameters are set to b1 = b2 = b, r1 = r2 = r, pb = gb = p, pb_mass=gb_mass=m_i=m, p_timestep =g_timestep=1. $v (t + 1) = ω * v (t) + b * r (p - x)$ (23) In any binary variant of PSO including NPSO there are only two values of position at particular dimension is possible i.e 1 or 0. Therefore, x (t) is taken as a set of 1 and 0. $x (t) = c$ (24) where, c ∈ (1, 0). The matrix from of Equations (23) and (24) is as follows $Y (t + 1) = AY (t) + Bp$ (25) $y (t) = [\begin{matrix} v (t) \\ x (t) \end{matrix}], A = [\begin{matrix} ω - br \\ 0 & 0 \end{matrix}], B = [\begin{matrix} br \\ c \end{matrix}]$ (26)

Algorithm 1 Newtonian PSO for feature selection

Input: High Dimensional Data=(S₁ S₂, …, S_N), (S_i∆R^d) and associate class is C = (C₁, C₂, …, C_N)

Output:Fitness of gbest fit_gb, position of gbest (gb) Initialize the position of particles (x), velocity (v), pbest (pb), gbest (gb), fitness of pb (fit_pb) and fitness of gb(fit_gb)

Initialize the mass (m), p_timestep, g_timestep, inertia weight (w) and number of particles(p)

while (Iter<=MaxIter) do

for (j=1 to p) do

Calculate fitness of particles using LOOCV-kNN

end for

for (i=1 to p) do

if (fit_i >= fitp_bi) then

pbi = xi

Calculate _ptimestep

pb_mass = m_i

else

No change in pb, p_timestep and mi

end if

if (fit_i >= fit_gb) then

gb = pbi

Calculate _gtimestep

gb _mass = m_i

No change in gb, g_timestep and m_i

end if

for (d=1 to number of features) do

/* Update the particle*/

\begin{array}{l} x (t + 1) = x (t) + υ (t + 1) \\ υ_{i d} (t + 1) = w * υ_{i d} (t) + \\ \frac{1}{m_{i}} (\frac{p b_{m a s s} * b_{1} * r_{1} * [p b_{i d} - x_{i d}]}{p_{t i m e s t e p}} + \frac{g b_{m a s s} * b_{2} * r_{2} * [g b_{i d} - x_{i d}]}{g t i m e s t e p}) \\ x_{i d} (t + 1) = {\begin{matrix} 1, & if υ_{i d} (t + 1) \geq 0.5 \\ 0, & if υ_{i d} (t + 1) \geq 0.5 \end{matrix} \end{array}

end for

end while

Now, calculate the eigen values of the system. $[\begin{matrix} ω - λ & - br \\ 0 & - λ \end{matrix}]$ (27) $(ω - λ) (- λ) = 0$ (28) The eigen values of this system are λ = ω and λ = 0. An equilibrium is a point where no change in a dynamic system is observed, in the absence of any external force (p= constant). Hence, Y (t + 1) _eq = Y (t) _eq. Therefore, Equation (25) can be written as. $Y (t)_{eq} = AY (t)_{eq} + Bp$ (29) $IY (t)_{eq} - AY (t)_{eq} = Bp$ (30) $[\begin{matrix} (1 - ω) & br \\ 0 & 1 \end{matrix}] [\begin{matrix} v (t) \\ x (t) \end{matrix}] = [\begin{matrix} brp \\ c \end{matrix}]$ (31) $(1 - ω) v (t) + brx (t) = brp$ (32) $v (t) = \frac{br (p - c)}{(1 - ω)}$ (33) $x (t) = c$ (34) From Equations (33) and (34) it is stated that, for fixed b,r and ω velocity is converging to (p - c), where c ∈ (1, 0).

Dynamics theory says that equilibrium of system with respect to time depends on the eigenvalues of the dynamic values. Both eigen values, {0, ω} where, ω =0.5 are less than one. Therefore, it is stated that system is moving to the stable equilibrium point.

5.2. Time complexity

Time complexity of the proposed methods is mainly depends on the time given to compute the new velocity and position of each dimension of each particle. Let us consider, I is the total number of iterations, D is initial dimension and P is the total number of particles. So the time complexity of the proposed methods is O(IDP). In algorithm (1), while loop is terminate when maximum iteration(I) is reached. For loop is terminate when loop reach to total no of partitions(P) and another for loop is depend on the dimension(D). Therefore, total complexity of the proposed methods is the multiplication of I, P and D.

5.3. Space complexity

Space complexity of proposed methods is O(DP), where, D is initial dimension and P is the total number of particles. Since this algorithms only store a matrix with the entire particle population which is called cost matrix.

6. Experiment and results

6.1. Datasets

The performance of the proposed NPSO is tested on different microarray gene expression dataset. Table 3 show some different type of experimental datasets. SRBCT is characterized by the large features information. It consist of small round cells that on staining appears blue. Frequency of these tumors is high in children as compare to adults. It has 83 samples and 2308 number of features. The SRBCT dataset has four subcategories of small round blue cell tumors. These four classes are Ewing’s sarcoma (EWS: 29 samples), Burkitt’s lymphoma (BL: 11 samples), neuroblastoma (NB: 18 samples) and rhabdomyosarcoma (RMS: 25 samples). DLBCL has 77 samples and 5469 number of features. It has two subcategories of diffuse large B-cell lymphoma. These two classes are (DLBCL:58 samples) and Follicular lymphoma (FL: 19 samples). Brain1-Tumor has 90 samples and 5920 number of features. Brain2-Tumor has total 50 samples and 10367 number of features.

Leukemia1 has 72 samples and 5327 number of features. Leukemia2 has 72 samples and 11225 number of features. Prostate Tumor has 102 samples and 10509 number of features. It has two classes which are normal tissue (normal: 50 samples) and prostate tumor (tumor: 52 samples). All the datasets have been downloaded from http://www.gems-system.org/.

6.2. Experimental setup

6.2.1 Experiment 1

This experiment has been conducted with the hope that NPSO selects the highly informative features from all datasets. To this end, NPSO is applied on SRBCT dataset, DLBCL dataset, Brain1-Tumor dataset, Brain2-Tumor dataset, Leukemia1 dataset, Leukemia2 dataset and Prostate Tumor dataset. NPSO is heuristic search method therefore, NPSO is applied multiple (five times) times on each datasets.

6.2.2 Experiment 2

NPSO is also compared with other state-of-art methods. The classification accuracy acquired from all the features (kNN-without feature subset selection) is also obtained to demonstrate the benefit of dimensionality reduction. The proposed method is compared with kNN [43], IBPSO [44], correlation coefficient [45], evolutionary computing [46], genetic algorithm [47], relieff [48], information gain [49], MIBPSO, FRBPSO and EVPSO.

Correlation coefficient, relieff and information gain are filter feature selection methods. Filter feature selection methods just ranks the features according to their performance. Therefore, top ranked feature are selected and given to the kNN to find the classification accuracy. The number of selected features from ranked features is kept equal to the number of features selected by NPSO-kNN. This experimental configuration enables a simple comparison and allows to investigate the discriminative power given by the selected subset of features from individual filter feature selection method. The evolutionary computing and genetic algorithm methods are performed using Weka tool. Weka has (μ, λ) evolutionary algorithm with random initialization, binary tournament selection operator, single point crossover operator, bit flip mutation and generational replacement.

In this experiment IBPSO, MIBPSO, FRBPSO, EVPSO, NPSO are wrapper class of feature selection methods in which kNN is used as classifier to calculate the classification accuracy. The feature selected by the NPSO is given to the SVM, to check the strength of selected feature using SVM classifier. All the simulations has been conducted on Matlab 7.11.1.866 R2010b, Licence No. 691568. For filter feature selection methods like (information gain, correlation coefficient and relieff) Weka 3.7.13 has been used. In filter feature selection the number of selected features are kept same as number of feature selected by the NPSO, to make the comparison simple. It allows to investigate the informative strength of given number of features. Maximum iteration used in NPSO is 150 and number of particles in the swarm are set to 40. Cognitive and social parameters are given the value of 2. Inertia weight is taken as ω = 0.5. The complete parameter setting used here, is already optimized in the literature by Shi and Eberhart [50], and also used by Chung et al. [44], for PSO based feature selection. In all the wrapper feature selection methods (IBPSO, MIBPSO, FRBPSO, and EVPSO) same parameter set is used along with leave one out cross validation.

6.3. Simulation results

6.3.1 Experiment 1

NPSO is a heuristic nature inspired search method. Therefore, it has been applied five times on all datasets. Classification accuracy of the selected features with leave one out cross validation (LOOCV) using kNN is reported in the Table 4. Table 5 shows the result of classification accuracy obatined using SVM with features which are selected by NPSO.

NPSO-kNN is the proposed NPSO which has leave one out cross validation (LOOCV) using kNN and NPSO-SVM means selected features from NPSO is given to the SVM. Therefore, number of selected features in both the table are same.

Table 4
Classification accuracy obtained after feature selection from microarray datasets using NPSO-kNN

Dataset Run1 Run2 Run3 Run4 Run5 Avg

SRBCT Classification Accuracy 97.59 98.90 97.59 97.59 98.80 98.09

Selected Features 58 109 58 58 109 82

DLBCL Classification Accuracy 97.40 98.70 97.40 98.70 97.40 97.92

Selected Features 72 82 76 76 72 75

Brain1-Tumor Classification Accuracy 95.56 96.67 95.56 96.67 95.56 96.00

Selected Features 69 64 59 64 69 65

Brain2-Tumor Classification Accuracy 90 92 90 92 90 90.8

Selected Features 61 52 54 52 61 56

Leukemia1 Classification Accuracy 98.61 97.22 97.22 98.61 97.22 97.78

Selected Features 97 95 82 81 82 87

Leukemia2 Classification Accuracy 95.83 97.22 95.83 97.22 97.22 96.66

Selected Features 125 130 120 134 130 128

Prostate Tumor Classification Accuracy 98.04 97.05 98.04 98.04 97.05 97.64

Selected Features 140 125 92 110 125 118

Dataset		Run1	Run2	Run3	Run4	Run5	Avg
SRBCT	Classification Accuracy	97.59	98.90	97.59	97.59	98.80	98.09
	Selected Features	58	109	58	58	109	82
DLBCL	Classification Accuracy	97.40	98.70	97.40	98.70	97.40	97.92
	Selected Features	72	82	76	76	72	75
Brain1-Tumor	Classification Accuracy	95.56	96.67	95.56	96.67	95.56	96.00
	Selected Features	69	64	59	64	69	65
Brain2-Tumor	Classification Accuracy	90	92	90	92	90	90.8
	Selected Features	61	52	54	52	61	56
Leukemia1	Classification Accuracy	98.61	97.22	97.22	98.61	97.22	97.78
	Selected Features	97	95	82	81	82	87
Leukemia2	Classification Accuracy	95.83	97.22	95.83	97.22	97.22	96.66
	Selected Features	125	130	120	134	130	128
Prostate Tumor	Classification Accuracy	98.04	97.05	98.04	98.04	97.05	97.64
	Selected Features	140	125	92	110	125	118

Table 5

Classification accuracy obtained after feature selection from microarray datasets using NPSO-SVM

Dataset		Run1	Run2	Run3	Run4	Run5	Avg
SRBCT	Classification Accuracy	100	100	100	100	100	100
	Selected Features	58	109	58	58	109	82
DLBCL	Classification Accuracy	100	100	100	100	100	100
	Selected Features	72	82	76	76	72	75
Brain1-Tumor	Classification Accuracy	97.78	97.78	96.67	97.78	96.67	97.34
	Selected Features	69	64	59	64	69	65
Brain2-Tumor	Classification Accuracy	92	94	92	94	92	92.8
	Selected Features	61	52	54	52	61	56
Leukemia1	Classification Accuracy	100	100	100	100	100	100
	Selected Features	97	95	82	81	82	87
Leukemia2	Classification Accuracy	98.61	100	98.61	100	98.61	99.17
	Selected Features	125	130	120	134	130	128
Prostate Tumor	Classification Accuracy	100	100	100	100	100	100
	Selected Features	140	125	92	110	125	118

6.3.2 Experiment 2

Table 6, shows the results of comparative study which shows the revealing performance of NPSO.

The reported results reveal that NPSO in combination with kNN and SVM respectively perform equally as well, or even better compared to other previously available feature selection methods on the same datasets. Table 6, reveals the blessings of feature selection with increased classification accuracy as compare to classification with all features (classification accuracy 91.57 with kNN). The highest classification accuracy with minimum features is shown in bold in Table 6. Since correlation coefficient, information gain and relieff the filter feature selection method. Therefore, number of selected features obtained from these methods are kept same as the number of features obtained using NPSO. This allows to investigate the classification benefit given by the limited number of the features. NPSO shows the highest reduction in dimension (average dimension reduction 98.59%) and highest information (average classification accuracy 98.56%) conveyed by the selected features.

Table 6
Average classification accuracy, number of selected features, reduction in dimension using different feature selection methods and NPSO

Dataset kNN IBPSO MIBPSO3 EVPSO FRBPSO Evolutionary Computing Genetic Algorithm Information Gain Correlation Coefficient Relieff NPSO kNN NPSO SVM

SRBCT Classification accuracy 91.57 97.59 98.05 95.66 98.19 78.313 81.9277 98.79 92.77 90.36 98.09 100

Selected Features all 1124 275 1050 213 173 96 82 82 82 82 82

Dimension Reduction 0 51.29 88.08 54.50 90.77 92.50 95.84 96.44 96.44 96.44 96.44 96.44

DLBCL Classification accuracy 87 94.81 96.16 78.86 96.49 77.9221 79.2208 94.8052 96.1039 97.4026 97.92 100

Selected Features 0 2697 282 1445 105 89 81 75 75 75 75 75

Dimension Reduction 0 50.68 94.82 73.57 98.08 98.37 98.51 98.51 98.50 98.62 98.62 98.62

Brain1-Tumor Classification Accuracy 86.67 89.99 89.88 89.77 90.77 83.333 81.111 88.888 85.55 86.667 96.00 97.34

Selected Features 0 2924 1160 3354 803 123 267 65 65 65 65 61

Dimension Reduction 0 50.60 80.40 78.93 86.43 97.92 98.90 98.90 98.90 98.90 98.90 98.96

Brain2-Tumor Classification accuracy 70 81.2 85.4 78.7 87.6 62 60 88 62 74 90.8 92.8

Selected Features 0 4983 1207 3547 662 770 1484 60 60 60 56 56

Dimension Reduction 0 51.93 88.35 65.78 93.61 92.57 85.68 99.42 99.42 99.42 99.43 99.43

Leukemia1 Classification accuracy 87.50 97.59 98.05 94.99 98.89 51.3889 88.33 94.44 95.83 94.44 97.78 100

Selected Features 0 2643 987 2233 825 12 454 96 96 96 87 87

Dimension Reduction 0 50.38 81.47 58.08 85.51 99.77 91.47 97.12 97.12 97.12 98.21 98.21

Leukemia2 Classification accuracy 70 97.22 87.22 89.44 97.50 86.11 87.5 97.22 94.44 97.22 96.66 99.17

Selected Features 0 4958 2257 3023 1028 427 56 229 229 229 128 114

Dimension Reduction 0 55.38 79.89 73.07 90.84 96.19 99.50 96 96 96 98.41 98.62

Prostate Tumor Classification accuracy 76.47 91.14 90.39 85.66 92.43 92.43 76.47 60.78 93.137 93.1373 97.64 100

Selected Features 0 1029 426 5321 418 274 36 214 214 214 118 118

Dimension Reduction 0 57.27 86.98 64.75 90.60 96.47 95.52 98.88 98.60 98.60 99.42 99.42

Average Classification Accuracy 81.32 92.79 92.16 87.58 94.53 73.65 76.80 85.96 88.68 91.49 96.42 98.47

Average Dimension Reduction 0 56.13 84.89 70.36 91.64 96.49 95.07 96.50 96.92 97.40 98.54 98.59

Dataset		kNN	IBPSO	MIBPSO3	EVPSO	FRBPSO	Evolutionary Computing	Genetic Algorithm	Information Gain	Correlation Coefficient	Relieff	NPSO kNN	NPSO SVM
SRBCT	Classification accuracy	91.57	97.59	98.05	95.66	98.19	78.313	81.9277	98.79	92.77	90.36	98.09	100
	Selected Features	all	1124	275	1050	213	173	96	82	82	82	82	82
	Dimension Reduction	0	51.29	88.08	54.50	90.77	92.50	95.84	96.44	96.44	96.44	96.44	96.44
DLBCL	Classification accuracy	87	94.81	96.16	78.86	96.49	77.9221	79.2208	94.8052	96.1039	97.4026	97.92	100
	Selected Features	0	2697	282	1445	105	89	81	75	75	75	75	75
	Dimension Reduction	0	50.68	94.82	73.57	98.08	98.37	98.51	98.51	98.50	98.62	98.62	98.62
Brain1-Tumor	Classification Accuracy	86.67	89.99	89.88	89.77	90.77	83.333	81.111	88.888	85.55	86.667	96.00	97.34
	Selected Features	0	2924	1160	3354	803	123	267	65	65	65	65	61
	Dimension Reduction	0	50.60	80.40	78.93	86.43	97.92	98.90	98.90	98.90	98.90	98.90	98.96
Brain2-Tumor	Classification accuracy	70	81.2	85.4	78.7	87.6	62	60	88	62	74	90.8	92.8
	Selected Features	0	4983	1207	3547	662	770	1484	60	60	60	56	56
	Dimension Reduction	0	51.93	88.35	65.78	93.61	92.57	85.68	99.42	99.42	99.42	99.43	99.43
Leukemia1	Classification accuracy	87.50	97.59	98.05	94.99	98.89	51.3889	88.33	94.44	95.83	94.44	97.78	100
	Selected Features	0	2643	987	2233	825	12	454	96	96	96	87	87
	Dimension Reduction	0	50.38	81.47	58.08	85.51	99.77	91.47	97.12	97.12	97.12	98.21	98.21
Leukemia2	Classification accuracy	70	97.22	87.22	89.44	97.50	86.11	87.5	97.22	94.44	97.22	96.66	99.17
	Selected Features	0	4958	2257	3023	1028	427	56	229	229	229	128	114
	Dimension Reduction	0	55.38	79.89	73.07	90.84	96.19	99.50	96	96	96	98.41	98.62
Prostate Tumor	Classification accuracy	76.47	91.14	90.39	85.66	92.43	92.43	76.47	60.78	93.137	93.1373	97.64	100
	Selected Features	0	1029	426	5321	418	274	36	214	214	214	118	118
	Dimension Reduction	0	57.27	86.98	64.75	90.60	96.47	95.52	98.88	98.60	98.60	99.42	99.42
	Average Classification Accuracy	81.32	92.79	92.16	87.58	94.53	73.65	76.80	85.96	88.68	91.49	96.42	98.47
	Average Dimension Reduction	0	56.13	84.89	70.36	91.64	96.49	95.07	96.50	96.92	97.40	98.54	98.59

7. Classwise analysis of datasets

This section shows a classwise analysis of datasets through proposed method. Classwise analysis has been performed on SRBCT, Brain1, 11-tumors and 14-tumor dataset. The information of SRBCT and Brain1 datasets is available in 3. Information of 11-tumors and 14-tumor datasets are shown in Table 7 with number of classes, number of features and number of samples. The main objective of classwise analysis to check the performance of NPSO with increasing numbers of classes. To perform this analysis many different datasets with different number classes are generated from each dataset. For example SRBCT has four classes hence, three new datasets are generated from SRBCT dataset. In two class SRBCT dataset only two classes are retained, resulting in 54 samples out of 83 samples, similarly in three class SRBCT dataset only three classes are retained which in turn results in 65 samples out of 83 samples. Finally all 83 samples are considered to perform analysis for four classes. Similarly all the datasets are fragmented to generate different set of multi class datasets.

Table 7
Summary of classwise analysis datasets

Datasets Samples Features Classes

11-Tumor 174 12534 11

14-Tumor 308 15010 26

Datasets	Samples	Features	Classes
11-Tumor	174	12534	11
14-Tumor	308	15010	26

Table 8 shows the classwise average classification accuracy of NPSO on SRBCT. The average classification of NPSO on two class dataset is 100% and average dimension reduction is 89.60%. Dimension reduction means (Total number of features-Selected features)/(Total Number of features). The average classification accuracy of three class SRBCT data is 100% and dimension reduction is 94.34. In four class (complete SRBCT dataset) dataset average classification accuracy is 98.09% and dimension reduction is 96.44.

Table 8

Average classwise classification accuracy obtained using NPSO on SRBCT dataset

Class	Samples	Features	Average classification accuracy	Selected features	Dimension reduction
Two Class	54	2308	100	240	89.60
Three Class	65	2308	100	130	94.34
Four Class	83	2308	98.09	82	96.44

Table 9, shows the average classwise classification accuracy of Brain1-Tumor dataset. Brain1-Tumor have 70 samples in two class, 80 samples in three class, 84 samples in four class, 90 samples in five class. Total number of features in every class is 5921. Table 9 shows when number of classes increases classification accuracy decreases, but dimension reduction increases.

Table 9

Average classwise classification accuracy obtained using NPSO on Brain1-Tumor dataset

Class	Samples	Features	Average classification accuracy	Selected features	Dimension reduction
Two Class	70	5921	99.36	620	89.23
Three Class	80	5921	99.29	465	92.45
Four Class	84	5921	98.81	284	95.09
Five Class	90	5921	96	65	98.90

Table 10, shows the classwise analysis of 11-Tumor dataset. Table 10, has ten datasets generated from 11-Tumor dataset. Table 10, also shows that two class and three class datasets have 100% classification accuracy with considerable dimension reduction. As the number of classes are increased in 11-Tumor dataset, performance of NPSO in turn of classification accuracy decreases but considerable dimension reduction is achieved.

Table 10

Average classwise classification accuracy obtained using NPSO on 11-Tumor dataset

Class	Samples	Features	Average classification accuracy	Selected features	Dimension reduction
Two Class	35	12534	100	2304	81.67
Three Class	61	12534	100	2276	81.94
Four Class	84	12534	99.48	1988	84.13
Five Class	96	12534	99.07	1091	91.29
Six Class	107	12534	98.22	890	92.89
Seven Class	114	12534	97.80	786	93.72
Eight Class	140	12534	97.49	423	96.67
Nine Class	146	12534	97.01	245	98.04
Ten Class	160	12534	96.56	221	98.32
Eleven Class	174	12534	93.96	157	98.74

Table 11, shows the classwise analysis of 14-Tumors dataset. 14-tumor has twenty six different classes of tumor hence twenty five different datasets are generated from 14-tumor dataset for classwise analysis. Upto three classes NPSO is performed with 100% classification accuracy. When number of classes are increased dimension reduction is increased.

The NPSO is proposed with the aim of dimension reduction. From classwise analysis it is clear that NPSO is able to reduce dimension of very complex datasets like 14-tumor datasets which has 26 classes and 15010 features. This analysis clearly reveals the merits of NPSO for dimension reduction of high dimensional datasets.

Table 11

Average classwise classification accuracy obtained using NPSO on 14-Tumor dataset

Class	Samples	Features	Average classification accuracy	Selected features	Dimension reduction
Two Class	44	15010	100	11235	25.14
Three Class	64	15010	100	8956	40.36
Four Class	79	15010	98.1	7623	49.12
Five Class	101	15010	97.03	5678	62.21
Six Class	112	15010	96.88	4321	71.26
Seven Class	122	15010	96.31	2134	85.84
Eight Class	133	15010	95.86	2012	86.21
Nine Class	150	15010	95.67	1721	88.59
Ten Class	161	15010	95.65	1572	89.56
Eleven Class	172	15010	95.06	1245	91.70
Twelve Class	187	15010	94.92	989	93.41
Thirteen Class	198	15010	94.69	782	94.79
Fourteen Class	218	15010	94.23	520	96.60
Fifteen Class	223	15010	93.95	467	97.45
Sixteen Class	232	15010	93.32	392	97.56
Seventeen Class	239	15010	93.12	319	97.94
Eighteen Class	250	15010	93.03	311	97.95
Nineteen Class	256	15010	93.01	305	97.98
Twenty Class	263	15010	92.91	267	98.28
Twenty one Class	269	15010	92.67	211	98.66
Twenty Two Class	274	15010	92.44	208	98.71
Twenty Three Class	287	15010	92.37	192	98.78
Twenty Four Class	297	15010	91.14	189	98.79
Twenty Five Class	300	15010	91.01	167	98.88
Twenty Six Class	308	15010	90.56	156	98.96

8. Conclusion

The proposed novel variant of PSO for feature selection (Newtonian PSO) incorporates the Newton’s law of motion in the PSO. Therefore, velocity equation is rewritten considering the particle’s own and neighboring particle’s mass and acceleration. After incorporating these factors PSO velocity update equation completely satisfies all the aspects of the Newton’s law of motion. The proposed method is mathematically validated and also applied on the gene expression microaaray datasets for gene selection. Mathematical validation show that the proposed method is converging to the equilibrium point. Experimental results show that the NPSO achieves both accurate classification and sufficient dimensionality reduction. This shows that, NPSO preferentially selects the most informative features which in turn, boosting the performance of classifier. From classwise analysis it is clear that NPSO is able to achieve very good dimension reduction with considerable classification accuracy from high dimensional complex datasets.

References

Dash and

Liu , Feature selection for classification, Intelligent Data Analysis 1 (1997), 131–156.

Xue ,

Zhang and

W.N.

Browne , Particle swarm optimization for feature selection in classification: A multiobjective approach, IEEE Transactions on Cybernetics 43 (2013), 1656–1671.

Agarwal ,

Ranjan and

Rajesh , Dimensionality reduction methods classical and recent trends: A survey, International Journal of Control Theory and Applications 9(2016), 4801–4808.

Kennedy and

Eberhart , Particle swarm optimization, Proceedings, IEEE International Conference on Neural Network, Perth, Australia, 1995, pp. 1942–1948.

Emami and

Derakhshan , Integrating fuzzy k-means particle swarm optimization, Arab J Sci Eng 40 (2015), 3545–3554.

Chuang ,

S.W.

Tsai and

Yang , Chaotic catfish particle swarm optimization for solving global numerical optimization problems, Applied Mathematics and Computation 217 (2011), 690–6916.

Wang ,

Li and

Liu , A hybrid particle swarm algorithm with cauchy mutation, Swarm Intelligence Symposium, IEEE, 2007, pp. 356–360.

Pan ,

X.T.

Li ,

Zhou ,

W.X.

Li and

Gao , Analysis of standard particle swarm optimization algorithm based on markov chain, Acta Automatica Sinica 39(4) (2013), 381–389.

Kumar ,

B.K.

Singh and

B.D.K.

Patro , Particle swarm optimization: A study of variants and their applications, International Journal of Computer Applications 135(5) (2016), 24–30.

10.

Szabo and

L.N.

de Castro , A constructive data classification version of the particle swarm optimization algorithm, Mathematical Problems in Engineering 13 (2013).

11.

Agarwal and

Ranjan , Optimum feature selection using new ternary particle swarm optimization in two phases, Journal of Intelligent and Fuzzy Systems 33 (2017), 2095–2107.

12.

Agarwal and

Ranjan , MR-TP-QFPSO: Map reduce two phase quantum fuzzy pso for feature selection, International Journal of System Assurance Engineering and Management (2017), 1–13.

13.

Agarwal and

Ranjan , Map reduce fuzzy ternary particle swarm optimization for feature selection, Journal of Statistics and Management Systems 20 (2017), 601–609.

14.

Li ,

Chen ,

Wang and

Jiao , Quantum-behaved particle swarm optimization using mapreduce, Bioinspired Computing Theories and Application (2016), 1410–1430.

15.

Wei ,

Kang-Ping and

Chun-guang , Fuzzy discrete particle swarm optimization for solving traveling salesman problem, Proceedings of the The Fourth International Conference on Computer and Information Technology, 2004, pp. 796–800.

16.

Izakian ,

Abraham and

Snasel , Fuzzy clustering using hybrid fuzzy c-means and fuzzy particle swarm optimization, World Congress on Nature and Biologically Inspired Computing, NaBIC, 2009, pp. 1690–1694.

17.

Liu ,

Abraham and

Zhangl , Fuzzy adaptive turbulent particle swarm optimization, Int J Innov Comput Appl (IJICA) 1 (2007), 39–47.

18.

L.Y.

Chuang ,

Tsai and

Yang , Fuzzy adaptive catfish particle swarm optimization, Artificial Intelligence Research 1(2) (2012), 149–170.

19.

Olivas ,

Valdez and

Castillo , Fuzzy classification system design using pso with dynamic parameter adaptation through fuzzy logic, Fuzzy Logic Augmentation of Nature-Inspired Optimization Meta Heuristics, Studies in Computational Intelligence 574 (2015), 29–47.

20.

Agarwal ,

Rajesh and

Ranjan , Frbpso: A fuzzy rule based binary pso for feature selection, Proceedings of the National Academy of Sciences, India Section A: Physical Sciences 87 (2017), 221–233.

21.

Zhou ,

Sun and

Xu , An advanced quantum-behaved particle swarm optimization algorithm utilizing cooperative strategy, Proceedings of IEEE Third International Workshop on Advanced Computational Intelligence, 2010, pp. 344–349.

22.

Li , A Non-dominated Sorting Particle Swarm Optimizer for Multiobjective Optimization, Genetic and Evolutionary Computation, Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2723, 2003, pp. 37–48.

23.

X.S.

Yang ,

Deb and

Fong , Accelerated Particle Swarm Optimization and Support Vector Machine for Business Optimization and Applications, Networked Digital Technologies, Communications in Computer and Information Science, Springer, Berlin, Heidelberg, 136, 2011, pp. 53–66.

24.

Brits ,

A.P.

Engelbrecht and

F.V.D.

Bergh , A niching particle swarm optimizer, Proceedings of the 4th Asia Pacific Conference on Simulated Evolution and Learning, 2002, pp. 692–696.

25.

G.I.

Evers and

M.B.

Ghalia , Regrouping particle swarm optimization: A new global optimization algorithm with improved performance consistency across benchmarksr, Proceedings of IEEE International Conference on Systems, Man and Cybernetics, 2009, pp. 3901–3908.

26.

Sun ,

Xiaojun and

Palade , Random drift particle swarm optimization algorithm: Convergence analysis and parameter selection, Machine Learning 101 (2015), 345–376.

27.

Liu , A fast and Elitist multiobjective particle swarm algorithm: NSPSO, IEEE International Conference Granular Computing, GrC, 2008. doi: 10.1109/GRC.2008.4664711

28.

Kang ,

Li and

Liu , Combined data with particle swarm optimization for structural damage detection, Mathematical Problems in Engineering (2013), 1–10.

29.

T.B.

Yeoman ,

Xue and

Zhang , Particle swarm optimization for feature selection, A hybrid filter-wrapper approach, Evolutionary Computation (CEC) (2015).

30.

Yao and

Han , Improved barebones particle swarm optimization with neighborhood search and its application on ship design, Mathematical Problems in Engineering (2013), 1–10.

31.

K.E.

Parsopoulos and

M.N.

Vrahatis , Unified particle swarm optimization for solving constrained engineering optimization problems, Mathematical Problems in Engineering (2005), 582–591.

32.

Mendes ,

Kennedy and

Neves , The fully informed particle swarm: Simpler, maybe better, IEEE Transactions on Evolutionary Computation 8 (2004), 204–210.

33.

Liang ,

Kennedy and

Suganthan , Comprehensive learning particle swarm optimizer for global optimization of multimodal functions, IEEE Transactions on Evolutionary Computation 10 (2006), 281–295.

34.

Zhang ,

Mei ,

Chen and

Li , Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy, Pattern Recognition 56 (2016), 1–15.

35.

Li ,

Mei and

Lv , Incomplete decision contexts: Approximate concept construction, rule acquisition and knowledge reduction, International Journal of Approximate Reasoning 54(1) (2013), 149–165.

36.

Cai ,

Li and

Ma , Knowledge reduction of dynamic covering decision information systems caused by variations of attribute values, International Journal of Machine Learning and Cybernetics 8(4) (2017), 1131–1144.

37.

I.C.

Trelea , The particle swarm optimization algorithm: Convergence analysis and parameter selection, Information Processing Letters 85 (2003), 317–325.

38.

Hongtao ,

Wenguang and

Zhenqiang , Convergence analysis of particle swarm optimizer and its improved algorithm based on velocity differential evolution, Computational Intelligence and Neuroscience (2013), 1–10.

39.

Statnikov , Gene expression model selector, www.gemssystem.org, 2005.

40.

Schwenker , Support vector machines for multiclass pattern recognition, Knowledge-Based Intelligent Engineering Systems and Allied Technologies (2000).

41.

J.C.

Platt ,

Cristianini and

J.S.

Taylor , Large margin dag's for multiclass classification, Proc Advances in Neural Information Processing Systems, 1999, pp. 547–553.

42.

Crammer and

Singer , On the learn ability and design of output codes for multiclass problems, Machine Learning 47(2) (2002), 201–233.

43.

Aha and

D.M.

Albert , Instance-based learning algorithms, Machine Learning 1 (1991), 37–66.

44.

Chuang ,

Chang ,

Tu and

Yang , Improved binary PSO for feature selection using gene expression data, Computational Biology and Chemistry 32 (2008), 29–38.

45.

A. L.

Edwards , The Correlation Coefficient, In: An Introduction to Linear Regression and Correlation. San Francisco, CA: W.H. Freeman, 1976, pp. 33–46.

46.

S.L.

Valero , Evolutionary Search: An Evolutionary Algorithm (EA) to explore the space of attributes, Attribute selection method, WEKA 2012.

47.

D.E.

Goldberg

Genetic Algorithm in search, optimization and machine learning, Addison-Wesley 1989.

48.

Hua ,

Yang ,

Ye and

Shao , A novel dynamic financial conditions index approach based on accurate online support vector regression, Procedia Computer Science 55 (2015), 944–952.

49.

T.M.

Mitchell , Machine learning, The Mc-Graw-Hill Companies, 1997, Inc. ISBN : 0070428077.

50.

Shi and

Eberhart , A modified particle swarm optimizer, IEEE International Conference on Evolutionary Computation, 1999, pp. 69–73.

Newton’s second law based PSO for feature selection: Newtonian PSO

Abstract

Keywords

1. Introduction

2. Particle swarm optimization

2.1.1 Binary PSO(BPSO)

2.1.4. Fuzzy logic based PSO

2.1.5. Multi objective PSO

2.1.6. Accelerated particle swarm optimization

2.1.7. Some other PSO variants

Table 1 Some other PSO variants

3.1. Problem description

3.2. Problem in existing variants of PSO

Table 3 Summary of experimental gene expression datasets Datasets Samples Features Classes SRBCT 83 2308 4 DLBCL 77 5469 2 Brain1-Tumor 90 5920 5 Brain2-Tumor 50 10367 4 Leukemia1 72 5327 3 Leukemia2 72 11225 3 Prostate Tumor 102 10509 2

5.1. Equilibrium and convergence analysis

5.3. Space complexity

6. Experiment and results

6.1. Datasets

6.2. Experimental setup

6.2.1 Experiment 1

6.2.2 Experiment 2

6.3. Simulation results

6.3.1 Experiment 1

Table 7 Summary of classwise analysis datasets Datasets Samples Features Classes 11-Tumor 174 12534 11 14-Tumor 308 15010 26

References

Table 1
Some other PSO variants

Table 3
Summary of experimental gene expression datasets

Datasets Samples Features Classes

SRBCT 83 2308 4

DLBCL 77 5469 2

Brain1-Tumor 90 5920 5

Brain2-Tumor 50 10367 4

Leukemia1 72 5327 3

Leukemia2 72 11225 3

Prostate Tumor 102 10509 2

Table 7
Summary of classwise analysis datasets

Datasets Samples Features Classes

11-Tumor 174 12534 11

14-Tumor 308 15010 26