Abstract
Wireless Sensor Networks is a complex network with millions of small-scale sensor nodes, working together to detect certain physical phenomena. Sensor nodes are operated by battery therefore the major concern is energy efficiency. Clustering is an effective technique to decrease the energy depletion in the network. However, choosing the optimum Cluster Heads is an NP-Hard problem. This paper proposes an unequal clustering technique that selects probationary Cluster Heads through fuzzy logic and the optimization of this probationary Cluster Heads is done through Harmony Search Algorithm (HSA). The proposed algorithm exhibits the dynamic capability of fuzzy logic and high search efficiency of HSA that extends the network lifespan. The findings are simulated against traditional clustering protocols and compared. The findings obtained show that the protocol proposed is performing superior in terms of network lifespan prolongation and other metrics.
Keywords
Introduction
Background
Wireless Sensor Networks (WSN) is composed of low-cost, very compact sensor nodes for detecting and sensing any event. They have the data collection and forwarding capability. In the sensing area, millions of nodes are deployed at random. WSN plays a critical role in several applications such as weather forecasting, environmental monitoring, disaster management, prevention of natural disaster, border protection, smart cities, and so on [1].
WSN consists of several nodes with one or more than one base station (BS). The sensor node consists of the sensing, communication, processing unit, and power unit. The sensing unit tracks the physical phenomenon, processes the sensed data through processing unit and transmits it to BS via the communication unit. Sensor nodes are inadequate in resources like memory, bandwidth and the most important is battery. Once deployed all the nodes in the network the battery can hardly be replaced or recharged [2]. Despite this, there are several other issues like coverage, synchronization, connectivity, fault tolerance, security, and localization problems.
Energy consumption is highly required in an effective manner to boost overall network efficiency. Routing protocol design should be energy intensive, simple and scalable to different environmental conditions.
Different routing and clustering protocols are developed in the literature that is an energy-efficient algorithm. Clustering is a technique that decreases the number of transmissions substantially by aggregating the data from the cluster members. The communication cost is proved higher than the sensing as well as the processing cost [3]. Clustering splits-up the entire network into small different groups. A chief is selected among the group that is known as Cluster Head (CH) [4]. It has three main responsibilities: one is to gather the data from the group members, second is to aggregate the collected data and eliminate redundancy and thirdly, the information collected is sent to the BS. When the clusters have an equivalent number of members then it is known as equal clustering while when the clusters have different members then it is known as unequal clustering [5].
When the nodes are distributed uniformly, equal clustering had a better result than unequal clustering. CHs near to the sink dies early as against the CHs far from the sink. This is known as the hotspot problem. This problem is resolved by unequal clustering. Clusters near the sink are small in this solution as compared to the clusters distant from the sink. Small clusters mean cluster having fewer members, and thus less traffic within the cluster.
Computational intelligence techniques such as Particle Swarm Optimization (PSO) [6], fuzzy logic [8], Harmony search algorithm [9] and Ant Colony Optimization (ACO) [7] used for the selection of an optimal number of cluster heads. These techniques proved to be energy efficient and help in prolongation of network lifetime. Hence, because WSN has a strong constraint, designing the clustering algorithm using computational intelligence techniques is important.
Some of the similar clustering algorithms are been proposed in [29, 30].
This paper suggests an unequal clustering algorithm with harmony search and fuzzy logic algorithm. Fuzzy logic is used to pick probationary CHs. Unequal probationary clusters are constructed and harmony search is used to final optimal cluster heads.
Authors contribution
The main contributions in this paper are: Probationary unequal size clusters are generated using fuzzy logic. The fuzzy logic is given three inputs. Unequal clusters are formed by considering the same three parameters. Optimal cluster heads are elected by the harmony search algorithm. Energy-efficient and load-balanced clusters are formed. Comparative study with existing approaches.
Structure of the paper
Section 2 discusses the related work done in similar direction. Section 3 illustrates existing network aware and the energy models. The detailed proposed model is explained in section 4. Section 5 illustrates the results of simulation for the proposed protocol. The article is concluded with section 6 followed by future works.
Related works
One of the most commonly used protocol is Low Energy Adaptive Clustering Hierarchy (LEACH) [4].
Other protocols are also developed to overcome existing protocol and improve the performance of LEACH protocol. LEACH-Centralized (LEACH-C) [10] is a protocol that considers node’s residual energy while CH selection and is capable of self-organizing and rotating CHs to balance the load among nodes. Energy-Efficient Heterogeneous Clustering (EEHC) [11] protocol is based on heterogeneous networks. G. Smaragdakis et al. [12] has proposed a protocol using heterogeneity of networks.
These days, computational intelligence or evolutionary algorithms are used to remove the difficulties in clustering [13]. Above discoursed algorithms tries to find an optimal solution for the problem. To achieve an optimal solution, these algorithms keep on iterating until an optimal solution is found. Selection of the optimal number of cluster heads is an NP-hard problem [14].
A genetic algorithm (GA) based algorithm [15] for data routing is proposed. It uses relay nodes. A fitness function is defined and to select the individuals, the roulette-wheel selection method is used. The mutation process is done by selecting a vital node from the relay nodes.
S. K. Gupta et al. [16] has proposed a routing algorithm using genetic algorithm. To select the individuals, tournament selection method is used instead of roulette-wheel selection method [17].
Ant colony optimization (ACO) depicts ant’s behavior from nature in search of food. ACO prolongs the lifetime of the network and used for multipath routing [18]. On the other hand particle swarm optimization (PSO) observes the behavior of a flock of birds. PSO is used in routing in [14, 19].
Energy can also be preserved by the use of fuzzy logic. An algorithm is proposed in [20] in which three parameters are inputted into the fuzzy system and one output is calculated. These three parameters are concentration, energy, and centrality. A ‘chance’ is calculated for each node which indicates the probability of a node to become CH. Cluster Head Election mechanism using Fuzzy logic (CHEF) [8] is an algorithm similar to [20] in which only two parameters are inputted into the system namely, residual energy and local distance.
A music-based technique is proposed in [9] for clustering. To obtain optimized clusters, an objective function is defined which is based on energy and intracluster distance. Probationary CHs are picked based on reminant energy only. Load balancing is not considered hence energy depletion is higher.
The issue at the hot spot is a conventional problem where the node located near to sink dis permanently. To overcome this issue unequal clustering algorithms are proposed. H. Bagci, and A. Yazici [21] tried to solve this issue using energy aware unequal clustering technique to reduce the hotspot issue and improve the network lifespan. S. A. Sert et al. [22] proposed an multi objective clustering algorithm using fuzzy logic using remaining energy of node distance of BS and density.
Several metaheuristic techniques used for clustering are considered in [23]. Other application areas which involve the usage of metaheuristic techniques are also discussed.
FLIHSBC [24] is proposed to enlarge the lifespan of the network. It is based on equality. The fuzzy logic approach selects the probationary cluster heads while the harmony search algorithm optimizes the count of heads in the clusters. This protocol suffers from energy hole problem.
The problem of Energy hole can be solved using unequal clustering technique as proposed in [25].
Until now, the proposed protocol doesn’t balance the load between the network nodes. Some protocol takes load balancing into consideration while other protocols are energy intensive. The proposed protocol takes into account both by the optimization of the selection of CHs.
System model
Model and network hypothesis
In the proposed protocol, sensor nodes are randomly scattered over the area of interest. A unique id is received by each sensor node. They are similar in nature. All nodes have same energy as well as equal sensing range. The depletion of energy causes the nodes to death [26, 27]. The Base Station (BS) position remains fixed after deployment is done. The Base Station knows the node’s position and having infinite energy. The nodes do not know the global topology; only they have information about the neighborhood. MAC layer’s communication environment is ideal.
Proposed protocol
Clustering is a mechanism used to decrease the dissipation of energy in the network. It uses two stages: first stage is the Cluster Head Competition stage and in second stage is Cluster Build stage.
In proposed protocol, first probationary cluster heads are selected which is employed by the use of fuzzy logic using three parameters. Then these probationary cluster heads are optimized to select the final cluster heads. Optimization is done through harmony search algorithm. Then cluster formation is done.
The proposed protocol uses fuzzy logic for the selection of probationary cluster heads. Three inputs are given to the fuzzy system. They are residual energy, node’s which are distance to the BS. The definitions of these three parameters are given below:
Residual energy: The energy of a node at any instance. Since, cluster head’s responsibility is to gather all the members’ data, aggregate the data and then transmits to the BS. It consumes more energy than the other members in the system. Hence, higher residual energy nodes are picked as cluster heads. This improves the network lifetime.
Node’s Centrality: this tells how much a node is in the center to other nodes. This is an important parameter as it significantly reduces the intracluster communication cost. If a node that is more central to the cluster is selected as CH, then the members are not needed to dissipate more energy in the transmission of the data to the CH.
Distance to the Base Station: It is a measure of the distance amid the nodes to the base station.
At any instance, the sensor node will be in any of the following five states: Initial state, competing for the state, final CH state, CH state, probationary, and member state.
Information of updates phase
Each sensor node contains a table known as information table that contains node ID, node’s energy, node’s distance from the base station, neighbor’s information, etc. Node_Info_Msg is transmitted by the node within its communication range, get all the neighbors information and updates its information table. In this way, each sensor node gets all its neighbors information. This Node_Info_Msg is transmitted periodically to update the latest information about the nodes.
Probationary cluster build phase
In this phase, probationary Custer heads formed by electing probationary cluster heads using fuzzy logic approach. Three parameters are inputted to the system namely, Node Centrality, Residual Energy, and Distance to BS. An output ‘chance’ is calculated that is the probability of a sensor node for becoming a probationary cluster head. Figure 1 explained the fuzzy system of the proposed protocol.

Fuzzy system for the proposed protocol.
Each input and output variables have three fuzzy linguistic variables. Trapezoidal and triangular membership functions are the most common membership functions for the fuzzy system. Boundary fuzzy linguistic variables have trapezoidal membership function while the middle fuzzy linguistic variables have triangular membership functions as shown in Figs. 2–5. In fuzzy sets, each elements is mapped to [0, 1] by membership functions.

Membership Function Energy’s.

Membership Function Centrality.

Dist-BS Membership Function.

Chance Membership Function.
The method of Mamdami fuzzy inference translates the crisp values to the fuzzy linguistic variables. Fuzzy IF-THEN rules are interpreted to produce the output from the input variables. Table 2 shows the fuzzy IF-THEN rules table for the proposed protocol. The method used for finding the crisp output is center of Area (CoA).
Gives a description of the function of the messages used in the proposed protocol
Fuzzy IF-THEN rules
After obtaining the crisp output, each sensor node has a value chance. Furthermore, a competition radius is calculated for each node that is given by Equation 2.
Each node compares its chance value within its competition range. The sensor node having a high value of chance within its range will convert into Probationary CH and its state becomes Probationary_CH. The non-CH nodes will become member nodes.
Probationary CH broadcast ADV_CH_MSG to advertise its cluster head membership. The member nodes to the nearest Cluster Heads broadcast a Join_CH_MSG. Probationary cluster heads then finally transmit Accept_MSG.
The numbers of clusters are optimized through an optimization technique known as Harmony Search Algorithm. It is a musician-inspired technique. The musician seeks a perfect harmonic condition. In order to get perfect harmony, they tried several iterations of pitches. The Harmony Search Algorithm has five phases in it, namely, In first phase, algorithm parameters and optimization problem are initialized. In second phase, Harmony Memory is initialized. In third phase, create a new harmony from the initial harmony. In fourth phase, Harmony Memory is updated. In fifth phase, go to phase 3 before you reached the termination criteria.
Harmony Search Algorithm includes other performance parameters like Harmony Memory Considering Rate (HMCR) Pitch Adjusting Rate (PAR). Harmony Memory Size (HMS)
The optimization problem is:
Equation 3 shows the function to achieve the optimum.
Choosing CH would be easier if the value of this function is lower.
The various functions are described as follows:
–The average distance from all neighboring nodes: Cluster member nodes send their sensed data to their respective cluster heads in intra-cluster communication.
Where, k = number of clusters and k j =amount of jth cluster neighbors.
–Average sink distance: Energy hole problem occurs mainly near the sink. Distance of CH from the sink plays a major part in the election of CHs. By getting plenty of CHs near to the sink avoids the energy hole problem.
–Energy Ratio: It is measured as the energy consumed by all the sensor nodes and by all the heads of the cluster.
Where N is the total number of nodes in the network.
Using the proposed algorithm the optimal cout of nodes are defined. The rest of the probationary CH nodes will become member nodes and join nearest cluster heads. After the final cluster formation, cluster heads create a schedule using Time Division Multiple Access (TDMA) for their respective clusters.
The performance evaluation of the proposed protocol is performed in this section. Many criteria are used for clustering protocol evaluations. Many of the metrics are given as: First Node Death (FND): The count of rounds is represented before the occurrence of first node being death within the network. Half Node Death (HND): the number of rounds until the decease of 50% of the node. 70% Node Dead (LND): It is described as the number of rounds before the decease of 70% of the nodes in the network occurs. Energy consumption: This is the amount of energy that nodes in the WSN deplete. The number of dead nodes per round: It is calculated as the total nodes that have depleted their energy. Packets sent to BS: It is based on the total packets transmitted by the nodes to the Base Station.
Simulation setup
The proposed protocol is simulated in MATLAB. Diverse scenarios are created to evaluate the effectiveness of the proposed protocol. The scenarios that are created are based on the count of nodes and the initial energy of the nodes. By varying this, a total of 9 scenarios are created that are as follows:
These nodes are randomly deployed over an area of 100 *100 with BS at the center.
The proposed protocol is compared with some well-known clustering protocols. They are LEACH [4], MOFCA [22], and FLIHSBC [24]. Table 3 listed the parameters for the simulation
Simulation parameters
Simulation parameters
The lifetime of Wireless Sensor Networks can be represented in several ways. The number of rounds until which the death of sensor nodes occurs is defined as a lifetime. The first node death does not affect the overall operation as the nearby nodes give the same data, but it degrades the quality of the data. When 50% of the network nodes die, then it more degrades the quality of the data. When 70% of the total nodes in the network dies then it worsens the quality of the network. When all the nodes in the network die, then the network stops operating. Here, for the proposed protocol, FND, HND and the 70% of the nodes dead round is taken into consideration. The FND, HND and 70% of the nodes dead round for the scenarios is shown in Figs. 6–8. It is clear from the figure that the proposed protocol lengthens the lifetime of the network. In scenario 1, at 4505th round the first node dies in proposed protocol while in LEACH it is at 542nd round, MOFCA it is at 819th round and in FLIHSBC it is at 4270th round. Similarly, in all scenarios, the proposed protocol performs better due to the optimal selection of CH by harmony search algorithm. Furthermore, Half Node Death round and 70 % of the nodes death round are also increased by the proposed protocol when compared with other algorithms. Probationary CHs are also selected through the proper use of all the variables of nodes. Hence, the lifespan of the network is increased by the proposed protocol.

FND for all scenarios.

HND for all scenarios.

70% of the nodes death round for all scenarios.
The count of nodes deceased is less (all scenarios) in the proposed protocol as shown in Figs. 9–17. Reduction in the cost of intracluster communication, with the selection of optimal clusters, leads to a higher number of alive nodes or we may conclude that less dead nodes. The lifetime of the network is dependent on the number of deceased nodes. therefore, the fewer dead nodes, the longer the lifespan of the network.

Comparison fog count of dead Nodes vs count of Rounds (Scenario 1).

Count of dead Nodes vs Count of of Rounds (Scenario 2).

Count of dead Nodes vs Count of Rounds (Scenario 3).

Count of dead Nodes vs Count of Rounds (Scenario 4).

Number of dead Nodes vs Count of Rounds (Scenario 5).

Count of dead Nodes vs Count of Rounds (Scenario 6).

Count of dead Nodes vs Count of Rounds (Scenario 7).

Count of dead Nodes vs Count of Rounds (Scenario 8).

Count of dead Nodes vs Count of Rounds (Scenario 9).
The energy consumed by the node in the network is one of the metric to assess the performance of the proposed algorithm. The energy depleted by the nodes in the network is less in the proposed protocol as shown in Figs. 18–26. This is due to the reason that optimal cluster heads are chosen from probationary cluster heads. LEACH does not consider residual energy during CH selection. It only selects CHs randomly hence deplete more energy. The intracluster communication cost is also less in proposed protocol hence, consumes less energy. The proposed protocol performs superior in terms of energy efficiency.

Energy Consumed vs Count of Rounds (Scenario 1).

Energy Consumed vs Count of Rounds (Scenario 2).

Energy Consumed vs Count of Rounds (Scenario 3).

Energy Consumed vs Count of Rounds (Scenario 4).

Energy Consumed vs Count of Rounds (Scenario 5).

Energy Consumed vs Count of Rounds (Scenario 6).

Energy Consumed vs Count of Rounds (Scenario 7).

Energy Consumed vs Count of Rounds (Scenario 8).

Energy Consumed vs Count of Rounds (Scenario 9).
The proposed protocol supplied the base station with more successful packets than rest of the protocols, as shown in Fig. 27. The count of packets transmitted efficiently is a criterion for determining the efficacy of the proposed protocol.

Comparison of Count of Packets transmitted for all scenarios.
The procedure suggested balances the network load consumption, thereby lengthening the FND. LEACH protocol has the worst results in all the metrics, since it has no control over the number of CHs chosen. Therefore, the energy use is low. MOFCA balances the load but considers no suitable CHs. In addition to this, FLIHSBC optimizes CHs but intra-cluster communication overhead is increased.
A distinct unequal clustering protocol is proposed in this paper. Probationary cluster heads are chosen by the degree, the distance of the node to the BS, and how central the node is i.e. centrality. These parameters are fed into the fuzzy system. The probationary cluster heads are more optimized with Harmony Search Algorithm. This optimization algorithm selects the optimal clusters. The cost of intracluster communication is lower in the proposed protocol. The protocol proposed balances the load between nodes as well. The problem with energy hole is also solved via the unequal clustering. The proposed protocol is simulated and contrasted with some well-known protocols of similar and unequal clustering protocols. The results obtained proved that the proposed protocol performs in every metrics used to evaluate the output. The proposed protocol is energy-efficient, and balances the load by unequal clustering. We will seek to strengthen the proposed protocol in the future by considering the improvement in topology and security. Mobile nodes can be considered too.
