Abstract
Data acquisition problem on large distributed wireless sensor networks (WSNs) is considered as a challenge in the growth of Internet of Things (IoT). Recently, the combination of compressive sensing (CS) and routing techniques has attracted much attention of researchers. An open question of this combination is how to integrate these techniques effectively for specific tasks. On the other hand, CS data reconstruction process is considered as one of the CS challenges because it requires to recover N data from only M measurement where M< <N. Through this paper, we propose a new scheme for data gathering in IoT based heterogeneous WSN that includes a new effective Deterministic Clustering using CS technique (DCCS) to handle the data acquisition problem. DCCS reduces the total overhead computational cost needed to self-organize WSN using a simple approach and then uses CS at each sensor node to decrease the overall energy consumption and increase the network lifetime. The proposed scheme includes also an effective CS reconstruction algorithm called Random Selection Matching Pursuit (RSMP) to improve the recovery process at the base station (BS). RSMP adds a random selection process during the forward step to give the opportunity for more columns to be selected as an estimated solution in each iteration. The simulation results show that the proposed scheme succeeds to minimize the overall network power consumption and prolong the network lifetime beside provide better performance in CS data reconstruction.
Keywords
Introduction
Internet of Things (IoT) is considered as a primary bridge connecting between physical and human world [45]. IoT becomes one of the significant and attractive fields of researches through which researchers hope to control all everyday usages via the Internet [1]. IoT elements include different objects from the technological environment, or even living organisms as well as devices that are already deeply embedded in technological environments such as smartphones or cars [6]. By integrating computational abilities in all kinds of things and living organisms, it will be possible providing a big leap in many sectors: Health, military, home, entertainment etc. [2, 3]. IoT consists of combinations between different technologies such as cloud computing, WSNs, big data and data information [5, 7]. Nowadays, WSNs widely used in various fields. Since IoT contains huge numbers of connected sensor nodes, WSNs are indeed able to integrate into IoT [42, 44]. The main task for IoT components (such as sensors, phones and RFID tags) is to sense, collect and store data then send them to the base station (BS) [41]. On the other side, data acquisition problem on large distributed WSNs is considered impediment in the further development of the IoT technology [8, 9].
It is highly required to find creative techniques that prolong the network lifetime [43]. Various techniques have been proposed such as routing protocols or data compression schemes [10–16, 62–68]. The data compression methods can be used to minimize the overall wireless channels transmitted data. This technique can be used to reduce the energy consumed by nodes communication which is considered as one of the major power consumers in IoT.
In the perspective of the compression, CS has been given as a sufficient method for signal sampling and compression [34]. In CS, the signal can be successfully sampled less than the rate of Nyquist theory if the signal is sparse by natural or by transformer [34]. In CS, the signal is sampled with compress simultaneously, rather than sample then compress like the other traditional compression techniques [10]. It can be reconstructed without any significant loss in the information [11, 12]. The general CS framework equation can be expressed as the following:
Here, the compressed samples vector y ∈ R
M
, with M << N, Φ is an M × N CS matrix, which is random matrices like Gaussian or Bernoulli distribution matrix in mostly CS methods, and signal vector x ∈ R
N
. In this system, ∥x ∥ 0 = S < M < N [13, 14]. Based on the CS method, if the sensors reading x is S-sparse, i.e., x has just S basis vectors where (N - S) are zeros and only S are nonzero. Then ∥L ∥ 1 norm minimization solution as shown in Equation (2) can be used to reconstruct x using only M-measurements [37].
In the context of routing algorithms, routing is considered as the most important communication paradigms that can optimize energy consumption in WSNs [26]. Designing suitable routing protocols for WSNs is considered as an important issue [21]. The hierarchical cluster-based routing protocol is considered as the most efficient protocol in terms of scalability and energy efficiency for WSNs [38]. In the Hierarchical protocols, sensor nodes are organized in groups called clusters [4]. In each cluster, one node acts as an aggregation point which called cluster head (CH) and cluster members. Each CH aggregates the data received from its cluster members. Finally, the BS receives these aggregated data from the CHs. In this case, the total data amount which is transmitted to the BS can be significantly reduced [27].
The integration between routing protocols and the CS method can help to solve the data acquisition problem [16]. However, finding an efficient way to integrate routing protocols and compressive data aggregation is an NP-complete problem [15]. Through this paper, we propose an effective Deterministic Clustering using CS technique (DCCS) for heterogeneous WSNs to handle the data acquisition problem. DCCS reduces the total overhead computational cost needed to self-organize WSNs using a simple approach and then using CS at each sensor node to minimize the overall network energy consumption and increases the network lifetime. Secondly, we propose an efficient reconstruction algorithm called Random Selection Matching Pursuit (RSMP) to improve the recovery process at the BS side using CS. RSMP adds a random selection process during the forward step to give the opportunity for more columns to be selected as an estimated solution in each iteration. The proposed algorithms are validated by simulations with respect to power consumption and network lifetime.
Our contributions can be highlighted in the following points: We propose a new DCCS algorithm that distributes the network into a fixed number of clusters in each round to overcome the stability problem and optimize energy consumption. Electing a CH depends on the residual energy of sensor nodes as in [39]. In order to decrease the total energy consumption, DCCS divides the communication process into an intra-cluster and inter-cluster communication process. During the intra-cluster process, DCCS organizes the nodes in each cluster into a chain using the proposed chain construction procedure. In the inter-cluster process, using the proposed chain construction procedure, DCCS organizes the CHs in a chain considering CHs as cluster members with the BS as CH. To enhance the CS data gathering and reconstruction process, DCCS allows the BS to dynamically change the measurement matrix depending on the network status. To improve the reconstruction process at the BS, the RSMP algorithm is proposed. RSMP adds a random selection process to the forward step in order to give the chance for all columns to be selected as an estimated solution in each round. Finally, we provide a comprehensive simulation results using randomly generated data and real data. The results prove that the performance of the proposed scheme exceeds existing baseline algorithms in terms of network lifetime, energy consumed and reconstruction error.
The rest of the paper is structured as follows: in Section II, the related works are explained. Section III presents the new algorithms. Section IV presents the simulation validation. Section V provides the conclusion of our research paper.
During the last few years, IoT applications such as surveillance and monitoring applications have attracted a number of researchers. Nodes energy constraints problem led to several routing protocols for WSNs [23]. For example, Deterministic Energy-efficient Clustering (DEC) algorithm is proposed for WSNs [39]. In DEC, WSNs nodes are divided into clusters with a specific CH. DEC has declared itself as a robust and more stable algorithm comparing with other protocols [4, 40] because it selects a fixed number of CHs in each round instead of using probabilistic-based models. Additionally, DEC uses only residual energy of the sensor nodes as the eligibility criterion to optimize energy consumption for WSNs.
The novel energy-efficient clustering protocol (NEECP) [38] aims to maximize the WSNs lifetime. The basic idea of NEECP is that the CH selection process depends upon the initial energies and residual of all nodes. The possible CH can be chosen on the root of the sensing range of the nodes. NEECP uses the chain approach for aggregating and collecting data in each cluster. In [46], Stability Improved LEACH (SILEACH) is proposed to improve the stability of LEACH by considering the residual energy, distances between the nodes and the CH and the distances between the nodes and the BS in CH selection process. In [47], the authors presented Stability Election Protocol (SEP) in which the CH selection process is based on weighted election probabilities using the remaining energy of each sensor node. A stable energy efficient clustering protocol for WSNs (SEECP) is proposed in [48], where the energy load balancing among the nodes is achieved by selecting the CHs in a deterministic fashion. Enhanced Threshold Sensitive Stable Election Protocol (ETSSEP) for heterogeneous WSN is proposed in [49]. ETSSEP is based on dynamically changing CH election probability and it elects CHs on the basis of the residual energy level of nodes and the minimum number of clusters per round.
The routing protocols above do not consider data compression so that they cannot satisfy the huge data traffic of WSNs [19]. It is effective to apply compression before transmitting data to reduce total power consumption by a sensor node [27]. For these reasons, the latest researches have explained that using CS scheme in WSNs can significantly decrease the total number of data gathered and improve WSNs performance [14, 50–55].
Compressive Data Gathering (CDG) [32] is the first work that introduced CS to WSNs. CDG combines routing techniques with CS for reducing the overall network energy consumption. The main problem of CDG is that it only uses the CS idea without any analysis. The proposed Efficient CS based routing technique (ECST) in [27] prolongs the network lifetime using compressive methods to compress sensors reading before sending them to BS, however, ECST does not take into consideration the effect of compression matrix on sensors data. The work in [14] aimed to reduce energy consumption by combining routing algorithms and compression techniques.
In [28], the authors proposed a CS scheme for collecting data in large-scale WSNs. [30] utilized signals data correlation to reduce the sampling ratio. In [31], the authors proposed a scheme to reconstruct losing data by exploiting the correlation between different nodes data. In [18], the authors combined between CS and tree routing protocols. However, it only succeeds to minimize the total forwarding energy consumption, but on the other side, it increases the power consumed by leaves and intermediate nodes. To solve the tree routing problem, the authors of [13] used a CS scheme in a hybrid way in which only parent nodes do CS operation but this solution is suitable only for small networks, however, cluster-based schemes are more efficient for large networks [19].
In [20], the authors integrated between CS and clustering based schemes and suggested an ideal number of clusters that can led to minimize the power consumption. In [21], the authors proposed a CS hybrid method that is integrated with the clustering method and studied the relationship between the cluster size and the number of transmissions in the hybrid CS method. In [51], the authors proposed a new Cluster Size Load Balancing for CS algorithm (CSLB-CS) which could achieve optimal utilization of CS method in an IoT-based WSN. The protocol in [22] combines between random walk (RW) routing protocol and CS to prolong the network lifetime. In [23], the authors proposed a distributed multi-chain compressive sensing-based routing algorithm (DMC-CS) in which all nodes are organized in a number of chains with a leader. Each chain leader collects the CS compressed samples from its members and then sends them to the BS. Although this algorithm succeeds to increase the network lifetime, it is costly because BS needs to know the distances among all sensors.
Multi-path routing protocol (RMPGST-IoT) schema proposed in [49]. It employs the Gray System Theory (GST) where routes with highest rank are selected for concurrent transmission of data, using a specific routine based on GST. A performance evaluation method of WSNs based on GST is proposed in [50]. In [51], the authors proposed a trust model with incentive mechanism to evaluate the reliability of nodes. The proposed model combines fuzzy sets with GST to evaluate every node’s trust credibility based on direct trust and indirect trust relationship. Only those with higher trust values can be chosen to forward packets. Those untrustworthy nodes with lower trust values will be detected and excluded from the trust list. Also, Integration of GST and fuzzy is employed in [52] to find the most optimal Gray-Fuzzy routing strategy and reduce the time complexity effectively.
The concept of linear Diophantine fuzzy sets (LDFSs) is a new approach for modeling uncertainties in decision analysis. Due to the addition of reference or control parameters with membership and non-membership grades, LDFS is more flexible and reliable than existing concepts of intuitionistic fuzzy sets (IFSs), Pythagorean fuzzy sets (PFSs), and q-rung orthopair fuzzy sets (q-ROFSs) [53, 54].
The concept of spherical linear Diophantine fuzzy sets (SLDFSs) introduced in [55] with the inclusion of reference or control parameters. A SLDFS with parameterizations process is very helpful for modeling uncertainties in the multi-criteria decision making (MCDM) process. SLDFSs can classify a physical system with the help of reference parameters. SLDFSs have various real-life applications towards digital image processing, network systems, vote casting, electrical engineering, medication, and selection of optimal choice.
Multi Criteria Decision Making Method was employed in WSN in various work, e.g., [56, 57]. Effective Data Acquisition Protocol for Multi-Hop Heterogeneous WSNs Using Compressive Sensing (EDACP-CS) algorithm was proposed in [24]. EDACP-CS combines a clustering-based algorithm and CS method in which the CH selection depends on distances from the BS and sensor residual energy. However, EDACP-CS suffers from the overhead cost of computation related to CH search. The work proposed in [9] was the first work studied the CS with IoT from the viewpoint of data-compressed sampling. The main problem of the protocol in [9] is that it applies CS without considering the organization of the sensors in order to send or receive the data from and to the BS. In the context of CS reconstruction problem, Orthogonal Matching Pursuit (OMP) [35] determines the greatest magnitude values in Φ T r index during each iteration, where, r represents the residual of y. Then, the least squares (LS) problem is solved. The works in [58, 59] proposed two algorithms based on OMP where the Stagewise OMP (StOMP) algorithm in [58] is an enhancement of OMP. StOMP selects more than one column to enhance the forward step of OMP; then utilizes these columns to solve the LS problem. While in [59], OMP is enhanced by grouping the inner-products having identical magnitudes into sets; then the set with the largest energy is determined. The algorithms [35, 59] do not have a backward step as they fall under the category of irreversible greedy algorithms. The advantage of the backward step is to recover from wrong selection that might have occurred during the forward step. On the other hand, reversible greedy algorithms e.g., IHT [60], CoSaMP [17], SP [33] and FBP [29] employ backward step to eliminate wrong selection added during the forward step. However, as a result of measurement noises, the MF does not usually give the indices of all correct columns. Indeed, the correct indices may not be selected because they give small correlation according to Equation (4).
As discussed above, the related algorithms suffer from non-stability because they used probabilistic-based models. In addition to, there is no efficient mechanism to regularly check the suitability of the selected measurement matrix in each round to decide either to change it or not and this was the motivation behind our new work in this paper. This paper proposes an efficient CS scheme to improve the performance of IoT based WSNs, prolong the network lifetime and improve the reconstruction process. The proposed scheme consists of two algorithms: Deterministic Clustering using CS protocol (DCCS) and Random Selection Matching Pursuit (RSMP). The details of the proposed algorithms can be shown during the next sections.
The used notations through the paper are listed in Table 1.
Notions Description
Notions Description
Recently, IoT technologies have attracted many researchers in the area of wireless networks. However, sensors energy constraints, designing effective sensor data aggregation techniques and managing a large amount of information are considered the major challenges facing IoT technologies. In this section, we describe our new scheme that integrates CS with an efficient routing technique. The proposed scheme includes two algorithms: (1) Deterministic Clustering using CS (DCCS) that divides the WSN into a number of clusters. In each cluster, a CH is selected according to sensor residual energy. DCCS organizes each cluster into a chain to start CS data gathering. Moreover, it allows BS to dynamically change the measurement matrix if it is not still suitable for the network and (2) Random Selection Matching Pursuit (RSMP) for data reconstruction. RSMP adds random selection during the column selection to increase the chance of finding the correct columns in each round and improve the reconstruction performance. Below, we give the details of the two algorithms.
In the DCCS, a heterogeneous IoT based WSN is considered where each sensor node is in one of the following classes: normal, advanced and superior with their names refer to energy level. The DCCS algorithm diagram ispresented in Fig. 1. DCCS comprises of two phases: Data compression and Setup.

DCCS Algorithm.
DCCS executes this phase only once in the first round. The main goal of this phase is to collect all sensor data X without using CS at the BS with minimum energy consumption. Setup phase consists of three steps: Cluster Head Selection, Clusters Construction, and Learning. In CH selection, depending on the residual energy (RE) of nodes, the BS selects a fixed number of nodes to be CHs. In clusters construction, CH organizes its cluster members in a chain. Finally, in learning step, using the received data from CHs, BS generates the CS matrix and calculate the threshold value.
The setup phase procedure will be as follows: CH Selection: in this step, as in [39], BS selects a fixed number of nodes (NCH) to be the CHs based on the RE of every node such that the priority is given to the nodes with higher RE. After the first round, each cluster is responsible dynamically for selecting a new CH. This scenario reduces the computation cost accompanying with CH search in the present protocols. The selected CHs transmit their own information to all other nodes to select the nearby CH and then start Clusters Construction step. Clusters Construction: Once the selected CHs (set C nodes) advertise themselves as CHs, the non-CHs nodes start to construct clusters by sending a join-request message (JRM) to the closest CH
i
, where i = 1, 2, …, N
CH
. This JRM contains (1) node identification (Node - ID), the selected CH identification (CH - id), the node residual energy (Node - RE), and the node’ location (Node - Loc). The DCCS divides the network into N
CH
clusters each cluster has CH and members. In order to decrease the whole network power consumption to transmit data per round in each cluster, the DCCS organizes the member nodes within each cluster into chain(s). Every CH
i
,, i = 1, 2, …, N
CH
applies the Initialization and Update steps to construct ChainList (s). The procedures for this step are shown in Algorithm 1.
Update Step: CH i decides the place of nearest unselected neighbor node C j to node C1 in ChainList i , by comparing the distance between C1, C j and any consecutive nodes in ChainList i . If the distance between C1 and C j is less than D where D the distance between C j and any consecutive, then C i adds it to the end of ChainList i . Otherwise C j will be added between the consecutive nodes that have minimum distance to C j , for example, if C r and C k are consecutive nodes in the ChainList i and if dis (C j , C last )> dis (C j , C r ) and dis (C j , C last ) > dis (C j , C k ), then node C j will be added between C j and C k . Otherwise, node C j will be added to the end of the ChainList i after node C last , where C last the last
Node is added to ChainList i and dis (C j , C k ) is the distance between C j and C k . CH i repeats the previous Update Step to include all its members in ChainList i . If a member dies in a ChainList i , then ChainList i will be reconstructed bypass the dead node.
By applying the previous steps, each node will send and receive the minimum distance, i.e., DCCS can reduce the energy consumption for each node.
3.
The BS generates this matrix using a random seed ɛ, and then broadcasts ɛ to the entire network. To develop the seed selection process, DCCS uses the following scenario: During the intra-cluster process, every CH collects a non-CS data X from its chain members and then fuses these data. CHs start inter-cluster communication process (Algorithm 1) by creating a chain among them, considering the BS as a CH. Using this chain, BS collects all sensors data, and thenit finds the best ɛ that gives the minimum error. The BS uses this minimum error as the threshold β. Finally, the BS sends ɛ to the entire network to use during the Data Compression phase.
Data Compression Phase will be executed starting from the second round. It consists of four steps: CS Data Gathering within intra-cluster and inter-cluster, Reconstruction, Dynamic Re-Generate the Random Seed and CH rotation. At the end of this phase, DCCS re-executes Algorithm 1 to create the new clusters with the new CHs. The details of these steps are illustrated below.
[1]
As described in the previous phase, there are N
CH
clusters, every cluster has CH
i
and chain members are organized in ChainList
i
such that each ChainList
i
= [c0, c1 … , clast-1, c
last
]. CS gathering data in each intra-cluster will be as follows: Last node c
last
in the ChainList
i
uses the global received seed ɛ from the BS to generate α
C
last
. c
last
Computes its compress vector (measurement) y
C
last
= α
C
last
d
C
last
, where d
C
last
is the reading of sensor c
last
, and then transmits the measurement y
C
last
to its previous neighbor node clast-1 in the ChainList
i
. Node clast-1 uses the same global seed ɛ to generate α
C
last-1
and compute its measurement y
C
last-1
= α
C
last-1
d
C
last-1
, and then transmits the summation vector y
C
last
+ y
C
last-1
to the previous node clast-2. Once clast-2 receives y
C
last
+ y
C
last-1
, it computes its value y
C
last-2
, adds it to y
C
last
+ y
C
last-1
and then sends the summation value to the previous node in ChainList
i
and so on till the CH
i
.
Now each CH i has received the compressed vectory i = [y c 0 , yc1c1, …, y c last ] from their corresponding cluster members. Through inter-cluster communication, DCCS applies the same scenario used in Algorithm 1 to organize the CHs in a chain and consider them as cluster members of a cluster with the BS as CH. The same approach isused by both CHs and cluster members in a new cluster for sending their data to the BS. The communication process both inter-cluster and intra-cluster are shown in Fig. 2.

Inter and Intra cluster communication processes.
[2]
Using the received compress vectors y = [y1, y2, y3, …, y i ], where i = [1, 2, …, N CH ], BS generates the CS matrix depending on the predefined random seed ɛ. After that, the BS reconstructs the original data x0 of every cluster. In order to improve this step, Random Selection Matching Pursuit (RSMP) is proposed. The main process of RSMP will be described in Section B.
[3]
During the setup phase, the BS generates M × N CS matrix using random seed ɛ and then broadcasts ɛ to the entire network. In each round, every sensor node transmits and receives a vector M × 1 whatever the number of alive nodes in each round as displayed in CS Gathering data within intra-cluster and inter-cluster Step. This leads to the increment of the average power consumption and also has a negative reflection in the reconstruction process.
To overcome this problem, DCCS dynamically changes the CS matrix whenever the network status changes, i.e.
DCCS gives the ability to dynamically change the CS matrix depending on the network status and the number of alive nodes in each round.
In this case, the CS matrix size decreases as the number of alive nodes decreases, i.e., DCCS can successfully decrease the overall power consumption. The BS can obtain the number of dead nodes in every cluster from each CH. Each CH can simply use a HELLO message to know the number of dead nodes in its cluster in each round. The process of this step can be summarized as follows: The BS compares the latest reconstructed data X′ with X and judges whether to re-generate depending on the error value of ɛ = X -- X′. If it goes beyond a predefined threshold β, it means that there is a change in network status, the BS regenerates new ɛ. Otherwise, no need to change last seed.
[4]
Every CH i checks the piggy-backed CM-REs information to decide whether it remains as CH or gives up their roles to any node in their clusters with the highest RE and assign these nodes as the new CHs. This step prevents WSNs from dying earlier by distributing energy consumption loads.
In this section, a new reconstruction technique called Random Selection Matching Pursuit (RSMP) is proposed. RSMP can be used by the BS to reconstruct the sensors readings again. It is a reversible greedy algorithm in the sense that it has reversible construction, the support set can be pruned (backward step) to remove the unreliable elements that chosen in the past (forward step). Before describing RSMP algorithm, some operations that used in the algorithm are defined as following:
During the forward step, most CS reconstruction greedy algorithms used Matched Filter Detection (MF) operation Φ′y to calculate the correlation between matrix Φ columns and the Sampled Measurement Vector y. Then, Equation (4) is used to select the set of indices corresponding to the n largest amplitude components of Φ′y. The size of n is different in each algorithm, for example: n = 1, . . S, and 2S in Orthogonal Measurement Sampling (OMP) [35], Subspace Pursuit (SP) [33] and COSAMP [17] algorithms respectively. However, as a result of measurement noises, the MF does not usually give the indices of all correct columns. Indeed, the correct indices may not be selected because they give small correlation according to Equation (4). To solve this drawback, RSMP proposes a random technique to the selection process in the forward step to increase the probability to find the correct columns indices in each iteration. The process of RSMP algorithm is given in Algorithm 2. RSMP includes four steps: Initialization, Forward, Backward and Update. The description of the four main steps are as follows:
[1]
[2]
RSMP uses a simple way to improve this step. Instead of selecting the indices of largest amplitude components in the set F only, in each iteration, RSMP selects S + q columns where q is the random selection size. RSMP firstly selects the largest S components in F (H = supp (F, S)) to create set H and then uses Equation (5) to select q random components from setF (R = Rand (F, q)), and creates set R to overcome the interested case in which the correct columns do not give high correlation. Indeed, the probability to find the correct columns in both cases is increased. RSMP set q = M/2 -- S depends on the fact that the CS reconstruction problem can be resolved if the sparsity level S ⩽ M/2 [33]. Finally, RSMP uses the union set U = HR between set H and set R to expand the estimated set T and start the next step.
[4]
This section provides the results of the simulation for evaluating the performance of our work. We divide this section into three parts: (1) in the first part, DCCS algorithm is evaluated in terms of a network lifetime (the first node die) and average power consumption, (2) in the second part, we evaluate RSMP reconstruction algorithm in terms of reconstruction error (Average Normalized Mean Squared) and compare its performance results with Orthogonal Matching Pursuit (OMP), COSAMP, Forward-Backward Pursuit (FBP) [29], Subspace Pursuit (SP) [33] and E-OMP algorithms [25], (3) Finally, the dynamic re-generate random seed step is evaluated in terms of average power consumption and reconstruction error.
In this section, MATLAB environment is used to perform the simulation. In this part, the network region size is 100m × 100m, and the total number of sensor nodes ranges from 50 to 200 nodes with increment of 50 nodes and the BS is located in the middle of the network. The DCCS algorithms is evaluated in case of homogenous and heterogeneous networks.
•
•
In this case, DCCS performance is evaluated and compared with DMC-CS [23] and EDACP-CS [24]. In addition to, we use the same energy parameters used in [4] to send an l-bit message for a distance d. The radio power consumption is:
In order to obtain this message, the radio expend is:
The parameters which are used during this simulated model can be expressed as following: E
elec
= 50nJ/bit, ɛ
fs
= 10pJ/bit/m2,
Figure 4 shows the network lifetime of DCCS, EDACP-CS and DMC-CS. In EDACP-CS and DMC-CS, the death of the first node is earlier than DCCS, and also Fig. 4 illustrates the effectiveness of the DCCS algorithm in prolonging network lifetime than EDACP-CS and DMC-CS algorithms. This is because DCCS uses a fixed number of CHs (NCH) in each round, not the case by the others, which leads to achieving the stability of energy consumption among the nodes. Additionally, in DCCS the BS takes the role to select the CHs only in the first round and then dynamically change the CHs, which considerably decreases the overhead cost of computation associated with CH search comparing with others. DCCS reduces the transmitted CS measurement samples in each cluster which dynamically depends on the network status rather than using a fixed number of CS measurement samples in each round as in the other algorithms.

Flow chart of DCCS Data Compression Phase.

Network lifetime in DCCS, DMC-CS and EDACP-CS.
Figure 5 shows the network lifetime and the number of alive nodes per round in DCCS, EDACP-CS and DMC-CS. It is clear that the first and last nodes in DCCS live more than those of EDACP-CS and DMC-CS, which means that DCCS reduces the energy consumption overall sensors. This is because DCCS reduces the power consumption for each node by organizing cluster members in a chain such that each node sends and receives only from the nearest node, which is not the considered by EDACP-CS algorithm. During the chain construction, DCCS rearranges all nodes in the chain when it adds a new node to the chain list to take in consideration the distances between this node and the others in the chain, rather than adding only the nearest node to the last node in the chain like DMC-CS. In Fig. 6, it is clear that DCCS succeeds to decrease the average energy consumption compared to EDACP-CS and DMC-CS algorithms because DCCS dynamically re-generates CS matrix which is not considered by the others.

Number of alive nodes as a function of number of rounds.

Average energy consumption in DCCS, EDACP-CS and DMC-CS.
In this section, we aim to evaluate the proposed algorithm in the case of a heterogeneous network. In this case, we assume that the total energy of the network is 102 J and the sensor nodes are divided into advanced, intermediate and normal according to their residual energy. DCCS performance is evaluated and compared with ETSSEP [49], SEECP [48] and SILEACH [46] based on CS method. It is clear from Fig. 7 that DCCS still provides good performance in terms of network lifetime extension compared to ETSSEP, SEECP and SILEACH algorithms because the dynamic CS matrix regeneration process in DCCS utilizes the CS matrix in an effective way to reduce the total number of transmitted data which leads to reduce the transmission energy consumption. On the other hand, the ETSSEP-CS, SEECP-CS, and SILEACH-CS algorithms use a fixed CS matrix in each iteration which may become not suitable for the network after a number of iterations. The same effect can be noticed in Fig. 8 where DCCS outperforms the other algorithms in terms of network lifetime in case of the death of half-nodes.

Network lifetime (First node dies) in DCCS, ETSSEP-CS, SEECP-CS and SILEACH-CS.

Network lifetime (half of node die) in DCCS, ETSSEP-CS, SEECP-CS and SILEACH-CS.
From Fig. 9, we can conclude that DCCS succeeds to reduce the total energy consumption compared with the ETSSEP-CS, SEECP-CS and SILEACH-CS algorithms because DCCS divides the network into a number of clusters and inside each cluster, it uses the proposed chain construction algorithm to organize the cluster members into a chain. Moreover, DCCS uses the same proposed chain construction algorithm to organize the CHs in chain using BS as a CH for transmission to the BS.

Residual Energy in DCCS, ETSSEP-CS, SEECP-CS and SILEACH-CS.
In this section, we evaluate the performance of RSMP reconstruction algorithm and compare its results with OMP, COSAMP, SP, FBP and E-OMP algorithms. First, we use RSMP algorithm to reconstruct the signals collected by the Intel Berkeley Research lab (real datasets) [36]. The whole process repeated over 500 times on randomly generated sparse samples S. Second, RSMP is applied to reconstruct computer-generated signals in which its nonzero coefficients are drawn from the uniform and Gaussian distributions. Finally, RSMP performance is measured over signal noise observations. The reconstruction is investigated by Gaussian matrix Φ with size M×N, where N = 256 and M = 128.
RSMP algorithm reconstruction’s performance is compared with different others reconstruction algorithms in term of Average Normalized Mean Squared Error (ANMSE), which is defined as the average ratio
In this section, we use RSMP algorithm to reconstruct the signals collected by Intel Berkeley Research lab [36].
Figure 10 shows the effectiveness of RSMP algorithm in terms of reconstructing the temperature signals. RSMP achieves similar performance in reconstructing the humidity signals as shown in Fig. 11. In Fig. 12, we illustrate the distribution of relative recovery error for different sensor nodes. It is clear that RSMP algorithm surpasses the performance of the other greedy algorithms, i.e., the COSAMP, OMP, EOMP, FBP and SP, respectively.

Reconstruction results over Intel temperature signals.

Reconstruction results over Intel humidity trace.
In this part of simulation, uniform and Gaussian distributions are used to draw the non- zero values of the sparse signal and the sparse level S ranges from 5 to 60. In Fig. 13 where the sparse signal’s non-zeros values are drawn from uniform distribution, RSMP algorithm clearly provides lower ANMSE comparing to COSAMP, OMP, E-OMP, FBP and SP. In addition, ANMSE for RSMP algorithm is started to increase only when S > 56 while it increases when S > 42, S≥30, S≥44, S≥38 and S≥35 for COSAMP, OMP, E-OMP, FBP and SP algorithms respectively as shown in Fig. 13. Figure 14 shows ANMSE results when the sparse signal’s non-zero values are drawn from Gaussian distribution. In Fig. 14, RSMP algorithm still provides lowest ANMSE result comparing to COSAMP, OMP, EOMP, FBP and SP, as S > 59, S≥46, S > 34, S > 49, S > 47 and S > 45, respectively.

Performance of six different reconstruction algorithms for temperature signals.

Reconstruction results over sparsity for uniform sparse signals.

Reconstruction results over sparsity for Gaussian sparse vector.
This part of simulation aims to test RSMP reconstruction performance when different measurement vector lengths- M are used with two different CS matrices: Gaussian and Bernoulli distribution matrices as shown in Figs. 15 and 16 respectively. To achieve this aim, sparse signals drawn from uniform distribution with length N = 120 is used and M values ranges from 10 to 60 with step size 1. From those figures, we observe that RSMP algorithm still provides the lowest ANMSE values comparing to the others.

Reconstruction results over Gaussian matrix by different lengths of M.

Reconstruction results over Bernoulli matrix by different lengths of M.
In this part, we add some noise equal to 10- 4 to the original uniform as well as in Gaussian distributions signal where N = 256 and M = 128. The CS matrix Φ is drawn from the Gaussian distribution. The sparsityS levels are from 10 to 60 with step size 1.
Figures 17 and 18 depict the reconstruction errors for the noisy Gaussian and uniform sparse signals. We observe that RSMP algorithm produces less error than COSAMP, OMP, E-OMP, FBP and SP. In summary, RSMP algorithm improves the reconstruction process with higher performance than COSAMP, OMP, E-OMP, FBP and SP algorithms. This is because in each iteration RSMP gives the chance to the columns which do not give the largest values in MF process to be chosen.

Reconstruction results for noisy uniform sparse signals.

Reconstruction results for noisy Gaussian sparse signals.
In this test, a network with 100 m×100 m region size and the number of sensor nodes ranges from 50 to 200 nodes with increments of 50 and the BS is located at (x = 50, y = 50).
Figure 19 shows the effect of network status changing on the reconstruction process. It is clear that the reconstruction errors of both DCCS dynamic and DCCS-static are the same when the network is stable, i.e., the number of dead nodes is equal to zero. However, the reconstruction error increases when the number of dead nodes is zero in the case of DCCS-static because DCCS-static uses a fixed CS matrix whatever the number of alive nodes in each round. On the other hand, DCCS-dynamic uses the threshold β as the best reconstruction error and then in each round, compares the reconstruction error with β.. If the error is larger than β, the old matrix is not suitable and the DCCS-dynamic regenerates another one. In this case, DCCS-dynamic keeps the reconstruction error stable whatever the number of dead nodes as shown in Fig. 19.

ANMSE as a function of number of died nodes in each round.
Figure 20 shows the ability of the DCCS-dynamic algorithm in the dynamic mode to increase the number of alive nodes in each round compared to DCCS in static mode. According to the DCCS-dynamic scenario, the number of measurement samples transmitted in intra-cluster or inter-cluster communication is decreased while the number of dead nodes is increased.

Average energy consumption in DCCS-dynamic and DCCS-static.
The main goal of IoT components is to obtain information about any event correctly. However, there are some problems which impede the way to achieve this goal such as sensor batteries constraints and managing large amount of data acquisition. To solve these problems, this work has proposed a new CS scheme for IoT based WSN and explained how this scheme can be used to compress and reduce the total data traffic through the network. The proposed work consists of two algorithms, first, one is called DCCS algorithm in which the BS selects a fixed number of cluster heads according to RE of each node. Then, DCCS organizes each cluster members into a chain before starting the CS data gathering. In each round, the BS checks the suitability of the measurement matrix to the network to decide either to change or not. The second algorithm is called RSMP which can successfully reconstruct the original data at the BS. The proposed work achieved our goals to prolong the network lifetime and improve the reconstruction performance. Simulation results proved that our proposed scheme is an effective data collection tool for decreasing energy consumption in network, prolonging network lifetime and reducing reconstruction error. As apart of Future work, we are planning to apply DCCS for improving Wireless Body Area Network performance. In addition, we plan to combine between RSMP algorithm and one of swarm algorithms for more improving to the reconstruction process.
