Efficient Data Gathering in Wireless Sensor Networks Based on Matrix Completion and Compressive Sensing

—Gathering data in an energy efficient manner in wireless sensor networks is an important design challenge. In wireless sensor networks, the readings of sensors always exhibit intra-temporal and inter-spatial correlations. Therefore, in this paper, we use low rank matrix completion theory to explore the inter-spatial correlation and use compressive sensing theory to take advantage of intra-temporal correlation. Our method, dubbed MCCS, can significantly reduce the amount of data that each sensor must send through network and to the sink, thus prolong the lifetime of the whole networks. Experiments using real datasets demonstrate the feasibility and efficacy of our MCCS method.


I. INTRODUCTION
In wireless senor networks, deployed sensor nodes periodically collect readings and send them to sinks (or base stations) via wireless channels.Due to the limited computation capability and energy power, it is desirable to design simple and energy efficient data gathering method to reduce energy consumption on each sensor.
Energy consumption of sending and receiving data is one of the most important factors in WSNs.Various methods have been proposed to achieve data compression include tradition source coding and distributed source coding [1] .Beside those methods, another promising taxonomy is Compressive Sensing (CS) [2-3] based approaches.

CS theory proved that if discrete signal N N
x R ! can be sparsely represented as N !using a transform basis N N !" (e.g.wavelet basis), we can exactly recover N x from M y (M<N) measurements using 1 l minimization constrain, here In (1), M N !" is the sensing matrix.In this paper, we will use sparse binary matrix as the sensing matrix which can reduce energy consumption while achieving competitive data compression ratio [4].N ! is a sparse vector with only few non-zero elements.
Vuran et al. [5] pointed out that in wireless sensor networks, the phenomena observed by sensors is highly spatially and temporally correlated.Those correlations result to the sparsity of sensor readings under wavelet transform (see fig. 2 in experiment section for the example of real world dataset) which then satisfy the sparse requirement of CS theory.
In [6][7] both inter-signal and intra-signal correlations are considered to reduce the communication cost on each sensor.Each sensor sends CS compressed measurements to sink individually, and sink recovers original readings from measurements of all the sensors using Joint Sparsity Module (JSM).
In [8] the authors proposed Compressive Data Gathering (CDG) framework to make use of the sparsity of inter-signal in multi-hop fashion for large dense wireless sensor networks in a distributed manner.Also in [9], a Hybrid CS aggregation was proposed to further reduce the communication overhead of sensors.But this kind of method needs to re-compute spanning tree each time if there are node joining or leaving.
Temporal correlation of intra-signal and hierarchical cluster feature was exploited in [10].In this method, cluster heads of each layer perform CS reconstruction and use the recovered readings to form new CS measurements with shorter length.The disadvantage is that each cluster head need to do CS reconstruction which is computation expensive.
Matrix Completion (MC) is the theory to recover a full data matrix from part of its entries.Recently, candes et al. proved that if the data matrix is a low rank or approximately low rank matrix, we can recover a full matrix from an incomplete set of entries [11] .Observing that the readings of sensors can form approximately low rank data matrix [5] , Cheng et al. proposed EDCA scheme to apply the low rank matrix completion theory to the data gathering problem of WSNs [12] .Since EDCA only sample partial readings on each single sensor, the energy cost of sampling which is not investigated before can also be reduced.
Inspired by the work of EDCA and the power of intrasignal temporal correlation, in this paper we first combine the matrix completion and compressive sensing theory to further compress the senor readings.

SPECIAL FOCUS PAPER EFFICIENT DATA GATHERING IN WIRELESS SENSOR NETWORKS BASED ON MATRIX COMPLETION AND COMPRESSIVE…
This paper first presents our matrix completion and compressive sensing method, which are called MCCS, and then demonstrates the performance gains by real deployed dataset.At last, we conclude and give out future research direction.

II. PROPOSED SCHEME
The fundamental idea of our proposed method is straightforward.Each sensor node randomly samples the readings, compresses the collected readings using CS method, and then sends out the compressed data.The base station will do CS reconstruction first and then perform matrix completion technology to get the readings from all the sensors.

A. Encoding Algorithm at Each Sensor Node
Our aim of encoding is trying to reduce the energy cost in all possible aspects including energy cost of sampling, communication cost and computation complex.Below are the details of our encoding algorithm at each sensor.
Step 1: Each sensor node randomly generates a binary sampling position vector N P with only q (q<N) non-zero entries.Here, we call Step 2: Each sensor node then scans the binary sampling position vector and only samples when the corresponding entry is non-zero.At the end, those sampled readings form a vector q q x R ! .
At this step, MC based compression is applied.Each node only sample q readings instead of N which results to / N q != " # $ % compression ratio.Thus the energy cost of sampling is reduced [12].
Step 3: At the first time, each sensor node generates and stores the same sparse binary matrix p q B ! (p<q) using the seed K pre-shared among all sensors and sinks.There are only small number d ( 1 p d > ! ) non-zero elements random located in each column of p q B ! .Also, we call Step 4: Each sensor node gets CS projections p y from q x according below operation After sending out p y and the sampling position vector N P (using bit mode), goes back to step 1.
At this step, since our measurement matrix is a sparse binary matrix, the energy expenditure for this CS compression is only involves simple addition operation.That is why we select sparse binary matrix instead widely used Gauss matrix.

Although q
x is randomly sampled from N continuous readings, due to the temporal correlation, we find that q x is still sparse under certain transform basis (fig.4 is an example from a real dataset).That is why we can still using CS to compress the readings after Matrix Completion based compression.Also, it is our main contribution of this paper.At the end, each sensor only needs to send out p readings instead of N, which results to / p N !" = # total compression ratio.

B. Recovering Algorithm at Sink Node
At the beginning, sink node generates and stores the same sparse binary matrix p q B ! using shared seed K.
After receiving CS projections p y from each node, sink node is able to reconstruct the partial readings q x through solving a 1 l minimization problem: 1 min || || . .q l p p q q p q q q q s t y B x B Here is a transform basis that can make q x sparsely represented as q ! .Suppose !!q is the solution of the convex optimization problem, than the original partial readings is !x q = !q"q !!q .With proper values of p and q, the error between !x q and q x can be very small.
After recovering from CS compression, sink node uses !x q and the binary sampling position vector N P from each sensor node to form an incomplete readings matrix !X J !N where J is the number of sensor nodes and each row contains 0 and related !x q .According to the spatial correlation, !X J !N is an approximate low rank matrix.That means we can recover the full readings matrix from convex optimization problem [11]: Here, ||.|| * is the nuclear norm, and the set ! is the positions corresponding to the partially sampled sets of readings.

III. EXPERIMENT RESULTS
We use the real world dataset from Intel Berkeley Research Lab [11] to evaluate the efficiency of our proposed method.The dataset contains temperature, humidity, light and voltage value periodically collected every 31 seconds from 54 distributed sensor nodes between February 28th and April 5th, 2004.In our experiment, temperature values from 46 nodes on March 1st, 2004 are selected (eight nodes among 54 nodes have very few values).Thus, those traces form a matrix 46 250 X !, here each row is from the readings of related sensor.X ! which belongs to sensor node 10.Fig. 2 shows the 250 coefficients after 6 level 9/7 wavelet de-correlation.We can find that there are only 5 coefficients whose absolute value is larger than 0.05, which means those readings are very sparse in wavelet domain.More importantly, fig. 3 shows that even after randomly sampling from original readings ( ! =0.5), those partial sampled readings are still sparse in wavelet domain.
In this experiment, We only compare our MCCS method with close related method MC [12], and we use CVX toolbox [14] doing CS reconstruction and TFOCS toolbox [15] doing matrix completion recovering.
The number of iteration of simulation is 100, and we calculate the average result values.In each simulation, we calculate an error matrix by comparing the recovered matrix with the original matrix 46 250 X ! .Fig. 4 and Fig. 5 show the mean value and standard deviation on different sample ratio ! between MC method and our MCCS method.From the results, we can see that with almost the same accurate, we can achieve more compression ratio than MC method.For example, with the same 0.3 sample ratio, if we set CS compression ratio ! to 0.4, our total compression ratio is 0.12 which is much less than MC method.efficient manner for sensor nodes.Our method fully exploits the low rank and sparse nature of readings among sensor nodes, and can achieve a high compression ratio of readings.Experiment on real world dataset demonstrates the efficient of the method.
Our future work include: 1) compare with other CS based methods in the literature; 2) do more experiments on other WSN deployment datasets using other CS reconstruction algorithms and MC recovering algorithms; 3) apply DCS-OMP [7] to the CS reconstruction procedure to see whether we can further reduce the number of samples required for effective recovery; and 4) modify our method to accommodate to abnormal readings.

Figure 4 . 10 Figure 5 .
Figure 4. Mean value with different sample ratio of original temperature readings from sensor node 10 SPECIAL FOCUS PAPER EFFICIENT DATA GATHERING IN WIRELESS SENSOR NETWORKS BASED ON MATRIX COMPLETION AND COMPRESSIVE…