An RSSI-based Wireless Sensor Network Localization Algorithm with Error Checking and Correction

— This paper studies the wireless sensor network localization algo-rithm based on the received signal strength indicator (RSSI) in detail. Considering the large errors in ranging and localization of nodes made by the algorithm, this paper corrects and compensates the errors of the algorithm to improve the coordinate accuracy of the node. The improved node localization algorithm performs error checking and correction on the anchor node and the node to be measured, respectively so as to make the received signal strength value of the node to be measured closer to the real value. It corrects the weighting factor by using the measured distance between communication nodes to make the coordinate of the node to be measured more accurate. Then, it calculates the mean deviation of localization based on the anchor node close to the node to be measured and compensates the coordinate error. Through the simulation experiment, it is found that the new localization algorithm with error checking and correction proposed in this paper improves the localization accuracy by 5%-6% compared with the weighted centroid algorithm based on RSSI.


Introduction
Wireless Sensor Network (WSN) is currently a popular information acquisition and processing technology and one of the research hotspots. WSN technology integrates sensor, wireless network communication and embedded computing technologies and transmits data and information through WSN nodes. It randomly sows a large number of cheap sensor nodes in a detection region and makes them self-organize and form an intelligent network system via wireless communication, which can sense and acquire various target information in the detection region at any time. In a WSN, the geographical locations of nodes are very important to the whole network. Knowing the location of the event and the information about the monitoring target is the basis for the real-time monitoring of the sensor network. Therefore, it is very necessary to apply an appropriate localization algorithm to obtain the location information of each communication node in the WSN.
In recent years, many domestic and foreign research enthusiasts have carried out in-depth studieson the WSN localization algorithm. Juan [1]. proposed a WSN distributed localization method based on local spatial constraints, which estimates the coordinate of a sensor node by using a program to constantly iteratively solve a set of local spatial constraintsand then constantly updates it based on the adjacent nodes in the communication range. ThuL [2]. proposed a compressed sensing localization method based on deterministic sensing matrix, which defines multiple sensor nodes as a sparse matrix in a discrete region, recovers noise measurement withthe received signal strength information, and then locates the target. Jian [3]. proposed a clusterbased WSNlocalization algorithm, which uses the cluster structure and a global system to represent the logic of the network andreduces measurement errorsbased on the probability of multi-hop, so as to improve the localization accuracy of nodes. Qiao Xin et al. [4]. proposed using percentage to correct the average hop distance of anchor nodes in DV-Hop, and used the differential particle swarm optimization algorithm to optimize the location coordinates of the nodes to be measured. WenJiangtao et al. [5]. analyzed the defects and shortcomings of the classical DV-HOP algorithm, used a quantization model to represent node errors and finally utilized the weighting factor to correct the location coordinates of the nodes. Cui Fayi [6]. proposed the quadrilateral localization algorithm, which selects the four anchor nodes with the least path loss, performs combination ranging of three of them and takes the average of the coordinates obtained from the qualified triangles as the location coordinate of the node to be measured, which improves the accuracy to a certain extent.
By comparing different localization algorithms, this paper finds that the RSSIbased localization algorithm has a high accuracy and there is no need for additional hardware, so this paper further studies the RSSI-based WSN localization algorithm, aiming to reduce the measurement errors in the ranging and localization of sensor nodes and improve the localization accuracy of the nodes to be measured.

2
Analysis of the anchor node localization algorithm

Establishment of the network model
At present, popular network models include the free space propagation model, hata model, log-distance distribution model and log-distance path loss model. This algorithm integrates all the above models. The free space propagation model is as follows: where d is the distance from the signal source, km, f is the frequency, MHz; and k is the path attenuation factor. Due to the complexity of the fixed application environment and the dispersibility of node performance, there is some gap between the resulting radio propagation path loss and the actual value, but the log-distance path loss model can reduce the gap.The following formula is used to calculate the path loss when the anchor node receives information: where, PL(d) represents the path loss after distance d, ! ! represents the Gaussiandistributed random variable with a mean of 0, and its standard deviation is between 4-10; k ranges from 2 to 4. Suppose d=1m, then substitute it into Formula (2) to obtain the valueof!"!! ! !. So when each anchor node receives the signal, the signal strength is calculated as below: In the above equation, P represents the transmission power; and G represents the antenna gain. The error in the absolute distance produced by the deviation of the RSSI value decreases as the distance from the node to the signal source shortens. This is because, due to the impact of ! ! , the RSSI fluctuation causes the absolute distance error to become larger, which in turn results in an increase in the RSSI value.

Distance estimation based on RSSI
Assuming that in the ideal case, the received signal strength indicator between the two nodes decreases with the increase of the distance -the longer the distance, the smaller the strength, so the distance between two nodes corresponding to the minimum received signal strength is the longest. When all the RSSI values in the entire network are collected, find the minimum value in the set C of the RSSI values, as shown in the following formula: Then the distance between the nodes corresponding to! !"# is the longest, denoted as! !"# , and the distance !!!! !! between the two nodes corresponding to the edge !!!! !! is denoted as: Since ! !"# represents the maximum value of the distance between communicable nodes in the entire network, if the distance between the two nodes is large enough, they will not be able to communicate again as they cannot receive signals from each other and! !"# will becloser to the communication radius R of the wireless sensor network node.So the value of ! !"# is approximately equal to R in the algorithm and the estimated distance of the edge !!!! !! is obtained as follows: The estimated distance can be obtained by the above equation, but the actual situation is that ! ! ! !"# , so the estimated distance of these communicable anchor nodes should be greater than or equal to the actual distance between them, i.e., ! !"#$% ! ! !" .Such an approximate estimate will result in some error in the localization algorithm.
In addition, the localization algorithm also eliminates ! ! when it finds the relative relationship between each localization node. ! ! represents the scale factor between the path length and the path loss in the wireless signal propagation model. The values of ! ! are different depending on the environment. If the scale factor is eliminated, the observer will be able to estimate the distance between the nodes with the value of ! ! unknown, which will bring many benefits to the algorithm. Therefore, this distance estimation algorithm can be applied in any environment, and, with this algorithm, we can also deduce the value of ! ! , as shown below: where R represents the communication radius of the sensor localization node. This result can also be used as an angle to observe the unknown environment when the sensor node is deployed in an unknown area.

Error correction analysis of anchor nodes
The error of the beacon node is obtained through comparison of the distance value measured by RSSI and the actual measured one. Take an anchor nodewith known coordinates as the node to be measured, calculate its coordinates by the trilateration method and then compare them with the actual coordinates to obtain the errors. So during the localization of an anchor node, if both errors are taken into account, the impacts of the interference factors in the network can be reduced. Suppose the adaptive weighted correction coefficient for the RSSI ranging of anchor node is as follows: is the adaptive weighted factor, representing the weight of each RSSI measured value. ! ! ! !! ! ! ! !! !!! !" , where ! ! stands for the measured distance between the anchor nodes to be corrected and i stands for the number of anchor nodes used for correction. For each anchor node in the entire network, the location information of other anchor nodes can always be used to correct its RSSI measured value.µ can reflect theRSSI measurementaccuracy ofthe anchor nodes. The correction distance between the anchor nodesto be measured can be expressed as: Since the adaptive weighted correction coefficient µ can reflect the error generated by the anchor node in measuring the RSSI value, it is possible to improve the accuracy of the distance measurement between localization nodes. But this correction algorithm still cannot correct errors caused by many unfavorable factors like obstruction and partition in the trilateration, so anchor node information needs to be used for further correction. The coordinate errors of the anchor node can be expressed as: where ! ! represents the actual X-coordinate value of the j-th anchor node; ! !" represents the calculated X-coordinate value of the j-th anchor node; ! ! represents the actual Y-coordinate value of the j-th anchor node; ! !" represents the calculated Y-coordinate value of the j-th anchor node. The calculated coordinate errorsof the anchor node are the localization errors of the network where the anchor node is located. The network localization errors can be expressed as: where ! ! ! ! !!!! !" denotes the X coordinate error of the j-th node; and! !" denotes the Y coordinate error of the j-th node. The network localization error is actually the mean value of the anchor node coordinate localization errors, so the final coordinates of the anchor node to be measured can be expressed as: where ! ! is the X-coordinate of the anchor node to be measured by the trilateration method; ! ! represents the calculated Y-coordinate of the anchor node to be measured. The above analysis shows that the distance between each anchor node is measured by the known anchor nodes. µ represents the measurement error coefficient of the anchor nodes, and it is used only to correct the distances calculated by anchor node measurement.

3
Error analysis of the node to be measured

Errors in distance estimation
Errors in measurement of the RSSI value. The ideal RSSI value between two sensor nodes i, j, which can communicate with each other, is denoted as! !" , and the measured RSSI value isdenoted as! !" .Here the Gaussian random variable is used to fit the signal interference in the real environment to obtainan analog value.! !" can be expressed in the formula below: where ! ! is the Gaussian random variable with a mean of0 and a variance of !, and the larger the !, the greater the error value corresponding to ! !" . When ! is constant, ! ! will vary within the corresponding range. Therefore, when the distance between i and j increases, the energy value ! !" it receives will be smaller. As can be seen from Formula (13), when ! !" is small, the error of the corresponding! !" will be greater, so in actual practice, we may calculate the mean of RSSI values between two communicable nodes to reduce the error caused by the RSSI ranging.
Assuming that the true distance between i and j is ! !" and that the estimated distance through calculation is denoted as! !" , then according to Formula (5), the distance between two communication points can be obtained, as shown in Formula (14): Here we can approximately take! !"# ! !, where R represents the communication radius of the sensor localization node, but in reality,! !"# ! !, so this approximate value will also bring some error to the estimated distance. If after one of the RSSI value measurement, ! !" and ! !"# are known, then R can be approximated to be the error of the maximum distance between communicable nodes. Assuming that ! !"# ! !!!! ! ! ! !!, Formula (14) can be obtained as follows: Formula (16) is the error produced by R beingapproximated as! !"# , and from the formula, it can be seen that ! ! itself involves the measurement error of RSSI value, so errors will be accumulated in the calculation process of ! !" .
Errors produced by the approximation of the communication radius R of WSN nodes and the maximum distance between communicable nodes. The relationship model between RSSI and distance can be expressed as a quadratic polynomial, i.e.!""# ! !! ! !"#$%& ! , where a is the coefficient to be determined and length is the distance. In this Gaussian model, there must be sensitive and non-sensitive regions, and the sensitive region refers to the RSSI value rangewithin which any value will cause a great change in the distance. When the measured RSSI value is within this sensitive region, the error produced by the calculated distance will be larger; otherwise, the error will be reduced.

Errors in location calculation
After all the estimated distances are obtained,the coordinates of all nodes to be measured will be solved at once through the planning method,so the algorithm does not have a clear boundary whether for the location calculation or the localization algorithm, and instead, it adopts the cross method. Using different planning solutions will result in different localization errors.
Signals will be attenuated in the process of propagation, so ranging through the RSSI value will produce errors in distance, and asa result, the three circles do not always intersect at one point in the trilateration method.The actual situation may be similar to the result shown in Figure 1. It can be seen from Figure 1 that, the actual measurement is somewhat different from the ideal result, which isaffectedby environmental factors.
The impacts of the external environment on the RSSI value are also relatively large, as there are alwaysunstable factors.Different application environments have different error interferences, and the application environment for localization nodescontain many variable factors, which affect the wireless signal transmission of the nodes.
In summary, it can be seen that both the location calculation error and the localization calculation error can be reflected in error checking and correction by the planning problem solution method.

Design of the localization algorithm for nodes to be measured with error checking and correction
The localization of anchor nodes described above is mainly to assist the localization of the nodes to be measured.The node to be measured needs to send itslocalization data frame to complete the localization. The anchor node function is to broadcast its own coordinates, and the under-test node function is to complete thestorage of anchor node coordinates, distance accumulation and the execution of the localization algorithm. http://www.i-joe.org

Fig. 2. Flowchart of the localization algorithm for nodes to be measured
The flowchart of the localization algorithm for the nodes to be measured is shown in Figure 2. After the networking of the anchor nodes, the anchor nodes will broadcast their localization data frames in an orderly manner mainly through the sequence trigger mechanism, which will send out a data frame every 2 seconds. Here the data frame is broadcast for once mainly to prevent the conflicts between massive data, and we use the TTL value to control the broadcast range to prevent data deluge. When localizing the data frame of an anchor node, we need to check whether the node to measured that has been passed has received the data frame. If it has, we need to determine the cumulative distance of the two data frames. If it is small, it can be kept so that the cumulative sum of RSSI values and TTL can be continuously updated. If the node to be measured receives the data frames from 3 or more anchor nodes, the localization algorithm can be executed. At this point, the coordinates of local anchor nodes need to be selected. The greater the hop count is, the greater the cumulative error of the RSSI distance will be, so the localization data frames of the anchor nodes with lower hop counts should be selected if possible to ensure the localization accuracy of the node to be measured in the network.

RSSI ranging correction model analysis
The RSSI technology is a localization technology used to calculate the location of a node by the propagation model of radio frequency signals. It mainly estimates the distance between the node to be measured and multiple anchor nodes by the degree of signal attenuation, and then estimates the coordinates of the node to be measured based on the calculated distance. In practical practice, the diffraction of radio waves are affected by multiple paths, thereby reducing the localization accuracy. We set the coordinates of all anchor nodes as !! ! ! ! ! !, and the coordinates of the node to be measured as!!!! !!, which along with the surrounding anchor nodes can determine different circles, as shown in the following formula: If the interference of noise is not considered, the intersection of these circles will be the coordinates of the node to be measured, as shown in Figure 3.
The locations of the anchor nodes are fixed, so the distance between any two anchor nodes is constant, andrelative to the plane system, the node to be measured will be localized on the hyperbola with the two anchor nodes as the intersections, as shown in Formula (18): . 3. Intersections of circles where i=2,3…M; and M represents the number of anchor nodes in the network. The localization performance is affected by the accuracy of the path loss model. If the loss model can accurately estimate the propagation distance, the localization accuracy will increase; otherwise, it will be poor. Here we use an RSSI-based circular correction method to solve the accuracy problem. The overall design theory is as follows: first, the node to be measured receives the data frames sent by the anchor nodes around it and the localization algorithmis used to estimate all the RSSI values,then sort the RSSI valuesreceived by the node, and find three anchor nodes with the strongest RSSI values among the surrounding ones, and then it calculates the coordinates of the node to be measured using trilateration and other related algorithms. Secondly, the anchor node with the highest RSSI value is shielded, and the remaining two anchor nodes and the next anchor node with ahigh RSSI value form a triangular region. In this way, the new coordinates of the node will be calculated through the localization algorithm and the offset direction of the node can be determined. Then again, the anchor node with the highest RSSI value is removed from the anchor nodes. Each time the node shifts for a certain distance, the localization will be repeated to achieve a higher localization accuracy.

Deployment of communication nodes
In order to test the performance of the proposed localization algorithm, the improved localization algorithm and several other localization algorithms are simulated and verified on the Matlab software test platform. First, all the anchor nodes and the nodes to be measured are randomly distributed in the 100m"100m square monitoring region, with the communication radius of the radio frequency being R and the experimental path loss coefficient n being set to 3. Due to the impacts of obstacles and reflection in the actual environment, themean square deviation of Gaussian distribution is introduced and set as 2. As shown in Fig.4, 200 sensor nodes are randomly and evenly distributed in the square simulation area,including 60 anchor nodes and 140 nodes to be measured. The blue o indicates the nodes to be measured, and the red * indicates the anchor nodes. The improved localization algorithm and the comparative localization algorithms are applied to the monitoring region respectively to understand the characteristics of each algorithm and the improved performance.

Simulation of the improved algorithm
Through the design and deployment of the monitoring region, sensor nodes are randomly distributed in the square monitoring region. Each node has its own communication range and scans other nodes in the communication range to communicate. Data are exchanged between nodes through communication. The sensor node can obtain the location relations with other nodes within its communication radius, and then uses the appropriate ranging method to calculate the neighbor relation graph to determine whether the anchor nodes around it satisfy the localization requirements. Then the node localization is carried out, as shown in Figure 5. In the localization process, the most important evaluation criterion for an algorithm is the average localization error, so it is necessary to normalize the average localization error. The average error formula is as follows: Where N is the number of nodes to be measured, R is the communication us,!!! ! ! ! ! ! is the actual value of the node to be measured and!! ! ! ! ! ! ! ! is the measured value of the node to be measured. Therefore, thelocalization error of each node can be obtained from the estimated localization and the actual location of the node to be measured, as shown in Figure 6.  Figure 6 shows the localization error diagram of the improved localization algorithm in the square simulation region. Through calculation of the neighbor relationships between nodes, the location coordinates of the nodes to be measured are located and the corresponding errors are generated. The red * denotes the anchor nodes, the blue o denotes the nodes to be measured and the blue lines are the localization errors of the nodes.

Experimental results analysis
The triangular centroid algorithm based on RSSI, the triangular weighted centroid algorithm based on RSSI and the localization algorithm with error checking and correction proposed in this paper are compared in terms of anchor node number and communication radius through simulation test. Figure 7 shows the variation curves of the average localization errors of the three localization algorithms under different anchor node numbers. When the total number of nodes is 100 and the communication radius is 20cm, the number of anchor nodes is increased gradually from 10 to 50. From this figure, it can be seen that, when the number of anchor nodes increases, the average localization errors of the nodes to be measured decrease. When there are 10 anchor nodes, the average localization error of the original localization algorithm is around 49% while that of the improved algorithm proposed in this paper is around 37%. When there are 50 anchor nodes, the average localization error of the original algorithm is around 22% and that of the improved one is about 9%. In summary, the improved algorithm proposed in this paper reduces the average localization error by about 12% and the average localization error is 5% less than those of other improved algorithms.

Conclusions
The node localization technology is a key technology in the wireless sensor network applications, and it is also applied in all areas of people's lives. In recent years, with the continuous development of intelligent network and communication technology, the accurate localization of target objects is very important to the whole network monitoring, and its scientific research value has also been paid wide attention to by scholars. Considering the traditional localization algorithm is relatively inefficient and the localization error is large, this paper puts forward some improvements to the traditional localization algorithm. The improved algorithm improves the localization efficiency and maintainsstable operation, but more importantly, it reduces the measurement errors in the ranging and localization of the algorithm and improves the localization accuracy of WSN nodes.