1. Introduction
Internet of Things (IoT) is emerging with an expectation of billions of connected devices to the Internet. These devices utilize low-power wireless communication technologies enabling long battery life and reliable end-to-end transmissions [
1]. Capitalizing on this development, industries shift towards the Industrial Internet of Things (IIoT) through concepts such as Wireless Sensor Networks (WSN). Utilizing WSN decreases installation cost as compared to infrastructures for wired sensor networks [
2]. According to [
3], the cost of installing cables across an industrial plant can run from USD 100s/ft to USD 1000s/ft.
In 2016 the Institute of Electrical and Electronics Engineers (IEEE) 802.15.4 low-rate wireless standard [
4] was amended with a new Time-Slotted Channel Hopping (TSCH) media access control targeting industrial applications. The IPv6 over the TSCH mode of IEEE 802.15.4e (6TiSCH) working group [
5] was started by Internet Engineering Task Force (IETF) with an intent to enable IP traffic for IIoT utilizing TSCH. In addition, 6TiSCH follows IETF Deterministic Network (DetNet) working group architecture [
6], aiming to standardize layer three traffic in terms of networks requiring deterministic capabilities. Wireless networks have a higher probability of packet loss compared to wired due to interference and channel disturbance. Hence, 6TiSCH recommends packet replication in its architecture in order to increase the reliability [
5]. However, in WSN packet replication causes higher energy consumption [
7] and, thus, reduced network lifetime. Industrial networks face tough lifetime requirements that WSN struggles to meet with packet replication.
This paper proposes and investigates a novel technique utilizing a Reverse Packet Elimination (RPE) algorithm. Packet replication is used to increase the reliability of the network. In addition, RPE limits the negative effect of reduced network lifetime as the replicated packets are cancelled when the original packet successfully reached its destination. This is achieved as the RPE enables the sink to send a packet down the alternative path to eliminate the replicated packet from being transmitted upstream. The energy consumption is reduced as a RPE packet is smaller than an upstream packet, and hence utilizing less resources to traverse the path. In addition, this paper examines the effects of introducing an adjustable replication delay, where the transmission of the replicated packet is delayed. The motivation for this is to avoid transmitting the replication packet if the original packet has arrived at its destination (the sink).
3. Related Work
Our work relates to the effort of increasing the robustness of WSN conducted by The Leapfrog Collaboration (LFC) [
16,
17]. This work was recently extended in [
18] which proposes PAREO where packet retransmission, replication, and overhearing is evaluated in combination. However, to the best of our knowledge, no related work exists in terms of a reverse mechanism or replication delay. Although, in [
19] the authors visit the idea of a reverse packet mechanism in Ethernet, with the reversing of packet elimination to free bandwidth upstream. However, the proposal is not further discussed or investigated.
Packet replication and elimination techniques in WSN increases energy consumption significantly. Such a rise in energy consumption is expected as the number of packets traversing the network at minimum doubles. This is shown in [
7], where the authors implement disjoint paths and transmit a replicated packet at each path. In short, the source duplicates the packet and the sink drops the last arriving packet. The authors show that packet replication decreased packets lost with almost 90%, however, the energy consumption in their scenario increased with 86.3%.
Leapfrog Collaboration (LFC) utilizes the same principle as [
7] by dropping the last arriving packet. However, LFC implements replication and dropping at each hop. By taking advantage of promiscuous overhearing, LFC utilizes a Leapfrog Beacon which enables all nodes to make the same cell available for receiving (RX). Opening the same cell allows all nodes except the node transmitting to listen for data, and LFC uses the overhearing function to double the probability that the packet will traverse over the DODAG network. As a result, delay decreases significantly as the packet propagates the favorable path. If no nodes are within sensing range, normal retransmission schemes are used. LFC achieves minimum end-to-end reliability of 99.1% for all simulations, with the lowest packet delivery ratio (PDR) link quality at 70% [
17]. However, with a drawback as the energy consumption increases. LFC reduces the average delay by 41% compared to schemes with four retransmission attempts [
17].
4. Reverse Packet Elimination
This paper proposes a novel Reverse Packet Elimination (RPE) algorithm that aims at increasing reliability while limiting the negative effect of reduced network lifetime. Similar to [
7] two disjoint paths are created from the source to the sink. The motivation for using packet replication is to increase the reliability of the WSN. It is assumed, as in [
7], that the scheduler establishes the paths or tracks based on routing and connectivity information obtained through the control plane links [
20]. The disjoint paths or tracks may be computed by a centralized PCE running, e.g., the Traffic Aware Scheduling Algorithm (TASA) as proposed by [
21]. The source node replicates its data packet, that is one candidate packet for each path. However, the node transmits initially only one of the packets and detains the second. The following description discusses the motivation and criterion for delaying the second packets. Forwarding two packets increases the traffic load and thus the energy usage in the network. Furthermore, as the packets have the same size, they have the same drop probability over lossy links (links with high Bit Error Rate (BER)). In addition, adding traffic increases the funneling effect. The first packet that arrives at the sink, triggers the sink to send a small RPE-cancelling packet down the alternative path from the path of the packet.
The RPE-cancelling packet searches for the copy that is propagating upstream and cancels it, i.e., stop further forwarding of the packet in order to conserve energy. The basic structure of the RPE algorithm is packet replication and enabling the sink to cancel the second packet. However, a second feature is introduced, this is a delay of transmitting the second packet. Depending on the requirements for delay and robustness, the second copy can be withheld by an adjustable time delay
. This is depicted in
Figure 2.
An example network illustrating RPE is shown at the right hand side of
Figure 3. The source node is node number 7 and the sink is node number 0. From the source to the sink two disjoint paths are shown, as Path A and Path B. The source node holds initially two copies of the packet (the original and the replicated packet). Our proposed RPE method is described in six steps:
1. A PCE schedules 6TiSCH tracks in daisy chains with two disjoint paths, Path A and Path B. Path A is by design scheduled to send the first packet. As shown in
Figure 3, blue cells make up a track from source to sink (0).
2. The source node replicates the data packet and sends it along Path A. Next, the sender schedules the copied packet to be transmitted along path B, which is disjoint from A. Both paths have explicit routes towards the sink as seen in
Figure 4a. The copied packet along Path B is delayed dependent on the upper bound latency of the application. The motivation for delaying this packet will be discussed further in the following text.
3. The two packets propagate along the two disjoint paths towards the sink. As depicted in
Figure 4b, the packet arriving last is discarded if both packets arrive at the sink before the sink could send an RPE-cancelling packet.
4. If they do not arrive simultaneously, the sink sends an RPE-cancelling packet down the alternative path from path of the arriving packet as shown in
Figure 5a. For example, if the Path A packet arrives first, the RPE-cancelling packet is sent down Path B and vice versa.
5. The RPE-cancelling packet traverses towards the source. In each visiting node, the packet triggers a search in the node’s queues for a copy of the packet that arrived at the sink. As illustrated in
Figure 5b. If a packet is located with an identical sequence number, this packet is dropped.
6. If an RPE-cancelling packet arrives at the source (looping back) before the replicated packet is sent, the source revokes the transmission of the copied packet.
4.1. Packet Size
In our investigations, we utilize a max packet size of 127 bytes for data packets and the RPE-cancelling packets are set to 23 bytes. 23 bytes is the minimum packet size of an 802.15.4 frame carrying the necessary headers and two bytes containing the RPE-cancelling payload.
Link quality is defined by its BER and the BER differs for each link according to its propagation characteristics. Assuming the BER of each bit is independent, if there is a bit error in a packet, the packet is dropped. PER in Equation (
1) denotes the packet error rate for a packet with n bits.
Writing packet reception rate (PRR) as PRR = 1 − PER, the following relation is found between the PRR of two packets with different sizes (short and long packets, namely
and
), with both packets experience the same BER.
where
and
are the numbers of bits in the short packet and long packet respectively. Equation (
2) shows that the packet size matters in terms of PRR. Since the cancelling packet is significantly smaller than the larger upstream packet it has some advantages; it has higher PRR. Furthermore, the smaller RPE-cancelling packets require less transmission time as they contain fewer bits than a large data packet. This causes reduced energy consumption. Finally, due to higher PRR, the time the radio is active is lower, energy consumption decreases due to fewer retransmission attempts. From Equation (
2) we find that if a 127 bytes data packet experiences a PRR = 70%, then a 23 bytes RPE-cancelling packet experiences a PRR = 93.7% under the same BER condition.
To put this into perspective, assume the topology in
Figure 4 where the source sends replicated packets along Path A and Path B. After a while, the link quality between 1 and 3 deteriorates from 70% to 20% for the 127 bytes packet. Assume that a 127 bytes packet then has a
, a 23 bytes RPE-cancelling packet have
at the same link. This means that theoretically, a RPE-cancelling packet can traverse from 1 to 3 and stop the large packet from utilizing all its retransmission attempts. In essence, this decreases resource utilization at nodes when a change in link quality occurs. As mentioned earlier, smaller packets consume less energy than larger packets. The main contributor to energy consumption in WSN is transmitting and receiving packets [
22]. Based on this understanding the energy consumption of transmitting and receiving a packet is proportional to the packet size. Thus, transmitting short RPE-cancelling packets requires less energy than longer data packets.
4.2. Selecting Delays
This section discusses delaying the replicated data packet. Clearly, this depends on the requirements imposed on the network by the system. For example, a WSN in a control loop must supply the controller with data within a time frame in order to keep the process stable. Assume the maximum permitted delay imposed on the network is , where we have reduced the number to safeguard the system. The maximum delay is a design criterion for the delay between the source and the sink, . If the delay from source to sink meets the design criteria (), but with limited margin, the advantage of RPE is as described above. However, if there is a margin, for example, that the delay between source and sink permits that the replicated packet is delayed a time, , before it is transmitted then the second advantage of RPE comes to play. RPE provides energy savings and preserves the robustness when the replicated packet can be delayed a time before it is transmitted. This means that if the first packet is lost, the replicated packet meets the deadline as it reaches the sink. In order to reduce the traffic, RPE enables the sink to cancel unnecessary transmission of the replication packets by sending RPE-cancelling packets. As pointed out earlier, this reduces the energy consumption in the WSN. In addition, it reduces the funneling effect caused by added data traffic imposed on the nodes close to the sink (the topic is presented in the following text). Next, we discuss the time delay, .
Selected replication delays are as follows: 1, 8, 816, 1624. As mentioned earlier is always given in slots, and it is the delay from the time the first copy was sent. Our assessment of delaying the replication packet is based on three scenarios. Firstly, we assess the situation where both packets are transmitted without waiting for the cancelling packet. Secondly, we assess the situation where the delay is larger than the round-trip time from the source to the sink. This means that the replication packet is delayed so much that the first packet reaches the sink and the cancelling packet reaches back to the source. In our example, this equals eight slots. However, if the first packet experiences delay caused by retransmission, the second packet is transmitted. Delaying the replication packet further takes retransmission into account. Thirdly, we assess the situation where the replication packet is delayed to the extent that both the path A packet and the RPE-cancelling packet have exhausted all their retransmission attempts.
When
slot, the source always sends two copies. This delay has the lowest latency (for both packets) since both packets have the shortest scheduled delay. Moreover, both packets traverse Path A and Path B towards the sink with a time difference of a slot as illustrated in
Figure 6. As can be seen in the figure, the RPE-cancelling packet, from the sink towards the source, must wait for one slot. The reason is that according to the scheduler, the sink must first listen for incoming packets in the
-cell (Path B). If the replicated packet did not arrive, the next cell, marked as
, is allocated to the RPE-cancelling packet. Clearly, switching these two cells is an option, saving one transmission from node 2, at the expense of increasing the delay along path B by one slot.
The next delay is
= 8,
Figure 7 depicts the scheduler.
= 8 slots are selected based on the number of hops between source and sink, since there are 4 hops to sink and 4 hops in return to the source. As can be seen, the Path B packet will not be sent if the Path A packet arrives successfully according to the scheduler, and the reverse path back to the source do likewise. When
= 8 slots, the Path A packet has enough time to reach the sink and trigger a message to stop the source from transmitting the replicated packet.
If , the replicated packet is scheduled with a delay of at least one slotframe later. A delay of = 1624 slots is the maximum latency of the packet looping back to the source in this topology. The delay equals 16 slotframes. In essence, the Path A packet is given the time to utilize all transmission attempts before the Path B packet is transmitted. A transmission fails if a packet was lost or its acknowledgement (ACK) fails during transmission. Hence, nodes have several attempts to ensure successful transmission. However, if a transmission fails in the current configuration, a node has to wait until the next slotframe iteration before trying again. Finally, = 824 slots was selected to investigate the difference between the short and the long delays. With = 824 slots, the delay is cut from 16 slotframes to eight slotframes. The delays are topology specific except = 1 slot.
4.3. Path Considerations
RPE assumes a routing protocol capable of providing upward routes for data and replicated data, as well as downward routes for RPE-cancelling packets. The RPL routing protocol which is part of the 6TiSCH suite offers both these capabilities. Since the downward RPE-cancelling packet always originates at the sink node, both storing and non-storing RPL mode may be utilized with RPE.
The RPL preferred parent selection needs to be adapted such that the source node can employ both Path A and Path B to forward traffic. Such a mechanism is implemented in [
18]. Selection of different parents, namely Preferred Parent and Alternative Parent, is in itself a field to be studied, and is not treated further here. An example of work in this direction can be found [
23].
RPE requires routing and accompanying schedules to yield disjoint paths through the network. Investigation into this assumption, such as, e.g., the requirement on network density, is deferred to future work. As described above, for our evaluation we assume a centralized PCE as envisioned in, e.g., 6TiSCH to provide this capability. This centralized strategy seems to be an intuitive solution for these cases since it has an end-to-end awareness such that disjoint paths can be guaranteed. The 6TiSCH hop-by-hop scheduling model [
5] may also be feasible since it has the same end-to-end property.
6. Performance Results
Our hypothesis is that RPE increases the WSN’s reliability without substantial decreased network lifetime. Presumably, RPE can only support system where its critical time delay compared to the WSN round trip time (from source to sink) meets the requirements presented here. This section assesses the performance of RPE in light of these challenges. As seen in
Figure 8, nodes are named according to their path and hop count, such as A2 or B1.
6.1. Reliability
The RPE method is investigated for a WSN with links that experience packet drops. Following the same investigations as LFC [
16], we assess links with PRR 70% increasing in increments of 10% until a 100% PRR. RPE, with
= 8 slots, is compared with the Single Path scenario. It was found that all the scenarios involving packet replication showed a negligible difference in reliability as from the sinks point of view the delay
does not affect the PRR. Hence, in
Figure 9 SP is only compared to RPE
for simplicity.
Figure 9 shows that RPE increases reliability compared to Single Path. Single Path was found to have a reliability of 88.24% when the link quality was 70%. At the same link quality, RPE increases the reliability to 98.65%. Moreover, the reliability of WSN with links with link quality of 80% shows significant improvements as Single Path had a packet reception ratio of 97.7% compared to RPE with a success rate of 99.95%. Lastly, at 90% link quality, the gap in reliability shrinks to a difference of 0.09% as RPE delivers a 100.0% delivery ratio. However, there is packet loss involved, and over time statistically, a packet will be dropped.
6.2. Distribution of RPE-Cancelling Packets
Clearly, our RPE method creates additional traffic in the network compared to a single path. In order to investigate the negative effects of increased traffic, we investigated where the packets are eliminated.
Figure 10 and
Figure 11 presents the findings. They show where, along the paths, the RPE packets are eliminated.
Figure 10 shows how different PRR values change the elimination location. The reduced link quality causes the elimination to spread across all nodes.
Figure 11 shows at which node the replicated packet is eliminated when the link quality is 80%. In the figure various RPE delays are compared with RPE Overprovisioning (OP). It shows how
= 816 and
= 1624 yielded similar results. However,
= 816 experiences that a few packets were dropped along Path B, but over 30 runs the average value recorded was less than 0.1%. Although, at node B3 an average of 4 packets were eliminated accounting for 0.24% of the drops. Summarizing,
= 816 and
= 1624 shows 99.76% and 100% packet elimination occur at the sink. Delay effects are discussed in more detail in the following text.
6.3. Latencies
This section presents the simulation results with regards to latency in data sent from the source to the sink. To provide a baseline,
Table 2 shows how the maximum latency increase with longer replication delays and minimum theoretical increase when overprovisioning.
The simulated latencies show the average, minimum, maximum, and the 99th percentile. Maximum latency presents the highest value recorded for all runs. What is more, maximum latencies listen in
Table 3 equals theoretical maximum latencies.
The latencies of the 99th percentile are presented in
Table 4. Single Path has a higher 99th percentile for all link qualities compared to RPE with
= 1 and
= 8. At PRR = 90% the difference is 1.89 s between RPE with
= 1 and Single Path. Increasing
from one slot to eight slots increases 99th percentile latency with 40 ms. This is interesting as the difference in sending the replicated packet is increased from 10 ms to eight times as high at 80 ms, but 99th percentile latency is only increased with 40 ms. Decreasing the link quality increases the 99th percentile latencies for all scenarios. What is more, the difference between
= 1 and
= 8 is fairly similar for each value of PRR (70%, 80%, and 90%).
6.4. Node Lifetime
The network lifetime is an important parameter; it has, however, several definitions [
25]. Here we adopt the definition that the network lifetime equals the time until the energy of the first node depletes. In our simulations, we have utilized a battery size of 2821.5 mAh.
Figure 12 presents the average node lifetimes in years for all simulations. Due to the extra TX/RX cell when Overprovisioning (OP), Path A has a significantly lower battery lifetime compared to Path B. Comparing Path A of OP with alternative RPE shows that OP has the lowest lifetime. It shows an imbalance as Path A is over-utilized and Path B not utilized. Dual Path (DP) also has worse performance at each node, compared to all simulations using RPE.
All simulations show a correlation between lifetime and where the upstream copy was eliminated. = 816, = 1624 and OP has the highest number of packets eliminated at the source. Hence, the source has the highest lifetime, as it does not send the replicated packet as often as for the other scenarios. It is very interesting to note the decreased load on node A1 when = 1. This is due to the frequency of alternating drops at sink, A1 and B1. On the other hand, it causes the source to have the lowest battery life of all RPE scenarios. The reason for the lowest battery life at the source is as mentioned in the selection of delays, that the source always sends both copies.
We found that for all simulations the first node to run out of power is A1, except in
= 1 and
= 8 were in approximately
simulations B1 had the shortest lifetime.
Figure 13 presents the lowest node lifetimes. As seen previously in
Figure 11, no elimination was registered when using
= 816 or
= 1624, causing traffic to behave as DP and SP. This is reflected in the lowest lifetime as for both scenarios it is within 7 days of SP and DP. The effects of OP are clear with A1 running out of battery 338 days before any other version due to the over-utilization at Path A. As mentioned in
Section 4.1, packet size matters, and the effects of utilizing RPE is clear on the node lifetimes.
6.5. Funneling Effect
The funneling effect occurs as the nodes closer to the sink have a higher load than nodes further away, due to the concentration of traffic on fewer nodes. This is shown in
Figure 14 where the average lifetime at A1 and B1 for all versions using packet replication are listed. The short delay,
= 1, return the evenest distribution between A1 and B1 since packet elimination occurs at a higher frequency at these nodes as seen in
Figure 14. In addition,
= 8 decreases the funneling effect as well, but not as much as
= 1. On the other hand, increasing replication delay
decreases the load on Path B. Moreover, overprovisioning has the lowest latency, resulting in the highest load on Path A, and lowest load on Path B. As previously mentioned, this is due to the over-utilization on Path A. The funneling effect can be seen for all nodes in
Figure 14 where lifetime increases with distance from the sink.
6.6. Average Energy Consumption
Energy consumption is critical in industrial WSN. As stated in the introduction, our hypothesis is that RPE reduces energy consumption. The energy consumption is found from the current consumption, since the voltage is constant. Different WSN designs produce different average current consumption as topology influences energy consumption. The average accumulated current consumption is shown in
Figure 15. The effects of packet replication are clear, as shown in
Figure 15, Single Path utilizes 159 mAh less than second-place
= 1624. This illustrates the energy cost of replication in general.
The effects of overprovisioning are seen with the highest accumulated current consumption at 1225 mAh. This increase is due to the extra cells, which causes more idle listening than the other scenarios.
Another key point is that the RPE scenarios have a lower average current consumption compared to Dual Path. A decrease in average current consumption is important as this topology does not represent a complete network. In fact, real network data traverse from other paths as well. Hence, decreased current consumption decreases due to reduced load on certain nodes due to funneling effects.
7. Discussions
Our analysis follows the methods used by LFC, and as LFC we investigate only a section of a WSN. A complete WSN with several paths puts requirements on the network design as two paths must be available for critical traffic flows. Our RPE method imposes only the requirement that the two paths are disjoint. Paths may hold a different number of nodes; this difference will alter the performance in terms of finding the optimal delay of the replication packet. Clearly, the method depends on the requirements imposed on the WSN in terms of delay tolerances and the performance of the WSN. Our results show optimal delay equals the time delay from the source to the sink and back along the second path. In our network, this was a delay of eight slots. The reason is that in an environment with no packet losses, the tracks allow a packet to loopback and stop the source from sending the replicated packet. Based on the results with 80% link quality, = 8 achieves the best traits of all replication delays. Comparing our RPE with scenarios Dual Path and Single Path with PRR = 80%, the lowest node lifetime increased with 1.74%. Next, the average latency is reduced with 39.1% and reliability increases from 97.7% to 99.95%, when comparing our RPE with Single Path. A natural disadvantage with the RPE proposal is the increase in average current consumption compared to using a Single Path. This is due to extra tracks and packets traversing disjoint paths. However, packet replication is recommended by DetNet architecture and RPE decreases the average current consumption compared to Dual Path, increases with 18% compared to Single Path. Finally, if low average latency is critical, it is recommended to utilize Overprovisioning. Overprovisioning has a 71.2% lower average latency than . However, this comes with an 18.34% reduction in the lowest node lifetime.
8. Conclusions
A novel method, Reverse Packet Elimination (RPE), for packet replication is proposed and analyzed in 6TiSCH. RPE delays the replicated packet from being transmitted and implements a method for cancelling the transmission of the replication packet if the original packet reached its destination. RPE enables WSN with higher reliability, lower average latency and increased node lifetime compared to traditional WSN. Clearly, as with any packet replication method the traffic increases and thus the average energy consumption in the network increases. However, the method of cancelling the replication packet proves effective as it reduces unnecessary transmissions. Related work Leapfrog Collaboration (LFC) finds a 99.1% worst case reliability, with the disadvantage of increasing energy consumption. Our method has similar reliability as LFC, as we got worst-case reliability of 97.7%. Another key point, RPE decreases the funneling effect, thus reducing the energy consumption of nodes closer to the sink and increasing the WSN lifetime. In fact, the lowest node lifetime increased by 1.74% compared to not utilizing any packet replication. However, a drawback of RPE versus LFC is a slight increase in average latency due to the delayed replication packet. This implies that RPE is not optimized for WSN with the most extreme latency requirements. LFC reports a 41% reduction in average latency compared to standard retransmission schemes, while we found that RPE has a 39.1% reduction with equal retransmission attempts. Moreover, LFC is by design dependent on high node density. RPE requires only two disjoint paths to the sink. This makes RPE less dependable on topology and presumably suitable for other WSNs.
9. Future Work
The proposed RPE method uses dedicated RPE-cancelling packets for signaling the successful arrival of the packet. Alternative signaling to reduce overhead may be explored such as piggybacking the cancelling-information on other traffic. Clearly, the solution depends on the latency requirements and whether there is any communication from the receiver to the source. Furthermore, the efficiency of cancellations may be improved since the current solution mandates one signal per replicated packet. Lastly, as proposed in RAW [
15], the necessity to utilize a mechanism such as RPE could be evaluated continuously by the forwarding plane.
Identifying and selecting paths to use with RPE clearly has impact on its performance. This subject was outside of our current scope yet requires investigation before a full implementation can be made. It should also include utilizing RPE over two paths with differing characteristics such as link quality and hop count.
The evaluation of RPE needs to be broadened to include more realistic propagation models and different traffic patterns, as well as varied topologies. These could include larger networks to observe scalability, and topologies from [
18] for comparison with PAREO mechanisms. Experiments in real-world scenarios and testbeds would also be valuable.
Our implementation is limited to two disjoint paths, but extra paths could be implemented if the network has a denser topology. This would increase reliability further with a drawback of increased energy consumption.
In the current design, the traffic is always scheduled at Path A. However, implementing load balancing decreases energy consumption by distributing the load at both paths.