More Web Proxy on the site http://driver.im/

research-article

Collaborative Reinforcement Learning Based Unmanned Aerial Vehicle (UAV) Trajectory Design for 3D UAV Tracking

Authors:

Changchuan YinAuthors Info & Claims

IEEE Transactions on Mobile Computing, Volume 23, Issue 12

Pages 10787 - 10802

https://doi.org/10.1109/TMC.2024.3382913

Published: 01 December 2024 Publication History

Abstract

In this paper, the problem of using one active unmanned aerial vehicle (UAV) and four passive UAVs to localize a 3D target UAV in real time is investigated. In the considered model, each passive UAV receives reflection signals from the target UAV, which are initially transmitted by the active UAV. The received reflection signals allow each passive UAV to estimate the signal transmission distance which will be transmitted to a base station (BS) for the estimation of the position of the target UAV. Due to the movement of the target UAV, each active/passive UAV must optimize its trajectory to continuously localize the target UAV. Meanwhile, since the accuracy of the distance estimation depends on the signal-to-noise ratio of the transmission signals, the active UAV must optimize its transmit power. This problem is formulated as an optimization problem whose goal is to jointly optimize the transmit power of the active UAV and trajectories of both active and passive UAVs so as to maximize the target UAV positioning accuracy. To solve this problem, a Z function decomposition based reinforcement learning (ZD-RL) method is proposed. Compared to value function decomposition based RL (VD-RL), the proposed method can find the probability distribution of the sum of future rewards to accurately estimate the expected value of the sum of future rewards thus finding better transmit power of the active UAV and trajectories for both active and passive UAVs and improving target UAV positioning accuracy. Simulation results show that the proposed ZD-RL method can reduce the positioning errors by up to 39.4% and 64.6%, compared to VD-RL and independent deep RL methods, respectively.

References

[1]

Y. Zhu, M. Chen, S. Wang, Y. Liu, and C. Yin, “Trajectory design for 3D UAV localization in UAV based networks,” in Proc. IEEE Int. Glob. Commun. Conf., Kuala Lumpur, Malaysia, 2023, pp. 4927–4932.

[2]

I. Guvenc, F. Koohifar, S. Singh, M. L. Sichitiu, and D. Matolak, “Detection, tracking, and interdiction for amateur drones,” IEEE Commun. Mag., vol. 56, no. 4, pp. 75–81, Apr. 2018.

[3]

M. Mozaffari, W. Saad, M. Bennis, Y.-H. Nam, and M. Debbah, “A tutorial on UAVs for wireless networks: Applications, challenges, and open problems,” IEEE Commun. Surv. Tut., vol. 21, no. 3, pp. 2334–2360, Third Quarter, 2019.

[4]

Z. Yang et al., “Joint altitude, beamwidth, location, and bandwidth optimization for UAV-enabled communications,” IEEE Commun. Lett., vol. 22, no. 8, pp. 1716–1719, Jun. 2018.

[5]

O. Y. Kolawole and M. Hunukumbure, “UAV based 5G indoor localization for emergency services,” in Proc. 5th Int. ACM Mobicom Workshop Drone Assist. Wirel. Commun. 5G Beyond, 2022, pp. 43–48.

[6]

Q. Wu, Y. Zeng, and R. Zhang, “Joint trajectory and communication design for multi-UAV enabled wireless networks,” IEEE Trans. Wireless Commun., vol. 17, no. 3, pp. 2109–2121, Mar. 2018.

[7]

F. Ho et al., “Decentralized multi-agent path finding for UAV traffic management,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 2, pp. 997–1008, Feb. 2022.

Digital Library

[8]

Z. Yang, C. Pan, K. Wang, and M. Shikh-Bahaei, “Energy efficient resource allocation in UAV-enabled mobile edge computing networks,” IEEE Trans. Wireless Commun., vol. 18, no. 9, pp. 4576–4589, Sep. 2019.

[9]

M. Chen et al., “Distributed learning in wireless networks: Recent progress and future challenges,” IEEE J. Sel. Areas Commun., vol. 39, no. 12, pp. 3579–3605, Dec. 2021.

[10]

J. Gui, T. Yu, B. Deng, X. Zhu, and W. Yao, “Decentralized multi-UAV cooperative exploration using dynamic centroid-based area partition,” DRONES, vol. 7, no. 6, Jun. 2023, Art. no.

[11]

H. Sier, X. Yu, I. Catalano, J. P. Queralta, Z. Zou, and T. Westerlund, “UAV tracking with LiDAR as a camera sensors in GNSS-denied environments,” Mar. 2023. [Online]. Available: https://arxiv.org/abs/2303.00277

[12]

Z. Xu, X. Zhan, Y. Xiu, C. Suzuki, and K. Shimada, “Onboard dynamic-object detection and tracking for autonomous robot navigation with RGB-D camera,” Feb. 2023. [Online]. Available: https://arxiv.org/abs/2303.00132

[13]

P. Sinha and I. Guvenc, “Impact of antenna pattern on TOA based 3D UAV localization using a terrestrial sensor network,” IEEE Trans. Veh. Technol, vol. 71, no. 7, pp. 7703–7718, Jul. 2022.

[14]

U. Bhattacherjee, E. Ozturk, O. Ozdemir, I. Guvenc, M. L. Sichitiu, and H. Dai, “Experimental study of outdoor UAV localization and tracking using passive RF sensing,” Sep. 2022. [Online]. Available: https://arxiv.org/abs/2108.07857

[15]

F. Wen, J. Shi, G. Gui, H. Gacanin, and O. A. Dobre, “3-D positioning method for anonymous UAV based on bistatic polarized MIMO radar,” IEEE Internet Things J., vol. 10, no. 1, pp. 815–827, Jan. 2023.

[16]

S. Xu, K. Dogançay, and H. Hmam, “Distributed path optimization of multiple UAVs for AOA target localization,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2016, pp. 3141–3145.

[17]

M. Silic and K. Mohseni, “An experimental evaluation of radio models for localizing fixed-wing UAVs in rural environments,” IEEE Trans. Veh. Technol, vol. 72, no. 5, pp. 5576–5586, May 2023.

[18]

M. Sadeghi, F. Behnia, and R. Amiri, “Optimal geometry analysis for TDOA-based localization under communication constraints,” IEEE Trans. Aerosp. Electron. Syst., vol. 57, no. 5, pp. 3096–3106, Oct. 2021.

[19]

A. Gendia, O. Muta, S. Hashima, and K. Hatano, “UAV positioning with joint NOMA power allocation and receiver node activation,” in Proc. IEEE Annu. Int. Symp. Pers. Indoor Mobile Radio Commun., 2022, pp. 240–245.

[20]

V. Saj, B. Lee, D. Kalathil, and M. Benedict, “Robust reinforcement learning algorithm for vision-based ship landing of UAVs,” Sep. 2022. [online]. Available: https://arxiv.org/abs/2209.08381

[21]

V. Tilwari and S. Pack, “Autonomous 3D UAV localization using taylor series linearized TDOA-based approach with machine learning algorithms,” in Proc. Int. Conf. Inf. Commun. Technol. Convergence, 2022, pp. 783–785.

[22]

B. Joshi, D. Kapur, and H. Kandath, “Sim-to-real deep reinforcement learning based obstacle avoidance for UAVs under measurement uncertainty,” Mar. 2023. [Online]. Available: https://arxiv.org/abs/2303.07243

[23]

C. Wang, J. Wang, Y. Shen, and X. Zhang, “Autonomous navigation of UAVs in large-scale complex environments: A deep reinforcement learning approach,” IEEE Trans. Veh. Technol, vol. 68, no. 3, pp. 2124–2136, Mar. 2019.

[24]

Y. Hu, M. Chen, W. Saad, H. V. Poor, and S. Cui, “Distributed multi-agent meta learning for trajectory design in wireless drone networks,” IEEE J. Sel. Areas Commun., vol. 39, no. 10, pp. 3177–3192, Oct. 2021.

Digital Library

[25]

P. Sunehag et al., “Value-decomposition networks for cooperative multi-agent learning,” Jun. 2017. [Online]. Available: https://arxiv.org/abs/1706.05296

[26]

Y. Chan and K. Ho, “A simple and efficient estimator for hyperbolic location,” IEEE Trans. Signal Process., vol. 42, no. 8, pp. 1905–1915, Aug. 1994.

Digital Library

[27]

W. Huang, H. Guo, and J. Liu, “Task offloading in UAV swarm-based edge computing: Grouping and role division,” in Proc. IEEE Glob. Commun. Conf., 2021, pp. 1–6.

[28]

J. Sabzehali, V. K. Shah, Q. Fan, B. Choudhury, L. Liu, and J. H. Reed, “Optimizing number, placement, and backhaul connectivity of multi-UAV networks,” IEEE Internet Things J., vol. 9, no. 21, pp. 21 548–21 560, Nov. 2022.

[29]

A. Albanese, P. Mursia, V. Sciancalepore, and X. Costa-Perez, “PAPIR: Practical RIS-aided localization via statistical user information,” in Proc. Int. Workshop Signal Process. Adv. Wirel. Commun., 2021, pp. 531–535.

[30]

Y. Zeng and R. Zhang, “Energy-efficient UAV communication with trajectory optimization,” IEEE Trans. Wireless Commun., vol. 16, no. 6, pp. 3747–3760, Mar. 2017.

Digital Library

[31]

X. Tong et al., “Environment sensing considering the occlusion effect: A multi-view approach,” IEEE Trans. Signal Process., vol. 70, pp. 3598–3615, 2022.

Digital Library

[32]

A. Quazi, “An overview on the time delay estimate in active and passive systems for target localization,” IEEE Trans. Acoust. Speech Signal Process., vol. ASSP-29, no. 3, pp. 527–533, Jun. 1981.

[33]

Y. Su, H. Zhou, Y. Deng, and M. Dohler, “Energy-efficient cellular-connected UAV swarm control optimization,” Mar. 2023. [Online]. Available: https://arxiv.org/abs/2303.10398

[34]

W. Dabney, G. Ostrovski, D. Silver, and R. Munos, “Implicit quantile networks for distributional reinforcement learning,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 2640–3498.

[35]

W.-F. Sun, C.-K. Lee, and C.-Y. Lee, “DFAC framework: Factorizing the value function via quantile mixture for multi-agent distributional Q-learning,” in Proc. Int. Conf. Mach. Learn., 2021, pp. 9945–9954.

[36]

W. Dabney, M. Rowland, M. Bellemare, and R. Munos, “Distributional reinforcement learning with quantile regression,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 2892–2901.

[37]

J. Zhao, Y. Zhu, X. Mu, K. Cai, Y. Liu, and L. Hanzo, “Simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted UAV communications,” IEEE J. Sel. Areas Commun., vol. 40, no. 10, pp. 3041–3056, Oct. 2022.

[38]

M. G. Bellemare, W. Dabney, and R. Munos, “A distributional perspective on reinforcement learning,” Jul. 2017. [Online]. Available: https://arxiv.org/abs/1707.06887

[39]

T. Jaakkola, M. I. Jordan, and S. P. Singh, “On the convergence of stochastic iterative dynamic programming algorithms,” Neural Comput., vol. 6, no. 6, pp. 1185–1201, Nov. 1994.

Digital Library

[40]

S. Wang et al., “Distributed reinforcement learning for age of information minimization in real-time IoT systems,” IEEE J. Sel. Topics Signal Process., vol. 16, no. 3, pp. 501–515, Jan. 2022.

[41]

H. Godrich, A. M. Haimovich, and R. S. Blum, “Target localization accuracy gain in MIMO radar-based systems,” IEEE Trans. Inf. Theory, vol. 56, no. 6, pp. 2783–2803, May 2010.

Digital Library

[42]

A. Al-Hourani, S. Kandeepan, and S. Lardner, “Optimal LAP altitude for maximum coverage,” IEEE Wireless Commun. Lett., vol. 3, no. 6, pp. 569–572, Dec. 2014.

[43]

N. Lin, Y. Fan, L. Zhao, X. Li, and M. Guizani, “Green: A global energy efficiency maximization strategy for multi-UAV enabled communication systems,” IEEE Trans. Mobile Comput., vol. 22, no. 12, pp. 7104–7120, Dec. 2023.

Digital Library

[44]

Y. Sun, D. Xu, D. W. K. Ng, L. Dai, and R. Schober, “Optimal 3D-trajectory design and resource allocation for solar-powered UAV communication systems,” IEEE Trans. Commun., vol. 67, no. 6, pp. 4281–4298, Jun. 2019.

[45]

T. Rashid, M. Samvelyan, C. S. de Witt, G. Farquhar, J. Foerster, and S. Whiteson, “QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning,” Jun. 2018. [Online]. Available: https://arxiv.org/abs/1803.11485

[46]

K. Son, D. Kim, W. J. Kang, D. E. Hostallero, and Y. Yi, “QTRAN: Learning to factorize with transformation for cooperative multi-agent reinforcement learning,” May 2019. [Online]. Available: https://arxiv.org/abs/1905.05408

[47]

C. Yu et al., “The surprising effectiveness of PPO in cooperative, multi-agent games,” Nov. 2022. [Online]. Available: https://arxiv.org/abs/2103.01955

[48]

J. G. Kuba et al., “Trust region policy optimisation in multi-agent reinforcement learning,” Apr. 2022. [Online]. Available: https://arxiv.org/abs/2109.11251

Cited By

Wang PYang HHan GYu RYang LSun GQi HWei XZhang Q(2024)Decentralized Navigation With Heterogeneous Federated Reinforcement Learning for UAV-Enabled Mobile Edge ComputingIEEE Transactions on Mobile Computing10.1109/TMC.2024.343969623:12(13621-13638)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1109/TMC.2024.3439696

Index Terms

Collaborative Reinforcement Learning Based Unmanned Aerial Vehicle (UAV) Trajectory Design for 3D UAV Tracking
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Robotics
      1. Robotic autonomy
      2. Robotic control
2. Computing methodologies
  1. Artificial intelligence
    1. Control methods
      1. Robotic planning
    2. Planning and scheduling
      1. Robotic planning
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Unmanned Aerial Vehicle (UAV) Forensics: The Good, The Bad, and the Unaddressed
Abstract
Unmanned Aerial Vehicles (UAVs) have been used for a variety of purposes including taking photographs and videos of large areas, undertaking environmental surveys, as well as conducting military operations. The growing use and functionality of ...
Pressurized Structures---Based Unmanned Aerial Vehicle Research

Several areas of unmanned aerial vehicle (UAV) performance need to be improved for the next generation of UAVs to be used successfully in expanded future combat roles. This report describes the initial research to improve the performance of UAVs through ...
ITU Tailless UAV Design

Nowadays, mini Unmanned Aerial Vehicles (UAVs) are utilized in a wide range of reconnaissance and surveillance missions with an ever increasing need for endurance and range. Thus, a slight improvement on these two primary performance parameters is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Mobile Computing

IEEE Transactions on Mobile Computing Volume 23, Issue 12

Dec. 2024

4601 pages

Issue’s Table of Contents

1536-1233 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 December 2024

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang PYang HHan GYu RYang LSun GQi HWei XZhang Q(2024)Decentralized Navigation With Heterogeneous Federated Reinforcement Learning for UAV-Enabled Mobile Edge ComputingIEEE Transactions on Mobile Computing10.1109/TMC.2024.343969623:12(13621-13638)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1109/TMC.2024.3439696

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents