[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Collaborative Reinforcement Learning Based Unmanned Aerial Vehicle (UAV) Trajectory Design for 3D UAV Tracking

Published: 01 December 2024 Publication History

Abstract

In this paper, the problem of using one active unmanned aerial vehicle (UAV) and four passive UAVs to localize a 3D target UAV in real time is investigated. In the considered model, each passive UAV receives reflection signals from the target UAV, which are initially transmitted by the active UAV. The received reflection signals allow each passive UAV to estimate the signal transmission distance which will be transmitted to a base station (BS) for the estimation of the position of the target UAV. Due to the movement of the target UAV, each active/passive UAV must optimize its trajectory to continuously localize the target UAV. Meanwhile, since the accuracy of the distance estimation depends on the signal-to-noise ratio of the transmission signals, the active UAV must optimize its transmit power. This problem is formulated as an optimization problem whose goal is to jointly optimize the transmit power of the active UAV and trajectories of both active and passive UAVs so as to maximize the target UAV positioning accuracy. To solve this problem, a Z function decomposition based reinforcement learning (ZD-RL) method is proposed. Compared to value function decomposition based RL (VD-RL), the proposed method can find the probability distribution of the sum of future rewards to accurately estimate the expected value of the sum of future rewards thus finding better transmit power of the active UAV and trajectories for both active and passive UAVs and improving target UAV positioning accuracy. Simulation results show that the proposed ZD-RL method can reduce the positioning errors by up to 39.4% and 64.6%, compared to VD-RL and independent deep RL methods, respectively.

References

[1]
Y. Zhu, M. Chen, S. Wang, Y. Liu, and C. Yin, “Trajectory design for 3D UAV localization in UAV based networks,” in Proc. IEEE Int. Glob. Commun. Conf., Kuala Lumpur, Malaysia, 2023, pp. 4927–4932.
[2]
I. Guvenc, F. Koohifar, S. Singh, M. L. Sichitiu, and D. Matolak, “Detection, tracking, and interdiction for amateur drones,” IEEE Commun. Mag., vol. 56, no. 4, pp. 75–81, Apr. 2018.
[3]
M. Mozaffari, W. Saad, M. Bennis, Y.-H. Nam, and M. Debbah, “A tutorial on UAVs for wireless networks: Applications, challenges, and open problems,” IEEE Commun. Surv. Tut., vol. 21, no. 3, pp. 2334–2360, Third Quarter, 2019.
[4]
Z. Yang et al., “Joint altitude, beamwidth, location, and bandwidth optimization for UAV-enabled communications,” IEEE Commun. Lett., vol. 22, no. 8, pp. 1716–1719, Jun. 2018.
[5]
O. Y. Kolawole and M. Hunukumbure, “UAV based 5G indoor localization for emergency services,” in Proc. 5th Int. ACM Mobicom Workshop Drone Assist. Wirel. Commun. 5G Beyond, 2022, pp. 43–48.
[6]
Q. Wu, Y. Zeng, and R. Zhang, “Joint trajectory and communication design for multi-UAV enabled wireless networks,” IEEE Trans. Wireless Commun., vol. 17, no. 3, pp. 2109–2121, Mar. 2018.
[7]
F. Ho et al., “Decentralized multi-agent path finding for UAV traffic management,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 2, pp. 997–1008, Feb. 2022.
[8]
Z. Yang, C. Pan, K. Wang, and M. Shikh-Bahaei, “Energy efficient resource allocation in UAV-enabled mobile edge computing networks,” IEEE Trans. Wireless Commun., vol. 18, no. 9, pp. 4576–4589, Sep. 2019.
[9]
M. Chen et al., “Distributed learning in wireless networks: Recent progress and future challenges,” IEEE J. Sel. Areas Commun., vol. 39, no. 12, pp. 3579–3605, Dec. 2021.
[10]
J. Gui, T. Yu, B. Deng, X. Zhu, and W. Yao, “Decentralized multi-UAV cooperative exploration using dynamic centroid-based area partition,” DRONES, vol. 7, no. 6, Jun. 2023, Art. no.
[11]
H. Sier, X. Yu, I. Catalano, J. P. Queralta, Z. Zou, and T. Westerlund, “UAV tracking with LiDAR as a camera sensors in GNSS-denied environments,” Mar. 2023. [Online]. Available: https://arxiv.org/abs/2303.00277
[12]
Z. Xu, X. Zhan, Y. Xiu, C. Suzuki, and K. Shimada, “Onboard dynamic-object detection and tracking for autonomous robot navigation with RGB-D camera,” Feb. 2023. [Online]. Available: https://arxiv.org/abs/2303.00132
[13]
P. Sinha and I. Guvenc, “Impact of antenna pattern on TOA based 3D UAV localization using a terrestrial sensor network,” IEEE Trans. Veh. Technol, vol. 71, no. 7, pp. 7703–7718, Jul. 2022.
[14]
U. Bhattacherjee, E. Ozturk, O. Ozdemir, I. Guvenc, M. L. Sichitiu, and H. Dai, “Experimental study of outdoor UAV localization and tracking using passive RF sensing,” Sep. 2022. [Online]. Available: https://arxiv.org/abs/2108.07857
[15]
F. Wen, J. Shi, G. Gui, H. Gacanin, and O. A. Dobre, “3-D positioning method for anonymous UAV based on bistatic polarized MIMO radar,” IEEE Internet Things J., vol. 10, no. 1, pp. 815–827, Jan. 2023.
[16]
S. Xu, K. Dogançay, and H. Hmam, “Distributed path optimization of multiple UAVs for AOA target localization,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2016, pp. 3141–3145.
[17]
M. Silic and K. Mohseni, “An experimental evaluation of radio models for localizing fixed-wing UAVs in rural environments,” IEEE Trans. Veh. Technol, vol. 72, no. 5, pp. 5576–5586, May 2023.
[18]
M. Sadeghi, F. Behnia, and R. Amiri, “Optimal geometry analysis for TDOA-based localization under communication constraints,” IEEE Trans. Aerosp. Electron. Syst., vol. 57, no. 5, pp. 3096–3106, Oct. 2021.
[19]
A. Gendia, O. Muta, S. Hashima, and K. Hatano, “UAV positioning with joint NOMA power allocation and receiver node activation,” in Proc. IEEE Annu. Int. Symp. Pers. Indoor Mobile Radio Commun., 2022, pp. 240–245.
[20]
V. Saj, B. Lee, D. Kalathil, and M. Benedict, “Robust reinforcement learning algorithm for vision-based ship landing of UAVs,” Sep. 2022. [online]. Available: https://arxiv.org/abs/2209.08381
[21]
V. Tilwari and S. Pack, “Autonomous 3D UAV localization using taylor series linearized TDOA-based approach with machine learning algorithms,” in Proc. Int. Conf. Inf. Commun. Technol. Convergence, 2022, pp. 783–785.
[22]
B. Joshi, D. Kapur, and H. Kandath, “Sim-to-real deep reinforcement learning based obstacle avoidance for UAVs under measurement uncertainty,” Mar. 2023. [Online]. Available: https://arxiv.org/abs/2303.07243
[23]
C. Wang, J. Wang, Y. Shen, and X. Zhang, “Autonomous navigation of UAVs in large-scale complex environments: A deep reinforcement learning approach,” IEEE Trans. Veh. Technol, vol. 68, no. 3, pp. 2124–2136, Mar. 2019.
[24]
Y. Hu, M. Chen, W. Saad, H. V. Poor, and S. Cui, “Distributed multi-agent meta learning for trajectory design in wireless drone networks,” IEEE J. Sel. Areas Commun., vol. 39, no. 10, pp. 3177–3192, Oct. 2021.
[25]
P. Sunehag et al., “Value-decomposition networks for cooperative multi-agent learning,” Jun. 2017. [Online]. Available: https://arxiv.org/abs/1706.05296
[26]
Y. Chan and K. Ho, “A simple and efficient estimator for hyperbolic location,” IEEE Trans. Signal Process., vol. 42, no. 8, pp. 1905–1915, Aug. 1994.
[27]
W. Huang, H. Guo, and J. Liu, “Task offloading in UAV swarm-based edge computing: Grouping and role division,” in Proc. IEEE Glob. Commun. Conf., 2021, pp. 1–6.
[28]
J. Sabzehali, V. K. Shah, Q. Fan, B. Choudhury, L. Liu, and J. H. Reed, “Optimizing number, placement, and backhaul connectivity of multi-UAV networks,” IEEE Internet Things J., vol. 9, no. 21, pp. 21 548–21 560, Nov. 2022.
[29]
A. Albanese, P. Mursia, V. Sciancalepore, and X. Costa-Perez, “PAPIR: Practical RIS-aided localization via statistical user information,” in Proc. Int. Workshop Signal Process. Adv. Wirel. Commun., 2021, pp. 531–535.
[30]
Y. Zeng and R. Zhang, “Energy-efficient UAV communication with trajectory optimization,” IEEE Trans. Wireless Commun., vol. 16, no. 6, pp. 3747–3760, Mar. 2017.
[31]
X. Tong et al., “Environment sensing considering the occlusion effect: A multi-view approach,” IEEE Trans. Signal Process., vol. 70, pp. 3598–3615, 2022.
[32]
A. Quazi, “An overview on the time delay estimate in active and passive systems for target localization,” IEEE Trans. Acoust. Speech Signal Process., vol. ASSP-29, no. 3, pp. 527–533, Jun. 1981.
[33]
Y. Su, H. Zhou, Y. Deng, and M. Dohler, “Energy-efficient cellular-connected UAV swarm control optimization,” Mar. 2023. [Online]. Available: https://arxiv.org/abs/2303.10398
[34]
W. Dabney, G. Ostrovski, D. Silver, and R. Munos, “Implicit quantile networks for distributional reinforcement learning,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 2640–3498.
[35]
W.-F. Sun, C.-K. Lee, and C.-Y. Lee, “DFAC framework: Factorizing the value function via quantile mixture for multi-agent distributional Q-learning,” in Proc. Int. Conf. Mach. Learn., 2021, pp. 9945–9954.
[36]
W. Dabney, M. Rowland, M. Bellemare, and R. Munos, “Distributional reinforcement learning with quantile regression,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 2892–2901.
[37]
J. Zhao, Y. Zhu, X. Mu, K. Cai, Y. Liu, and L. Hanzo, “Simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted UAV communications,” IEEE J. Sel. Areas Commun., vol. 40, no. 10, pp. 3041–3056, Oct. 2022.
[38]
M. G. Bellemare, W. Dabney, and R. Munos, “A distributional perspective on reinforcement learning,” Jul. 2017. [Online]. Available: https://arxiv.org/abs/1707.06887
[39]
T. Jaakkola, M. I. Jordan, and S. P. Singh, “On the convergence of stochastic iterative dynamic programming algorithms,” Neural Comput., vol. 6, no. 6, pp. 1185–1201, Nov. 1994.
[40]
S. Wang et al., “Distributed reinforcement learning for age of information minimization in real-time IoT systems,” IEEE J. Sel. Topics Signal Process., vol. 16, no. 3, pp. 501–515, Jan. 2022.
[41]
H. Godrich, A. M. Haimovich, and R. S. Blum, “Target localization accuracy gain in MIMO radar-based systems,” IEEE Trans. Inf. Theory, vol. 56, no. 6, pp. 2783–2803, May 2010.
[42]
A. Al-Hourani, S. Kandeepan, and S. Lardner, “Optimal LAP altitude for maximum coverage,” IEEE Wireless Commun. Lett., vol. 3, no. 6, pp. 569–572, Dec. 2014.
[43]
N. Lin, Y. Fan, L. Zhao, X. Li, and M. Guizani, “Green: A global energy efficiency maximization strategy for multi-UAV enabled communication systems,” IEEE Trans. Mobile Comput., vol. 22, no. 12, pp. 7104–7120, Dec. 2023.
[44]
Y. Sun, D. Xu, D. W. K. Ng, L. Dai, and R. Schober, “Optimal 3D-trajectory design and resource allocation for solar-powered UAV communication systems,” IEEE Trans. Commun., vol. 67, no. 6, pp. 4281–4298, Jun. 2019.
[45]
T. Rashid, M. Samvelyan, C. S. de Witt, G. Farquhar, J. Foerster, and S. Whiteson, “QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning,” Jun. 2018. [Online]. Available: https://arxiv.org/abs/1803.11485
[46]
K. Son, D. Kim, W. J. Kang, D. E. Hostallero, and Y. Yi, “QTRAN: Learning to factorize with transformation for cooperative multi-agent reinforcement learning,” May 2019. [Online]. Available: https://arxiv.org/abs/1905.05408
[47]
C. Yu et al., “The surprising effectiveness of PPO in cooperative, multi-agent games,” Nov. 2022. [Online]. Available: https://arxiv.org/abs/2103.01955
[48]
J. G. Kuba et al., “Trust region policy optimisation in multi-agent reinforcement learning,” Apr. 2022. [Online]. Available: https://arxiv.org/abs/2109.11251

Cited By

View all
  • (2024)Decentralized Navigation With Heterogeneous Federated Reinforcement Learning for UAV-Enabled Mobile Edge ComputingIEEE Transactions on Mobile Computing10.1109/TMC.2024.343969623:12(13621-13638)Online publication date: 1-Dec-2024

Index Terms

  1. Collaborative Reinforcement Learning Based Unmanned Aerial Vehicle (UAV) Trajectory Design for 3D UAV Tracking
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image IEEE Transactions on Mobile Computing
            IEEE Transactions on Mobile Computing  Volume 23, Issue 12
            Dec. 2024
            4601 pages

            Publisher

            IEEE Educational Activities Department

            United States

            Publication History

            Published: 01 December 2024

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 02 Mar 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)Decentralized Navigation With Heterogeneous Federated Reinforcement Learning for UAV-Enabled Mobile Edge ComputingIEEE Transactions on Mobile Computing10.1109/TMC.2024.343969623:12(13621-13638)Online publication date: 1-Dec-2024

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media