Learning to Communicate with Reinforcement Learning for an Adaptive Traffic Control System

Simon Vanneste¹⁰,
Gauthier de Borrekens¹⁰,
Stig Bosmans¹⁰,
Astrid Vanneste¹⁰,
Kevin Mets¹¹,
Siegfried Mercelis¹⁰,
Steven Latré¹¹ &
…
Peter Hellinckx¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 343))

Included in the following conference series:

International Conference on P2P, Parallel, Grid, Cloud and Internet Computing

775 Accesses
1 Altmetric

Abstract

Recent work in multi-agent reinforcement learning has investigated inter agent communication which is learned simultaneously with the action policy in order to improve the team reward. In this paper, we investigate independent Q-learning (IQL) without communication and differentiable inter-agent learning (DIAL) with learned communication on an adaptive traffic control system (ATCS). In real world ATCS, it is impossible to present the full state of the environment to every agent so in our simulation, the individual agents will only have a limited observation of the full state of the environment. The ATCS will be simulated using the Simulation of Urban MObility (SUMO) traffic simulator in which two connected intersections are simulated. Every intersection is controlled by an agent which has the ability to change the direction of the traffic flow. Our results show that a DIAL agent outperforms an independent Q-learner on both training time and on maximum achieved reward as it is able to share relevant information with the other agents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 143.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 179.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Intelligent Traffic Control by Multi-agent Cooperative Q Learning (MCQL)

A Framework for Smart Traffic Controller by Improved Cooperative Multi-agent Learning Algorithms (ICMALA)

Cooperative Traffic Signal Control Based on Multi-agent Reinforcement Learning

References

Commission, E.: Roadmap to a Single European Transport Area: Towards a Competitive and Resource Efficient Transport System: White Paper. Publications Office of the European Union (2011)
Google Scholar
El-Tantawy, S., Abdulhai, B., Abdelgawad, H.: Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (marlin-atsc): methodology and large-scale application on downtown toronto. IEEE Trans. Intell. Transp. Syst. 14(3), 1140–1150 (2013)
Article Google Scholar
Foerster, J.N., Assael, Y.M., De Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. arXiv preprint arXiv:1605.06676 (2016)
Kok, J.R., Vlassis, N.: Using the max-plus algorithm for multiagent decision making in coordination graphs. In: Robot Soccer World Cup, pp. 1–12. Springer (2005)
Google Scholar
Liang, E., et al.: RLlib: abstractions for distributed reinforcement learning. In: International Conference on Machine Learning (ICML) (2018)
Google Scholar
Lopez, P.A., et al.: Microscopic traffic simulation using sumo. In: The 21st IEEE International Conference on Intelligent Transportation Systems. IEEE (2018). https://elib.dlr.de/124092/
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275 (2017)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Oliehoek, F.A., Amato, C.: A concise introduction to decentralized POMDPs. Springer (2016)
Google Scholar
Van der Pol, E., Oliehoek, F.A.: Coordinated deep reinforcement learners for traffic light control. In: Proceedings of Learning, Inference and Control of Multi-Agent Systems (at NIPS 2016) (2016)
Google Scholar
Sukhbaatar, S., Szlam, A., Fergus, R.: Learning multiagent communication with backpropagation. arXiv preprint arXiv:1605.07736 (2016)
Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296 (2017)
Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 330–337 (1993)
Google Scholar
Tan, T., Bao, F., Deng, Y., Jin, A., Dai, Q., Wang, J.: Cooperative deep reinforcement learning for large-scale traffic grid signal control. IEEE Trans. Cybern. 50(6), 2687–2700 (2019)
Article Google Scholar
Thorpe, T.L.: Vehicle traffic light control using sarsa. Technical report. citeseer.ist.psu.edu/thorpe97vehicle.html (1997)
Google Scholar
Vanneste, S., Vanneste, A., Bosmans, S., Mercelis, S., Hellinckx, P.: Learning to communicate with multi-agent reinforcement learning using value-decomposition networks. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 736–745. Springer (2019)
Google Scholar
Vanneste, S., Vanneste, A., Mercelis, S., Hellinckx, P.: Learning to communicate using counterfactual reasoning. arXiv preprint arXiv:2006.07200 (2020)
Wu, C., Kreidieh, A., Parvate, K., Vinitsky, E., Bayen, A.M.: Flow: A modular learning framework for autonomy in traffic. arXiv preprint arXiv:1710.05465 (2017)
Zheng, G., et al.: Diagnosing reinforcement learning for traffic signal control. arXiv preprint arXiv:1905.04716 (2019)

Download references

Acknowledgements

This work was supported by the Research Foundation Flanders (FWO) under Grant Number 1S94120N and Grant Number 1S12121N. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

Author information

Authors and Affiliations

IDLab, Faculty of Applied Engineering, University of Antwerp - imec, Sint-Pietersvliet 7, 2000, Antwerpen, Belgium
Simon Vanneste, Gauthier de Borrekens, Stig Bosmans, Astrid Vanneste, Siegfried Mercelis & Peter Hellinckx
IDLab, Department of Computer Science, University of Antwerp - imec, Sint-Pietersvliet 7, 2000, Antwerpen, Belgium
Kevin Mets & Steven Latré

Authors

Simon Vanneste
View author publications
You can also search for this author in PubMed Google Scholar
Gauthier de Borrekens
View author publications
You can also search for this author in PubMed Google Scholar
Stig Bosmans
View author publications
You can also search for this author in PubMed Google Scholar
Astrid Vanneste
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Mets
View author publications
You can also search for this author in PubMed Google Scholar
Siegfried Mercelis
View author publications
You can also search for this author in PubMed Google Scholar
Steven Latré
View author publications
You can also search for this author in PubMed Google Scholar
Peter Hellinckx
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Vanneste .

Editor information

Editors and Affiliations

Dept of Info and Communication Engg, Fukuoka Institute of Technology, Fukuoka, Japan
Leonard Barolli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vanneste, S. et al. (2022). Learning to Communicate with Reinforcement Learning for an Adaptive Traffic Control System. In: Barolli, L. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2021. Lecture Notes in Networks and Systems, vol 343. Springer, Cham. https://doi.org/10.1007/978-3-030-89899-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-89899-1_21
Published: 20 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89898-4
Online ISBN: 978-3-030-89899-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics