[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3378184.3378194acmotherconferencesArticle/Chapter ViewAbstractPublication PagesappisConference Proceedingsconference-collections
research-article

deep-MARLIN: Using Deep Multi-Agent Reinforcement Learning for Adaptive Traffic Light Control

Published: 17 February 2020 Publication History

Abstract

Almost every major city in the world is facing a significant economic loss caused by traffic congestions. In this context, it already has been shown that Adaptive Traffic Light Control (ATLC) can be an effective solution to improve a diversity of different traffic-related metrics. The problem of ATLC can be modeled in various ways with reinforcement learning being one of the most promising frameworks. Especially the application of Multi-Agent Reinforcement Learning (MARL) can be a suitable approach for learning to adaptively control the traffic of realistic road networks. Among the set of MARL algorithms, Multi-Agent Reinforcement Learning for Integrated Network (MARLIN) stands out and is shown to be particularly suited to the problem of ATLC with producing remarkable results. MARLIN models the multi-agent framework as a stochastic game providing an explicit coordination mechanism for the agents. However, in MARLIN, the possible size of the state and action space is limited and the features for the state representation need to be hand-crafted. Therefore, in this study, the algorithm is combined with function approximation by using artificial neural networks to overcome these limitations. deep-MARLIN is explained and bench-marked in a large and realistic traffic environment using Simulation of Urban MObility (SUMO). The results indicate that when compared to MARLIN, deep-MARLIN converges faster to a policy that is producing lower average vehicle delay.

References

[1]
B. Abdulhai, R. Pringle, and G. J. Karakoulas. 2003. Reinforcement Learning for True Adaptive Traffic Signal Control. Journal of Transportation Engineering 129, 3 (2003), 278--285.
[2]
I. Arel, C. Liu, T. Urbanik, and A. G. Kohls. 2010. Reinforcement Learning-Based Multi-Agent System for Network Traffic Signal Control. IET Intelligent Transport Systems 4, 2 (2010), 128--135.
[3]
T. Başar and G.J. Olsder. 1982. Dynamic Noncooperative Game Theory. Academic Press Inc.
[4]
L. Busconiu, R. Babuska, and B. De Schutter. 2008. A Comprehensive Survey of Multiagent Reinforcement Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38, 2 (2008), 156--172.
[5]
T. Chu, J. Wang, L. Codecà, and Z. Li. 2019. Multi-Agent Deep Reinforcement Learning for Large-scale Traffic Signal Control. IEEE Transactions on Intelligent Transportation Systems (2019).
[6]
S. El-Tantawy, B. Abdulhai, and H. Abdelgawad. 2013. Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLINATSC): Methodology and Large-Scale Application on Downtown Toronto. IEEE Transactions on Intelligent Transportation Systems 14, 3 (2013), 1140--1150.
[7]
S. El-Tantawy, B. Abdulhai, and H. Abdelgawad. 2014. Design of Reinforcement Learning Parameters for Seamless Application of Adaptive Traffic Signal Control. Journal of Intelligent Transportation Systems 18, 3 (2014), 227--245.
[8]
J. Gao, Y. Shen, J. Liu, M. Ito, and Shiratori N. 2017. Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network. arXiv:1705.02755 (2017).
[9]
INRIX. 2019. INRIX 2018 Global Traffic Scorecard. http://inrix.com/scorecard/
[10]
D. P. Kingma and J. L. Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 (2014).
[11]
L. Kuyer, S. Whiteson, B. Bakker, and N. Vlassis. 2008. Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs. Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases (2008), 656--671.
[12]
P. Lillicrap, T., J. Hunt, J., A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. 2015. Continuous Control with Deep Reinforcement Learning. arXiv:1509.02971 (2015).
[13]
L.-J. Lin. 1993. Reinforcement Learning for Robots Using Neural Networks. Dissertation (Carnegie Mellon University Pittsburgh) (1993).
[14]
M. L. Littman. 1994. Markov Games as a Framework for Multi-Agent Reinforcement Learning. Proceedings of the 11th International Conference on Machine Learning (1994), 157--163.
[15]
P. A. Lopez, M. Behrisch, L. Bieker-Walz, J. Erdmann, Y.-P. Flötteröd, R. Hilbrich, L. Lücken, J. Rummel, P. Wagner, and E. Wiener. 2018. Microscopic Traffic Simulation using SUMO. Proceedings of the 21st IEEE International Conference on Intelligent Transportation Systems (2018), 2575--2582.
[16]
A. L. Maas, A. Y. Hannun, and A. Y. Ng. 2013. Rectifier Nonlinearities Improve Neural Network Acoustic Models. Proceedings of the 30th International Conference on Machine Learning (2013).
[17]
V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Harley, T. P. Lillicrap, D. Silver, and K. Kavukcuoglu. 2016. Asynchronous Methods for Deep Reinforcement Learning. Proceedings of the 33rd International Conference on Machine Learning (2016).
[18]
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. 2013. Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602 (2013).
[19]
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. 2015. Human-level Control through Deep Reinforcement Learning. Nature 518, 7540 (2015), 529--533.
[20]
R. Nair, P. Varakantham, M. Tambe, and M. Yokoo. 2005. Networked Distributed POMDPs: A Synthesis of Distributed Constraint Optimization and POMDPs. Proceedings of the 20th National Conference on Artificial Intelligence (2005), 133--139.
[21]
N. Ono and K. Fukumoto. 1996. Multi-Agent Reinforcement Learning: A Modular Approach. Proceedings of the 2nd International Conference on Multiagent Systems (1996), 252--258.
[22]
L. S. Shapley. 1953. Stochastic Games. Proceedings of the National Academy of Sciences of the United States of America 39, 10 (1953), 1095--1100.
[23]
R. S. Sutton and A. G. Barto. 2018. Reinforcement Learning: An Introduction. The MIT Press.
[24]
J. N. Tsitsiklis and B. Van Roy. 1997. An Analysis of Temporal-Difference Learning with Function Approximation. IEEE Trans. Automat. Control 42, 5 (1997), 674--690.
[25]
C. J. C. H. Watkins. 1989. Learning from Delayed Rewards. Dissertation (King's College London) (1989).
[26]
M. A. Wiering. 2000. Multi-Agent Reinforcement Learning for Traffic Light Control. Proceedings of the 17th International Conference on Machine Learning (2000), 1151--1158.

Cited By

View all
  • (2023)eMARLIN: Distributed Coordinated Adaptive Traffic Signal Control with Topology-Embedding PropagationTransportation Research Record: Journal of the Transportation Research Board10.1177/03611981231184250Online publication date: 11-Jul-2023
  • (2023)Incremental Reinforcement Learning with Prioritized Sweeping for Traffic Signal Control2023 IEEE 8th International Conference on Intelligent Transportation Engineering (ICITE)10.1109/ICITE59717.2023.10733874(21-28)Online publication date: 28-Oct-2023

Index Terms

  1. deep-MARLIN: Using Deep Multi-Agent Reinforcement Learning for Adaptive Traffic Light Control

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    APPIS 2020: Proceedings of the 3rd International Conference on Applications of Intelligent Systems
    January 2020
    214 pages
    ISBN:9781450376303
    DOI:10.1145/3378184
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 February 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Adaptive Traffic Light Control
    2. Artificial Intelligence
    3. Game Theory
    4. Machine Learning
    5. Multi-Agent
    6. Neural Networks
    7. Reinforcement Learning
    8. Stochastic Game

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    APPIS 2020

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)23
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 21 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)eMARLIN: Distributed Coordinated Adaptive Traffic Signal Control with Topology-Embedding PropagationTransportation Research Record: Journal of the Transportation Research Board10.1177/03611981231184250Online publication date: 11-Jul-2023
    • (2023)Incremental Reinforcement Learning with Prioritized Sweeping for Traffic Signal Control2023 IEEE 8th International Conference on Intelligent Transportation Engineering (ICITE)10.1109/ICITE59717.2023.10733874(21-28)Online publication date: 28-Oct-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media