[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Libsignal: an open library for traffic signal control

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

This paper introduces a library for cross-simulator comparison of reinforcement learning models in traffic signal control tasks. This library is developed to implement recent state-of-the-art reinforcement learning models with extensible interfaces and unified cross-simulator evaluation metrics. It supports commonly-used simulators in traffic signal control tasks, including Simulation of Urban MObility(SUMO) and CityFlow, and multiple benchmark datasets for fair comparisons. We conducted experiments to validate our implementation of the models and to calibrate the simulators so that the experiments from one simulator could be referential to the other. Based on the validated models and calibrated environments, this paper compares and reports the performance of current state-of-the-art RL algorithms across different datasets and simulators. This is the first time that these methods have been compared fairly under the same datasets with different simulators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Listing 1
Listing 2
Listing 3
Listing 4
Listing 5
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

All dataset are publicly available at https://github.com/DaRL-LibSignal/LibSignal/tree/master/data/raw_data.

Code availability

All code associated with this paper is publicly available from https://github.com/DaRL-LibSignal/LibSignal.

Notes

  1. http://sumo.sourceforge.net..

  2. https://cityflow-project.github.io/.

  3. http://www.yunqiacademy.org/poster

References

  • Ault, J., & Sharon, G. (2021). Reinforcement learning benchmarks for traffic signal control. In 35th Conference on neural information processing systems datasets and benchmarks track (Round 1).

  • Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). OpenAI Gym

  • Cao, M., Li, V. O., & Shuai, Q. (2022). A gain with no pain: Exploring intelligent traffic signal control for emergency vehicles. IEEE Transactions on Intelligent Transportation Systems, 23(10), 17899–17909.

    Article  Google Scholar 

  • Chen, C., Wei, H., Xu, N., Zheng, G., Yang, M., Xiong, Y., Xu, K., & Li, Z. (2020). Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control. In Proceedings of the AAAI conference on artificial intelligence (vol. 34, pp. 3414–3421).

  • Chu, T., Wang, J., Codecà, L., & Li, Z. (2019). Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 21(3), 1086–1095.

    Article  Google Scholar 

  • Devailly, F.-X., Larocque, D., & Charlin, L. (2021). Ig-rl: Inductive graph reinforcement learning for massive-scale traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 23(7), 7496–7507.

    Article  Google Scholar 

  • Kheterpal, N., Parvate, K., Wu, C., Kreidieh, A., Vinitsky, E., & Bayen, A. (2018). Flow: Deep reinforcement learning for control in sumo. EPiC Series in Engineering, 2, 134–151.

    Article  Google Scholar 

  • Lopez, P.A., Behrisch, M., Bieker-Walz, L., Erdmann, J., Flötteröd, Y.-P., Hilbrich, R., Lücken, L., Rummel, J., Wagner, P., & Wießner, E. (2018). Microscopic traffic simulation using sumo. In 2018 21st international conference on intelligent transportation systems (ITSC) (pp. 2575–2582). IEEE.

  • Ma, J., & Wu, F. (2020). Feudal multi-agent deep reinforcement learning for traffic signal control. In Proceedings of the 19th international conference on autonomous agents and multiagent systems (AAMAS) (pp. 816–824).

  • Oroojlooy, A., Nazari, M., Hajinezhad, D., & Silva, J. (2020). Attendlight: Universal attention-based reinforcement learning model for traffic signal control. Advances in Neural Information Processing Systems, 33, 4079–4090.

    Google Scholar 

  • Peng, X.B., Andrychowicz, M., Zaremba, W., & Abbeel, P. (2018). Sim-to-real transfer of robotic control with dynamics randomization. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 3803–3810). IEEE.

  • Raeis, M., & Leon-Garcia, A. (2021). A deep reinforcement learning approach for fair traffic signal control. In 2021 IEEE international intelligent transportation systems conference (ITSC) (pp. 2512–2518). IEEE.

  • Rasheed, F., Yau, K.-L.A., Noor, R. M., Wu, C., & Low, Y.-C. (2020). Deep reinforcement learning for traffic signal control: A review. IEEE Access, 8, 208016–208044.

    Article  Google Scholar 

  • Reinforcement Learning for Traffic Signal Control. https://traffic-signal-control.github.io/. Accessed 22 May 2022.

  • Rizzo, S.G., Vantini, G., & Chawla, S. (2019). Reinforcement learning with explainability for traffic signal control. In 2019 IEEE intelligent transportation systems conference (ITSC) (pp. 3567–3572). IEEE.

  • Rizzo, S.G., Vantini, G., & Chawla, S. (2019). Time critic policy gradient methods for traffic signal control in complex and congested scenarios. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 1654–1664).

  • Terry, J., Black, B., Grammel, N., Jayakumar, M., Hari, A., Sullivan, R., Santos, L. S., Dieffendahl, C., Horsch, C., Perez-Vicente, R., et al. (2021). Pettingzoo: Gym for multi-agent reinforcement learning. Advances in Neural Information Processing Systems, 34, 15032–15043.

    Google Scholar 

  • Tran, T.V., Doan, T.-N., & Sartipi, M. (2021). Tslib: A unified traffic signal control framework using deep reinforcement learning and benchmarking. In 2021 IEEE international conference on big data (Big Data) (pp. 1739–1747). https://doi.org/10.1109/BigData52589.2021.9671993

  • Wang, M., Wu, L., Li, J., & He, L. (2021). Traffic signal control with reinforcement learning based on region-aware cooperative strategy. IEEE Transactions on Intelligent Transportation Systems, 23(7), 6774–6785.

    Article  Google Scholar 

  • Wei, H., Chen, C., Zheng, G., Wu, K., Gayah, V., Xu, K., & Li, Z. (2019). Presslight: Learning max pressure control to coordinate traffic signals in arterial network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 1290–12980).

  • Wei, H., Xu, N., Zhang, H., Zheng, G., Zang, X., Chen, C., Zhang, W., Zhu, Y., Xu, K., & Li, Z. (2019). Colight: Learning network-level cooperation for traffic signal control. In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 1913–1922)

  • Wei, H., Zheng, G., Gayah, V., & Li, Z. (2019). A survey on traffic signal control methods. arXiv preprint arXiv:1904.08117

  • Wei, H., Zheng, G., Yao, H., & Li, Z. (2018). Intellilight: A reinforcement learning approach for intelligent traffic light control. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 2496–2505).

  • Wei, H., Zheng, G., Gayah, V., & Li, Z. (2021). Recent advances in reinforcement learning for traffic signal control: A survey of models and evaluation. ACM SIGKDD Explorations Newsletter, 22(2), 12–18.

    Article  Google Scholar 

  • Wu, L., Wang, M., Wu, D., & Wu, J. (2021). Dynstgat: Dynamic spatial-temporal graph attention network for traffic signal control. In Proceedings of the 30th ACM international conference on information & knowledge management (pp. 2150–2159).

  • Xiong, Y., Zheng, G., Xu, K., & Li, Z. (2019). Learning traffic signal control from demonstrations. In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 2289–2292).

  • Xu, B., Wang, Y., Wang, Z., Jia, H., & Lu, Z. (2021). Hierarchically and cooperatively learning traffic signal control. In Proceedings of the AAAI conference on artificial intelligence (vol. 35, pp. 669–677).

  • Yau, K.-L.A., Qadir, J., Khoo, H. L., Ling, M. H., & Komisarczuk, P. (2017). A survey on reinforcement learning models and algorithms for traffic signal control. ACM Computing Surveys (CSUR), 50(3), 1–38.

    Article  Google Scholar 

  • Yen, C.-C., Ghosal, D., Zhang, M., & Chuah, C.-N. (2020). A deep on-policy learning agent for traffic signal control of multiple intersections. In 2020 IEEE 23rd international conference on intelligent transportation systems (ITSC) (pp. 1–6). IEEE.

  • Zang, X., Yao, H., Zheng, G., Xu, N., Xu, K., & Li, Z. (2020). Metalight: Value-based meta-reinforcement learning for traffic signal control. In Proceedings of the AAAI conference on artificial intelligence (vol. 34, pp. 1153–1160).

  • Zhang, H., Feng, S., Liu, C., Ding, Y., Zhu, Y., Zhou, Z., Zhang, W., Yu, Y., Jin, H., & Li, Z. (2019). Cityflow: A multi-agent reinforcement learning environment for large scale city traffic scenario. In The world wide web conference (pp. 3620–3624).

  • Zhang, H., Liu, C., Zhang, W., Zheng, G., & Yu, Y. (2020). Generalight: Improving environment generalization of traffic signal control via meta reinforcement learning. In Proceedings of the 29th ACM international conference on information & knowledge management (pp. 1783–1792).

  • Zhao, W., Queralta, J.P., & Westerlund, T. (2020). Sim-to-real transfer in deep reinforcement learning for robotics: A survey. In 2020 IEEE symposium series on computational intelligence (SSCI) (pp. 737–744). IEEE.

  • Zheng, G., Xiong, Y., Zang, X., Feng, J., Wei, H., Zhang, H., Li, Y., Xu, K., & Li, Z. (2019). Learning phase competition for traffic signal control. In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 1963–1972).

  • Zheng, G., Zang, X., Xu, N., Wei, H., Yu, Z., Gayah, V., Xu, K., & Li, Z. (2019). Diagnosing reinforcement learning for traffic signal control. arXiv . https://doi.org/10.48550/ARXIV.1905.04716. https://arxiv.org/abs/1905.04716

Download references

Funding

The work was supported by NSF award #2153311.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the experimental design. Code and experiments are written and conducted by HM, XL, and HW. Documentation is provided by HM, XL, and LD. The manuscript was written by HM, XL, BS, and HW. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Bin Shi or Hua Wei.

Ethics declarations

Conflicts of interest

Not applicable.

Ethical approval

Not applicable.

Consent to participate

Hao Mei agrees to participate. Xiaoliang Lei agrees to participate. Longchao Da agrees to participate. Bin Shi agrees to participate. Hua Wei agrees to participate.

Consent for publication

Hao Mei agrees that his individual data and image are published. Xiaoliang Lei agrees that her individual data and image are published. Longchao Da agrees that his individual data and image are published. Bin Shi agrees that his individual data and image are published. Hua Wei agrees that his individual data and image are published.

Additional information

Editors: Emma Brunskill, Minmin Chen, Omer Gottesman, Lihong Li, Yuxi Li, Yao Liu, Zonging Lu, Niranjani Prasad, Zhiwei Qin, Csaba Szepesvari, Matthew Taylor

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 A.1: Documentation and license

LibSignal is open source and free to use/modify under the GNU General Public License 3. The code and documents are available on Github at https://darl-libsignal.github.io/. The embedded traffic datasets are distributed with their own licenses from Reinforcement Learning for Traffic Signal Control 2022 and Ault & Sharon (2021), whose licenses are under the GNU General Public License 3. SUMO is licensed under EPL 2.0 and CityFlow is under Apache 2.0. All experiments can be reproduced from the source code, which includes all hyper-parameters and configurations. The authors will bear all responsibility in case of violation of rights, etc., ensure access to the data and provide the necessary maintenance.

1.2 A.2: Details of world class

World class can extract and integrates the information and then pass the information to Agent. Specifically, in the initialization phase of the World, it would create an engine for the user-specified simulator and read the road network from the atomic file in the format of .json or .net.xml, then create Intersection, info_functions object, and other necessary variables to describe and process the information. In the training or evaluating phase, it would take step() function to interact with the simulator and then update the information.

  • Intersection class. Intersection is the basic component of the World. All of the information is stored in variables of Intersection class, for example, roads, phases, and etc.

  • info_functions object. In the World class, we provide an info_functions object inside to help retrieve information from different simulator environments and update information after each simulator performs a step. The info_functions contain state information including lane_count, lane_waiting_count, lane_waiting_time count, pressure, phase, and metrics including throughput, average_travel_time, lane_delay, lane_vehicles. These info_functions will later be called by Generator class and pass information into Agent.

  • step() function. It is another common function shared between different World classes. It takes in actions returned from Agent class and passes them into the simulator for next step execution. And action is either sampled from action space for exploration or calculated from the model after optimization. Generally, the action space contains eight phases. However, in highly heterogenous traffic structures, the action space may differ and is provided by the simulators whose action parameters are taken from configuration files.

Listing 6 presents an example for creating a World for the CityFlow simulator environment, including 3 sections: (1) initialization. (2) create Intersections, roads, lanes, and other necessary parameters. 3) define info_functions to facilitate retrieve information.

Listing 6
figure f

An example to create World class

1.3 A.3: Validation

To validate our PyTorch re-implementations performance, we compare the performance of four originally implemented in TensorFlow. Figure 5 shows the converge curve of MAPG, PressLight, IDQN, and CoLight in both the train and test phase, which are not provided in Sect. 4.1. The final performance in Table 10 shows that all four new implementations are consistent with their original TensorFlow implementations.

Fig. 5
figure 5

Convergence curve of models implemented in their original form (TensorFlow) and in LibSignal (PyTorch). Y-axis is the testing result w.r.t. average travel time (in seconds)

Table 10 Best episode performance w.r.t. average travel time (in seconds). The performance of models is consistent under TensorFlow and PyTorch

1.4 A.4: Network conversion

Current LibSignal includes 9 datasets which are converted and calibrated. Their road networks are shown in Fig. 6. Other configurations of CityFlow1x1 datasets are similar to CityFlow1x1 that appeared in the full paper in road network structure, which will not be shown here.

Fig. 6
figure 6

Road networks in different simulators for calibration

1.5 A.5: Calibration steps

To validate that the performance of the algorithms is consistent in both SUMO and CityFlow, we calibrate the simulators in the following aspects:

  • Calibration from SUMO to CityFlow: To make the conversion of complex networks from SUMO compatible with CityFlow, we redesign the original convert files from  Zhang et al. (2019) with the following: (1) For those .rou files in SUMO that only specify source and destination intersections and ignore roads that would be passing, the router command line in SUMO should be applied to generate full routes before converting it into CityFlow’s .json traffic flow file. (2) We treat all the intersections without traffic signals in SUMO as “virtual” nodes in CityFlow’s .json road network file. (3) We keep the time interval the same for red and yellow signals in SUMO and CityFlow. (4) SUMO has a feature of the dynamic routing of vehicles that CityFlow does not have, currently all the simulations under SUMO in LibSignal disables the dynamic routing. (5) To reduce the differences in the results of different simulators caused by the fact that the phases in the SUMO environment cannot fully be transferred to the phases in the CityFlow environment (SUMO provides more abundant phases than CityFlow), we modify the judgment conditions of phase transformation.

  • Calibration from CityFlow to SUMO: (1) The vehicles in CityFlow’s traffic flow file need to be sorted according to their departure time because the SUMO traffic file defaults to the depart time of the preceding vehicle earlier than the following vehicle. (2) The type of vehicle should be clearly indicated in order to limit the max speed of the vehicle.

Table 11 Performance of agents in CityFlow and SUMO on additional datasets that are not shown in Sect. 4.2 with best and second best performance highlighted

1.6 A.6: Supplementary results

We conduct experiments on all nine datasets and also provide results of the best episode, full converge curves and standard deviations of the performance on the four datasets in the full paper.

1.6.1 A.6.1: Other comparison studies on datasets not shown in full paper

Table 11 shows the result of performance on the other five datasets. It shows PressLight and IDQN are the most stable algorithms most of the time.

1.6.2 A.6.2: Converge curve of Table 8

Figure 7 shows the full converge curve of 2000 episodes for IPPO and MAPG agents. The result shows that compared to Q-learning agents, Actor-Critic agents are hard to converge on some large or complex datasets, and the convergence time needed is more than ten times of Q-learning methods.

Fig. 7
figure 7

Full converge curve of Table 8

1.6.3 Result of best episode

Table 12 gives the episode number of all datasets. It supports the conclusion that PressLight, followed by IDQN, has the best sample efficiency compared with other algorithms.

Table 12 The episode of best results for different agents w.r.t. different methods

1.6.4 A.6.4: Performance on the benchmark with standard deviations

Table 13 shows the standard deviation of the performance on the four datasets in the full paper.

Table 13 The standard deviations of Table 8

1.7 Extension to other simulators

LibSignal is a cross-simulator library for traffic control tasks. Currently, we support the most commonly used CityFlow and SUMO simulators, and our library is open to other new simulation environments. CBEngine is a new simulator that served as the simulation environment in the KDD Cup 2021 City Brain Challenge Footnote 3 and is designed for executing traffic control tasks on large traffic networks. We integrate this new simulator into our traffic control framework to extend LibSignal ’s usage in other simulation environments. We show the result of MaxPressure, SOTL, FixedTime, and IDQN’ performance under CBEngine in Table 14.

Table 14 Performance on CBEngine simulator

1.8 Hyperparameters

Table 15 provides the parameters of each algorithm, training environment, and hardware parameters on the server.

Table 15 Hyperparameters of models, servers and training

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mei, H., Lei, X., Da, L. et al. Libsignal: an open library for traffic signal control. Mach Learn 113, 5235–5271 (2024). https://doi.org/10.1007/s10994-023-06412-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-023-06412-y

Keywords

Navigation