Abstract
This paper proposes a real-life application of deep reinforcement learning to address the order dispatching problem of a Turkish ultra-fast delivery company, Getir. Before applying off-the-shelf reinforcement learning methods, we define the specific problem at Getir and one of the solutions the company has implemented. We discuss the novel aspects of Getir’s problem compared to the state-of-the-art order dispatching studies and highlight the limitations of Getir’s solution. The overall aim of the company is to deliver to as many customers as possible within 10 minutes. The orders arrive throughout the day, and centralized warehouses in the regions decide whether an incoming order should be served or canceled depending on their couriers’ shifts and status. We use Deep Q-networks to learn the actions of warehouses, i.e., accepting or canceling an order, directly from state dimensions using reinforcement learning. We design the networks with two different rewards. We conduct empirical analyses using real-life data provided by Getir to generate training samples and to assess the models’ performance during a selected 30-day period with a total of 9880 orders. The results indicate that our proposed models are able to generate policies that outperform the rule-based heuristic employed in practice.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Due to privacy agreements with the company, we do not disclose the exact numbers for the queue size and delivery time limits in this paper.
References
Agarap AF (2018) Deep learning using rectified linear units (relu). CoRR arXiv:1803.08375
Chen B, Qu R, Bai R, Laesanklang W (2019a) A variable neighborhood search algorithm with reinforcement learning for a real-life periodic vehicle routing problem with time windows and open routes. RAIRO Operations Research
Chen Y, Qian Y, Yao Y, Wu Z, Li R, Zhou Y, Hu H, Xu Y (2019b) Can sophisticated dispatching strategy acquired by reinforcement learning? In: 18th International Conference on Autonomous Agents and MultiAgent Systems, pp 1395–1403
Han S, Zhao L, Chen K, Zw Luo, Mishra D (2017) Appointment scheduling and routing optimization of attended home delivery system with random customer behavior. Eur J Oper Res 262(3):966–980
Holler J, Vuorio R, Qin Z, Tang X, Jiao Y, Jin T, Singh S, Wang C, Ye J (2019) Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem. In: 2019 IEEE International Conference on Data Mining (ICDM), pp 1090–1095. https://doi.org/10.1109/ICDM.2019.00129
Huang Y, Zhao L, Powell W B, Tong Y, Ryzhov I O (2019) Optimal learning for urban delivery fleet allocation. Transp Sci 53(3):623–641. https://doi.org/10.1287/trsc.2018.0861
Jung J, Jayakrishnan R (2013) Design and modeling of real-time shared-taxi dispatch algorithms. In: Transportation Research Board 92nd Annual Meeting
Kingma D P, Ba J (2015) Adam: a method for stochastic optimization. CoRR arXiv:1412.6980
Li Y, Zheng Y, Yang Q (2019) Efficient and effective express via contextual cooperative reinforcement learning. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 510–519
Lin C, Choy K L, Ho G T, Lam H, Pang G K, Chin K S (2014) A decision support system for optimizing dynamic courier routing operations. Expert Syst Appl 41(15):6917–6933
Lin K, Zhao R, Xu Z, Zhou J (2018) Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD’18. Association for Computing Machinery, New York, pp 1774–1783. https://doi.org/10.1145/3219819.3219993
Lu Z, Pu H, Wang F, Hu Z, Wang L (2017) The expressive power of neural networks: a view from the width. In: Guyon I, Luxburg U V, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30, Curran Associates, Inc., pp 6231–6239
Mahmud M, Kaiser MS, Hussain A, Vassanelli S (2018) Applications of deep learning and reinforcement learning to biological data. IEEE Trans Neural Netw Learn Syst 29(6):2063–2079. https://doi.org/10.1109/TNNLS.2018.2790388
Masoud N, Jayakrishnan R (2017) A real-time algorithm to solve the peer-to-peer ride-matching problem in a flexible ridesharing system. Transportation Research Part B Methodological. https://doi.org/10.1016/j.trb.2017.10.006
Massey Jr FJ (1951) The kolmogorov-smirnov test for goodness of fit. J Am Stat Assoc 46 (253):68–78. https://doi.org/10.1080/01621459.1951.10500769
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. In: NIPS Deep Learning Workshop
Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518:529–533
Ota M, Vo H, Silva C, Freire J (2015) A scalable approach for data-driven taxi ride-sharing simulation. In: 2015 IEEE International Conference on Big Data (Big Data), pp 888–897
Pitel L (2020) Michael moritz backs turkish grocery start-up. Available from https://www.ft.com/content/d0a427f6-36e0-11ea-a6d3-9a26f8c3cba4
Qin Z, Tang X, Jiao Y, Zhang F, Xu Z, Zhu H, Ye J (2020) Ride-hailing order dispatching at didi via reinforcement learning. INFORMS J Appl Anal 50(5):272–286
Restrepo M I, Semet F, Pocreau T (2019) Integrated shift scheduling and load assignment optimization for attended home delivery. Transp Sci 53(4):1150–1174
Reyes D, Erera A L, Savelsbergh M W P, Sahasrabudhe S, O’Neil RJ (2018) The meal delivery routing problem. Technical Report
Schaul T, Quan J, Antonoglou I, Silver D (2016) Prioritized experience replay. In: International Conference on Learning Representations
Silver D, Huang A, Maddison C J, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484–489. https://doi.org/10.1038/nature16961
Sungur I, Ren Y, Ordonez F, Dessouky M, Zhong H (2010) A model and algorithm for the courier delivery problem with uncertainty. Transp Sci 44(2):193–205. https://doi.org/10.1287/trsc.1090.0303
Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction, 2nd edn. The MIT Press
Tan T, Bao F, Deng Y, Jin A, Dai Q, Wang J (2020) Cooperative deep reinforcement learning for large-scale traffic grid signal control. IEEE Trans Cybern 50(6):2687–2700. https://doi.org/10.1109/TCYB.2019.2904742
Tang X, Qin ZT, Zhang F, Wang Z, Xu Z, Ma Y, Zhu H, Ye J (2019) A deep value-network based approach for multi-driver order dispatching. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, Association for Computing Machinery, New York, p 1780–1790. https://doi.org/10.1145/3292500.3330724
Ulmer MW, Thomas BW, Mattfeld DC (2019) Preemptive depot returns for dynamic same-day delivery. EURO J Transp Logist 8(4):327–361. https://doi.org/10.1007/s13676-018-0124-0, https://www.sciencedirect.com/science/article/pii/S2192437620300479
Uwano F, TATEBE N, TAJIMA Y, NAKATA M, KOVACS T, TAKADAMA K (2018) Multi-agent cooperation based on reinforcement learning with internal reward in maze problem. SICE J Control Measur Syst Integr 11(4):321–330. https://doi.org/10.9746/jcmsi.11.321
Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30
Vera JM, Abad AG (2019) Deep reinforcement learning for routing a heterogeneous fleet of vehicles. In: 2019 IEEE Latin American Conference on Computational Intelligence (LA-CCI), pp 1–6
Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N (2016) Dueling network architectures for deep reinforcement learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of The 33rd International Conference on Machine Learning, PMLR, vol 48. Proceedings of Machine Learning Research, New York, pp 1995–2003. http://proceedings.mlr.press/v48/wangf16.html
Wang Z, Qin Z, Tang X, Ye J, Zhu H (2018) Deep reinforcement learning with knowledge transfer for online rides order dispatching. In: 2018 IEEE International Conference on Data Mining (ICDM), pp 617–626, https://doi.org/10.1109/ICDM.2018.00077
Zhao J, Mao M, Zhao X, Zou J (2020) A hybrid of deep reinforcement learning and local search for the vehicle routing problems. IEEE Trans Intell Transp Syst:1–11. https://doi.org/10.1109/TITS.2020.3003163
Zhou M, Jin J, Zhang W, Qin Z, Jiao Y, Wang C, Wu G, Yu Y, Ye J (2019) Multi-agent reinforcement learning for order-dispatching via order-vehicle distribution matching. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM ’19. Association for Computing Machinery, New York, pp 2645–2653. https://doi.org/10.1145/3357384.3357799
Acknowledgments
This research is partly funded by Getir Perakende Lojistik A.S., Istanbul, Turkey.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kavuk, E.M., Tosun, A., Cevik, M. et al. Order dispatching for an ultra-fast delivery service via deep reinforcement learning. Appl Intell 52, 4274–4299 (2022). https://doi.org/10.1007/s10489-021-02610-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02610-0