Deep Reinforcement Learning Based Optimal Route and Charging Station Selection
<p>Overall architecture for the electric vehicle charging navigation system (EVCNS). EV: electric vehicle; EVCS: electric vehicle charging station; ITS: intelligent transport system.</p> "> Figure 2
<p>Information flow for system operation of the proposed EVCNS.</p> "> Figure 3
<p>Deep Reinforcement Learning based EV Charging Navigation System.</p> "> Figure 4
<p>Overview of training process.</p> "> Figure 5
<p>Flowchart of charging station selection.</p> "> Figure 6
<p>Random distribution of arrival time of EV charging requests: (<b>a</b>) Uniform distribution; (<b>b</b>) Normal distribution.</p> "> Figure 7
<p>Cumulative rewards progress during the training process: (<b>a</b>) Convergence of cumulative rewards; (<b>b</b>) Convergence of total travel time.</p> "> Figure 8
<p>Average travel time of 100 EVs according to distributions: (<b>a</b>) Uniform distribution; (<b>b</b>) Normal distribution.</p> "> Figure 9
<p>Average travel distance of different number of EVs according to distributions: (<b>a</b>) Uniform distribution; (<b>b</b>) Normal distribution.</p> "> Figure 10
<p>Average travel time of different number of EVs according to distributions: (<b>a</b>) Uniform distribution; (<b>b</b>) Normal distribution.</p> ">
Abstract
:1. Introduction
- Model-free deep reinforcement learning based optimal route and charging station selection (RCS) algorithm is proposed to overcome the uncertainty issues of the traffic conditions and dynamic arrival EV charging requests.
- The RCS problem is formulated by the Markov Decision Process (MDP) model with unknown transition probabilities.
- The performance of the proposed DRL based RCS algorithm is compared to the conventional algorithms in terms of travel time, waiting time, charging time, driving time, and distance under the various distributions and number of EV charging requests.
2. Related Work
3. DRL Based Route and Charging Station Selection Algorithm
3.1. System Architecture
3.1.1. Electric Vehicle (EV)
3.1.2. Electric Vehicle Charging Station (EVCS)
3.1.3. Intelligent Transport System (ITS) Center
3.1.4. Electric Vehicle Charging Navigation System (EVCNS) Center
3.2. System Operation
- Step 1: The EV which needs a charging service is requesting for the route & charging station selection service from the EVCNS center. The EV request is transmitted to the EVCNS center through wireless communication technologies.
- Step 2: The EVCNS center is continuously receiving the monitoring information of EVCSs (number of charging vehicles, number of waiting vehicles, etc.) and the road traffic condition (road states, average velocity, etc.) from the ITS center.
- Step 3: Based on the received information from EVCSs and ITS center, the EVCNS center recommends the optimal route & charging station for the requested EV.
- Step 4: The EV confirms the recommended charging station and sends a confirmation message for reservation information to the EVCNS center. The EVCNS center stores the reservation information for the next charging requests.
3.3. Electric Vehicle Charging Navigation System
3.3.1. Traffic Preprocess Module (TPM)
3.3.2. Charging Preprocess Module (CPM)
3.3.3. Feature Extract Module (FEM)
3.3.4. Route & Charging Station Selection Module (RCSM)
3.4. Deep Q Network for Route and Charging Station Selection
3.4.1. Markov Decision Process Modeling
- System States: The state including the arrived EV charge request and information for each EVCS is represented as , as given in Equation (9), where are the set of expected arrival time, waiting time, driving time, and driving distance for each EVCS, respectively. The EV charge request consists of the starting point , the destination location , the request time , the request time interval , the current SOC, , and the required SOC, , as given in Equation (10). Since the charging requests arrive dynamically, the time difference between the past request and the current request is provided as an additional feature. We assume that a fixed number of EV charge requests arrive within the operation time T because the possible status is infinite when the continues to arrive. The set of expected waiting time, driving time, and driving distance represents the expected values for each EVCS selection, which is to reduce the number of rapidly increasing dimensions to represent the current environment.
- Action and Transition Probability: The EVNS can take an action for each state . This action represents an index of EVCS and includes the route planned corresponding EVCS k from to via an EVCS k using the FEM. The action space is the set of K.
- Reward: The reward function is divided into two parts, one for terminal state and one for non-terminal state. Where the terminal state indicates when operation time T expired. If it is not a terminal state, the reward is defined as the expected travel time that can be obtained by selecting action with the corresponding EVCS, even if the EV chooses the EVCS, because the actual travel time has not been revealed due to the real-time traffic conditions and the charging behavior of other EVs. In terminal state, the actual travel time of all EV requests is revealed, and the difference between actual travel time and expected travel time is defined as the reward. The reward function is formulated as
- Action-Value Function: The action-value function denotes , which is the expected total summation of future rewards for using action in a certain state following a policy .
3.4.2. Training of DQN
Algorithm 1. Training process of DQN. |
|
3.4.3. EVCS Selection Using Trained DQN
Algorithm 2. Route & Charging Station Selection. |
|
4. Performance Evaluation
4.1. Simulation and Training Setup
4.2. Performance Evaluation
- Second, the Minimum Travel Time (MTT) strategy aims at minimizing the total travel time, including driving, waiting, and charging time, similar to the proposed algorithm. The strategy selects the EVCS and corresponding route that takes a minimum travel time [24].
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
Nomenclature
Sets and indices | |
Network topology | |
Set of vertices, which represent intersections or end points of the road | |
Set of edges, which are roads between node and | |
Set of EVCSs | |
Charging request of EV at time step t | |
Set of links from the origin to the destination via the EVCS k | |
Set of links from origin to EVCS | |
Parameters | |
Learning rate | |
Energy consumption rate (kW/h) | |
Discount factor, | |
Charging power of EVCS | |
Charging efficiency | |
Variables | |
Energy consumption of link | |
Maximum battery capacity of EV | |
Estimated charging amount energy | |
Location of the source | |
Location of the destination | |
Required state of charge | |
Current state of charge | |
State of charge when EV arrives at EVCS k | |
Distance of link | |
Charging request time | |
Driving time to move a link l at time step t | |
Total driving time of route via the EVCS k | |
Expected arrival time at EVCS k | |
Estimated charging time at EVCS k | |
Expected waiting time at each EVCS k | |
Average road velocity between node and at time step t | |
Weight value of |
References
- Ghosh, A. Possibilities and Challenges for the Inclusion of the Electric Vehicle (EV) to Reduce the Carbon Footprint in the Transport Sector: A Review. Energies 2020, 13, 2602. [Google Scholar] [CrossRef]
- Zhang, J.; Yan, J.; Liu, Y.; Zhang, H.; Lv, G. Daily electric vehicle charging load profiles considering demographics of vehicle users. Appl. Energy 2020, 274, 115063. [Google Scholar] [CrossRef]
- Lee, W.; Schober, R.; Wong, V.W.S. An Analysis of Price Competition in Heterogeneous Electric Vehicle Charging Stations. IEEE Trans. Smart Grid 2019, 10, 3990–4002. [Google Scholar] [CrossRef]
- Liu, C.; Chau, K.T.; Wu, D.; Gao, S. Opportunities and Challenges of Vehicle-to-Home, Vehicle-to-Vehicle, and Vehicle-to-Grid Technologies. Proc. IEEE 2013, 101, 2409–2427. [Google Scholar] [CrossRef] [Green Version]
- Silva, F.C.; Ahmed, M.A.; Martínez, J.M.; Kim, Y.-C. Design and Implementation of a Blockchain-Based Energy Trading Platform for Electric Vehicles in Smart Campus Parking Lots. Energies 2019, 12, 4814. [Google Scholar] [CrossRef] [Green Version]
- Tan, J.; Wang, L. Real-Time Charging Navigation of Electric Vehicles to Fast Charging Stations: A Hierarchical Game Approach. IEEE Trans. Smart Grid 2015, 8, 846–856. [Google Scholar] [CrossRef]
- Yang, H.; Deng, Y.; Qiu, J.; Li, M.; Lai, M.; Dong, Z.Y. Electric Vehicle Route Selection and Charging Navigation Strategy Based on Crowd Sensing. IEEE Trans. Ind. Inform. 2017, 13, 2214–2226. [Google Scholar] [CrossRef]
- Yang, J.-Y.; Chou, L.-D.; Chang, Y.-J. Electric-Vehicle Navigation System Based on Power Consumption. IEEE Trans. Veh. Technol. 2016, 65, 5930–5943. [Google Scholar] [CrossRef]
- Guo, Q.; Xin, S.; Sun, H.; Li, Z.; Zhang, B. Rapid-Charging Navigation of Electric Vehicles Based on Real-Time Power Systems and Traffic Data. IEEE Trans. Smart Grid 2014, 5, 1969–1979. [Google Scholar] [CrossRef]
- Jin, C.; Tang, J.; Ghosh, P. Optimizing Electric Vehicle Charging: A Customer’s Perspective. IEEE Trans. Veh. Technol. 2013, 62, 2919–2927. [Google Scholar] [CrossRef]
- Zhang, X.; Peng, L.; Cao, Y.; Liu, S.; Zhou, H.; Huang, K. Towards holistic charging management for urban electric taxi via a hybrid deployment of battery charging and swap stations. Renew. Energy 2020, 155, 703–716. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhang, D.; Qiu, R.C. Deep reinforcement learning for power system: An overview. CSEE J. Power Energy Syst. 2019, 6, 213–225. [Google Scholar]
- Luong, N.C.; Hoang, D.T.; Gong, S.; Niyato, D.; Wang, P.; Liang, Y.-C.; Kim, D.I. Applications of Deep Reinforcement Learning in Communications and Networking: A Survey. IEEE Commun. Surv. Tutor. 2019, 21, 3133–3174. [Google Scholar] [CrossRef] [Green Version]
- Lei, L.; Tan, Y.; Zheng, K.; Liu, S.; Zhang, K.; Shen, X. Deep Reinforcement Learning for Autonomous Internet of Things: Model, Applications and Challenges. IEEE Commun. Surv. Tutor. 2020, 22, 1722–1760. [Google Scholar] [CrossRef] [Green Version]
- Nguyen, T.T.; Reddi, V.J. Deep Reinforcement Learning for Cyber Security. arXiv 2019, arXiv:1906.05799. [Google Scholar]
- Mason, K.; Grijalva, S. A review of reinforcement learning for autonomous building energy management. Comput. Electr. Eng. 2019, 78, 300–312. [Google Scholar] [CrossRef] [Green Version]
- Lee, S.; Choi, D.-H. Reinforcement Learning-Based Energy Management of Smart Home with Rooftop Solar Photovoltaic System, Energy Storage System, and Home Appliances. Sensors 2019, 19, 3937. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kim, S.; Lim, H. Reinforcement Learning Based Energy Management Algorithm for Smart Energy Buildings. Energies 2018, 11, 2010. [Google Scholar] [CrossRef] [Green Version]
- Wan, Z.; Li, H.; He, H.; Prokhorov, D. Model-Free Real-Time EV Charging Scheduling Based on Deep Reinforcement Learning. IEEE Trans. Smart Grid 2019, 10, 5246–5257. [Google Scholar] [CrossRef]
- Sadeghianpourhamami, N.; Deleu, J.; Develder, C. Definition and Evaluation of Model-Free Coordination of Electrical Vehicle Charging with Reinforcement Learning. IEEE Trans. Smart Grid 2020, 11, 203–214. [Google Scholar] [CrossRef] [Green Version]
- Wang, S.; Bi, S.; Angela Zhang, Y.J. Reinforcement Learning for Real-time Pricing and Scheduling Control in EV Charging Stations. IEEE Trans. Ind. Inform. 2019, 17, 849–859. [Google Scholar] [CrossRef]
- Qian, T.; Shao, C.; Wang, X.; Shahidehpour, M. Deep Reinforcement Learning for EV Charging Navigation by Coordinating Smart Grid and Intelligent Transportation System. IEEE Trans. Smart Grid 2020, 11, 1714–1723. [Google Scholar] [CrossRef]
- Eklund, P.W.; Kirkby, S.; Pollitt, S. A dynamic multi-source Dijkstra’s algorithm for vehicle routing. In Proceedings of the IEEE Australian and New Zealand Conference on Intelligent Information Systems, Adelaide, Australia, 18–20 November 1996; pp. 329–333. [Google Scholar]
- Cao, Y.; Zhang, X.; Wang, R.; Peng, L.; Aslam, N.; Chen, X. Applying DTN routing for reservation-driven EV Charging management in smart cities. In Proceedings of the 2017 13th International Wireless Communications and Mobile Computing Conference (IWCMC), Valencia, Spain, 26–30 June 2017; pp. 1471–1476. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018; Volume 1. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Mo, W.; Yang, C.; Chen, X.; Lin, K.; Duan, S. Optimal Charging Navigation Strategy Design for Rapid Charging Electric Vehicles. Energies 2019, 12, 962. [Google Scholar] [CrossRef] [Green Version]
- Cerna, F.V.; Pourakbari-Kasmaei, M.; Romero, R.A.; Rider, M.J. Optimal delivery scheduling and charging of EVs in the navigation of a city map. IEEE Trans. Smart Grid 2017, 9, 4815–4827. [Google Scholar] [CrossRef] [Green Version]
- Luo, L.; Gu, W.; Zhou, S.; Huang, H.; Gao, S.; Han, J.; Wu, Z.; Dou, X. Optimal planning of electric vehicle charging stations comprising multi-types of charging facilities. Appl. Energy 2018, 226, 1087–1099. [Google Scholar] [CrossRef]
- TensorFlow Framework. Available online: https://www.tensorflow.org/ (accessed on 5 November 2019).
- Xia, F.; Chen, H.; Chen, L.; Qin, X. A Hierarchical Navigation Strategy of EV Fast Charging Based on Dynamic Scene. IEEE Access 2019, 7, 29173–29184. [Google Scholar] [CrossRef]
- Cao, Y.; Liu, S.; He, Z.; Dai, X.; Xie, X.; Wang, R.; Yu, S. Electric Vehicle Charging Reservation under Preemptive Service. In Proceedings of the 2019 1st International Conference on Industrial Artificial Intelligence (IAI), Shenyang, China, 23–27 July 2019; pp. 1–6. [Google Scholar]
Ref. | Contributions | Main Entities | Decision |
---|---|---|---|
[6] | An integrated EV charging navigation system based on a hierarchical game approach | EV terminals, EVCS, power system operation center (PSOC), and EVNS | EVs decide when to charge and which EVCS should be selected |
[7] | EV route selection and charging navigation based on crowd sensing | Electric vehicles, charging stations, decision making center | The decision making center send charging navigation decisions & route selection to EVs |
[8] | Electric vehicle navigation system based on vehicular ad-hoc networks (VANETS) | EVs, charging stations, traffic information center | The traffic information center analyzes the traffic information and plans routes accordingly |
[9] | Rapid charging navigation strategy based on real time traffic data and status of the power grid | EVs, charging stations, ITS center, power system control center (PSCC) | The ITS and PSCC do not require data from EV side to take decision. EV owners are not required to upload information |
[11] | Hybrid charging management framework for optimal choice between battery charging and swapping for urban EV taxi | Electric taxi, charging station, Battery swapping station, Global controller | The global controller selects a proper charging/swapping station as well as enabling charging reservation |
[22] | EV charging navigation based on deep reinforcement learning, | EVs, Charging stations, ITS center | EV driver decides based on received EVCS charging price, waiting time and road velocity |
Current work | EV navigation algorithm based on deep reinforcement learning, | EVs, charging stations, ITS center, and EVNS control center | The EVNS selects send charging navigation decisions & route selection to EVs |
Ref. | Objective | EV Problem | Method | Challenges |
---|---|---|---|---|
[18] | Min. Operation energy cost | Charging/discharging schedules of Building | RL, Q-Learning | Unknown future information (load, energy prices, and amount of energy generation) |
[19] | Min. Charging cost | Charging/discharging schedules of EV | DRL, DQN, LSTM | Randomness in traffic conditions, user’s commuting behavior, and the pricing process |
[20] | Min. Cost of charging a group of EVs | EV charging scheduling of EVCSs | RL, fitted Q-iteration | Curse of dimensionality due to the continuity and scale of the state and action spaces |
[21] | Max. Profit of EVCS | Optimal pricing and charging scheduling | RL, SARSA | Random EV arrivals and departures |
[22] | Min. Total cost of an EV | Navigate an EV to EVCS | DRL, DQN | Uncertainty in traffic conditions, charging price, and waiting time |
Current work | Min. Total travel time of multiple EVs | Navigate multiple EVs to destination via EVCS | DRL, DQN | Uncertainty in traffic conditions, randomly arrival requests |
Parameter | Value |
---|---|
Max. Battery capacity | 54.75 kWh |
Initial SOC | Uniform (0.2, 0.4) |
Required SOC | 0.9 |
Energy consumption rate | 0.16 kW/km |
Number of EVCS | 3 |
Number of charging pole | 2 |
Charging power | 60 kW |
Charging efficiency | 0.9 |
Number of nodes | 39 |
Number of links | 134 |
Parameter | Value | Parameter | Value |
---|---|---|---|
Number of epochs, M | 7000 | Target Net. update period, | 10 |
Discount factor, | 0.99 | Batch size, | 256 |
Learning rate, | 0.01 | Training threshold, | 5000 |
Objective | EVCS Selection Function |
---|---|
Min. Distance | |
Min. Travel Time | |
Min. Waiting time |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, K.-B.; A. Ahmed, M.; Kang, D.-K.; Kim, Y.-C. Deep Reinforcement Learning Based Optimal Route and Charging Station Selection. Energies 2020, 13, 6255. https://doi.org/10.3390/en13236255
Lee K-B, A. Ahmed M, Kang D-K, Kim Y-C. Deep Reinforcement Learning Based Optimal Route and Charging Station Selection. Energies. 2020; 13(23):6255. https://doi.org/10.3390/en13236255
Chicago/Turabian StyleLee, Ki-Beom, Mohamed A. Ahmed, Dong-Ki Kang, and Young-Chon Kim. 2020. "Deep Reinforcement Learning Based Optimal Route and Charging Station Selection" Energies 13, no. 23: 6255. https://doi.org/10.3390/en13236255
APA StyleLee, K.-B., A. Ahmed, M., Kang, D.-K., & Kim, Y.-C. (2020). Deep Reinforcement Learning Based Optimal Route and Charging Station Selection. Energies, 13(23), 6255. https://doi.org/10.3390/en13236255