Abstract
In the realm of intelligent transportation, autonomous driving relies heavily on robust path planning. This study introduces a novel Deep Reinforcement Learning algorithm for Path Planning (DRL-PP) to address limitations in current methodologies. To navigate complex environments, DRL-PP is engineered to determine optimal actions while minimizing overfitting. The algorithm employs neural networks to identify advantageous state-specific actions, constructing an optimized trajectory from origin to destination. A critical innovation lies in the enhanced reward function, which integrates objective-specific data to dynamically calibrate reward metrics. This refinement significantly augments decision-making efficiency. Empirical evaluations demonstrate that DRL-PP stabilizes reward accumulation and minimizes exploratory iterations. Comparative analysis confirms that the proposed algorithm consistently outperforms benchmark models in navigational efficacy, offering a robust solution for the evolutionary advancement of autonomous vehicle technology.
References
C. Katrakazas, M. Quddus, W.-H. Chen, and L. Deka, “Real-time motion planning methods for autonomous on-road driving: State-of-the-art and future research directions,” Transp. Res. C, Emerg. Technol., vol. 60, pp. 416–442, Nov. 2015.
Y. Ma, W. Liang, J. Li, X. Jia, and S. Guo, “Mobility-aware and delay-sensitive service provisioning in mobile edge-cloud networks,” IEEE Trans. Mobile Comput., vol. 21, no. 1, pp. 196–210, Jan. 2022.
J. Li, Y. Chen, X. Zhao, and J. Huang, “An improved DQN path planning algorithm,” J. Supercomput., vol. 78, no. 1, pp. 616–639, Jan. 2022.
J. Zhang, H. Guo, J. Liu, and Y. Zhang, “Task offloading in vehicular edge computing networks: A load-balancing solution,” IEEE Trans. Veh. Technol., vol. 69, no. 2, pp. 2092–2104, Feb. 2020.
Q. Xie, X. Zhang, I. Rekik, X. Chen, N. Mao, D. Shen, and F. Zhao, “Constructing high-order functional connectivity network based on central moment features for diagnosis of autism spectrum disorder,” PeerJ, vol. 9, Jul. 2021, Art. no. e11692.
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing Atari with deep reinforcement learning,” 2013, arXiv:1312.5602.
F. Zhao, Z. Han, D. Cheng, N. Mao, X. Chen, Y. Li, D. Fan, and P. Liu, “Hierarchical synchronization estimation of low- and high-order functional connectivity based on sub-network division for the diagnosis of autism spectrum disorder,” Frontiers Neurosci., vol. 15, p. 1898, Feb. 2022.
Y. Lu and Y. Wan, “Clustering by sorting potential values (CSPV): A novel potential-based clustering method,” Pattern Recognit., vol. 45, no. 9, pp. 3512–3522, Sep. 2012.
G. Gao and R. Jin, “An end-to-end flow control method based on DQN,” in Proc. Int. Conf. Big Data, Inf. Comput. Netw. (BDICN), Jan. 2022, pp. 504–507.
X. Zhang, F. Yang, Y. Guo, H. Yu, Z. Wang, and Q. Zhang, “Adaptive differential privacy mechanism based on entropy theory for preserving deep neural networks,” Mathematics, vol. 11, no. 2, p. 330, Jan. 2023.
H. Yu, L. T. Yang, Q. Zhang, D. Armstrong, and M. J. Deen, “Convolutional neural networks for medical image analysis: State-of-the-art, comparisons, improvement and perspectives,” Neurocomputing, vol. 444, pp. 92–110, Jul. 2021.
Y. Luo, Y. Zhang, X. Ding, X. Cai, C. Song, and X. Yuan, “StrDip: A fast data stream clustering algorithm using the dip test of unimodality,” in Web Information Systems Engineering—WISE 2018 (Lecture Notes in Computer Science), vol. 11234, H. Hacid, W. Cellary, H. Wang, H. Y. Paik, and R. Zhou, Eds. Cham, Switzerland: Springer, 2018.
L. M. Zamstein, A. A. Arroyo, E. M. Schwartz, S. Keen, B. Sutton, and G. Gandhi, “Koolio: Path planning using reinforcement learning on a real robot platform,” in Proc. 19th Florida Conf. Recent Adv. Robot. (FCRAR), 2006, pp. 25–26.
V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, pp. 529–533, Feb. 2015.
G. Pan, Y. Xiang, X. Wang, Z. Yu, and X. Zhou, “Research on path planning algorithm of mobile robot based on reinforcement learning,” Soft Comput., vol. 26, no. 18, pp. 8961–8970, Sep. 2022.
A. I. Panov, K. S. Yakovlev, and R. Suvorov, “Grid path planning with deep reinforcement learning: Preliminary results,” Procedia Comput. Sci., vol. 123, pp. 347–353, 2018.
X. Lei, Z. Zhang, and P. Dong, “Dynamic path planning of unknown environment based on deep reinforcement learning,” J. Robot., vol. 2018, pp. 1–10, Sep. 2018.
A. Balachandran, A. S. Lal, and P. Sreedharan, “Autonomous navigation of an AMR using deep reinforcement learning in a warehouse environment,” in Proc. IEEE 2nd Mysore Sub Sect. Int. Conf. (MysuruCon), Oct. 2022, pp. 1–5.
J. Gao, W. Ye, J. Guo, and Z. Li, “Deep reinforcement learning for indoor mobile robot path planning,” Sensors, vol. 20, no. 19, p. 5493, Sep. 2020.
A. Konar, I. Goswami Chakraborty, S. J. Singh, L. C. Jain, and A. K. Nagar, “A deterministic improved Q-learning for path planning of a mobile robot,” IEEE Trans. Syst., Man, Cybern., Syst., vol. 43, no. 5, pp. 1141–1153, Sep. 2013.
P. K. Das, H. S. Behera, and B. K. Panigrahi, “Intelligent-based multi-robot path planning inspired by improved classical Q-learning and improved particle swarm optimization with perturbed velocity,” Eng. Sci. Technol., Int. J., vol. 19, no. 1, pp. 651–669, Mar. 2016.
H. Van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” in Proc. AAAI Conf. Artif. Intell., 2016, vol. 30, no. 1, pp. 2094–2100.
Y. Li, A. H. Aghvami, and D. Dong, “Path planning for cellular-connected UAV: A DRL solution with quantum-inspired experience replay,” IEEE Trans. Wireless Commun., vol. 21, no. 10, pp. 7897–7912, Oct. 2022.
Y. Li, A. H. Aghvami, and D. Dong, “Intelligent trajectory planning in UAV-mounted wireless networks: A quantum-inspired reinforcement learning perspective,” IEEE Wireless Commun. Lett., vol. 10, no. 9, pp. 1994–1998, Sep. 2021.
T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” 2015, arXiv:1509.02971.
H. Yu, L. T. Yang, X. Fan, and Q. Zhang, “A deep residual computation model for heterogeneous data learning in smart Internet of Things,” Appl. Soft Comput., vol. 107, Aug. 2021, Art. no. 107361.
H. Yu, Q. Zhang, and L. T. Yang, “An edge-cloud-aided private high-order fuzzy C-means clustering algorithm in smart healthcare,” in Proc. IEEE/ACM Trans. Comput. Biol. Bioinf., 2023, pp. 1–10.
R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Mach. Learn., vol. 8, nos. 3–4, pp. 229–256, May 1992.
S. Wang, S. Wang, Z. Liu, and Q. Zhang, “A role distinguishing BERT model for medical dialogue system in sustainable smart city,” Sustain. Energy Technol. Assessments, vol. 55, Feb. 2023, Art. no. 102896.
X. Hu, X. Ding, D. Bai, and Q. Zhang, “A compressed model-agnostic meta-learning model based on pruning for disease diagnosis,” J. Circuits, Syst. Comput., vol. 32, no. 2, Jan. 2023, Art. no. 2350022.
X. Libin and C. Chunjie, “A short signal backoff MAC protocol based on game theory for underwater sensor networks,” IEEE Access, vol. 10, pp. 125992–126000, 2022.
R. Zhu, Q. Jiang, X. Huang, D. Li, and Q. Yang, “A reinforcement-learning-based opportunistic routing protocol for energy-efficient and void-avoided UASNs,” IEEE Sensors J., vol. 22, no. 13, pp. 13589–13601, Jul. 2022.
F. Cheng, G. Gui, N. Zhao, Y. Chen, J. Tang, and H. Sari, “UAV-relaying-assisted secure transmission with caching,” IEEE Trans. Commun., vol. 67, no. 5, pp. 3140–3153, May 2019.
M. Aljehani and M. Inoue, “Performance evaluation of multi-UAV system in post-disaster application: Validated by HITL simulator,” IEEE Access, vol. 7, pp. 64386–64400, 2019.
I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, “Neural combinatorial optimization with reinforcement learning,” 2016, arXiv:1611.09940.

