针对传统路径规划算法在机械臂避障运动时存在规划时间长、路径冗长等问题,提出了一种基于深度强化学习(Deep Reinforcement Learning,DRL)的运动规划方法。首先,构建了机械臂数学模型和运动环境,并在PyBullet中搭建了DOBOT机械臂与操作环境,设置了DRL所需的奖励函数、动作变量和状态变量等参数。其次,针对静态障碍物规避问题的特点,采用深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法,进行了运动仿真试验。仿真结果表明,相较于快速扩展随机树(Rapid-exploring Random Tree,RRT)算法以及改进RRT算法,所提出的DDPG算法在规划时间和路径长度方面均有一定程度提高。最后,在实验室中采用DOBOT机械臂对DDPG算法在多种障碍物环境下避障操作的有效性进行了验证。
Abstract
A deep reinforcement learning (DRL)-based motion planning method is proposed to improve long planning elapse and lengthy path of the traditional planning algorithms for robotic manipulator movement in obstacle avoidance. Firstly
based on the mathematical model of the manipulator and the motion environment
the DOBOT robot and the operating environment are built in PyBullet
and the parameters such as the reward function
the action and the state variables required for DRL are set. Secondly
the deep deterministic policy gradient (DDPG) algorithm is applied for the characteristics of static obstacle avoidance
and motion simulation experiments are conducted. The simulation results show that the proposed DDPG algorithm has a certain degree of improvements in planning elapse and path length compared with the rapid-exploring random tree (RRT) algorithm and the improved RRT algorithm. Finally
the effectiveness of the DDPG algorithm in obstacle avoidance operations is tested using the DOBOT robot in a laboratory environment with multiple obstacles.
LIANG Yuming,XU Lihong.Research status and development trend of path planning technology for mobile robots[J].Mechatronics,2009,15(3):35-38.
OKAMURA A M,SMABY N,CUTKOSKY M R.An overview of dexterous manipulation[C]//Proceedings-IEEE International Conference on Robotics and Automation,2000:255-262.
RYBUS T.Obstacle avoidance in space robotics:review of major challenges and proposed solutions[J].Progress in Aerospace Sciences,2018,101:31-48.
ZHANG Silun,WU Huaiyu,CHEN Yang,et al.A robotic arm motion planning method based on preferential learning generalization mechanism[J].Chinese High Technology Letters,2019,29(7):685-693.
XIE Long,LIU Shan.Dynamic obstacle avoidance planning for robotic arm based on improved potential field method[J].Control Theory and Applications,2018,35(9):1239-1249.
ZHAO Hui,LI Qingdang,ZHANG Mingyue.A robotic arm path planning method based on improved RRT algorithm[J].Electronic Measurement Technology,2021,44(16):45-49.
WEI Z,CHEN W,WANG H,et al.Manipulator motion planning using flexible obstacle avoidance based on model learning[J].International Journal of Advanced Robotic Systems,2017,14(3):378-400.
LI Guangchuang,CHENG Lianglun.Research on robotic arm obstacle avoidance path planning based on deep reinforcement learning[J].Software Engineering,2019,22(3):12-15.
JING X,ZHAO H,LIU D,et al.Application of deep reinforcement learning in mobile robot path planning[C]//Proceedings of Chinese Automation Congress(CAC),Jinan,China,October 20-22,2017:7112-7116.
TAI L,LIU M.Towards cognitive exploration through deep reinforcement learning for mobile robots[J].arXiv,2016,abs/1610.01733:12422135.
HASSELT H,GUEZ A,SILVER D.Deep reinforcement learning with double Q-learning[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence,Phoenix,Arizona USA,February 12-133,2016:2094-2100.
MNIH V,BADIA A P,MIRZA M,et al.Asynchronous methods for deep reinforcement learning[C]//Proceedings of the International Conference on Machine Learning.New York:ACM,2016:1928-1937.
TAI L,PAOLO G,LIU M.Virtual-ta-real deep reinforcement learning:continuous control of mobile robots for maples navigation[C]//Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS),2017:31-36.
GU S,HOLLY E,LILLICRAP T,et al.Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates[C]//2017 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2017:3389-3396.
SUTTONU R S,BARTO A G.Reinforcement learning:an introduction[J].Robotica,1999,17(2):229-235.
LILLICRAP T P,HUNT J J,PRITZEL A,et al.Continuous control with deep reinforcement learning[J].CoRR,2015,abs/1509.02971:16326763.
张景涛.基于ROS的机械臂避障轨迹规划研究[D].郑州:河南工业大学,2022:19-23.
ZHANG Jingtao.Research on ROS-based robotic arm obstacle avoidance trajectory planning[D].Zhengzhou:Henan University of Technology,2022:19-23.