Deep Reinforcement Learning-based Trajectory Planning for Manipulator Obstacle Avoidance

Cao Yi; Guo Yinhui; Li Lei; Zhu Baiyu; Zhao Zhihua

doi:10.16578/j.issn.1004.2539.2023.12.006

您当前的位置：

首页 >

文章列表页 >

Deep Reinforcement Learning-based Trajectory Planning for Manipulator Obstacle Avoidance

Theory·Research | 更新时间：2023-12-21

- Deep Reinforcement Learning-based Trajectory Planning for Manipulator Obstacle Avoidance
- Journal of Mechanical Transmission Vol. 47, Issue 12, Pages: 40-46(2023)
- 作者机构：
  
  河南工业大学机电工程学院，河南郑州 450001
- 作者简介：
- 基金信息：
- DOI：10.16578/j.issn.1004.2539.2023.12.006
  CLC：
- Published：15 December 2023，
  
  Received：21 September 2022，
  
  Revised：12 November 2022，
扫描看全文
曹毅,郭银辉,李磊等.基于深度强化学习的机械臂避障轨迹规划研究[J].机械传动,2023,47(12):40-46.

Cao Yi,Guo Yinhui,Li Lei,et al.Deep Reinforcement Learning-based Trajectory Planning for Manipulator Obstacle Avoidance[J].Journal of Mechanical Transmission,2023,47(12):40-46.
曹毅,郭银辉,李磊等.基于深度强化学习的机械臂避障轨迹规划研究[J].机械传动,2023,47(12):40-46. DOI： 10.16578/j.issn.1004.2539.2023.12.006.

Cao Yi,Guo Yinhui,Li Lei,et al.Deep Reinforcement Learning-based Trajectory Planning for Manipulator Obstacle Avoidance[J].Journal of Mechanical Transmission,2023,47(12):40-46. DOI： 10.16578/j.issn.1004.2539.2023.12.006.

摘要

针对传统路径规划算法在机械臂避障运动时存在规划时间长、路径冗长等问题，提出了一种基于深度强化学习（Deep Reinforcement Learning，DRL）的运动规划方法。首先，构建了机械臂数学模型和运动环境，并在PyBullet中搭建了DOBOT机械臂与操作环境，设置了DRL所需的奖励函数、动作变量和状态变量等参数。其次，针对静态障碍物规避问题的特点，采用深度确定性策略梯度（Deep Deterministic Policy Gradient，DDPG）算法，进行了运动仿真试验。仿真结果表明，相较于快速扩展随机树（Rapid-exploring Random Tree，RRT）算法以及改进RRT算法，所提出的DDPG算法在规划时间和路径长度方面均有一定程度提高。最后，在实验室中采用DOBOT机械臂对DDPG算法在多种障碍物环境下避障操作的有效性进行了验证。

Abstract

A deep reinforcement learning (DRL)-based motion planning method is proposed to improve long planning elapse and lengthy path of the traditional planning algorithms for robotic manipulator movement in obstacle avoidance. Firstly

based on the mathematical model of the manipulator and the motion environment

the DOBOT robot and the operating environment are built in PyBullet

and the parameters such as the reward function

the action and the state variables required for DRL are set. Secondly

the deep deterministic policy gradient (DDPG) algorithm is applied for the characteristics of static obstacle avoidance

and motion simulation experiments are conducted. The simulation results show that the proposed DDPG algorithm has a certain degree of improvements in planning elapse and path length compared with the rapid-exploring random tree (RRT) algorithm and the improved RRT algorithm. Finally

the effectiveness of the DDPG algorithm in obstacle avoidance operations is tested using the DOBOT robot in a laboratory environment with multiple obstacles.

关键词

机械臂深度强化学习避障路径规划深度确定性策略梯度算法

Keywords

ManipulatorDeep reinforcement learningObstacle avoidance path planningDeep deterministic policy gradient algorithm

references

张亮.仿人机器人肩肘腕关节及臂的设计［D］.秦皇岛：燕山大学，2016：1-3.

ZHANG Liang.Design of humanoid robot shoulder-elbow-wrist joint and arm［D］.Qinhuangdao：Yanshan University，2016：1-3.

梁毓明，徐立鸿.移动机器人路径规划技术的研究现状与发展趋势［J］.机电一体化，2009，15（3）：35-38.

LIANG Yuming，XU Lihong.Research status and development trend of path planning technology for mobile robots［J］.Mechatronics，2009，15（3）：35-38.

OKAMURA A M，SMABY N，CUTKOSKY M R.An overview of dexterous manipulation［C］//Proceedings-IEEE International Conference on Robotics and Automation，2000：255-262.

RYBUS T.Obstacle avoidance in space robotics：review of major challenges and proposed solutions［J］.Progress in Aerospace Sciences，2018，101：31-48.

张思伦，吴怀宇，陈洋，等.基于优选学习泛化机制的机械臂运动规划方法［J］.高技术通讯，2019，29（7）：685-693.

ZHANG Silun，WU Huaiyu，CHEN Yang，et al.A robotic arm motion planning method based on preferential learning generalization mechanism［J］.Chinese High Technology Letters，2019，29（7）：685-693.

谢龙，刘山.基于改进势场法的机械臂动态避障规划［J］.控制理论与应用，2018，35（9）：1239-1249.

XIE Long，LIU Shan.Dynamic obstacle avoidance planning for robotic arm based on improved potential field method［J］.Control Theory and Applications，2018，35（9）：1239-1249.

赵惠，李庆党，张明月.基于改进RRT算法的机械臂路径规划方法［J］.电子测量技术，2021，44（16）：45-49.

ZHAO Hui，LI Qingdang，ZHANG Mingyue.A robotic arm path planning method based on improved RRT algorithm［J］.Electronic Measurement Technology，2021，44（16）：45-49.

WEI Z，CHEN W，WANG H，et al.Manipulator motion planning using flexible obstacle avoidance based on model learning［J］.International Journal of Advanced Robotic Systems，2017，14（3）：378-400.

李广创，程良伦.基于深度强化学习的机械臂避障路径规划研究［J］.软件工程，2019，22（3）：12-15.

LI Guangchuang，CHENG Lianglun.Research on robotic arm obstacle avoidance path planning based on deep reinforcement learning［J］.Software Engineering，2019，22（3）：12-15.

JING X，ZHAO H，LIU D，et al.Application of deep reinforcement learning in mobile robot path planning［C］//Proceedings of Chinese Automation Congress（CAC），Jinan，China，October 20-22，2017：7112-7116.

TAI L，LIU M.Towards cognitive exploration through deep reinforcement learning for mobile robots［J］.arXiv，2016，abs/1610.01733：12422135.

HASSELT H，GUEZ A，SILVER D.Deep reinforcement learning with double Q-learning［C］//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence，Phoenix，Arizona USA，February 12-133，2016：2094-2100.

MNIH V，BADIA A P，MIRZA M，et al.Asynchronous methods for deep reinforcement learning［C］//Proceedings of the International Conference on Machine Learning.New York：ACM，2016：1928-1937.

TAI L，PAOLO G，LIU M.Virtual-ta-real deep reinforcement learning：continuous control of mobile robots for maples navigation［C］//Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems（IROS），2017：31-36.

GU S，HOLLY E，LILLICRAP T，et al.Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates［C］//2017 IEEE International Conference on Robotics and Automation（ICRA）.IEEE，2017：3389-3396.

SUTTONU R S，BARTO A G.Reinforcement learning：an introduction［J］.Robotica，1999，17（2）：229-235.

LILLICRAP T P，HUNT J J，PRITZEL A，et al.Continuous control with deep reinforcement learning［J］.CoRR，2015，abs/1509.02971：16326763.

张景涛.基于ROS的机械臂避障轨迹规划研究［D］.郑州：河南工业大学，2022：19-23.

ZHANG Jingtao.Research on ROS-based robotic arm obstacle avoidance trajectory planning［D］.Zhengzhou：Henan University of Technology，2022：19-23.

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Obstacle Avoidance Motion Planning of Manipulators in ROS Based on the Improved RRT Algorithm

Establishment of Hysteresis Stiffness Model and Kinematic Precision Analysis of Manipulators

Manipulator Trajectory Planning based on ADPSO Algorithm

Trajectory Planning of Time Optimal Manipulator based on Complex Method

Kinematics Analysis of Grasping Manipulator based on ART-RBF Learning Algorithm

Related Author

Guo Yinhui

Li Lei

Zhao Pu

Zhang Jingtao

Cao Yi

Ma Feihong

Li Yuliang

Yue Xiaoli

Related Institution

College of Mechanical and Electrical Engineering, Henan University of Technology

College of Mechanical Engineering, Donghua University

College of Mechanical and Electrical Engineering，Central South University of Forestry and Technology

College of Mechanical Engineering，Hunan Institute of Science and Technology

School of Mechanical Engineering，University of South China

⁰