论文部分内容阅读
提出一种基于最先策略增强学习的 ART2神经网络 FPRL-ART2(Foremost-Policy Reinforcement Learn-ing based ART2 neuraI network),并介绍其学习算法.为了达到在线学习的目的.在 FPRL-ART2中,从状态到行为值之间的映射中,选择第一个得到奖励的行为,而不是选择诸如1-step Q-Learning 中具有最优行为值的行为.ART2神经网络用于存储分类模式,其权重通过增强学习增强或减弱,达到学习的目的.并将 FPRL-ART2运用到移动机器人避碰撞问题的研究中.仿真实验表明,引入 FPRL-ART2后减少移动机器人与障碍物发生碰撞的次数,具有良好的避碰效果.
This paper proposes a Foremost-Policy Reinforcement Learn-ing based ART2 neuraI network based on the first strategy to enhance learning, and introduces its learning algorithm.In order to achieve the purpose of online learning, in FPRL-ART2, Instead of choosing behaviors that have the best behavior values in 1-step Q-Learning, such as the state-to-behavior mapping, and the AR2 neural network is used to store the classification patterns whose weights are passed Enhance learning to enhance or weaken, and achieve the purpose of learning.Furthermore, FPRL-ART2 is applied to the problem of collision avoidance of mobile robots.The simulation results show that the introduction of FPRL-ART2 reduces the number of collisions between moving robots and obstacles, Collision avoidance effect.