论文部分内容阅读
针对多星协同动态任务规划问题,以往多采用基于启发式的重规划算法,但是由于启发式策略依赖于具体任务,使得优化性受到影响。注意到协同规划的历史信息对后续协同规划的影响,本文提出了一种基于策略迭代的多智能体强化学习和迁移学习的混合学习算法求解该问题近似最优策略。本文的多智能体强化学习方法利用神经网络描述各颗卫星的强化学习策略,通过协同进化的方法迭代搜索具有最优拓扑结构和连接权重的策略神经网络个体。针对随机出现的观测任务请求导致历史学习策略失效,通过迁移学习将历史学习策略转换为当前初始策略,保证规划质量前提下加快多星协同任务规划速度。仿真实验及分析结果表明本文算法对动态随机出现的任务请求有良好的适应性。
For multi-star collaborative dynamic task planning, heuristic-based re-planning algorithms are mostly used in the past, but the heuristic is dependent on the specific tasks, which makes the optimization affected. In this paper, a hybrid learning algorithm based on strategy iterative multi-agent reinforcement learning and relocation learning is proposed to solve the approximate optimal strategy of the problem by noting the influence of collaborative planning history information on subsequent collaborative planning. In this paper, the multi-agent reinforcement learning method uses neural network to describe the reinforcement learning strategy of each satellite and iteratively searches for the individuals of the strategy neural network with the optimal topology and connection weights through the method of co-evolution. Requesting for stochastic observational tasks leads to the failure of history learning strategies, transforming learning history learning strategies into current initial strategies through migration learning, and accelerating the planning speed of multi-star collaborative tasks under the premise of planning quality. Simulation experiments and analysis results show that the proposed algorithm has good adaptability to the task requests that appear on the fly.