论文部分内容阅读
针对认知无线电多用户的信道和功率资源分配问题,提出一种基于用户聚类和可变学习速率的多Agent强化学习方法.首先使用分层处理分离信道选择与功率控制,采用快速最优搜索结合用户数均衡调节实现信道分配;其次,使用随机博弈框架对多用户功率控制问题进行建模,通过K均值用户聚类减少博弈参与用户数量和降低单个用户的环境复杂度,并使用可变Q学习速率和策略学习速率的方法进一步促进多Agent强化学习的收敛.仿真结果表明,该方法能使多个用户的功率状态和总收益有效收敛,并且使整体性能达到次优.
Aiming at the problem of multi-user channel and power resource allocation for cognitive radio, this paper proposes a multi-agent reinforcement learning method based on user clustering and variable learning rate.Secondly, using hierarchical processing to separate channel selection and power control, a fast optimal search Secondly, using the stochastic game framework to model multi-user power control problems, reducing the number of game participants and reducing the environmental complexity of a single user by using K-means clustering, and using variable Q The learning rate and strategy learning rate further promote the convergence of multi-agent reinforcement learning.The simulation results show that this method can effectively converge the power states and the total revenue of multiple users, and make the overall performance sub-optimal.