TD(λ)相关论文
近年来强化学习中的策略梯度方法以其良好的收敛性能吸引了广泛的关注。研究了平均模型中的自然梯度算法,针对现有算法估计梯度时......
A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game posi......