论文部分内容阅读
One of the main problems in cooperative multiagent learning is that the joint action space grows exponentially with the number of agents. In this paper, we investigate a sparse representation of the coordination dependencies between agents to employ roles and context-specific coordination graphs to reduce the joint action space. In our framework, the global joint Q-function is decomposed into a number of local Q-functions. Each local Q-function is shared among a small group of agents and is composed of a set of value rules. We propose a novel multiagent Q-learning algorithm which learns the weights in each value rule automatically.We give empirical evidence to show that our learning algorithm converges to the same optimal policy with a significantly faster speed than traditional multiagent learning techniques.