论文部分内容阅读
提出一种基于行为等价原理分段处理交互式动态影响图(I-DID)的近似算法:先将底层I-DID模型分解成包含若干时间片的子片段,求解首片段,获得各模型的策略树,并依行为等价原理合并策略树,形成策略图,其结果作为下一片段的初始模型,再进行求解.重复这个过程,直到最后片段结束,获得完全策略图,用来指导agent是否进行模型更新.最后,针对多agent老虎问题进行试验和算法比较,试验结果从模型解的质量和模型空间大小2个方面验证了所提算法的有效性.
This paper proposes an approximate algorithm based on the principle of behavioral equivalence to process I-DIDs interactively. First, the underlying I-DID model is decomposed into sub-segments containing several time slices. The first fragment is obtained, The strategy tree merges the strategy tree according to the principle of equivalence of actions to form a strategy diagram whose result is used as the initial model of the next segment and then solves the problem. Repeat this process until the end of the last segment obtains the full strategy graph to guide the agent Finally, the experiments and algorithms are compared to the multi-agent tiger problem, and the test results verify the validity of the proposed algorithm from two aspects of the quality of the model solution and the model space size.