A MULTI-AGENT LOCAL-LEARNING ALGORITHM UNDER GROUP ENVIROMENT

来源 :Journal of Electronics(China) | 被引量 : 0次 | 上传用户:kpqkxx03592
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
In this paper,a local-learning algorithm for multi-agent is presented based on the fact that individual agent performs local perception and local interaction under group environment.As for in-dividual-learning,agent adopts greedy strategy to maximize its reward when interacting with envi-ronment.In group-learning,local interaction takes place between each two agents.A local-learning algorithm to choose and modify agents’ actions is proposed to improve the traditional Q-learning algorithm,respectively in the situations of zero-sum games and general-sum games with unique equi-librium or multi-equilibrium.And this local-learning algorithm is proved to be convergent and the computation complexity is lower than the Nash-Q.Additionally,through grid-game test,it is indicated that by using this local-learning algorithm,the local behaviors of agents can spread to globe. In this paper, a local-learning algorithm for multi-agent is presented based on the fact that individual agent performs local perception and local interaction under group environment. As for in-dividual-learning, agent adopts greedy strategy to maximize its reward when interacting with envi-ronment.In group-learning, local interaction takes place between each two agents. A local-learning algorithm to choose and modify agents’ actions is proposed to improve the traditional Q-learning algorithm, respectively in the situations of zero-sum games and general-sum games with unique equi-librium or multi-equilibrium. And this local-learning algorithm is proved to be convergent and the computation complexity is lower than the Nash-Q.Additionally, through grid-game test, it is indicated that by using this local-learning algorithm, the local behaviors of agents can spread to globe.
其他文献
广西大厂矿务局长坡选厂过去使用的铁筛网,寿命短,更换频繁,每年还要消耗20多吨钢铁,每张只能处理3800吨。自从改用我厂生产的SZZ聚醚型聚氨酯浇注橡胶筛板,每张处理矿石可
介绍了复摆腭式破碎机动态功率测试方法,并从功率示波曲线了解功率变化规律及飞轮贮能情况,对飞轮设计提出了一些看法。 The dynamic power test method of compound pendul
当采掘工作面的风流温度超过《煤矿安全规程》的规定,而且仅靠通风的手段又改善不了工作面的热状况时,就应该考虑采取降温措施。在各种降温方法中,采用机械制冷降温方法是最
对于深度氧化,不含硫化物或含少量硫化物,而含有大量泥质组分(如绢云母等)的含金氧化矿石,采用浮选法选矿,选矿指标一般都比较低。目前国内外选别这种矿石主要采用氰化法。
目的:采用紫杉醇和DC疫苗联合作用于小鼠宫颈癌移植瘤,观察小鼠肿瘤组织微环境里面的IL-6这一免疫抑制因子与肿瘤微环境的细胞因子VEGF发生的变化,对应用紫杉醇化疗药物后能否
招标前夕,中央电视台广经中心副主任兼广告部主任针对今年招标形势做出了种种判断,从招标结果来看——11月4日,在央视国际网站在线直播的 On the eve of the tender, the de
本文旨在讨论采用澳大利亚联邦科学与工业研究组织(CSIRO)的中空色体传感器或美国矿山局(USBM)的变形计这类测试仪器测量原岩应力时如何考虑岩石的各向异性。各向异性岩石中
现代化科技的迅猛发展,使得秘书工作面临更多的挑战。现代秘书应具备公关意识、辅助角色意识、现代管理意识、积极学习意识、保密意识、创新意识、信息意识等几个方面素养,这
为了考察国外空气重介质分选技术,1986年初由河南省煤炭厅,中国矿业学院和新密矿务局组成的代表团,赴加拿大和美国进行了考察。主要考察了位于爱德蒙顿附近迪旺的加拿大能源
对摄影师来说,假期最愉快的事情就是端着相机出门去“啪啪啪”了.但回头浏览作品的过程却会让人眼花缭乱,不知道从何开始.Lightroom的图库模块是解决这个问题的一道良方,它提