Reinforcement Learning from Algorithm Model to Industry Innovation: A Foundation Stone of Future Art

来源 :中兴通讯技术(英文版) | 被引量 : 0次 | 上传用户:hace
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Reinforcement leing (RL) algorithm has been introduced for several de-cades, which becomes a paradigm in sequential decision-making and control. The de-velopment of reinforcement leing, especially in recent years, has enabled this algo-rithm to be applied in many industry fields, such as robotics, medical intelligence, and games. This paper first introduces the history and background of reinforcement le-ing, and then illustrates the industrial application and open source platforms. After that, the successful applications from AlphaGo to AlphaZero and future reinforcement leing technique are focused on. Finally, the artificial intelligence for complex inter-action (e.g., stochastic environment, multiple players, selfish behavior, and distributes optimization) is considered and this paper concludes with the highlight and outlook of future general artificial intelligence.
其他文献
介绍一款低功耗、3V工作电压和0.0%精度的A/D变换器
目前全球陷入煤炭和石油这两大能源危机中,天然气作为第三大能源的作用将日益重要。作为清洁能源,全球天然气储量较丰富,但是就我国而言,天然气人均储量较低,所以我国如何做好天然
金融市场被认为是经济活动的中心,同时众多的研究表明金融发展与经济增长之间存在着强相关关系。而经济增长理论认为知识和技术是推动经济增长的核心要素之一。技术进步来自创新的探索过程,R&D活动作为推动技术创新、技术进步的源泉,对经济增长起着至关重要的作用。R&D融资因而被视为金融发展影响经济增长的潜在途径之一,即金融发展通过作用于R&D融资而间接影响经济增长。因此,对于融资因素对于R&D投资的影响问题,
分析了我国钢筘、钢综、综框生产的行业规模、现状以及与世界先进技术水平的差距,提出生产企业应加大技术投入、使织造器材不断创新并向高品质、大规模化发展的策略.
在时代发展背景下,如何借助体育教学活动的支撑推进中学生体质健康水平的提高,受到了教育领域的广泛关注.本文从中学生体质健康的重要性入手,对促进中学生体质健康的体育教学
到2009年为止,我国实际利用外资额的一半以上仍然集中于制造业,这与世界上发达国家集中投资和发展服务业的理念有所背离。而相对于消费性服务业、社会服务业而言,一国的生产性服
当前世界全球化趋势在不断加强,各个国家之间经济、文化等之间的交流越来越密切,孤立的发展方式已经不适应当前时代的需求。而地理本身具有相互联系的特点,在研究地理问题的
期刊
为掌握自育三系不育系黔209A的异交特性,以利于其繁、制种,对其开花习性进行初步研究。结果表明:在贵阳自然生态条件下,黔209A群体开花历时10~12d,单穗开花期约8d,开花后约第
Abstract: The research on residents’ travel mode choice mainly studies how traffic flows are shared by different traffic modes, which is the prerequisite for the government to establish transportation
Abstract: Polarization?division multiplexing (PDM) with modulation in the nonlinear frequency domain consisting of the discrete and/or continuous spectrum has been recently regarded as a useful method