Greedy feature replacement for online value function approximation

来源 :Journal of Zhejiang University-Science C(Computers & Electro | 被引量 : 0次 | 上传用户：shishaofei

【摘要】

：

Reinforcement learning(RL) in real-world problems requires function approximations that depend on selecting the appropriate feature representations. Representat

【作者】

：

Feng-fei ZHAO Zheng QIN Zhuo SHAO Jun FANG Bo-yan REN

【机构】

：

Department of Computer Science and Technology, Tsinghua University,School of Software, Tsinghua Univ

【出处】

：

Journal of Zhejiang University-Science C(Computers & Electro

【发表日期】

：

2014年03期

【关键词】

：

replacement automatically greedy handle guarantees expanded selecting replace fa

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

Reinforcement learning(RL) in real-world problems requires function approximations that depend on selecting the appropriate feature representations. Representational expansion techniques can make linear approximators represent value functions more effectively; however, most of these techniques function well only for low dimensional problems. In this paper, we present the greedy feature replacement(GFR), a novel online expansion technique, for value-based RL algorithms that use binary features. Given a simple initial representation, the feature representation is expanded incrementally. New feature dependencies are added automatically to the current representation and conjunctive features are used to replace current features greedily. The virtual temporal difference(TD) error is recorded for each conjunctive feature to judge whether the replacement can improve the approximation. Correctness guarantees and computational complexity analysis are provided for GFR. Experimental results in two domains show that GFR achieves much faster learning and has the capability to handle large-scale problems. Representational expansion techniques can make linear approximators represent value functions more effectively; however, most of these techniques function well only for low dimensional problems. In this paper, we present the greedy feature replacement (GFR), a novel online expansion technique, for value-based RL algorithms that use binary features. Given a simple initial representation, the feature representation is expanded incrementally. New feature dependencies are added automatically to the current representation and conjunctive features are used to replace current features greedily. The virtual temporal difference (TD) error is recorded for each conjunctive feature to judge whether the replacement can improve improve the approximation. Correctness guarantees and computational complexity analysis are provided for GFR. Experimental results in two domains show that GFR achieves much faster learning and has the capability to handle large-scale problems.

其他文献

牧区第一次公开建党的经历(下)

现在回忆起来,我们当时宣讲党课的主要内容大致为:什么是共产主义、什么是社会主义、什么是共产党、怎样做一个共产党员、过渡时期党的总路线、党的民族政策、党在牧区的各项

期刊

候补党员那达慕大会民族政策生产劳动打草场泥匠思想觉悟教育管理人说领导干部

我的教育思想

在上学学习的时候，总觉得一些教育理论用处不大，经过几年的教学实践，我深切体会到：要想不断提高自己的教学水平，就不能没有一个明确的教育思想作为指导，现在，我认为应当用教育思想来贯穿教学过程的始终。　　一、把爱给学生　　没有爱就没有教育。如果教师没有对祖国和人民的爱，就无法培养学生的高尚情操；没有对生活和事业的爱，就无法引导学生对生活充满爱；没有对家人、朋友的爱，就不可能塑造学生善良的心；没有对学生的

期刊

教育思想教育理论教学水平教学实践教学过程学习

红景天苷的生物合成途径及生物技术研究进展

红景天苷是景天科红景天属植物最为重要的药效成分,由于其具有多种重要的药用功效而成为当今天然产物研究的热点之一。在全面总结前人研究成果的基础上,综述了红景天苷生物合

期刊

红景天苷景天科红景天属组织培养生物合成途径高山红景天药效成分愈伤组织遗传转化悬浮培养天然产物研究

党的领导是我们事业胜利的保证

在纪念伟大的五四运动８０周年，重温五四运动及其后来的历史经验时，不能不强烈地感受到：中国革命和建设事业的不断胜利，中国青年运动的健康发展，须臾离不开中国共产党坚强正确的领导；要

期刊

中国青年运动党的领导五四精神中国工人运动工农群众反帝反封建邓小平理论人民解放战争重大原则问题革命青年

中国的大门是如何打开的(下)

六、担任国家进出口委副主任的江泽民,与广东省委负责同志研究决定,国家拿出3000万元贷款,专供开发深圳经济特区用。荒土变成了金子。特区应该怎么建? 圈出一块地方,搞一个

期刊

深圳经济特区引进外资出口商品副主任土地使用费特区经济秦文俊李灏吴南生圈出

洞穴石笋ICP-MS向量元素分析技术与豫西MIS8/9时段古气候变化研究

过去气候变化是全球变化研究的重要组成部分。探讨地球环境在地质时期的变化规律是预测朱来气候变化、应对当今和将来日益严竣的环境问题的迫切需求。由于我国的气候环境变化

学位

洞穴石笋电感耦合等离子体质谱仪亚洲季风微量元素检测技术古气候

贝壳酒店

请下载后查看，本文暂不支持在线获取查看简介。 Please download to view, this article does not support online access to view profile.

期刊

工程指导其他费用优惠政策维护费客房数总部地址浑源县山西省大同市运营要求设计方案

FANUC系统参数在机床维护中的应用

爱数控的博客该博客创建于公元2012年,致力于分享数控操作,机床维修,系统维护等方面的内容。由机床参数引起的无报警故障。一台FANUC 18i-W慢走丝,开机后CRT显示X、Y、U、V坐

期刊

机床参数机床维修小数点慢走丝数控操作手摇脉冲发生器报警故障机床参数参考点

快速城市化背景下的城市商业网点规划——邛崃市城市商业网点规划探析

商业是城市最重要的功能之一，始终是城市经济、社会生活的最基本内容。商业网点布局规划工作是各国政府普遍关注的一个重要问题之一，合理的商业网点布局不仅会促进地区商业

学位

城市化背景城市商业网点服务业服务贸易发展模式

谢觉哉洪湖历险记

1932年6月下旬,蒋介石纠集15万兵力,对湘鄂西革命根据地进行大举进攻,红军被迫离开根据地作战略转移。9月,敌人对洪湖地区进行大规模的清剿,湘鄂西省委政治秘书长兼文化部副

期刊

湘鄂西历险记战略转移中共地下党政治秘书青草湖大举进攻副部长贫苦农民虐待俘虏

Greedy feature replacement for online value function approximation

与本文相关的学术论文