Direct heuristic dynamic programming based on an improved PID neural network

来源 :Journal of Control Theory and Applications | 被引量 : 0次 | 上传用户:liuxiaotiancxks
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
In this paper,an improved PID-neural network(IPIDNN) structure is proposed and applied to the critic and action networks of direct heuristic dynamic programming(DHDP).As one of online learning algorithm of approximate dynamic programming(ADP),DHDP has demonstrated its applicability to large state and control problems.Theoretically, the DHDP algorithm requires access to full state feedback in order to obtain solutions to the Bellman optimality equation. Unfortunately,it is not always possible to access all the states in a real system.This paper proposes a solution by suggesting an IPIDNN configuration to construct the critic and action networks to achieve an output feedback control.Since this structure can estimate the integrals and derivatives of measurable outputs,more system states are utilized and thus better control performance are expected.Compared with traditional PIDNN,this configuration is flexible and easy to expand. Based on this structure,a gradient decent algorithm for this IPIDNN-based DHDP is presented.Convergence issues are addressed within a single learning time step and for the entire learning process.Some important insights are provided to guide the implementation of the algorithm.The proposed learning controller has been applied to a cart-pole system to validate the effectiveness of the structure and the algorithm. In one paper, an improved PID-neural network (IPIDNN) structure is proposed and applied to the critic and action networks of direct heuristic dynamic programming (DHDP). its applicability to large state and control problems. Theoretically, the DHDP algorithm requires access to full state feedback in order to obtain solutions to the Bellman optimality equation. Unfortunately, it is not always possible to access all the states in a real system. This paper Proposition a solution by suggesting an IPIDNN configuration to construct the critic and action networks to achieve output feedback control .ince this structure can estimate the integrals and derivatives of measurable outputs, more system states are utilized and thus better control performance are expected. Compared with traditional PIDNN, this configuration is flexible and easy to expand. Based on this structure, a gradient decent algorithm for this IPIDNN-based DHDP is presented. Convergence issues are addressed within a single learning time step and for the entire learning process. Important important insights are provided to guide the implementation of the algorithm. The proposed learning controller has been applied to a cart-pole system. to validate the effectiveness of the structure and the algorithm.
其他文献
鉴于信息技术学科的特殊性,教师对合作学习认识的片面性,小组合作往往“误入歧途”。通过挖掘教材背后隐藏的争议点,找到小组合作的切入点,以提高小组合作的效率。 In view
根据数据和资料表明,世界许多国家的5岁以下儿童死亡率都下降速度都比较趋缓。如果以此发展下去,将很难在2015年实现降低三分之二的千年发展目标。其实,在一些国家和地区当中,这
学位
该文从挂篮荷载计算、施工流程、支座及临时固结施工、挂篮安装及试验、合拢段施工、模板制作安装、钢筋安装、混凝土的浇筑及养生、测量监控等方面人手,介绍了S226海滨大桥
随着我国经济的不断发展,城市化进程的推进,高速公路的建设事业也迎来了快速发展的新时期.高速公路的日常工作离不开机电系统的应用,随着高速公路建设的推进,机电系统的应用
In this paper,the problem of global output feedback stabilization for a class of upper-triangular nonlinear systems with time-varying time-delay in the state is
该文从挂篮荷载计算、施工流程、支座及临时固结施工、挂篮安装及试验、合拢段施工、模板制作安装、钢筋安装、混凝土的浇筑及养生、测量监控等方面人手,介绍了S226海滨大桥
大众的政治参与实际上是民主的内核,是人类社会所追寻和探讨的理想所在。社会大众究竟以何种方式、途径和渠道在何种程度上参与政治生活,特别是政府治理过程,始终是政治学恒
学位
冷战结束后,世界总体上保持了和平与稳定的局面。但是,在原先两极格局掩盖下的各种矛盾不断爆发,一些国家和地区陷入了内部冲突与混乱。20世纪90年代以来,世界上的冲突大多数不是
学位
随着我国改革开放的深化,政府与市场之间的关系不断重新定位。特殊法人制度作为融合经济规律和国家意志的产物,体现了社会多元化、法治化、自治化精神,具有不可替代的制度优
学位
徐州市900多万人口,有600多万在农村。推进科学发展,建设美好徐州,建设一支坚强有力的村党组织书记队伍至关重要。近两年来,我们坚持解放思想、开拓创新,在这方面做 Xuzhou