reinforcementlearning相关论文
Reinforcement learning based energy efficient robot relay for unmanned aerial vehicles against smart
Unmanned aerial vehicles (UAVs) with limited energy resources,severe path loss,and shadowing to the ground base stations......
We tackle the online 3D bin packing problem (3D-BPP),a challenging yet practically useful variant of the classical bin p......
In this paper,we introduce the Anderson acceleration technique developed to be applied to reinforcement learning tasks.W......
The H∞ control method is an effective approach for attenuating the effect of disturbances on practical systems, but it ......
Underwater optical imaging produces images with high resolution and abundant information and hence has outstanding advan......
Given a collection of parameterized multi-robot controllers associated with individual behaviors designed for particular......
Reinforcement learning is one of the fastest growing areas in machine learning,and has obtained great achievements in bi......
Reinforcement learning is about learning agent models that make the best sequential decisions in unknown en-vironments.I......
Actor-Critic Reinforcement Learning and Application in Developing Computer-Vision-Based Interface Tr
This paper synchronizes control theory with computer vision by formalizing object tracking as a sequen-tial decision-mak......
Abnormal or drastic changes in the natural environment may lead to unexpected events,such as tsunamis and earthquakes,wh......
The popularity of IEEE 802.11 based wireless local area network (WLAN) increased significantly in recent years and resul......
Self-driving car navigation with obstacle avoidance problem is a hot topic in academic world.The goal of this problem is......
The path planning for autonomous vehicles is a hot topic in academic world.The goal of this problem is to design a vehi-......
Depression detection is a significant issue for human well-being.Conventional diagnosis of depression requires a face-to......
Recently,interactive character animations in computer games are mainly rely on motion-captured or carefully crafted moti......
Least-Squares Temporal Difference Learning with Eligibility Traces based on Regularized Extreme Lear
The task of learning the value function under a fixed policy in continuous Markov decision processes(MDPs)is considered.......
For some rodent mammals when they foraging or looking for a target, the positions and headings in their brain cells are ......
Uncovering representations and algorithms of decision making by model-based analysis of striatal neu
The striatum is a major input site of the basal ganglia,which play an essential role in 337 decision making....
Development of a Deep Learning Model for Binding Affinity Prediction and Fragment-based de novo Drug
The traditional drug design and discovery methods were time-consuming and expensive,which largely reduced the efficiency......
Over the past decade,deep learning(DL)has achieved remarkable success in various artificial intelligence(AI)research are......
光刻用准分子激光器的能量特性在集成电路的光刻过程中至关重要,直接影响光刻机曝光线条的精度。为了实现对于衡量能量特性的能量......
针对当前遥感目标检测方法只能识别出遥感目标的类别及位置,无法生成与遥感图像内容相关文本描述的问题,提出了一种基于注意力和强......
5G时代移动设备产生了海量数据,其中大多数是多媒体内容。通过无线网络传输如此规模的多媒体内容将会消耗大量无线频谱资源,进而导致......
Revisiting the ODE Method for Recursive Algorithms:Fast Convergence Using Quasi Stochastic Approxima
Several decades ago,Profs.Sean Meyn and Lei Guo were postdoctoral fellows at ANU,where they shared interest in recursive......
Single Exposure to Cocaine Impairs Reinforcement Learning by Potentiating the Activity of Neurons in
Plasticity in the glutamatergic synapses on striatal medium spiny neurons (MSNs) is not only essential for behavioral ad......
One of the hallmarks of human society is the ubiquitous interactions among individuals.Indeed,a significant portion of h......
Deep reinforcement learning (DRL), which combines deep learning with reinforcement learning, has achieved great success ......
With the rapid development of artificial intelligence(AI)technology and its successful application in various fields,mod......
Sepsis treatment is a highly challenging effort to reduce mortality in hospital intensive care units since the treatment......
Multi-agent reinforcement learning(MARL) has long been a significant research topic in both machine learning and control......
In this paper,a general resource distribution game with a hierarchical structure on the bipartite graph is proposed.In t......
It is shown that we can control spatiotemporal chaos in the Frenkel-Kontorova (FK) model by a model-free control method ......
该文从挂篮荷载计算、施工流程、支座及临时固结施工、挂篮安装及试验、合拢段施工、模板制作安装、钢筋安装、混凝土的浇筑及养生......
强化学习一词出自行为心理学,这门学科把学习看作为反复试验的过程,以便把环境的状态映射为动作。强化学习的这种特性必须增加智能系......
文章介绍了加强学习模型,分别给出了加强学习的四个主要算法:动态规划、蒙特卡罗算法、时序差分算法、Q-学习,并指出了它们之间的区别......
取消了平均奖赏激励学习的单链或互通MDPs假设,基于有效跟踪技术和折扣奖赏型SARSA(λ)算法,时传统的平均奖赏激励学习进行了推广,提......
论文简要介绍了多智能体技术和信息融合系统,将多智能体技术运用到信息融合系统中,对信息融合系统中的模型和方法进行改进,提出了多智......
随着多移动机器人协调系统的应用向未知环境发展,一些依赖于环境模型的路径规划方法不再适用,而利用再励学习与环境直接交互,不需要先......
该文通过对协商协议的引入,对提议形式、协商流程的分析,结合多属性效用理论和连续决策过程,提出了一个开放的、动态的、支持学习机制......
该文提出了一种基于博弈论的函数优化算法。算法将优化问题的搜索空间映射为博弈的策略组合空间,优化目标函数映射为博弈的效用函......
分析了折扣激励学习存在的问题,对MDPs的SARSA(λ)算法进行了折扣的比较实验分析,讨论了平均奖赏常量对无折扣SARSA(()算法的影响。......