Optimal Policies for Quantum Markov Decision Processes

来源 :国际自动化与计算杂志(英文版) | 被引量 : 0次 | 上传用户:fclhp
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Markov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random.In particular,it serves as a mathematical framework for reinforcement learning.This paper introduces an extension of MDP,namely quantum MDP (qMDP),that can serve as a mathematical model of decision making about quantum systems.We develop dy-namic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon.The results ob-tained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.
其他文献
市政作业过程中需要对管道材料选择、管道设计、管道位置勘探、顶进工艺注意事项进行全面把关,按照项目施工图纸对管道间的连接进行把关,特别是要对管道的压力进行控制,整体
以鼓楼大街车站基坑工程施工为背景,主要介绍了地铁车站周围环境与车站基坑围护结构,采用地下连续墙围护体系作业工艺进行施工,提出以成槽加固与坑内土体加固、导墙施工、泥
针对柳城特长隧道穿越高地温地区施工存在温度高、湿度大、高温段落长的特点,系统分析了高地温隧道衬砌的受力特征,提出了高地温隧道隔热和降温综合控制措施,解决了隧道内施
设备的合理选择与科学应用关系着整个工程的质量,并且对国家公路工程行业的健康发展有着重要的影响.对公路工程使用机械选择与应用关键性进行阐述,并对压路机合理选择与应用
以河北省衡水市前进街铁路桥梁转体桥作业施工作为项目研究对象,基于项目工程特点,介绍了以主墩、主梁和主塔施工为主的转体桥设计要点,提出转体系统施工和连续梁转体施工的
山西省中部引黄工程深埋长隧洞的涌水具有涌水量大、水头压力高、补给丰富等特点,同时施工中不断会出现断层、涌水涌泥、岩溶、软岩、岩爆和煤层等现象,因此工程采用双护盾TB
杨振宁先生指出,自然科学研究与人文创作一样,也有因人而异的风格,并且这种风格会对研究工作产生重大的影响.2020年诺贝尔物理学奖授予了黑洞的理论和观测研究,其中一半授予
Recently, deep learning has achieved great success in visual tracking tasks, particularly in single-object tracking. This paper provides a comprehensive review of state-of-the-art single-object tracki
Human group activity recognition(GAR) has attracted significant attention from computer vision researchers due to its wide practical applications in security surveillance, social role understanding an
Audio-visual learning, aimed at exploiting the relationship between audio and visual modalities, has drawn considerable attention since deep learning started to be used successfully. Researchers tend