论文部分内容阅读
This talk concerns finite continuous-time Markov decision processes (CTMDPs) with the long-run average variance minimization (AV) criterion, the goal of which is to find a policy with minimum AV over a class of expected average optimal policies.We first show that in general an AV minimization policy may exist by using examples.Furthermore, in order to obtain an AV minimization policy, we then consider a spccial but important class of CTMDPs larger than the class of ergodic CTMDPs.By using a martingale technique we prove that the AV criterion for the class of CTMDPs can be transformed into an equivalent expected average criterion, and thus the existence and calculation of a AV minimization policy are obtained by a policy iteration algorithm in an finite munber of iterations.