Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors

Zhang, Yi

doi:10.1007/s11750-011-0186-8

Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors

Original Paper
Published: 20 April 2011

Volume 21, pages 378–408, (2013)
Cite this article

TOP Aims and scope Submit manuscript

Yi Zhang¹

213 Accesses
10 Citations
Explore all metrics

Abstract

In this paper we develop the convex analytic approach to a discounted discrete-time Markov decision process (DTMDP) in Borel state and action spaces with N constraints. Unlike the classic discounted models, we allow a non-constant discount factor. After defining and characterizing the corresponding occupation measures, the original constrained DTMDP is written as a convex program in the space of occupation measures, whose compactness and convexity we show. In particular, we prove that every extreme point of the space of occupation measures can be generated by a deterministic stationary policy for the DTMDP. For the resulting convex program, we prove that it admits a solution that can be expressed as a convex combination of N+1 extreme points of the space of occupation measures. One of its consequences is the existence of a randomized stationary optimal policy for the original constrained DTMDP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aliprantis C, Border K (2007) Infinite dimensional analysis. Springer, New York
Google Scholar
Altman E (1999) Constrained Markov decision processes. Chapman & Hall/CRC, Boca Raton
Google Scholar
Bertsekas D, Shreve S (1978) Stochastic optimal control. Academic Press, New York
Google Scholar
Borkar V (1988) A convex analytic approach to Markov decision processes. Probab Theory Relat Fields 78:583–602
Article Google Scholar
Chen R, Blankenship G (2004) Dynamic programming equations for discounted constrained stochastic control. IEEE Trans Autom Control 49:699–709
Article Google Scholar
Dynkin E, Yushkevich A (1979) Controlled Markov processes. Springer, New York
Book Google Scholar
Frid E (1972) On optimal strategies in control problems with constraints. Theory Probab Appl 17:188–192
Article Google Scholar
González-Hernández J, Hernández-Lerma O (2005) Extreme points of sets of randomized strategies in constrained optimization and control problems. SIAM J Optim 15:1085–1104
Article Google Scholar
González-Hernández J, López-Martińez R, Pérez-Hernández J (2007) Markov control processes with randomized discounted cost. Math Methods Oper Res 65:27–44
Article Google Scholar
González-Hernández J, López-Martińez R, Minjárez-Sosa J (2009) Approximation, estimation and control of stochastic systems under a randomized discounted cost criterion. Kybernetika 45:737–754
Google Scholar
Guo X, Piunovskiy A (2011) Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math Oper Res 36:105–132
Article Google Scholar
Hernández-Lerma O, Lasserre J (1996) Discrete-time Markov control processes. Springer, New York
Book Google Scholar
Hernández-Lerma O, Lasserre J (1999) Further topics on discrete-time Markov control processes. Springer, New York
Book Google Scholar
Hernández-Lerma O, González-Hernández J (2000) Constrained Markov control processes in Borel spaces: the discounted case. Math Methods Oper Res 52:271–285
Article Google Scholar
Mao X, Piunovskiy A (2000) Strategic measures in optimal control problems for stochastic sequences. Stoch Anal Appl 18:755–776
Article Google Scholar
Meyer P (1966) Probability and potentials. Blaisdell, Waltham
Google Scholar
Piunovskiy A (1997) Optimal control of random sequences in problems with constraints. Kluwer Academic, Dordrecht
Book Google Scholar
Piunovskiy A (2000) Controlled random sequences: the convex analytic approach and constrained problems. Russ Math Surv 53:1233–1293
Article Google Scholar
Piunovskiy A (2006) Dynamic programming in constrained Markov decision processes. Control Cybern 35:645–660
Google Scholar
Piunovskiy A, Mao X (2000) Constrained Markovian decision processes: the dynamic programming approach. Oper Res Lett 27:119–126
Article Google Scholar
Prokhorov Yu (1956) Convergence of random processes and limit theorems in probability theory. Theory Probab Appl 1:157–214
Article Google Scholar
Rockafellar R (1974) Conjugate duality and optimization. SIAM, Philadelphia
Book Google Scholar
Schäll M (1975) Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z Wahrscheinlichkeitstheor Verw Geb 32:179–196
Article Google Scholar
Varadarajan V (1958) Weak convergence of measures on separable metric spaces. Sankhya 19:15–22
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematical Sciences, University of Liverpool, L69 7ZL, Liverpool, UK
Yi Zhang

Authors

Yi Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y. Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors. TOP 21, 378–408 (2013). https://doi.org/10.1007/s11750-011-0186-8

Download citation

Received: 30 November 2010
Accepted: 25 March 2011
Published: 20 April 2011
Issue Date: July 2013
DOI: https://doi.org/10.1007/s11750-011-0186-8

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors

Abstract

Access this article

Similar content being viewed by others

A Constrained Optimization Problem with Applications to Constrained MDPs

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Discrete-time control with non-constant discount factor

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors

Abstract

Access this article

Similar content being viewed by others

A Constrained Optimization Problem with Applications to Constrained MDPs

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Discrete-time control with non-constant discount factor

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation