Optimal policies for constrained average-cost Markov decision processes

González-Hernández, Juan; Villarreal, César E.

doi:10.1007/s11750-009-0110-7

Optimal policies for constrained average-cost Markov decision processes

Original Paper
Published: 23 July 2009

Volume 19, pages 107–120, (2011)
Cite this article

TOP Aims and scope Submit manuscript

Juan González-Hernández¹ &
César E. Villarreal²

190 Accesses
7 Citations
Explore all metrics

Abstract

We give mild conditions for the existence of optimal solutions for a Markov decision problem with average cost, under m constraints of the same kind, in Borel actions and states spaces. Moreover, there is an optimal policy that is a convex combination of at most m+1 deterministic policies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Article 17 January 2019

Symmetric Markov Processes with Tightness Property

Stochastic differential games with controlled regime-switching

Article 31 May 2024

References

Altman E (1999) Constraint Markov decision processes. Chapman & Hall/CRC, Boca Raton
Google Scholar
Ash RB (1972) Real analysis and probability. Academic Press, London
Google Scholar
Beutler FJ, Ross KW (1985) Optimal policies for controlled Markov chains with a constraint. J Math Anal Appl 112:236–252
Article Google Scholar
Billingsley P (1995) Probability and measure, 3rd edn. Wiley–Interscience, New York
Google Scholar
Billingsley P (1999) Convergence of probability measures, 2nd edn. Wiley–Interscience, New York
Book Google Scholar
Borkar VS (1994) Ergodic control of Markov chains with constraints—the general case. SIAM J Control Optim 32:176–186
Article Google Scholar
Bourbaki N (1969) Éléments de mathématique, chapitre IX. Hermann, Paris
Google Scholar
Collins EJ, McNamara JM (1998) Finite-horizon dynamic optimization when the terminal reward is a concave function of the distribution of the final state. Adv Appl Probab 30:122–136
Article Google Scholar
Feimberg EA, Shwartz A (1996) Constrained discounted dynamic programming. Math Oper Res 21:922–945
Article Google Scholar
Frid EB (1972) On optimal strategies in control problems with constraints. Theory Probab Appl 17:188–192
Article Google Scholar
González-Hernández J, Hernández-Lerma O (2005) Extreme points of sets randomized strategies in constrained optimization and control problems. SIAM J Optim 15(4):1085–1104
Article Google Scholar
Haviv M (1996) On constrained Markov decision processes. Oper Res Lett 19(1):25–28
Article Google Scholar
Hernández-Lerma O, González-Hernández J (2000) Constrained Markov control processes in Borel spaces: the discounted case. Math Methods Oper Res 52:271–285
Article Google Scholar
Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes: basic optimality criteria. Springer, New York
Google Scholar
Hernández-Lerma O, González-Hernández J, López-Martínez RR (2003) Constrained average cost Markov control processes in Borel space. SIAM J Control Optim 42(2):442–468
Article Google Scholar
Hinderer K (1970) Foundations of non-stationary dynamic programming discrete-time parameter. Lecture notes oper res math syst, vol 33. Springer, Berlin
Google Scholar
Hu Q, Yue W (2008) Markov decision processes with their applications. Springer, New York
Google Scholar
Karr AF (1983) Extreme points of certain sets of probability measures with applications. Math Oper Res 8:74–85
Article Google Scholar
Kurano M, Nakagami J, Huang Y (2000a) Constrained Markov decision processes with compact state and action spaces: the average case. Optimization 48(2):255–269
Article Google Scholar
Kurano M, Yasuda M, Nakagami J, Yoshida Y (2000b) A fuzzy treatment of uncertain Markov decision processes: average case. In Proceedings of ASSM2000 International Conference on Applied Stochastic System Modeling, Kyoto, pp 148–157
Loève M (1977) Probability theory, vol 1, 4th edn. Springer, New York
Google Scholar
Munkres JR (1975) Topology: a first course. Prentice–Hall, New Jersey
Google Scholar
Phelps RR (1966) Lectures on Choquet’s theorem. Van Nostrand, New York
Google Scholar
Piunovskiy AB (1993) Control of random sequences in problems with constraints. Theory Probab Appl 38(4):751–762
Article Google Scholar
Piunovskiy AB (1997) Optimal control of random sequences in problems with constraints. Kluwer Academic, Dordrecht
Google Scholar
Piunovskiy AB, Khametov VM (1991) Optimal control by random sequences with constraints. Math Notes 49:654–656
Google Scholar
Sennott LI (1991) Constrained discounted Markov decision chains. Probab Eng Inf Sci 5:463–475
Article Google Scholar
Sennott LI (1993) Constrained average cost Markov decision chains. Probab Eng Inf Sci 7:69–83
Article Google Scholar
Tanaka K (1991) On discounted dynamic programming with constraints. J Math Anal Appl 155:264–277
Article Google Scholar
Winkler G (1988) Extreme points of moment sets. Math Oper Res 13:581–587
Article Google Scholar
Yoshida K (1978) Functional analysis, 5th edn. Springer, Berlin
Google Scholar
Yushkevich AA (1997) The compactness of a policy space in dynamic programming via an extension theorem for Carathéodory functions. Math Oper Res 22:458–467
Article Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Probabilidad y Estadística, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Apartado postal 20-726, Admon. No. 20, delegación Álvaro Obregón, 01000, Mexico City, D.F., Mexico
Juan González-Hernández
Posgrado en Ingeniería de Sistemas, Facultad de Ingeniería Mecánica y Eléctrica, Universidad Autónoma de Nuevo León, Ciudad Universitaria, 66450, San Nicolás de los Garza, N.L., Mexico
César E. Villarreal

Authors

Juan González-Hernández
View author publications
You can also search for this author in PubMed Google Scholar
César E. Villarreal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to César E. Villarreal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

González-Hernández, J., Villarreal, C.E. Optimal policies for constrained average-cost Markov decision processes. TOP 19, 107–120 (2011). https://doi.org/10.1007/s11750-009-0110-7

Download citation

Received: 12 December 2008
Accepted: 02 July 2009
Published: 23 July 2009
Issue Date: July 2011
DOI: https://doi.org/10.1007/s11750-009-0110-7

Keywords

Mathematics Subject Classification (2000)

90C40

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal policies for constrained average-cost Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Symmetric Markov Processes with Tightness Property

Stochastic differential games with controlled regime-switching

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

Optimal policies for constrained average-cost Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Symmetric Markov Processes with Tightness Property

Stochastic differential games with controlled regime-switching

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation