Skip to main content
Log in

Optimal policies for constrained average-cost Markov decision processes

  • Original Paper
  • Published:
TOP Aims and scope Submit manuscript

Abstract

We give mild conditions for the existence of optimal solutions for a Markov decision problem with average cost, under m constraints of the same kind, in Borel actions and states spaces. Moreover, there is an optimal policy that is a convex combination of at most m+1 deterministic policies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Altman E (1999) Constraint Markov decision processes. Chapman & Hall/CRC, Boca Raton

    Google Scholar 

  • Ash RB (1972) Real analysis and probability. Academic Press, London

    Google Scholar 

  • Beutler FJ, Ross KW (1985) Optimal policies for controlled Markov chains with a constraint. J Math Anal Appl 112:236–252

    Article  Google Scholar 

  • Billingsley P (1995) Probability and measure, 3rd edn. Wiley–Interscience, New York

    Google Scholar 

  • Billingsley P (1999) Convergence of probability measures, 2nd edn. Wiley–Interscience, New York

    Book  Google Scholar 

  • Borkar VS (1994) Ergodic control of Markov chains with constraints—the general case. SIAM J Control Optim 32:176–186

    Article  Google Scholar 

  • Bourbaki N (1969) Éléments de mathématique, chapitre IX. Hermann, Paris

    Google Scholar 

  • Collins EJ, McNamara JM (1998) Finite-horizon dynamic optimization when the terminal reward is a concave function of the distribution of the final state. Adv Appl Probab 30:122–136

    Article  Google Scholar 

  • Feimberg EA, Shwartz A (1996) Constrained discounted dynamic programming. Math Oper Res 21:922–945

    Article  Google Scholar 

  • Frid EB (1972) On optimal strategies in control problems with constraints. Theory Probab Appl 17:188–192

    Article  Google Scholar 

  • González-Hernández J, Hernández-Lerma O (2005) Extreme points of sets randomized strategies in constrained optimization and control problems. SIAM J Optim 15(4):1085–1104

    Article  Google Scholar 

  • Haviv M (1996) On constrained Markov decision processes. Oper Res Lett 19(1):25–28

    Article  Google Scholar 

  • Hernández-Lerma O, González-Hernández J (2000) Constrained Markov control processes in Borel spaces: the discounted case. Math Methods Oper Res 52:271–285

    Article  Google Scholar 

  • Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes: basic optimality criteria. Springer, New York

    Google Scholar 

  • Hernández-Lerma O, González-Hernández J, López-Martínez RR (2003) Constrained average cost Markov control processes in Borel space. SIAM J Control Optim 42(2):442–468

    Article  Google Scholar 

  • Hinderer K (1970) Foundations of non-stationary dynamic programming discrete-time parameter. Lecture notes oper res math syst, vol 33. Springer, Berlin

    Google Scholar 

  • Hu Q, Yue W (2008) Markov decision processes with their applications. Springer, New York

    Google Scholar 

  • Karr AF (1983) Extreme points of certain sets of probability measures with applications. Math Oper Res 8:74–85

    Article  Google Scholar 

  • Kurano M, Nakagami J, Huang Y (2000a) Constrained Markov decision processes with compact state and action spaces: the average case. Optimization 48(2):255–269

    Article  Google Scholar 

  • Kurano M, Yasuda M, Nakagami J, Yoshida Y (2000b) A fuzzy treatment of uncertain Markov decision processes: average case. In Proceedings of ASSM2000 International Conference on Applied Stochastic System Modeling, Kyoto, pp 148–157

  • Loève M (1977) Probability theory, vol 1, 4th edn. Springer, New York

    Google Scholar 

  • Munkres JR (1975) Topology: a first course. Prentice–Hall, New Jersey

    Google Scholar 

  • Phelps RR (1966) Lectures on Choquet’s theorem. Van Nostrand, New York

    Google Scholar 

  • Piunovskiy AB (1993) Control of random sequences in problems with constraints. Theory Probab Appl 38(4):751–762

    Article  Google Scholar 

  • Piunovskiy AB (1997) Optimal control of random sequences in problems with constraints. Kluwer Academic, Dordrecht

    Google Scholar 

  • Piunovskiy AB, Khametov VM (1991) Optimal control by random sequences with constraints. Math Notes 49:654–656

    Google Scholar 

  • Sennott LI (1991) Constrained discounted Markov decision chains. Probab Eng Inf Sci 5:463–475

    Article  Google Scholar 

  • Sennott LI (1993) Constrained average cost Markov decision chains. Probab Eng Inf Sci 7:69–83

    Article  Google Scholar 

  • Tanaka K (1991) On discounted dynamic programming with constraints. J Math Anal Appl 155:264–277

    Article  Google Scholar 

  • Winkler G (1988) Extreme points of moment sets. Math Oper Res 13:581–587

    Article  Google Scholar 

  • Yoshida K (1978) Functional analysis, 5th edn. Springer, Berlin

    Google Scholar 

  • Yushkevich AA (1997) The compactness of a policy space in dynamic programming via an extension theorem for Carathéodory functions. Math Oper Res 22:458–467

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to César E. Villarreal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

González-Hernández, J., Villarreal, C.E. Optimal policies for constrained average-cost Markov decision processes. TOP 19, 107–120 (2011). https://doi.org/10.1007/s11750-009-0110-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11750-009-0110-7

Keywords

Mathematics Subject Classification (2000)

Navigation