Skip to main content
Log in

A survey of recent results on continuous-time Markov decision processes

  • Published:
TOP Aims and scope Submit manuscript

Abstract

This paper is a survey of recent results on continuous-time Markov decision processes (MDPs) withunbounded transition rates, and reward rates that may beunbounded from above and from below. These results pertain to discounted and average reward optimality criteria, which are the most commonly used criteria, and also to more selective concepts, such as bias optimality and sensitive discount criteria. For concreteness, we consider only MDPs with a countable state space, but we indicate how the results can be extended to more general MDPs or to Markov games.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Albright S.C. and Winston W. (1979). A Birth-Death Model of Advertising and Pricing.Advances in Applied Probability 11, 134–152.

    Article  Google Scholar 

  • Allen L.J.S. (2003).An Introduction to Stochastic Processes with Applications to Biology. Pearson Education.

  • Anderson W.J. (1991).Continuous-Time Markov Chains. Springer.

  • Bailey N.T.J. (1975).The Mathematical Theory of Infectious Diseases and Its Applications. Griffin.

  • Bartholomew D.J. (1973).Stochastic Models for Social Processes, 2nd Edition. Wiley.

  • Bellman R. (1957).Dynamic Programming. Princeton University Press.

  • Berge C. (1963).Topological Spaces. Macmillan.

  • Bertsekas D.P. (2001).,Dynamic Programming and Optimal Control, Vol. II, 2nd. Edition. Athena Scientific.

  • Blackwell D. (1962). Discrete Dynamic Programming.Annals of Mathematical Statistics 33, 719–726.

    Article  Google Scholar 

  • Cao X.R. (2005). Basic Ideals for Event-Based Optimality of Markov, Systems.Discrete Event Dynamic Systems: Theory and Applications 15, 169–197.

    Article  Google Scholar 

  • Cao X.R. and Guo X.P. (2006). Continuous-Time Markov Decision Processes withn-Potential Optimality Criteria. Preprint.

  • Cao X.R. and Zhang J.Y. (2007). Then-th Order Bias Optimality for Multi-Chain Markov Decsiion Processes.IEEE Transactions on Automatic Control. In press.

  • Dekker R. and Hordijk A. (1988). Average, Sensitive and Blackwell Optimal Policies in Denumerable Markov Decision Chains with Unbounded Rewards.Mathematics of Operations Research 13, 395–420.

    Google Scholar 

  • Dekker R. and Hordijk A. (1992). Recurrence Conditions for Average and Black-well Optimality in Denumerable State Markov Decision Chains.Mathematics of Operations Research 17, 271–289.

    Google Scholar 

  • Doshi B.T. (1976). Continuous-Time Control of Markov Processes on an Arbitrary State Space: Discounted Rewards.Annals of Statistics 4, 1219–1235.

    Article  Google Scholar 

  • Dynkin E.B. and Yushkevich A.A. (1979).Controlled Markov Processes. Springer.

  • Feinberg E.A. and Shwartz A. (2002).Handbook of Markov Decision Processes. Kluwer.

  • Feller W. (1940). On the Integro-Differential Equations of Purely Discontinuous Markoff Processes.Transactions of the American Mathematical Society 48, 488–515.

    Article  Google Scholar 

  • Fisher L. (1968). On the Recurrent Denumerable Decision Process.Annals of Mathematical Statistics 39, 424–434.

    Article  Google Scholar 

  • Gale D. (1967). On Optimal Development in a Multi-Sector Economy.Review of Economic Studies 34, 1–19.

    Article  Google Scholar 

  • Guo X.P. (2006). Continuous-Time Markov Decision Processes with Discounted Rewards: The Case of Polish Spaces.Mathematics of Operations Research (to appear).

  • Guo X.P. and Cao X.R. (2005). Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards.SIAM Journal on Control and Optimization 44, 29–48.

    Article  Google Scholar 

  • Guo X.P. and Hernández-Lerma O. (2003a). Continuous-Time Controlled Markov Chains.Annals of Applied Probability 13, 363–388.

    Article  Google Scholar 

  • Guo X.P. and Hernández-Lerma O. (2003b). Continuous-Time Controlled Markov Chains with Discounted Rewards.Acta Applicandae Mathematicae 79, 195–216.

    Article  Google Scholar 

  • Guo X.P. and Hernández-Lerma O. (2003c). Drift and Monotonicity Conditions for Continuous-Time Controlled Markov Chains with an Average Criterion.IEEE Transactions on Automatic Control 48, 236–245.

    Article  Google Scholar 

  • Guo X.P. and Hernández-Lerma O. (2003d). Zero-Sum Games for Continuous-Time Markov Chains with Unbounded Transition and Average Payoff Rates.Journal of Applied Probability 40, 327–345.

    Article  Google Scholar 

  • Guo X.P. and Hernández-Lerma O. (2005a). Nonzero-Sum Games for Continuous-Time Markov Chains with Unbounded Discounted Payoffs.Journal of Applied Probability 42, 302–320.

    Article  Google Scholar 

  • Guo X.P. and Hernández-Lerma O. (2005b). Zero-Sum Continuous-Time Markov Games with Unbounded Transition and Discounted Payoff Rates.Bernoulli 16, 1009–1029.

    Google Scholar 

  • Guo X.P. and Liu K. (2001). A Note on Optimality Conditions for Continuous-Time Markov Decision Processes with Average Cost Criterion.IEEE Transactions on Automatic Control 46, 1984–1989.

    Article  Google Scholar 

  • Guo X.P. and Rieder U. (2006). Average Optimality for Continuous-Time Markov Decision Processes in Polish Spaces.Annals of Applied Probability 16, 730–756.

    Article  Google Scholar 

  • Guo X.P. and Zhu W.P. (2002a). Denumerable State Continuous-Time Markov Decision Processes with Unbounded Cost and Transition Rates Under the Discounted Criterion.Journal of Applied Probability 39, 233–250.

    Article  Google Scholar 

  • Guo X.P. and Zhu W.P. (2002b). Denumerable State Continuous-Time Markov Decision Processes with Unbounded Cost and Transition Rates Under Average Criterion.ANZIAM Journal 43, 541–557.

    Google Scholar 

  • Haviv M. and Puterman M.L. (1998). Bias Optimality in Controlled Queuing Systems.Journal of Applied Probability 35, 136–150.

    Article  Google Scholar 

  • Hernández-Lerma O. (1994).Lectures on Continuous-Time Markov Control Processes. Aportaciones Matemáticas, Vol. 3, Sociedad Matemática Mexicana, Mexico City.

    Google Scholar 

  • Hernández-Lerma O. and Govindan T.E. (2001). Nonstationary Continuous-Time Markov Control Processes with Discounted Costs on Infinite Horizon.,Acta Applicandae Mathematicae 67, 277–293.

    Article  Google Scholar 

  • Hernández-Lerma O. and Lasserre J.B. (1996).Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer.

  • Hernández-Lerma O. and Lasserre J.B. (1999).Further Topics on Discrete-Time Markov Control Processes. Springer.

  • Hernández-Lerma O. and Romera R. (2004). The Scalarization Approach to Multi-objective Markov Control Problems: Why Does it Work?Applied Mathematics and Optimization 50, 279–293.

    Article  Google Scholar 

  • Hilgert N. and Hernández-Lerma O. (2003). Bias Optimality Versus Strong 0-Discount Optimality in Markov Control Processes with Unbounded Costs.Acta Applicandae Mathematicae 76, 215–235.

    Article  Google Scholar 

  • Hordijk A. and Yushkevich A.A. (1999a). Blackwell Optimality in the Class of Stationary Policies in Markov Decision Chains with a Borel State and Unbounded Rewards.Mathematical Methods of Operations Research 49, 1–39.

    Google Scholar 

  • Hordijk A. and Yushkevich A.A. (1999b). Blackwell Optimality in the Class of All Policies in Markov Decision Chains with a Borel State and Unbounded Rewards.Mathematical Methods of Operations Research 50, 421–448.

    Article  Google Scholar 

  • Hordijk A. and Yushkevich A.A. (2002). Blackwell Optimality. In: Feinberg E.A. and Shwartz A. (eds.),Handbook of Markov Decision Processes. Kluwer, 231–267.

  • Hou Z.T. and Guo X.P. (1998).Markov Decision Processes. Science and Technology Press of Human, Changsha, China. (In Chinese.)

    Google Scholar 

  • Howard R.A. (1960).Dynamic Programming and Markov Processes, Wiley.

  • Hu Q. (1992). Discounted and Average Markov Decision Processes with Unbounded Rewards: New Conditions.Journal of Mathematical Analysis and Applications 171, 111–124.

    Article  Google Scholar 

  • Hu Q. (1996). Continuous Time Markov Decision Processes with Discounted Moment Criterion.Journal of Mathematical Analysis and Applications 203, 1–12.

    Article  Google Scholar 

  • Iosifescu M. and Tautu P. (1973).Stochastic Processes and Applications in Biology and Medicine, Vol. II: Models. Springer.

  • Jaskiewicz A. (2004). On the Equivalence of Two Expected Average Cost Criteria for Semi-Markov Control Processes.Mathematics of Operations Research 29, 326–338.

    Article  Google Scholar 

  • Kakumanu P. (1971). Continuously Discounted Markov Decision Models with Countable State and Action Spaces.Annals of Mathematical Statistics 42, 919–926.

    Article  Google Scholar 

  • Kakumanu P. (1972). Nondiscounted Continuous-Time Markov Decision Processes with Countable State and Action Spaces.SIAM Journal on Control 10, 210–220.

    Article  Google Scholar 

  • Kakumanu P. (1975). Continuous Time Markov Decision Processes with Average Return Criterion.Journal of Mathematical Analysis and Applications 52, 173–188.

    Article  Google Scholar 

  • Kakumanu P. (1977). Relation Between Continuous and Discrete Markovian Decision Problems.Naval Research Logistics Quarterly 24, 431–439.

    Article  Google Scholar 

  • Kato T. (1966).Perturbation Theory for Linear Operators. Springer.

  • Kermack W.O. and McKendrick A.G. (1927). Contributions to the Mathematical Theory of Epidemics.Proceedings of the Royal Statistical Society A115, 700–721.

    Google Scholar 

  • Kitayev M.Yu. (1985). Semi-Markov and Jump Markov Controlled Models: Average Cost Criterion.Theory of Probability and its Applications 30, 272–288.

    Article  Google Scholar 

  • Kitayev M.Yu. and Rykov V.V. (1995).Controlled Queueing Systems. CRC Press.

  • Lasserre J.B. (1988). Conditions for the Existence of Average and Blackwell Optimal Stationary Policies in Denumerable Markov Decision Processes.Journal of Mathematical Analysis and Applications 136, 479–490.

    Article  Google Scholar 

  • Lefèvre C. (1979). Optimal Control of the Simple Stochastic Epidemic with Variable Recovery Rates.Mathematical Biosciences 44, 209–219.

    Article  Google Scholar 

  • Lefèvre C. (1981). Optimal Control of a Birth and Death Epidemic Process.Operations Research 29, 971–982.

    Google Scholar 

  • Leizarowitz A. (1996). Overtaking and Almost-Sure Optimality for Infinite Horizon Markov Decision Processes.Mathematics of Operations Research 21, 158–181.

    Article  Google Scholar 

  • Lembersky M.R. (1974). On Maximal Rewards and ∈-Optimal Policies in Continuous Time Markov Chains.Annals of Statistics 2, 159–169.

    Article  Google Scholar 

  • Lewis M.E., Ayhan H. and Foley R.D. (1999). Bias Optimality in a Queue with Admission Control.Probability in the Engineering and Informational Sciences 13, 309–327.

    Article  Google Scholar 

  • Lewis M.E., Ayhan H. and Foley R.D. (2002). Bias Optimal Admission Policies for a Noustationary Multiclass Queueing System.Journal of Applied Probability 39, 20–37.

    Article  Google Scholar 

  • Lewis M.E. and Puterman M.L. (2001). A Note on Bias Optimality in Controlled Queueing Systems.Journal of Applied Probability 37, 300–305.

    Google Scholar 

  • Lewis M.E. and Puterman M.L. (2002). A Probabilistic Analysis of Bias Optimality in Unichain Markov Decision Processes.IEEE Transactions on Automatic Control 46, 96–100.

    Article  Google Scholar 

  • Lippman S.A. (1975). Applying a New Device in the Optimization of Exponential Queueing Systems.Operations Research 23, 667–710.

    Google Scholar 

  • Lund R.B., Meyn S.P. and Tweedie R.L. (1996). Computable Exponential Convergence Rates for Stochastically Ordered Markov Processes.Annals of Applied Probability 6, 218–237.

    Article  Google Scholar 

  • Mangel M. (1985).Decision and Control in Uncertain Resource Systems. Academic Press.

  • Massy W.F., Montgomery D.B. and Morrison D.G. (1970).Stochastic Models of Buying Behavior. MIT Press.

  • Meyn S.P. and Tweedie R.L. (1993). Stability of Markovian Processes III: Foster-Lyapunov Criteria for Continuous-Time Processes.Advances in Applied Probability, 25, 518–548.

    Article  Google Scholar 

  • Miller B.L. (1968). Finite State Continuous Time Markov Decision Processes with an Infinite Planning Horizon.Journal of Mathematical Analysis and Applications 22, 552–569.

    Article  Google Scholar 

  • Miller B.L. and Veinott A.F. (1969). Discrete Dynamic Programming with a Small Interest Rate.Annals of Mathematical Statistics 40, 366–370.

    Article  Google Scholar 

  • Piunovskii A.B. (1998). A Controlled Jump Discounted Model with Constraints.Theory of Probability and Its Applications 42, 51–72.

    Article  Google Scholar 

  • Piunovskii A.B. (2004). Multicriteria Impulsive Control of Jump Markov Processes.Mathematical Methods of Operations Research 60, 125–144.

    Google Scholar 

  • Prieto-Rumeau T. (2006). Blackwell Optimality in the Class of Markov Policies for Continuous-Time Controlled Markov Chains.Acta Applicandae Mathematicae 92, 77–96.

    Article  Google Scholar 

  • Prieto-Rumeau T. and Hernández-Lerma O. (2005a). The Laurent Series, Sensitive Discount and Blackwell Optimality for Continuous-Time Controlled Markov Chains.Mathematical Methods of Operations Research 61, 123–145.

    Article  Google Scholar 

  • Prieto-Rumeau T. and Hernández-Lerma O. (2005b). Bias and Overtaking Equilibria for Zero-Sum Continuous-Time Markov Games.Mathematical Methods of Operations Research 61, 437–454.

    Article  Google Scholar 

  • Prieto-Rumeau T. and Hernández-Lerma O. (2006a). Bias Optimality for Continuous-Time Controlled Markov Chains.SIAM Journal on Control and Optimization 45, 51–73.

    Article  Google Scholar 

  • Prieto-Rumeau T. and Hernández-Lerma O (2006b). A Unified Approach to Continuous-Time Discounted Markov Control Processes.Morfismos 10 (to appear).

  • Prieto-Rumeau T. and Hernández-Lerma O. (2006c). Ergodic Control of Continuous-Time Markov Chains with Pathwise Constraints. Preprint.

  • Prieto-Rumeau T. and Hernández-Lerma O. (2006d). Variance Minimization and the Overtaking Optimality Approach to Continuous-Time Markov Control Chains. Preprint.

  • Puterman M.L. (1974). Sensitive Discount Optimality in Controlled One-Dimensional Diffusions.Annals of Probability 2, 408–419.

    Article  Google Scholar 

  • Puterman M.L. (1994).Markov Decision Processes. Wiley.

  • Qiu Q., Wu Q. and Pedram M. (2001). Stochastic Modeling of a Power-Managed System: Construction and Optimization.IEEE Transactions on Computer Aided Design 20, 1200–1217.

    Article  Google Scholar 

  • Ramsey F.P. (1928). A Mathematical Theory of Savings.Econometrics Journal 38, 543–559.

    Google Scholar 

  • Ross S.M. (1970).Applied Probability Models with Optimization Applications. Holden-Day.

  • Roykov V.V. (1966). Markov Sequential Decision Processes with Finite State and Decision Space.Theory of Probability and Its Applications, 11, 302–311.

    Article  Google Scholar 

  • Schäl M. (1992). On the Second Optimality Equation for Semi-Markov Decision Models.Mathematics of Operations Research 17, 470–486.

    Article  Google Scholar 

  • Sennott L.I. (1999).Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley.

  • Serfozo R.F. (1979). An Equivalence Between Continuous and Discrete Time Markov Decision Processes.Operations Research 27, 616–620.

    Google Scholar 

  • Sladký K. (1978). Sensitive Optimality Criteria for Continuous Time Markov Processes.Transactions of the Eighth Prague Conference on Information Theory Statistical Decision Functions and Random Processes (Prague, 1978), Vol. B, 221–225.

    Google Scholar 

  • Song, J.S. (1987). Continuous-Time Markov Decision Programming with Non-Uniformly Bounded Transition Rates.Scientia Sinica 12, 1258–1267. (in Chinese).

    Google Scholar 

  • Tadj L and Choudhury G. (2005). Optimal Design and Control of Queues.Top 13, 359–412.

    Article  Google Scholar 

  • Taylor H.M. (1976). A Laurent Series for the Resolvent of a Strongly Continuous Stochastic Semi-Group.Mathematical Programming Study 6, 258–263.

    Google Scholar 

  • Veinott A.F. (1966). On Finding Optimal Policies in Discrete Dynamic Programming with no Discounting.annals of Mathematical Statistics 37, 1284–1294.

    Article  Google Scholar 

  • Veinott A.F. (1969). Discrete Dynamic Programming with Sensitive Discount Optimality Criteria.Annals of Mathematical Statistics 40, 1635–1660.

    Article  Google Scholar 

  • Vidale M.L and Wolfe H.B. (1957). An Operations Research Study of Sales Response to Advertising.Operations Research 5, 370–381.

    Google Scholar 

  • von Weizsäcker C.C. (1965). Existence of Optimal Programs of Accumulation for an Infinite Horizon.Review of Economic Studies 32, 85–104.

    Article  Google Scholar 

  • Wickwire K. (1977). Mathematical Models for the Control of Pests and Infectious Diseases: A Survey.Theoretical Population Biology 11, 182–238.

    Article  Google Scholar 

  • Wu C.B. (1997). Continuous Time Markov Decision Processes with Unbounded Reward and Non-Uniformly Bounded Transition Rate Under Discounted Criterion.Acta Mathematicae Applicandae Sinica 20, 196–208.

    Google Scholar 

  • Ye L., Guo X.P. and Hernández-Lerma O. (2006). Existence and Regularity of NonhomogeneousQ(t)-Processes under Measurability Conditions. Preprint.

  • Yosida K. (1980).Functional Analysis, Sixth Edition. Springer.

  • Yushkevich A.A. (1973). On a Class of Strategies in General Markov Decision Models.Theory of Probability and Its Applications 18, 777–779.

    Article  Google Scholar 

  • Yushkevich A.A. (1977). Controlled Markov Models with Countable State and Continuous Time.Theory of Probability and its Applications 22, 215–235.

    Article  Google Scholar 

  • Yushkevich A.A. (1994). Blackwell Optimal Policies in a Markov Decision Process with a Borel State Space.Mathematical Methods of Operations Research 40, 253–288.

    Article  Google Scholar 

  • Yushkevich A.A. (1997). Blackwell Optimality in Continuous in Action Markov Decision Processes.SIAM Journal on Control and Optimization 35, 2157–2182.

    Article  Google Scholar 

  • Yushkevich A.A. and Feinberg E.A. (1979). On Homogeneous Markov Model with Continuous Time and Finite or Countable State Space.Theory of Probability and its Applications 24, 156–161.

    Article  Google Scholar 

References

  • Bather J. (1976). Optimal Stationary Policies for Denumerable Markov Chains in Continuous Time.Advances in Applied Probability 8, 148–155.

    Google Scholar 

  • Cao X.-R. (2003a). Semi-Markov Decision Problems and Performance Sensitivity Analysis.IEEE Transactions on Automatic Control 48, 758–769.

    Article  Google Scholar 

  • Cao X.-R. (2003b) A Sensitivity View of Markov Decision Processes and Reinforcement Learning. In: Gong W. and Shi L. (eds.),Modeling, Control and Optimization of Complex systems, Kluwer, 261–283.

  • Cao X.-R. (2003c). From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning.Discrete Event Dynamic Systems 13, 9–39.

    Article  Google Scholar 

  • Cao X.-R. (2004). The Potential Structure of Sample Paths and Performance Sensitivities of Markov Systems.IEEE Transactions on Automatic Control 49, 2129–2142.

    Article  Google Scholar 

  • Cao X.-R. (2005). Basic Ideas for Event-Based Optimality of Markov Systems.Discrete Event Dynamic Systems: Theory and Applications 15, 169–197.

    Article  Google Scholar 

  • Cao X.-R. and Zhang J.Y. (2007). Thenth-Order Bias Optimality for Multichain Markov Decision Process.IEEE Transactions on Automatic Control (to appear).

  • Dijk N.V. (1993).Queueing Networks and Product Forms: A System Approach. Wiley.

  • Dynkin E.B. and Yushkevich A.A. (1979).Controlled markov Processes. Springer.

  • Puterman M.L. (1994).Markov Decision Processes. Wiley.

  • Sennott L.I. (1999).Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley.

References

  • Hu Q. (1990). CTMDP and Its Relationship with DTMDP.Chinese Science Bulletin 35, 710–714.

    Google Scholar 

  • Hu Q. (1992). Discounted and Average Markov Decision Processes with Unbounded Rewards: New Conditions.Journal of Mathematical Analysis and Applications 171, 111–124.

    Article  Google Scholar 

  • Hu Q. (1996). Continuous Time Markov Decision Processes with Discounted Moment Criterion.Journal of Mathematical Analysis and Applications 203, 1–12.

    Article  Google Scholar 

  • Hu Q., Liu J. and Yue W. (2003). Continuous Time Markov Decision Processes: Discounted Total Reward.International Journal of Pure and Applied Mathematics 7, 147–175.

    Google Scholar 

  • Hu Q. and Wang J. (1998). Continuous Time Markov Decision Processes with Nonuniformly Bounded Rate: Expected Total Rewards.Optimization 43, 219–233.

    Article  Google Scholar 

  • Serfozo R.F. (1979). An Equivalence Between Continuous and Discrete Time Markov Decision Processes.Operations Research 27, 616–620.

    Google Scholar 

References

  • Haviv M. and Puterman M.L. (1998). Bias Optimality in Controlled Queuing Systems.Journal of Applied Probability 35, 136–150.

    Article  Google Scholar 

  • Lewis M.E., Ayhan H. and Foley R.D. (1999). Bias Optimality in a Queue with Admission Control.Probability in the Engineering and Informational Sciences 13, 309–327.

    Article  Google Scholar 

  • Lewis M.E., Ayhan H. and Foley R.D. (2002). Bias Optimal Admission Policies for a Nonstationary Multiclass Queueing System.Journal of Applied Probability 39, 20–37.

    Article  Google Scholar 

References

  • Borkar V.S. (2004). Controlled Diffusion Processes.Probability Surveys 2, 213–244.

    Google Scholar 

  • Cao X.R. and Guo X.P. (2004). Partially Observable Markov Decision Processes With Reward Information Proceeding of 43rd IEEE Conference on Decision and Control, 4393–4398.

  • Hernández-Lerma O. (1989).Adaptive Markov Control Processes. Springer.

  • Hernández-Lerma O. and Lasserre J.B. (1996).Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer.

  • Hernández-Lerma O. and Lasserre J.B. (1999).Further Topics on Discrete-Time Markov Control Processes. Springer.

  • Howard R.A. (1960).Dynamic Programming and Markov Processes. MIT Press.

  • Kaelbling L.P., Littman M.L. and Moore A.W. (1996). Reinforcement Learning: A Survey,Journal of Artificial Intelligence Research 4, 237–285.

    Google Scholar 

  • Puterman M.L. (1994).Markov Decision Processes. Wiley.

  • Neyman A. and Sorin S. (2001).Stochastic Games and Applications. NATO Science Series, 570.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Research partially supported by grants NSFC, DRFP and NCET.

Research partially supported by CONACyT (Mexico) Grant 45693-F.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, X., Hernández-Lerma, O., Prieto-Rumeau, T. et al. A survey of recent results on continuous-time Markov decision processes. TOP 14, 177–261 (2006). https://doi.org/10.1007/BF02837562

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02837562

Key Words

AMS subject classification

Navigation