Ir al contenido

Documat


Un teorema de verificación para indexabilidad de modelos restless bandit con estado real

    1. [1] Universidad Carlos III de Madrid

      Universidad Carlos III de Madrid

      Madrid, España

  • Localización: BEIO, Boletín de Estadística e Investigación Operativa, ISSN 1889-3805, Vol. 39, Nº. 3, 2023, págs. 25-37
  • Idioma: español
  • Enlaces
  • Resumen
    • Este artículo se propone presentar de forma concisa y accesible a lectores no especialistas las principales contribuciones del trabajo del autor “A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits”, Mathematics of Operations Research, vol. 45, no. 2, 465–496, 2020, que fue galardonado con el Premio SEIO – Fundación BBVA 2020 en la categoría de “Mejor contribución metodológica en Investigación Operativa”.

  • Referencias bibliográficas
    • Bellman, R. 1956. “A Problem in the Sequential Design of Experiments.” Sankhyā: The Indian Journal of Statistics 16: 221–29.
    • Bertsimas, D., and J. Niño-Mora. 1996. “Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; a Polyhedral Approach to...
    • Dance, C. R., and T. Silander. 2015. “When Are Kalman-Filter Restless Bandits Indexable?” In 29th Conference on Neural Information Processing...
    • Dance, C. R., and T. Silander. 2019. “Optimal Policies for Observing Time Series and Related Restless Bandit Problems.” J. Mach. Learn. Res....
    • Gittins, J. C. 1979. “Bandit Processes and Dynamic Allocation Indices (with Discussion).” J. Roy. Statist. Soc. Ser. B 41: 148–77.
    • Gittins, J. C. 1989. Multi-Armed Bandit Allocation Indices. Chichester, UK: Wiley.
    • Gittins, J. C., and D. M. Jones. 1974. “A Dynamic Allocation Index for the Sequential Design of Experiments.” In Progress in Statistics (Proceedings...
    • Heilmann, W. -R. 1977. “Linear Programming of Dynamic Programs with Unbounded Rewards.” Meth. Oper. Res. 24: 94–105.
    • Hernández-Lerma, O., and J. B. Lasserre. 1999. Further Topics on Discrete-Time Markov Control Processes. New York, NY, USA: Springer.
    • Kleinrock, L. 1965. “A Conservation Law for a Wide Class of Queueing Disciplines.” Naval Res. Logist. Quart. 12: 181–92.
    • Krishnamurthy, V. 2016. Partially Observed Markov Decision Processes: From Filtering to Controlled Sensing. Cambridge, UK: Cambridge University...
    • La Scala, B. F., and W. Moran. 2006. “Optimal Target Tracking with Restless Bandits.” Digit. Signal Process. 16: 479–87.
    • Lippman, S. A. 1975. “On Dynamic Programming with Unbounded Rewards.” Management Sci. 21: 1225–33.
    • Liu, K., and Q. Zhao. 2010. “Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access.” IEEE...
    • Mahajan, A., and D. Teneketzis. 2008. “Multi-Armed Bandit Problems.” In Foundations and Applications of Sensor Management, edited by A. O....
    • Moran, W., S. Suvorova, and S. Howard. 2008. “Application of Sensor Scheduling Concepts to Radar.” In Foundations and Applications of Sensor...
    • Niño-Mora, J. 2001. “Restless Bandits, Partial Conservation Laws and Indexability.” Adv. Appl. Probab. 33: 76–98.
    • Niño-Mora, J. 2002. “Dynamic Allocation Indices for Restless Projects and Queueing Admission Control: A Polyhedral Approach.” Math. Program....
    • Niño-Mora, J. 2006. “Restless Bandit Marginal Productivity Indices, Diminishing Returns and Optimal Control of Make-to-Order/Make-to-Stock...
    • Niño-Mora, J. 2007. “Dynamic Priority Allocation via Restless Bandit Marginal Productivity Indices (with Discussion).” TOP 15: 161–98.
    • Niño-Mora, J. 2011. “Conservation Laws and Related Applications.” In Wiley Encyclopedia of Operations Research and Management Science, edited...
    • Keskinocak, J. P. Kharoufeh, and J. C. Smith. New York, NY, USA: Wiley. https://doi.org/10.1002/9780470400531.eorms0186.
    • Keskinocak, J. P. Kharoufeh, and J. C. Smith. 2015. “A Verification Theorem for Indexability of Discrete Time Real State Discounted Restless...
    • Keskinocak, J. P. Kharoufeh, and J. C. Smith. 2020. “A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits.”...
    • Keskinocak, J. P. Kharoufeh, and J. C. Smith. 2023. “Markovian Restless Bandits and Index Policies: A Review.” Mathematics 11, (1639).
    • Papadimitriou, C. H.., and J. N. Tsitsiklis. 1999. “The Complexity of Optimal Queuing Network Control.” Math. Oper. Res. 24: 293–305.
    • Thompson, W. R. 1933. “On the Likelihood That One Unknown Probability Exceeds Another in View of the Evidence of Two Samples.” Biometrika...
    • Tsitsiklis, J. N. 1994. “A Short Proof of the Gittins Index Theorem.” Ann. Appl. Probab. 4: 194–99.
    • Van Nunen, J. A. E. E., and J. Wessels. 1978. “A Note on Dynamic Programming with Unbounded Rewards.” Management Sci. 24: 576–80.
    • Varaiya, P. P., J. C. Walrand, and C. Buyukkoc. 1985. “Extensions of the Multiarmed Bandit Problem: The Discounted Case.” IEEE Trans. Automat....
    • Washburn, R. B. 2008. “Application of Multi-Armed Bandits to Sensor Management.” In Foundations and Applications of Sensor Management, 153–75....
    • Weber, R. R. 1992. “On the Gittins Index for Multiarmed Bandits.” Ann. Appl. Probab. 2: 1024–33.
    • Weber, R. R., and G. Weiss. 1990. “On an Index Policy for Restless Bandits.” J. Appl. Probab. 27: 637–48.
    • Wessels, J. 1977. “Markov Programming by Successive Approximations with Respect to Weighted Supremum Norms.” J. Math. Anal. Appl. 58: 326–35.
    • Whittle, P. 1980. “Multi-Armed Bandits and the Gittins Index.” J. Roy. Statist. Soc. Ser. B 42: 143–49.
    • Whittle, P. 1988. “Restless Bandits: Activity Allocation in a Changing World.” In A Celebration of Applied Probability, edited by J. Gani,...
    • Zhao, Q. 2020. Multi-Armed Bandits: Theory and Applications to Online Learning in Networks. Morgan & Claypool.

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno