Un teorema de verificación para indexabilidad de modelos restless bandit con estado real

José Niño-Mora

Ayuda

Un teorema de verificación para indexabilidad de modelos restless bandit con estado real

José Niño Mora ^[1]
1. [1] Universidad Carlos III de Madrid
  
  Universidad Carlos III de Madrid
  
  Madrid, España
Localización: BEIO, Boletín de Estadística e Investigación Operativa, ISSN 1889-3805, Vol. 39, Nº. 3, 2023, págs. 25-37
Idioma: español
Enlaces
- Texto completo
Resumen
- Este artículo se propone presentar de forma concisa y accesible a lectores no especialistas las principales contribuciones del trabajo del autor “A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits”, Mathematics of Operations Research, vol. 45, no. 2, 465–496, 2020, que fue galardonado con el Premio SEIO – Fundación BBVA 2020 en la categoría de “Mejor contribución metodológica en Investigación Operativa”.
Referencias bibliográficas
- Bellman, R. 1956. “A Problem in the Sequential Design of Experiments.” Sankhyā: The Indian Journal of Statistics 16: 221–29.
- Bertsimas, D., and J. Niño-Mora. 1996. “Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; a Polyhedral Approach to...
- Dance, C. R., and T. Silander. 2015. “When Are Kalman-Filter Restless Bandits Indexable?” In 29th Conference on Neural Information Processing...
- Dance, C. R., and T. Silander. 2019. “Optimal Policies for Observing Time Series and Related Restless Bandit Problems.” J. Mach. Learn. Res....
- Gittins, J. C. 1979. “Bandit Processes and Dynamic Allocation Indices (with Discussion).” J. Roy. Statist. Soc. Ser. B 41: 148–77.
- Gittins, J. C. 1989. Multi-Armed Bandit Allocation Indices. Chichester, UK: Wiley.
- Gittins, J. C., and D. M. Jones. 1974. “A Dynamic Allocation Index for the Sequential Design of Experiments.” In Progress in Statistics (Proceedings...
- Heilmann, W. -R. 1977. “Linear Programming of Dynamic Programs with Unbounded Rewards.” Meth. Oper. Res. 24: 94–105.
- Hernández-Lerma, O., and J. B. Lasserre. 1999. Further Topics on Discrete-Time Markov Control Processes. New York, NY, USA: Springer.
- Kleinrock, L. 1965. “A Conservation Law for a Wide Class of Queueing Disciplines.” Naval Res. Logist. Quart. 12: 181–92.
- Krishnamurthy, V. 2016. Partially Observed Markov Decision Processes: From Filtering to Controlled Sensing. Cambridge, UK: Cambridge University...
- La Scala, B. F., and W. Moran. 2006. “Optimal Target Tracking with Restless Bandits.” Digit. Signal Process. 16: 479–87.
- Lippman, S. A. 1975. “On Dynamic Programming with Unbounded Rewards.” Management Sci. 21: 1225–33.
- Liu, K., and Q. Zhao. 2010. “Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access.” IEEE...
- Mahajan, A., and D. Teneketzis. 2008. “Multi-Armed Bandit Problems.” In Foundations and Applications of Sensor Management, edited by A. O....
- Moran, W., S. Suvorova, and S. Howard. 2008. “Application of Sensor Scheduling Concepts to Radar.” In Foundations and Applications of Sensor...
- Niño-Mora, J. 2001. “Restless Bandits, Partial Conservation Laws and Indexability.” Adv. Appl. Probab. 33: 76–98.
- Niño-Mora, J. 2002. “Dynamic Allocation Indices for Restless Projects and Queueing Admission Control: A Polyhedral Approach.” Math. Program....
- Niño-Mora, J. 2006. “Restless Bandit Marginal Productivity Indices, Diminishing Returns and Optimal Control of Make-to-Order/Make-to-Stock...
- Niño-Mora, J. 2007. “Dynamic Priority Allocation via Restless Bandit Marginal Productivity Indices (with Discussion).” TOP 15: 161–98.
- Niño-Mora, J. 2011. “Conservation Laws and Related Applications.” In Wiley Encyclopedia of Operations Research and Management Science, edited...
- Keskinocak, J. P. Kharoufeh, and J. C. Smith. New York, NY, USA: Wiley. https://doi.org/10.1002/9780470400531.eorms0186.
- Keskinocak, J. P. Kharoufeh, and J. C. Smith. 2015. “A Verification Theorem for Indexability of Discrete Time Real State Discounted Restless...
- Keskinocak, J. P. Kharoufeh, and J. C. Smith. 2020. “A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits.”...
- Keskinocak, J. P. Kharoufeh, and J. C. Smith. 2023. “Markovian Restless Bandits and Index Policies: A Review.” Mathematics 11, (1639).
- Papadimitriou, C. H.., and J. N. Tsitsiklis. 1999. “The Complexity of Optimal Queuing Network Control.” Math. Oper. Res. 24: 293–305.
- Thompson, W. R. 1933. “On the Likelihood That One Unknown Probability Exceeds Another in View of the Evidence of Two Samples.” Biometrika...
- Tsitsiklis, J. N. 1994. “A Short Proof of the Gittins Index Theorem.” Ann. Appl. Probab. 4: 194–99.
- Van Nunen, J. A. E. E., and J. Wessels. 1978. “A Note on Dynamic Programming with Unbounded Rewards.” Management Sci. 24: 576–80.
- Varaiya, P. P., J. C. Walrand, and C. Buyukkoc. 1985. “Extensions of the Multiarmed Bandit Problem: The Discounted Case.” IEEE Trans. Automat....
- Washburn, R. B. 2008. “Application of Multi-Armed Bandits to Sensor Management.” In Foundations and Applications of Sensor Management, 153–75....
- Weber, R. R. 1992. “On the Gittins Index for Multiarmed Bandits.” Ann. Appl. Probab. 2: 1024–33.
- Weber, R. R., and G. Weiss. 1990. “On an Index Policy for Restless Bandits.” J. Appl. Probab. 27: 637–48.
- Wessels, J. 1977. “Markov Programming by Successive Approximations with Respect to Weighted Supremum Norms.” J. Math. Anal. Appl. 58: 326–35.
- Whittle, P. 1980. “Multi-Armed Bandits and the Gittins Index.” J. Roy. Statist. Soc. Ser. B 42: 143–49.
- Whittle, P. 1988. “Restless Bandits: Activity Allocation in a Changing World.” In A Celebration of Applied Probability, edited by J. Gani,...
- Zhao, Q. 2020. Multi-Armed Bandits: Theory and Applications to Online Learning in Networks. Morgan & Claypool.