La replicabilidad en la ciencia y el papel transformador de la metodología estadística de knockoffs

Alejandro Román Vásquez; Gabriel Escarela Pérez; Gabriel Núñez Antonio; José Ulises Márquez Urbina

Ayuda

La replicabilidad en la ciencia y el papel transformador de la metodología estadística de knockoffs

Vásquez, Alejandro Román ^[1] ; Escarela Pérez, Gabriel ^[1] ; Núñez-Antonio, Gabriel ^[1] ; Márquez Urbina, José Ulises ^[2]
1. [1] Universidad Autónoma Metropolitana
  
  Universidad Autónoma Metropolitana
  
  México
2. [2] Mathematics Research Center
  
  Mathematics Research Center
  
  México
Localización: SahuarUS: Revista Electrónica de Matemáticas, ISSN-e 2448-5365, Vol. 8, Nº. 1, 2024 (Ejemplar dedicado a: Onceavo Número), págs. 1-22
Idioma: español
DOI: 10.36788/sah.v8i1.148
Enlaces
- Texto completo
Resumen
- Un aspecto importante en la ciencia es la replicabilidad de los resultados científicos. En este artículo se examinan algunas causas fundamentales que contribuyen a la falta de replicabilidad, centrando el análisis en un componente crucial: la estadística y la inferencia selectiva. Partiendo de los desafíos inherentes a las pruebas de hipótesis múltiples en situaciones de alta dimensionalidad, una estrategia para abordar la problemática de la replicabilidad se basa en la implementación del modelo-X de imitaciones. Esta metodología se destaca por generar variables sintéticas que imitan a las originales, permitiendo diferenciar de manera efectiva entre asociaciones genuinas y espurias, y controlando de manera simultánea la tasa de falsos descubrimientos en entornos de muestras finitas. Los aspectos técnicos del modelo-X de imitaciones se describen en este trabajo, subrayando sus alcances y limitaciones. Se enfatiza la efectividad de esta metodología con casos de éxito, tales como la estimación de la pureza en tumores, el análisis de asociación genómica, la identificación de factores pronósticos en ensayos clínicos, la determinación de factores de riesgo asociados al COVID-19 de larga duración, y la selección de variables en estudios de tasa de criminalidad. Estos ejemplos concretos ilustran la preponderante utilidad práctica y la versatilidad del modelo-X de imitaciones en diversas áreas de investigación. Sin lugar a dudas, este enfoque contribuye de manera original a los desafíos actuales en cuanto a la replicabilidad, marcando un hito significativo en la mejora de la confiabilidad y robustez de la evidencia científica.
Referencias bibliográficas
- A. Ahlgren, “A modest proposal for encouraging replication,” American Psychologist, vol. 24, no. 4, p. 471, 1969. DOI: https://doi.org/10.1037//0003-066X.24.4.471.a
- R. F. Barber and E. J. Candès, “Controlling the false discovery rate via knockoffs,” The Annals of Statistics, vol. 43, no. 5, pp. 2055 –...
- R. F. Barber, E. J. Candès, and R. J. Samworth, “Robust inference with knockoffs,” The Annals of Statistics, vol. 48, no. 3, pp. 1409 – 1431,...
- J. A. Bargh, M. Chen, and L. Burrows, “Automaticity of social behavior: Direct effects of trait construct and stereotype activation on action.”...
- S. Bates, E. Candès, L. Janson, and W. Wang, “Metropolized knockoff sampling,” Journal of the American Statistical Association, vol. 116,...
- C. G. Begley and L. M. Ellis, “Raise standards for preclinical cancer research,” Nature, vol. 483, no. 7391, pp. 531–533, 2012. DOI: https://doi.org/10.1038/483531a
- D. J. Bem, “Feeling the future: experimental evidence for anomalous retroactive influences on cognition and affect.” Journal of personality...
- D. J. Benjamin, J. O. Berger, M. Johannesson, B. A. Nosek, E.-J. Wagenmakers, R. Berk, K. A. Bollen, B. Brembs, L. Brown, C. Camerer et al.,...
- Y. Benjamini, “Selective Inference: The Silent Killer of Replicability,” Harvard Data Science Review, vol. 2, no. 4, dec 16 2020, https://hdsr.mitpress.mit.edu/pub/l39rpgyc....
- Y. Benjamini and Y. Hochberg, “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” Journal of the...
- M. Binkowski, D. J. Sutherland, M. Arbel, and A. Gretton, “Demystifying mmd gans,” arXiv preprint arXiv:1801.01401, 2018.
- D. Bishop, “Interpreting unexpected significant findings,” 2014.
- F. Bretz, T. Hothorn, and P. Westfall, Multiple comparisons using R. CRC press, 2016. DOI: https://doi.org/10.1201/9781420010909
- K. E. Campbell and T. T. Jackson, “The role of and need for replication research in social psychology,” Replications in social psychology,...
- E. Candès, Y. Fan, L. Janson, and J. Lv, “Panning for gold:‘Model-X’ knockoffs for high dimensional controlled variable selection,” Journal...
- T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge...
- J. Cohen, “Things I have learned (so far).” in Annual Convention of the American Psychological Association, 98th, Aug, 1990, Boston, MA, US;...
- L. J. Colling and D. Szucs, “Statistical inference and the replication crisis,” Review of Philosophy and Psychology, vol. 12, pp. 121–147,...
- A. O. Cramer, D. van Ravenzwaaij, D. Matzke, H. Steingroever, R. Wetzels, R. P. Grasman, L. J. Waldorp, and E.-J. Wagenmakers, “Hidden multiplicity...
- R. Dai and C. Zheng, “False discovery rate-controlled multiple testing for union null hypotheses: a knockoff-based approach,” Biometrics,...
- S. Doyen, O. Klein, C.-L. Pichon, and A. Cleeremans, “Behavioral priming: It’s all in the mind, but whose mind?” PLOS ONE, vol. 7, no. 1,...
- F. Fidler et al., “Should psychology abandon p values and teach cis instead? evidence-based reforms in statistics education,” 2006.
- E. I. George and R. E. McCulloch, “Approaches for bayesian variable selection,” Statistica Sinica, pp. 339–373, 1997.
- I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” Advances in neural information...
- M. A. Haendel, C. G. Chute, T. D. Bennett, D. A. Eichmann, J. Guinney, W. A. Kibbe, P. R. Payne, E. R. Pfaff, P. N. Robinson, J. H. Saltz...
- J. P. A. Ioannidis, “Why most published research findings are false,” PLOS Medicine, vol. 2, no. 8, 08 2005. DOI: https://doi.org/10.1371/journal.pmed.0020124
- T. Jiang, Y. Li, and A. A. Motsinger-Reif, “Knockoff boosted tree for model-free variable selection,” Bioinformatics, vol. 37, no. 7, pp....
- J. Jordon, J. Yoon, and M. van der Schaar, “Knockoffgan: Generating knockoffs for feature selection using generative adversarial networks,”...
- M. Kormaksson, L. J. Kelly, X. Zhu, S. Haemmerle, L. Pricop, and D. Ohlssen, “Sequential knockoffs for continuous and categorical predictors:...
- E. Lander and L. Kruglyak, “Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results,” Nature genetics,...
- Y. Li, D. M. Umbach, A. Bingham, Q.-J. Li, Y. Zhuang, and L. Li, “Putative biomarkers for predicting tumor sample purity based on gene expression...
- P.-R. Loh, G. Kichaev, S. Gazal, A. P. Schoech, and A. L. Price, “Mixed-model association for biobank-scale datasets,” Nature genetics, vol....
- Y. Lu, Y. Fan, J. Lv, and W. Stafford Noble, “DeepPINK: reproducible feature selection in deep neural networks,” Advances in neural information...
- S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” Advances in neural information processing systems, vol....
- C. C. Mann, “Behavioral genetics in transition: A mass of evidence—animal and human—shows that genes influence behavior. but the attempt to...
- L. Mescheder, S. Nowozin, and A. Geiger, “The numerics of gans,” Advances in neural information processing systems, vol. 30, 2017.
- National Academies of Sciences, Engineering, and Medicine, “Reproducibility and replicability in science,” 2019.
- R. Nuzzo, “Scientific method: Statistical errors,” Nature, vol. 506, no. 7487, p. 150, 2014. DOI: https://doi.org/10.1038/506150a
- Open Science Collaboration, “Estimating the reproducibility of psychological science,” Science, vol. 349, no. 6251, p. aac4716, 2015. DOI:...
- H. Pashler, C. Harris, and N. Coburn, “Elderly-related words prime slow walking. psychfiledrawer,” 2011.
- F. Prinz, T. Schlange, and K. Asadullah, “Believe it or not: how much can we rely on published data on potential drug targets?” Nature reviews...
- A. Ramdas, S. J. Reddi, B. Póczos, A. Singh, and L. Wasserman, “On the decreasing power of kernel and distance based nonparametric hypothesis...
- Z. Ren, Model-free Methods For Multiple Testing and Predictive Inference. Stanford University, 2021.
- Y. Romano, M. Sesia, and E. Cand`es, “Deep knockoffs,” Journal of the American Statistical Association, vol. 115, no. 532, pp. 1861–1872,...
- F. Romero, “Philosophy of science and the replicability crisis,” Philosophy Compass, vol. 14, no. 11, p. e12633, 2019. DOI: https://doi.org/10.1111/phc3.12633
- K. Sechidis, M. Kormaksson, and D. Ohlssen, “Using knockoffs for controlled predictive biomarker identification,” Statistics in Medicine,...
- M. Sesia, C. Sabatti, and E. J. Cand`es, “Gene hunting with hidden markov model knockoffs,” Biometrika, vol. 106, no. 1, pp. 1–18, 2019....
- M. Sesia, S. Bates, E. Candès, J. Marchini, and C. Sabatti, “False discovery rate control in genome-wide association studies with population...
- J. P. Simmons, L. D. Nelson, and U. Simonsohn, “False-positive psychology: Undisclosed flexibility in data collection and analysis allows...
- N. C. Smith, “Replication studies: A neglected aspect of psychological research.” American Psychologist, vol. 25, no. 10, p. 970, 1970. DOI:...
- A. Spector and L. Janson, “Powerful knockoffs via minimizing reconstructability,” The Annals of Statistics, vol. 50, no. 1, pp. 252–276, 2022....
- W. Stroebe, T. Postmes, and R. Spears, “Scientific misconduct and the myth of selfcorrection in science,” Perspectives on psychological science,...
- M. Sudarshan, W. Tansey, and R. Ranganath, “Deep direct likelihood knockoffs,” Advances in neural information processing systems, vol. 33,...
- P. Sur and E. J. Cand`es, “A modern maximum-likelihood theory for high-dimensional logistic regression,” Proceedings of the National Academy...
- D. Trafimow and M. Marks, “Editorial,” Basic and Applied Social Psychology, vol. 37, no. 1, pp. 1–2, 2015. DOI: https://doi.org/10.1080/01973533.2015.1012991
- A. R. Vásquez, J. U. Márquez Urbina, G. Gonz´alez Far´ıas, and G. Escarela, “Controlling the false discovery rate by a latent gaussian copula...
- R. Wang, R. Dai, and C. Zheng, “Controlling fdr in selecting group-level simultaneous signals from multiple data sources with application...
- R. L. Wasserstein and N. A. Lazar, “The asa statement on p-values: context, process, and purpose,” pp. 129–133, 2016. DOI: https://doi.org/10.1080/00031305.2016.1154108
- R. L. Wasserstein, A. L. Schirm, and N. A. Lazar, “Moving to a world beyond “p¡ 0.05”,”pp. 1–19, 2019.
- K. Yoshihara, M. Shahmoradgoli, E. Mart´ınez, R. Vegesna, H. Kim, W. Torres-Garcia, V. Trevi˜no, H. Shen, P. W. Laird, D. A. Levine et al.,...
- S. T. Ziliak and D. N. McCloskey, The cult of statistical significance: How the standard error costs us jobs, justice, and lives. University...