Abstract
In genetics, a genome-wide association study (GWAs) involves an analysis of the single-nucleotide polymorphisms (SNPs) that constitute the genome. This analysis is performed on a large set of individuals usually classified as cases and controls. The study of differences in the SNP chains of both groups is known as pathway analysis. The analysis alluded to allows the researcher to go beyond univariate results like those offered by the p-value analysis and its representation by Manhattan plots. Pathway analysis makes it possible to detect weaker single-variant signals and is also helpful in order to understand molecular mechanisms linked to certain diseases and phenotypes. The present research proposes a new algorithm based on evolutionary computation, capable of finding significant pathways in GWA studies. Its performance has been tested with the help of synthetic data sets created with an ad hoc developed genomic data simulator.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gonzalez-Donquiles, C., et al.: PoDA algorithm: predictive pathways in colorectal cancer. In: Pérez García, H., Alfonso-Cendón, J., Sánchez González, L., Quintián, H., Corchado, E. (eds.) SOCO/CISIS/ICEUTE -2017. AISC, vol. 649, pp. 419–427. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67180-2_41
Gutiérrez, D.Á., et al.: A multiregressive approach for SNPs identification in prostate cancer. In: Pérez García, H., Alfonso-Cendón, J., Sánchez González, L., Quintián, H., Corchado, E. (eds.) SOCO/CISIS/ICEUTE -2017. AISC, vol. 649, pp. 400–409. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67180-2_39
McCarthy, M., Abecasis, G., Cardon, L., et al.: Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat. Rev. Genet. 9, 356–369 (2008). https://doi.org/10.1038/nrg2344
Moore, J., Asselbergs, F., Williams, S.: Bioinformatics challenges for genome-wide association studies. Bioinformatics 26, 445–455 (2010). https://doi.org/10.1093/bioinformatics/btp713
Visscher, P., Brown, M., McCarthy, M., Yang, J.: Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012). https://doi.org/10.1016/j.ajhg.2011.11.029
Fan, Y., Song, Y.: Finding the missing heritability of genome-wide association study using genotype imputation. Matters 2, e201604000013 (2016). https://doi.org/10.19185/matters.201604000013
García-Campos, M., Espinal-Enríquez, J., Hernández-Lemus, E.: Pathway analysis: state of the art. Front. Physiol. 6, 383 (2015). https://doi.org/10.3389/fphys.2015.00383
Marees, A., de Kluiver, H., Stringer, S., et al.: A tutorial on conducting genome-wide association studies: quality control and statistical analysis. Int. J. Methods Psychiatr. Res. 27, e1608 (2018). https://doi.org/10.1002/mpr.1608
Alonso Fernández, J., Díaz Muñiz, C., Garcia Nieto, P., de Cos, J.F., Sánchez Lasheras, F., Roqueñí, M.: Forecasting the cyanotoxins presence in fresh waters: a new model based on genetic algorithms combined with the MARS technique. Ecol. Eng. 53, 68–78 (2013). https://doi.org/10.1016/j.ecoleng.2012.12.015
Moore, J.H., White, B.: Genome-wide genetic analysis using genetic programming: the critical need for expert knowledge. In: Riolo, R., Soule, T., Worzel, B. (eds.) Genetic Programming Theory and Practice IV. Genetic and Evolutionary Computation, pp. 11–28. Springer, Boston (2007). https://doi.org/10.1007/978-0-387-49650-4_2
Ordóñez Galán, C., Sánchez Lasheras, F., de Cos, J.F., Bernardo Sánchez, A.: Missing data imputation of questionnaires by means of genetic algorithms with different fitness functions. J. Comput. Appl. Math. 311, 704–717 (2017). https://doi.org/10.1016/j.cam.2016.08.012
Sánchez Lasheras, J.E., et al.: Classification of prostate cancer patients and healthy individuals by means of a hybrid algorithm combining SVM and evolutionary algorithms. In: de Cos Juez, F.J., et al. (eds.) HAIS 2018. LNCS, pp. 547–557. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-92639-1_46
Suárez Sánchez, A., Riesgo Fernández, P., Sánchez Lasheras, F., et al.: Prediction of work-related accidents according to working conditions using support vector machines. Appl. Math. Comput. 218, 3539–3552 (2011). https://doi.org/10.1016/j.amc.2011.08.100
García Nieto, P., Alonso Fernández, J., Sánchez Lasheras, F., de Cos, J.F., Díaz Muñiz, C.: A new improved study of cyanotoxins presence from experimental cyanobacteria concentrations in the Trasona reservoir (Northern Spain) using the MARS technique. Sci. Total Environ. 430, 88–92 (2012). https://doi.org/10.1016/j.scitotenv.2012.04.068
Rosado, P., Lequerica-Fernández, P., Villallaín, L., et al.: Survival model in oral squamous cell carcinoma based on clinicopathological parameters, molecular markers and support vector machines. Expert Syst. Appl. 40, 4770–4776 (2013). https://doi.org/10.1016/j.eswa.2013.02.032
Vilán Vilán, J., Alonso Fernández, J., García Nieto, P., et al.: Support vector machines and multilayer perceptron networks used to evaluate the cyanotoxins presence from experimental cyanobacteria concentrations in the Trasona reservoir (Northern Spain). Water Resour. Manage. 27, 3457–3476 (2013). https://doi.org/10.1007/s11269-013-0358-4
García Nieto, P., Sánchez Lasheras, F., García-Gonzalo, E., de Cos, J.F.: PM10 concentration forecasting in the metropolitan area of Oviedo (Northern Spain) using models based on SVM, MLP, VARMA and ARIMA: a case study. Sci. Total Environ. 621, 753–761 (2018). https://doi.org/10.1016/j.scitotenv.2017.11.291
R Core Team: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2018). https://www.R-project.org/
Scrucca, L.: GA: a package for genetic algorithms in R. J. Stat. Softw. 53(4), 1–37 (2013). https://www.jstatsoft.org/v53/i04/
Szumilas, M.: Explaining odds ratios. J. Can. Acad. Child Adolesc. Psychiatry 19(3), 227–229 (2010)
Turner, S.D.: qqman: an R package for visualizing GWAS results using Q-Q and Manhattan plots. J. Open Source Softw. 3, 731 (2018). https://doi.org/10.21105/joss.00731
Satagopan, J., Smith, A.: Statistical methods in genomics research. Heart Drug 3, 48–60 (2003). https://doi.org/10.1159/000070907
Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)
Östensson, M.: Statistical methods for genome wide association studies. Chalmers University of Technology and the University of Gothenburg, Göteborg (2012)
Braun, R., Buetow, K.: Pathways of distinction analysis: a new technique for multi-SNP analysis of GWAS data. PLoS Genet. 7(6), e1002101 (2011). https://doi.org/10.1371/journal.pgen.1002101
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Díez Díaz, F., Sánchez Lasheras, F., de Cos Juez, F.J., Martín Sánchez, V. (2019). Evolutionary Algorithm for Pathways Detection in GWAS Studies. In: Pérez García, H., Sánchez González, L., Castejón Limas, M., Quintián Pardo, H., Corchado Rodríguez, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2019. Lecture Notes in Computer Science(), vol 11734. Springer, Cham. https://doi.org/10.1007/978-3-030-29859-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-29859-3_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29858-6
Online ISBN: 978-3-030-29859-3
eBook Packages: Computer ScienceComputer Science (R0)