Abstract
Autism Spectrum Disorder (ASD) is an etiologically and clinically heterogeneous neurodevelopmental disorder with more than 800 putative risk genes. This heterogeneity, coupled with the low penetrance of most ASD-associated mutations presents a challenge in identifying the relevant genetic determinants of ASD. We developed a machine learning semi-supervised gene scoring and classification method based on network propagation using a variant of the random walk with restart algorithm to identify and rank genes according to their association to know ASD-related genes. The method combines information from protein-protein interactions and positive (disease-related) and negative (disease-unrelated) genes. Our results indicate that the proposed method can classify held-out known disease genes in a cross-validation setting with good performance (area under the receiver operating curve \(\sim \)0.85, area under the precision-recall curve \(\sim \)0.8 and Matthews correlation coefficient 0.57). We found a set of top-ranking novel candidate genes identified by the method to be significantly enriched for pathways related to synaptic transmission and ion transport and specific neurotransmitter-associated pathways previously shown to be associated with ASD. Most of the novel candidate genes were found to be targeted by denovo single nucleotide variants in ASD patients.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
https://gene.sfari.org/database/human-gene/, accessed 1 May 2018.
- 2.
http://amp.pharm.mssm.edu/Enrichr, accessed 1 Jun. 2018.
- 3.
http://denovo-db.gs.washington.edu/denovo-db/, accessed 1 February 2019.
References
Elsabbagh, M., et al.: Global prevalence of autism and other pervasive developmental disorders. Autism Res. 5(3), 160–179 (2012). https://doi.org/10.1002/aur.239
American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, 5th edn., Washington, DC (2013). https://doi.org/10.1176/appi.books.9780890425596
Chaste, P., Roeder, K., Devlin, B.: The Yin and Yang of autism genetics: how rare De Novo and common variations affect liability. Ann. Rev. Genomics Hum. Genet. 18, 167–187 (2017). https://doi.org/10.1146/annurev-genom-083115-022647
Pinto, D., et al.: Functional impact of global rare copy number variation in autism spectrum disorders. Nature 466(7304), 368 (2010). https://doi.org/10.1038/nature09146
Pinto, D., et al.: Convergence of genes and cellular pathways dysregulated in autism spectrum disorders. Am. J. Hum. Genet. 94(5), 677–694 (2014). https://doi.org/10.1016/j.ajhg.2014.03.018
Vorstman, J., Parr, J., Moreno-De-Luca, D., Anney, R., Nurnberger Jr., J., et al.: Autism genetics: opportunities and challenges for clinical translation. Nat. Rev. Genet. 18(6), 362–376 (2017). https://doi.org/10.1038/nrg.2017.4
Ansel, A., Rosenzweig Joshua, P., Zisman, P.D., Melamed, M., Gesundheit, B.: Variation in gene expression in autism spectrum disorders: an extensive review of transcriptomic studies. Front. Neurosci. 10, 601 (2017). https://doi.org/10.3389/fnins.2016.00601
Krishnan, A., et al.: Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder. Nat. Neurosci. 19(11), 1454–1462 (2016). https://doi.org/10.1038/nn.4353
Asif, M., et al.: Identifying disease genes using machine learning and gene functional similarities, assessed through Gene Ontology. PloS one 13(12), e0208626 (2018). https://doi.org/10.1371/journal.pone.0208626
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems, pp. 321–328 (2004)
Cowen, L., Ideker, T., Raphael, B.J., Sharan, R.: Network propagation: a universal amplifier of genetic associations. Nat. Rev. Genet. 18(9), 551–562 (2017). https://doi.org/10.1038/nrg.2017.38
Mosca, E., et al.: Network diffusion-based prioritization of autism risk genes identifies significantly connected gene modules. Front. Genet. 8, 129 (2017). https://doi.org/10.3389/fgene.2017.00129
Szklarczyk, D., et al.: The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res. 45(Database issue), D362–D368 (2017). https://doi.org/10.1093/nar/gkw937
Smedley, D., Haider, S., et al.: The BioMart community portal: an innovative alternative to large, centralized data repositories. Nucleic Acids Res. 43(1), W589–W598 (2015). https://doi.org/10.1093/nar/gkv350
Acknowledgments
The authors would like to acknowledge the support by the UID/MULTI/04046/2019 centre grant from FCT, Portugal (to BioISI). A.M. is recipient of a fellowship from BioSys PhD programme (Ref SFRH/BD52485/2014) from FCT (Portugal). This work used the EGI infrastructure with the support of NCG-INGRID-PT (Portugal) and BIFI (Spain).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Martiniano, H.F.M.C., Asif, M., Vicente, A.M., Correia, L. (2020). Network Propagation-Based Semi-supervised Identification of Genes Associated with Autism Spectrum Disorder. In: Raposo, M., Ribeiro, P., Sério, S., Staiano, A., Ciaramella, A. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2018. Lecture Notes in Computer Science(), vol 11925. Springer, Cham. https://doi.org/10.1007/978-3-030-34585-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-34585-3_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34584-6
Online ISBN: 978-3-030-34585-3
eBook Packages: Computer ScienceComputer Science (R0)