[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2649387.2649442acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
short-paper

Deep autoencoder neural networks for gene ontology annotation predictions

Published: 20 September 2014 Publication History

Abstract

The annotation of genomic information is a major challenge in biology and bioinformatics. Existing databases of known gene functions are incomplete and prone to errors, and the bimolecular experiments needed to improve these databases are slow and costly. While computational methods are not a substitute for experimental verification, they can help in two ways: algorithms can aid in the curation of gene annotations by automatically suggesting inaccuracies, and they can predict previously-unidentified gene functions, accelerating the rate of gene function discovery. In this work, we develop an algorithm that achieves both goals using deep autoencoder neural networks. With experiments on gene annotation data from the Gene Ontology project, we show that deep autoencoder networks achieve better performance than other standard machine learning methods, including the popular truncated singular value decomposition.

References

[1]
G. Pandey, V. Kumar, and M. Steinbach, "Computational approaches for protein function prediction: A survey". Twin Cities: Department of Computer Science and Engineering, University of Minnesota, 2006.
[2]
The Gene Ontology Consortium, "Creating the Gene Ontology Resource: Design and Implementation". Genome Research, vol. 11, pp. 1425--1433, 2001.
[3]
M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin, and G. Sherlock, "Gene Ontology: tool for the unification of biology". Nature Genetics, vol. 25.1: pp. 25--29, 2000.
[4]
O. D. King, R. E. Foulger, S. S. Dwight, J. V. White, and F. P. Roth, "Predicting gene function from patterns of annotation". Genome Research 13.5: pp. 896--904, 2003.
[5]
Y. Tao, L. Sam, J. Li, C. Friedman, and Y. A. Lussier, "Information theory applied to the sparse gene ontology annotation network to predict novel gene function". Bioinformatics, vol. 23.13: pp. 529--538, 2007.
[6]
Z. Barutcuoglu, R. E. Schapire, and O. G. Troyanskaya, "Hierarchical multi-label prediction of gene function". Bioinformatics, vol. 22.7: pp. 830--836, 2006.
[7]
S. Raychaudhuri, et al. "Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature". Genome Research, vol. 12.1: pp. 203--214, 2002.
[8]
A. Perez, C. Perez-Iratxeta, P. Bork, G. Thode, and M. A. Andrade, "Gene annotation from scientific literature using mappings between keyword systems". Bioinformatics, vol. 20.13: pp. 2084--2091, 2004.
[9]
G. Yu, H. Rangwala, C. Domeniconi, G. Zhang, and Z. Yu, "Protein Function Prediction with Incomplete Annotations". IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 11.3: pp. 579--591, 2013.
[10]
P. Khatri, B. Done, A. Rao, A. Done, and S. Draghici, "A semantic analysis of the annotations of the human genome". Bioinformatics, vol. 21.16: pp. 3416--3421, 2005.
[11]
M. Masseroli, M. Tagliasacchi, and D. Chicco, "Semantically improved genome-wide prediction of Gene Ontology annotations". Proceedings of IEEE ISDA 2011, the 11th International Conference on Intelligent Systems Design and Applications, pp. 1080--1085, 2011.
[12]
P. Pinoli, D. Chicco, and M. Masseroli. "Improved biomolecular annotation prediction through Weighting Scheme methods". Proceedings of CIBB 2013, the Tenth International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, Nice, France, pp. 1--12, 2013.
[13]
H. Bourlard, and Y. Kamp, "Auto-association by multilayer perceptrons and singular value decomposition." Biological cybernetics, vol. 59.4-5, pp. 291--294, 1988.
[14]
P. Baldi and K. Hornik, "Neural networks and principal component analysis: Learning from examples without local minima." Neural networks, vol. 2.1, pp. 53--58, 1989.
[15]
P. Baldi, "Autoencoders, Unsupervised Learning, and Deep Architectures". Journal of Machine Learning Research-Proceedings Track, vol. 27, pp. 37--50, 2012.
[16]
G. H. Golub, and C. Reinsch, "Singular value decomposition and least squares solutions". Numerische Mathematik vol. 14.5: pp. 403--420, 1970.
[17]
M. Masseroli, M. Tagliasacchi, "Web resources for gene list analysis in biomedicine", In: Lazakidou, A., editor. Web-based Applications in Health Care and Biomedicine. Heidelberg, D: Springer, Annals of Information Systems Series, vol. 7, pp. 117--141, 2010
[18]
B. Done, P. Khatri, A. Done, and S. Draghici, "Semantic analysis of genome annotations using weighting schemes". Proceedings of CIBCB 2007, the IEEE Symposium Computational Intelligence and Bioinformatics and Computational Biology, pp. 212--218, 2007.
[19]
B. Done, P. Khatri, A. Done, and S. Draghici, "Predicting novel human gene ontology annotations using semantic analysis." IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7.1: pp. 91--99, 2010.
[20]
D. Chicco, and M. Masseroli, "A Discrete Optimization Approach for SVD Best Truncation Choice based on ROC Curves". Proceedings of IEEE BIBE 2013, the 13rd International Conference on Bioinformatics and Bioengineering, pp. 1--4, 2013.
[21]
R. Collobert, K. Kavukcuoglu and C. Farabet, "Torch7: A Matlab-like Environment for Machine Learning". BigLearn, NIPS Workshop, 2011.
[22]
A. Canakoglu, G. Ghisalberti, and M. Masseroli, "Integration of Biomolecular Interaction Data in a Genomic and Proteomic Data Warehouse to Support Biomedical Knowledge Discovery". Computational Intelligence Methods for Bioinformatics and Biostatistics, Springer Berlin Heidelberg, pp. 112--126, 2012.
[23]
F. Pessina, M. Masseroli, and A. Canakoglu, "Visual composition of complex queries on an integrative Genomic and Proteomic Data Warehouse". Engineering, vol. 5:10B, pp. 1--8, 2013.
[24]
D. Chicco, M. Tagliasacchi, and M. Masseroli, "Genomic annotation prediction based on integrated information". Computational Intelligence Methods for Bioinformatics and Biostatistics, Springer Berlin Heidelberg, pp. 238--252, 2012.
[25]
D. Chicco, M. Tagliasacchi, and M. Masseroli, "Biomolecular annotation prediction through information integration". Proceedings of CIBB 2011, the 8th Computational Intelligence Methods for Bioinformatics and Biostatistics, pp. 1--8, 2011.
[26]
M. Masseroli, D. Chicco, and P. Pinoli, "Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annotations". Proceedings of IEEE IJCNN 2012, the International Joint Conference on Neural Networks, pp- 1--8 2012.
[27]
P. Pinoli, D. Chicco, and M. Masseroli, "Latent Dirichlet Allocation based on Gibbs Sampling for Gene Function Prediction". Proceedings of IEEE CIBCB 2014, the Conference on Computational Intelligence in Bioinformatics and Computational Biology, pp. 1--4, 2014.
[28]
S. Ceri, D. Braga, F. Corcoglioniti, M. Grossniklaus, and S. Vadacca, "Search computing challenges and directions". Springer, Berlin Heidelberg, 2010.
[29]
D. Chicco, "Integration of bioinformatics web services through the Search Computing technology". Technical Report, TR 2012/02. Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milan, Italy.

Cited By

View all
  • (2024)Structure-primed embedding on the transcription factor manifold enables transparent model architectures for gene regulatory network and latent activity inferenceGenome Biology10.1186/s13059-023-03134-125:1Online publication date: 18-Jan-2024
  • (2024)DAE-CFR: detecting microRNA-disease associations using deep autoencoder and combined feature representationBMC Bioinformatics10.1186/s12859-024-05757-y25:1Online publication date: 29-Mar-2024
  • (2024)Optimal approximations for the free boundary problems of the space-time fractional Black-Scholes equations using a combined physics-informed neural networkScientific Reports10.1038/s41598-024-77073-714:1Online publication date: 25-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
BCB '14: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
September 2014
851 pages
ISBN:9781450328944
DOI:10.1145/2649387
  • General Chairs:
  • Pierre Baldi,
  • Wei Wang
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 September 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. autoencoders
  2. biomolecular annotations
  3. gene ontology
  4. matrix-completion
  5. neural networks
  6. principal component analysis
  7. truncated singular value decomposition

Qualifiers

  • Short-paper

Conference

BCB '14
Sponsor:
BCB '14: ACM-BCB '14
September 20 - 23, 2014
California, Newport Beach

Acceptance Rates

Overall Acceptance Rate 254 of 885 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)89
  • Downloads (Last 6 weeks)7
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Structure-primed embedding on the transcription factor manifold enables transparent model architectures for gene regulatory network and latent activity inferenceGenome Biology10.1186/s13059-023-03134-125:1Online publication date: 18-Jan-2024
  • (2024)DAE-CFR: detecting microRNA-disease associations using deep autoencoder and combined feature representationBMC Bioinformatics10.1186/s12859-024-05757-y25:1Online publication date: 29-Mar-2024
  • (2024)Optimal approximations for the free boundary problems of the space-time fractional Black-Scholes equations using a combined physics-informed neural networkScientific Reports10.1038/s41598-024-77073-714:1Online publication date: 25-Oct-2024
  • (2023)Dynamic Depth Learning in Stacked AutoEncodersApplied Sciences10.3390/app13191099413:19(10994)Online publication date: 5-Oct-2023
  • (2023)DeepGenePrior: A deep learning model for prioritizing genes affected by copy number variantsPLOS Computational Biology10.1371/journal.pcbi.101124919:7(e1011249)Online publication date: 24-Jul-2023
  • (2023)An Integrated Feature Extraction Based on Principal Components and Deep Auto Encoder with Extra Tree for Intrusion Detection SystemsJournal of Information & Knowledge Management10.1142/S021964922350066123:01Online publication date: 17-Nov-2023
  • (2023)Arrhythmia Detection from Electrocardiogram Signal Data Based on Wavelet Transform and Deep Reinforcement Learning2023 IEEE International Conference on Medical Artificial Intelligence (MedAI)10.1109/MedAI59581.2023.00051(325-333)Online publication date: 18-Nov-2023
  • (2023)Transformer-Based Gene Scoring Model for Extracting Representative Characteristic of Central Dogma Process to Prioritize Pathogenic Genes Applying Breast Cancer Multi-omics Data2023 IEEE International Conference on Big Data and Smart Computing (BigComp)10.1109/BigComp57234.2023.00033(149-154)Online publication date: Feb-2023
  • (2023)Structural damage assessment through a new generalized autoencoder with features in the quefrency domainMechanical Systems and Signal Processing10.1016/j.ymssp.2022.109713184(109713)Online publication date: Feb-2023
  • (2023)Physics-based and data-driven modeling for biomanufacturing 4.0Manufacturing Letters10.1016/j.mfglet.2023.04.003Online publication date: May-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media