[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Structured Output Prediction of Novel Enzyme Function with Reaction Kernels

  • Conference paper
Biomedical Engineering Systems and Technologies (BIOSTEC 2010)

Abstract

Enzyme function prediction is an important problem in post-genomic bioinformatics, needed for reconstruction of metabolic networks of organisms. Currently there are two general methods for solving the problem: annotation transfer from a similar annotated protein, and machine learning approaches that treat the problem as classification against a fixed taxonomy, such as Gene Ontology or the EC hierarchy. These methods are suitable in cases where the function of the new protein is indeed previously characterized and included in the taxonomy. However, given a new function that is not previously described, these approaches are not of significant assistance to the human expert. The goal of this paper is to bring forward structured output learning approaches for the case where the exactly correct function of the enzyme to be annotated may not be contained in the training set. Our approach hinges on fine-grained representation of the enzyme function via the so called reaction kernels that allow interpolation and extrapolation in the output (reaction) space. A kernel-based structured output prediction model is used to predict enzymatic reactions from sequence motifs. We bring forward several choices for constructing reaction kernels and experiment with them in the remote homology case where the functions in the test set have not been seen in the training phase.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 71.50
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 89.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Astikainen, K., Holm, L., Pitknen, E., Szedmak, S., Rousu, J.: Towards structured output prediction of enzyme function. In: BMC Proceedings, vol. 2(S4), S2 (2008)

    Google Scholar 

  2. Barutcuoglu, Z., Schapire, R., Troyanskaya, O.: Hierarchical multi-label prediction of gene function. Bioinformatics 22(7), 830–836 (2006)

    Article  Google Scholar 

  3. Blockeel, H., Schietgat, L., Struyf, J., Džeroski, S., Clare, A.: Decision trees for hierarchical multilabel classification: A case study in functional genomics. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 18–29. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Borgwardt, K.M., Ong, C.S., Schnauer, S., Vishwanathan, S.V.N., Smola, A.J., Kriegel, H.P.: Protein function prediction via graph kernels. Bioinformatics 21(1), 47–56 (2005)

    Article  Google Scholar 

  5. Clare, A., King, R.: Machine learning of functional class from phenotype data. Bioinformatics 18(1), 160–166 (2002)

    Article  Google Scholar 

  6. Gartner, T.: A survey of kernels for structured data. SIGKDD Explorations 5 (2003)

    Google Scholar 

  7. Goto, S., Okuno, Y., Hattori, M., Nishioka, T., Kanehisa, M.: Ligand: database of chemical compounds and reactions in biological pathways. Nucleic Acids Research 30(1), 402 (2002)

    Article  Google Scholar 

  8. Heger, A., Korpelainen, E., Hupponen, T., Mattila, K., Ollikainen, V., Holm, L.: Pairsdb atlas of protein sequence space. Nucl. Acids Res. 36, D276–D280 (2008)

    Article  Google Scholar 

  9. Heger, A., Mallick, S., Wilton, C., Holm, L.: The global trace graph, a novel paradigm for searching protein sequence databases. Bioinformatics 23(18) (2007)

    Google Scholar 

  10. Heinonen, M., Lappalainen, S., Mielikäinen, T., Rousu, J.: Computing Atom Mappings for Biochemical Reactions without Subgraph Isomorphism. Journal of Computational Biology (to appear 2011)

    Google Scholar 

  11. Holm, L., Sander, C.: Dali/fssp classification of three-dimensional protein folds. Nucleic Acids Research 25(1), 231–234 (1996)

    Article  Google Scholar 

  12. Lanckriet, G., Deng, M., Cristianini, N., et al.: Kernel-based data fusion and its application to protein function prediction in yeast. In: PSB 2004 (2004)

    Google Scholar 

  13. Pitkänen, E., Jouhten, P., Rousu, J.: Inferring branching pathways in genome-scale metabolic networks. BMC Systems Biology 3(1), 103 (2009)

    Article  Google Scholar 

  14. Pitkänen, E., Rousu, J., Ukkonen, E.: Computational methods for metabolic reconstruction. Current Opinion in Biotechnology 21, 70–77 (2010)

    Article  Google Scholar 

  15. Punta, M., Ofran, Y.: The rough guide to in silico function prediction, or how to use sequence and structure information to predict protein function. PLoS Computational Biology 4(10) (2008)

    Google Scholar 

  16. Rantanen, A., Rousu, J., Jouhten, P., Zamboni, N., Maaheimo, H., Ukkonen, E.: An analytic and systematic framework for estimating metabolic flux ratios from 13 C tracer experiments. BMC bioinformatics 9(1), 266 (2008)

    Article  Google Scholar 

  17. Rousu, J., Saunders, C., Szedmak, S., Shawe-Taylor, J.: Kernel-based learning of hierarchical multilabel classification models. JMLR 7 (2006)

    Google Scholar 

  18. Schlkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Computation 13(7), 1443–1471 (2001)

    Article  Google Scholar 

  19. Sokolov, A., Ben-Hur, A.: A structured-outputs method for prediction of protein function. In: Proceedings of the 3rd International Workshop on Machine Learning in Systems Biology (2008)

    Google Scholar 

  20. Szedmak, S., Shawe-Taylor, J., Parado-Hernandez, E.: Learning via linear operators: Maximum margin regression. Tech. rep., Pascal (2005)

    Google Scholar 

  21. Taskar, B., Guestrin, C., Koller, D.: Max-margin markov networks. In: NIPS 2003 (2004)

    Google Scholar 

  22. Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: ICML (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Astikainen, K., Holm, L., Pitkänen, E., Szedmak, S., Rousu, J. (2011). Structured Output Prediction of Novel Enzyme Function with Reaction Kernels. In: Fred, A., Filipe, J., Gamboa, H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2010. Communications in Computer and Information Science, vol 127. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18472-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18472-7_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-18471-0

  • Online ISBN: 978-3-642-18472-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics