[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3107411.3107412acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Bayesian Collective Markov Random Fields for Subcellular Localization Prediction of Human Proteins

Published: 20 August 2017 Publication History

Abstract

Advanced biotechnology makes it possible to access a multitude of heterogeneous proteomic, interactomic, genomic, and functional annotation data. One challenge in computational biology is to integrate these data to enable automated prediction of the Subcellular Localizations (SCL) of human proteins. For proteins that have multiple biological roles, their correct in silico assignment to different SCL can be considered as an imbalanced multi-label classification problem. In this study, we developed a Bayesian Collective Markov Random Fields (BCMRFs) model for multi-SCL prediction of human proteins. Given a set of unknown proteins and their corresponding protein-protein interaction (PPI) network, the SCLs of each protein can be inferred by the SCLs of its interacting partners. To do so, we integrate PPIs, the adjacency of SCLs and protein features, and perform transductive learning on the re-balanced dataset. Our experimental results show that the spatial adjacency of the SCLs improves multi-SCL prediction, especially for the SCLs with few annotated instances. Our approach outperforms the state-of-art PPI-based and feature-based multi-SCL prediction method for human proteins.

References

[1]
Barry C. Arnold and David Strauss. 1991. Pseudolikelihood Estimation: Some Examples. SankhyAJ\: The Indian Journal of Statistics, Series B (1960--2002) 53, 2 (1991), 233--243. http://www.jstor.org/stable/25052695
[2]
Janos X Binder, Sune Pletscher-Frankild, Kalliopi Tsafou, Christian Stolte, Sean I O'Donoghue, Reinhard Schneider, and Lars Juhl Jensen. 2014. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Data base : the journal of biological databases and curation 2014 (jan 2014), bau012.
[3]
Torsten Blum, Sebastian Briesemeister, and Oliver Kohlbacher. 2009. MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction. BMC bioinformatics 10 (jan 2009), 274.
[4]
Sebastian Briesemeister,Jiirg Rahnenfiihrer, and Oliver Kohlbacher. 2010. YLoc an interpretable web server for predicting subcellular localization. Nucleic acids research 38, Web Server issue (jul 2010), W497--502.
[5]
Alberto Calderone, Luisa Castagnoli, and Gianni Cesareni. 2013. mentha: a resource for browsing integrated protein-interaction networks. Nature methods 10, 8 (aug 2013), 690--1.
[6]
Francisco Charte, Antonio Rivera, Maria Jose del Jesus, and Francisco Herrera. 2014. Concurrence among Imbalanced Labels and Its Influence on Multilabel Resampling Algorithms. Springer International Publishing, 110--121.
[7]
J. L. Gardy. 2003. PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria. Nucleic Acids Research 31, 13 (ju! 2003), 3613--3617.
[8]
Jennifeil. Gardy amfFiona S L Bririkriian. 2006. Methods for predicting bacterial protein subcellular localization. Nature reviews. Microbiology 4, 10 (oct 2006), 741--51.
[9]
Shantanu Godbole and Sunita Sarawagi. 2004. Discriminative Methods for Multi labeled Classification. Springer Berlin Heidelberg, 22--30.
[10]
Xiaotong Guo, Fulin Liu, Ying Ju, Zhen Wang, and Chunyu Wang. 2016. Human Protein Subcellular Localization with Integrated Source and Multi-label Ensemble Classifier. Scientific Reports 6, February (2016), 28087.
[11]
Jianjun He, Hong Gu, and Wenqi Liu. 2012. Imbalanced multi-modal multi-label learning for subcellular localization prediction of human proteins with both single and multiple sites. PLoS ONE 7, 6 (2012).
[12]
Mien-Chie Hung and Wolfgang Link. 2011. Protein localization in disease and therapy. Journal of cell science 124, Pt 20 (oct 2011), 3381--92.
[13]
Jonathan Q Jiang and Maoying Wu. 2012. Predicting multiplex subcellular localization of proteins using protein-protein interaction network: a comparative study. BMC bioinformatics 13 Suppl 1, Suppl 10 (jan 2012), S20.
[14]
T. S. Keshava Prasad, Renu Goel, Kurnaran Kandasamy, Shivakurnar Keerthikumar, Sameer Kumar, Suresh Mathivanan, Deepthi Telikicherla, Rajesh Raju, Beema Shafreen, Abhilash Venugopal, Lavanya Balakrishnan, Arivusudar Marimuthu, Sutopa Banerjee, Devi S. Somanathan, Aimy Sebastian, Sandhya Rani, Somak Ray, C. J. Harrys Kishore, Sashi Kanth, Mukhtar Ahmed, Manoj K. Kashyap, Riaz Mahmood, Y. I. Ramachandra, V. Krishna, B. Abdul Rahiman, Sujatha Mohan, Prathibha Ranganathan, Subhashri Ramabadran, Raghothama Chaerkady, and Akhilesh Pandey. 2009. Human Protein Reference Database - 2009 update. Nucleic Acids Research 37, November 2008 (2009), 767--772.
[15]
Pushmeet Kohli, M. Pawan Kumar, and Philip H. S. Torr. 2007. P3 & Beyond: Solving Energies with Higher Order Cliques. In 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.
[16]
Yiannis A. I. Kourmpetis, Aalt D. J. van Dijk, Marco C. A. M. Bink, Roeland C. H. J. van Ham, and Cajo J. F. ter Braak. 2010. Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data. PLoS ONE 5, 2 (2010), e9293.
[17]
Kiyoung Lee, Han-Yu Chuang, Andreas Beyer, Min-Kyung Sung, Won-Ki Huh, Bonghee Lee, and Trey Ideker. 2008. Protein networks markedly improve pre- diction of subcellular localization in multiple eukaryotic species. Nucleic acids research 36, 20 (nov 2008), e136.
[18]
Noah Lee, Andrew F. Laine, and R. Theodore Smith. 2009. Bayesian Trans- ductive Markov Random Fields for Interactive Segmentation in Retinal Dis- orders. Springer Berlin Heidelberg, 227--230.
[19]
Sandra Orchard, Samuel Kerrien, Sara Abbani, Bruno Aranda, Jignesh Bhate, Shelby Bidwell, Alan Bridge, Leonardo Briganti, Fiona Brinkman, Gianni Ce- sareni, Andrew Chatr-aryamontri, Emilie Chautard, Carol Chen, Marine Du- mousseau, Johannes Goll, Robert Hancock, Linda I Hannick, Igor Jurisica, Jyoti Khadake, David J Lynn, Usha Mahadevan, Livia Perfetto, Arathi Raghunath, Sylvie Ricard-Blum, Bernd Roechert, Lukasz Salwinski, Volker Stümpflen, Mike Tyers, Peter Uetz, Ioannis Xenarios, and Henning Hermjakob. 2012. Protein interaction data curation: the International Molecular Exchange (IMEx) con- sortium. Nature Methods 9, 6 (2012), 626--626. nmeth0612--626a {20} Konstantinos Sechidis, Grigorios Tsoumakas, and Ioannis Vlahavas. 2011. On the Stratification of Multi-label Data. Springer Berlin Heidelberg, 145--158.
[20]
Chang Jin Shin, Simon Wong, Melissa J Davis, and Mark A Ragan. 2009. Proteinprotein interaction as a predictor of subcellular location. BMC systems biology 3 (jan 2009), 28.
[21]
Mathias Uhlen, Per Oksvold, Linn Fagerberg, Emma Lundberg, Kalle Jonasson, Mattias Forsberg, Martin Zwahlen, Caroline Kampf, Kenneth Wester, Sophia Hober, Henrik Wernerus, Lisa Björling, and Fredrik Ponten. 2010. Towards a knowledge-based Human Protein Atlas. Nature Biotechnology 28, 12 (dec 2010), 1248--1250.
[22]
Shibiao Wan, Man-Wai Mak, and Sun-Yuan Kung. 2017. Transductive Learning for Multi-Label Protein Subchloroplast Localization Prediction. IEEE/ACM Transactions on Computational Biology and Bioinformatics 14, 1 (jan 2017), 212--224.
[23]
Zhi Wei and Hongzhe Li. 2007. A Markov random field model for networkbased analysis of genomic data. Bioinformatics 23, 12 (2007), 1537--1544.
[24]
Nancy Y. Yu, James R. Wagner, Matthew R. Laird, Gabor Melli, Sébastien Rey, Raymond Lo, Phuong Dao, S Cenk Sahinalp, Martin Ester, Leonard J. Foster, Fiona S L Brinkman, S. Cenk Sahinalp, Martin Ester, Leonard J. Foster, and Fiona S L Brinkman. 2010. PSORTb 3.0: Improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics 26, 13 (jul 2010), 1608--1615.
[25]
Hang Zhou, Yang Yang, and Hong-Bin Shen. 2016. Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features. Bioinformatics (dec 2016), btw723.

Cited By

View all
  • (2019)Tissue-Specific Subcellular Localization Prediction Using Multi-Label Markov Random FieldsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2019.289768316:5(1471-1482)Online publication date: 1-Sep-2019

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ACM-BCB '17: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics
August 2017
800 pages
ISBN:9781450347228
DOI:10.1145/3107411
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. human protein subcellular localization
  2. imbalanced multi-label classification
  3. markov random field
  4. transductive learning

Qualifiers

  • Research-article

Funding Sources

  • International DFG Research Training Group GRK 1906/1

Conference

BCB '17
Sponsor:

Acceptance Rates

ACM-BCB '17 Paper Acceptance Rate 42 of 132 submissions, 32%;
Overall Acceptance Rate 254 of 885 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)2
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Tissue-Specific Subcellular Localization Prediction Using Multi-Label Markov Random FieldsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2019.289768316:5(1471-1482)Online publication date: 1-Sep-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media