[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Predicting Protein-Protein Interactions from Protein Domains Using a Set Cover Approach

Published: 01 January 2007 Publication History

Abstract

One goal of contemporary proteome research is the elucidation of cellular protein interactions. Based on currently available protein-protein interaction and domain data, we introduce a novel method, Maximum Specificity Set Cover (MSSC), for the prediction of protein-protein interactions. In our approach, we map the relationship between interactions of proteins and their corresponding domain architectures to a generalized weighted set cover problem. The application of a greedy algorithm provides sets of domain interactions which explain the presence of protein interactions to the largest degree of specificity. Utilizing domain and protein interaction data of S. cerevisiae, MSSC enables prediction of previously unknown protein interactions, links that are well supported by a high tendency of coexpression and functional homogeneity of the corresponding proteins. Focusing on concrete examples, we show that MSSC reliably predicts protein interactions in well-studied molecular systems, such as the 26S proteasome and RNA polymerase II of S. cerevisiae. We also show that the quality of the predictions is comparable to the Maximum Likelihood Estimation while MSSC is faster. This new algorithm and all data sets used are accessible through a Web portal at http://ppi.cse.nd.edu.

References

[1]
{1} T. Ito, T. Chiba, R. Ozawa, M. Yoshida, M. Hattori, and Y. Sakaki, "A Comprehensive Two-Hybrid Analysis to Explore the Yeast Protein Interactome," Proc. Nat'l Academy of Science USA, vol. 98, no. 8, pp. 4569-4574, 2001.
[2]
{2} T. Ito, K. Tashiro, S. Muta, R. Ozawa, T. Chibba, M. Nishizawa, K. Yamamoto, S. Kuhara, and Y. Sakaki, "Towards a Protein-Protein Interaction Map of the Budding Yeast: A Comprehensive System to Examine Two-Hybrid Interactions in All Possible Combinations between the Yeast Proteins," Proc. Nat'l Academy of Science USA, vol. 97, no. 3, pp. 1143-1147, 2000.
[3]
{3} B. Schwikowski, P. Uetz, and S. Fields, "A Network of Protein-Protein Interactions in Yeast," Nature Biotechnology, vol. 18, pp. 1257-1261, 2000.
[4]
{4} P. Uetz, L. Giot, G. Cagney, T. Mansfield, R. Judson, J. Knight, D. Lockshorn, V. Narayan, M. Srinivasan, P. Pochart, A. Qureshi-Emili, Y. Li, B. Godwin, D. Conover, T. Kalbfleisch, G. Vijayadamodar, M. Yang, M. Johnston, S. Fields, and J. Rothberg, "A Comprehensive Analysis of Protein-Protein Interactions of Saccharomyces cerevisiae," Nature, vol. 403, pp. 623-627, 2000.
[5]
{5} J.S. Bader, D. Chaudhuri, and J. Chant, "Gaining Confidence in High-Throughput Protein Interaction Networks," Nature Biotechnology , vol. 22, pp. 78-85, 2004.
[6]
{6} P. Bork, L. Jensen, C. von Mering, A. Ramani, and E. Marcotte, "Protein Interaction Networks from Yeast to Human," Current Opinion on Structural Biology, vol. 14, pp. 292-299, 2004.
[7]
{7} M. Vidal, "Interactome Modelling," FEBS Letters, vol. 579, pp. 1834-1838, 2005.
[8]
{8} D. LaCount, M. Vignali, R. Chettier, A. Phansalkar, R. Bell, J. Hesselberth, L. Schoenfeld, S.S.I. Ota, C. Kurschner, S. Fields, and R. Hughes, "A Protein Interaction Network of the Malaria Parasite Plasmodium falciparum," Nature, vol. 438, pp. 103-107, 2005.
[9]
{9} A. Walhout, R. Sordella, X. Lu, J. Hartley, G. Temple, M. Brasch, N. Thierry-Mieg, and M. Vidal, "Protein Interaction Mapping in C. elegans Using Proteins Involved in Vulval Development," Science, vol. 287, pp. 116-122, 2000.
[10]
{10} S. Li, C. Armstrong, N. Bertin, H. Ge, S. Milstein, M. Boxem, P.-O. Vidalain, J.-D. Han, A. Chesneau, and T. Ha, "A Map of the Interactome Network of the Metazoan C. Elegans," Science, vol. 303, pp. 540-543, 2004.
[11]
{11} L. Giot, J. Bader, C. Brouwer, A. Chaudhuri, B. Kuang, Y. Li, Y. Hao, C. Ooi, B. Godwin, E. Vitols, G. Vijayadamodar, P. Pochart, H. Machineni, M. Welsh, Y. Kong, B. Zerhusen, R. Malcolm, Z. Varrone, A. Collis, M. Minto, S. Burgess, L. McDaniel, E. Stimpson, F. Spriggs, J. Williams, K. Neurath, N. Ioime, M. Agee, E. Voss, K. Furtak, R. Renzulli, N. Aanensen, S. Carrolla, E. Bickelhaupt, Y. Lazovatsky, A. DaSilva, J. Zhong, C. Stanyon, R. Finley Jr., K. White, M. Braverman, T. Jarvie, S. Gold, M. Leach, J. Knight, R. Shimkets, M. McKenna, J. Chant, and J. Rothberg, "A Protein Interaction Map of Drosophila melanogaster," Science, vol. 302, pp. 1727-1736, 2004.
[12]
{12} J.-F. Rual et al., "Towards a Proteome-Scale Map of the Human Protein-Protein Interaction Network," Nature, vol. 437, pp. 1173- 1178, 2005.
[13]
{13} A. Enright, I. Iliopoulos, N. Kyrpides, and C. Ouzounis, "Protein Interaction Maps for Complete Genomes Based on Gene Fusion Events," Nature, vol. 402, pp. 86-90, 1999.
[14]
{14} E. Marcotte, M. Pellegrini, M. Thompson, T. Yeates, and D. Eisenberg, "A Combined Algorithm for Genomewide Prediction of Protein Function," Nature, vol. 402, pp. 83-86, 1999.
[15]
{15} M. Pellegrini, E. Marcotte, M. Thompson, D. Eisenberg, and T. Yeates, "Assigning Protein Functions by Comparative Genome Analysis: Protein Phylogenetic Profiles," Proc. Nat'l Academy of Sciences USA, vol. 96, pp. 4285-4288, 1999.
[16]
{16} E. Marcotte, M. Pellegrini, H.-L. Ng, D. Rice, T. Yeates, and D. Eisenberg, "Detecting Protein Function and Protein-Protein Interactions from Genome Sequences," Science, vol. 285, pp. 751- 753, 1999.
[17]
{17} D. Ekman, J.F.-S.A.K. Björklund, and E. Elofsson, "Multi-Domain Proteins in the Three Kingdoms of Life: Orphan Domains and Other Unassigned Regions," J. Molecular Biology, vol. 348, pp. 231-243, 2005.
[18]
{18} J. Wojcik and V. Schächter, "Protein-Protein Interaction Map Inference Using Interacting Domain Profile Pairs," Bioinformatics, vol. 17, pp. 296S-305S, 2001.
[19]
{19} J. Espadaler, R.J.O. Romero-Isart, and B. Oliva, "Prediction of Protein-Protein Interactions Using Distant Conservation of Sequence Patterns and Structure Relationships," Bioinformatics, vol. 21, pp. 3360-3368, 2005.
[20]
{20} P. Aloy, B. Böttcher, H. Ceulemans, C. Leutwein, C. Mellwig, S. Fischer, A.-C. Gavin, P. Bork, G. Superti-Furga, L. Serrano, and R. Russell, "Structure-Based Assembly of Protein Complexes in Yeast," Science, vol. 303, pp. 2026-2029, 2004.
[21]
{21} A. Stein, R. Russell, and P. Aloy, "3did: Interacting Protein Domains of Known Three-Dimensional Structure," Nucleic Acids Research, vol. 33, pp. D413-D417, 2005.
[22]
{22} M. Deng, S. Mehta, F. Sun, and T. Cheng, "Inferring Domain-Domain Interactions from Protein-Protein Interactions," Genome Research, vol. 12, pp. 1540-1548, 2002.
[23]
{23} I. Iossifov, M. Krauthammer, C. Friedman, V. Hatzivassiloglou, J. Bader, K. White, and A. Rzhetsky, "Probabilistic Inference of Molecular Networks from Noisy Data Sources," Bioinformatics, vol. 20, pp. 1205-1213, 2004.
[24]
{24} E. Sprinzak and H. Margalit, "Correlated Sequence-Signature as Markers of Protein-Protein Interaction," J. Molecular Biology, vol. 311, pp. 681-692, 2001.
[25]
{25} S.M. Gomez, W.S. Noble, and A. Rzhetsky, "Learning to Predict Protein-Protein Interactions from Protein Sequences," Bioinformatics , vol. 19, pp. 1875-1881, 2003.
[26]
{26} D. Han, H.-S. Kim, J. Seo, and W. Jang, "A Domain Combination Based Probabilistic Framework for Protein-Protein Interaction Prediction," Genome Informatics, vol. 14, pp. 250-259, 2003.
[27]
{27} D. Goldberg and F. Roth, "Assessing Experimentally Derived Interactions in a Small-World," Proc. Nat'l Academy of Sciences USA, vol. 100, no. 8, pp. 4372-4376, 2003.
[28]
{28} R. Jansen, H. Yu, D. Greenbaum, Y. Kluger, N. Krogan, S. Chung, A. Emili, M. Snyder, J. Greenblatt, and M. Gerstein, "A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data," Science, vol. 302, pp. 449-453, 2003.
[29]
{29} N. Nariai, S. Kim, S. Imoto, and S. Miyano, "Using Protein-Protein Interactions for Refining Gene Networks Estimated from Microarray Data by Bayesian Networks," Proc. Pacific Symp. Biocomputing , pp. 336-347, 2004.
[30]
{30} I. Albert and R. Albert, "Conserved Network Motifs Allow Protein-Protein Interaction Prediction," Bioinformatics, vol. 20, pp. 3346-3352, 2004.
[31]
{31} S. Gomez, S. Lo, and A. Rhetsky, "Probabilistic Prediction of Unknown Metabolic and Signal Transduction Networks," Genetics , vol. 159, pp. 1291-1298, 2001.
[32]
{32} A. Tong et al., "A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules," Science, vol. 295, pp. 321-324, 2002.
[33]
{33} D. Rhodes, S. Tomlins, S. Varambally, V. Mahavisno, T. Barrette, S. Kalyana-Sundaram, D. Ghosh, A. Pandey, and A. Chinnaiyan, "Probabilistic Model of the Human Protein-Protein Interaction Network," Nature Biotechnology, vol. 23, pp. 951-959, 2005.
[34]
{34} A. Bateman, L. Coin, R. Durbin, R. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E. Sonnhammer, D. Studholme, C. Yeats, and S. Eddy, "The Pfam Protein Families Database," Nucleic Acids Research, vol. 32, pp. D138-D141, 2004.
[35]
{35} A. Zanzoni, L. Montecchi-Palazzi, M. Quondam, G. Ausiello, M. Helmer-Citterich, and G. Cesareni, "MINT--A Molecular INTeraction Database," FEBS Letters, vol. 513, pp. 135-140, 2002.
[36]
{36} H.W. Mewes, U.B.D. Frishman, G. Mannhaupt, K. Mayer, M. Mokrejs, M.M.B. Morgenstern, S. Rudd, and B. Weil, "MIPS: A Database for Genomes and Protein Sequences," Nucleic Acids Research, vol. 30, pp. 31-34, 2002.
[37]
{37} G. Bader, I. Donaldson, C. Wolting, B. Ouellette, T. Pawson, and C. Hogue, "BIND--the Biomolecular Interaction Network Data-base," Nucleic Acids Research, vol. 29, pp. 242-245, 2001.
[38]
{38} J. Mellor, I. Yanai, K. Clodfelter, J. Mintseris, and C. DeLisi, "Predictome: a Database of Putative Functional Links between Proteins," Nucleic Acids Research, vol. 30, pp. 306-309, 2002.
[39]
{39} C. vonMering, M. Huynen, D. Jaeggi, P.B.S. Schmidt, and B. Snel, "STRING: a Database of Predicted Functional Associations between Proteins," Nucleic Acids Research, vol. 31, pp. 258-261, 2003.
[40]
{40} I. Xenarios, L. Salwinski, X. Duan, P. Higney, S.-M. Kim, and D. Eisenberg, "DIP, the Database of Interacting Proteins: A Research Tool for Studying Cellular Networks of Protein Interactions," Nucleic Acids Research, vol. 30, pp. 303-305, 2002.
[41]
{41} R. Krause, C. vonMering, and P. Bork, "A Comprehensive Set of Protein Complexes in Yeast: Mining Large Scale Protein-Protein Interaction Screens," Bioinformatics, vol. 19, pp. 1901-1908, 2003.
[42]
{42} Y. Ho et al., "Systematic Identification of Protein Complexes in Saccharomyces cerevisiae by Mass Spectrometry," Nature, vol. 415, pp. 180-183, 2002.
[43]
{43} A. Gavin, M. Bösche, R. Krause, P. Grandi, M. Marzioch, A. Bauer, J. Schultz, J. Rick, A.-M. Michon, C.-M. Cruciat, M. Remor, C. Böfert, M. Schelder, M. Brajenovic, H. Ruffner, A. Merino, K. Klein, M. Hudak, D. Dickson, T. Rudi, V. Gnau, A. Bauch, S. Bastuck, B. Huhse, C. Leutwein, M.-A. Heurtier, R. Copley, A. Edelmann, E. Querfurth, V. Rybin, G. Drewes, M. Raida, T. Bouwmeester, P. Bork, B. Seraphin, B. Kuster, G. Neubauer, and G. Superti-Furga, "Functional Organization of the Yeast Proteome by Systematic Analysis of Protein Complexes," Nature, vol. 415, pp. 141-147, 2002.
[44]
{44} F. Corpet, F. Servant, J. Gouzy, and D. Kahn, "ProDom and ProDom-CG: Tools for Protein Domain Analysis and Whole Genome Comparisons," Nucleic Acids Research, vol. 28, no. 1, pp. 267-269, 2000.
[45]
{45} B. Boeckmann, A. Bairoch, R. Apweiler, M.-C. Blatter, A. Estreicher, E. Gasteiger, M. Martin, K. Michoud, C. O'Donovan, I. Phan, S. Pilbout, and M. Schneider, "The SWISS-PROT Protein Knowledgebase and Its Supplement TrEMBL in 2003," Nucleic Acids Research, vol. 31, pp. 365-390, 2003.
[46]
{46} A. Grigoriev, "A Relationship between Gene Expression and Protein Interactions on the Proteome Scale: Analysis of the Bacteriophage T7 and the Yeast Saccharomyces cerevisiae," Nucleic Acids Research, vol. 29, pp. 3513-3519, 2001.
[47]
{47} H. Ge, L. Ziu, G. Church, and M. Vidal, "Correlation between Transcriptome and Interactome Mapping Data from Saccharomyces cerevisiae," Nature Genetics, vol. 29, pp. 482-486, 2001.
[48]
{48} R. Jansen, D. Greenbaum, and M. Gerstein, "Relating Whole-Genome Expression Data with Protein-Protein Interactions," Genome Research, vol. 12, pp. 37-42, 2002.
[49]
{49} "The Gene Ontology (GO) Database and Information Resource," Nucleic Acids Research, vol. 32, pp. D258-D261, G.O. Consortium, 2004.
[50]
{50} D. Martin, C. Brun, E. Remy, P. Mouren, D. Thieffry, and B. Jacq, "GOToolBox: Functional Analysis of Gene Data Sets Based on Gene Ontology," Genome Biology, vol. 5, no. 12, pp. 1901-1908, 2004.
[51]
{51} D.S. Johnson, "Approximation Algorithms for Combinatorial Problems," J. Computer System Science, vol. 9, pp. 256-278, 1974.
[52]
{52} T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms, second ed. McGraw-Hill, 2001.
[53]
{53} C. Huang, "Multiscale Computational Methods for Morphogenesis and Algorithms for Protein-Protein Interaction Inference," PhD dissertation, Dept. of Computer Science and Eng., Univ. of Notre Dame, July 2005, http://etd.nd.edu/ETD-db/theses/available/ etd-07212005-085435/.

Cited By

View all
  • (2017)A New Feature Vector Based on Gene Ontology Terms for Protein-Protein Interaction PredictionIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2016.255530414:4(762-770)Online publication date: 1-Jul-2017
  • (2017)Improved prediction of proteinprotein interactions using novel negative samples, features, and an ensemble classifierArtificial Intelligence in Medicine10.1016/j.artmed.2017.03.00183:C(67-74)Online publication date: 1-Nov-2017
  • (2012)Mining from protein–protein interactionsWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery10.1002/widm.10652:5(400-410)Online publication date: 1-Sep-2012
  • Show More Cited By

Index Terms

  1. Predicting Protein-Protein Interactions from Protein Domains Using a Set Cover Approach

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
          IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 4, Issue 1
          January 2007
          160 pages

          Publisher

          IEEE Computer Society Press

          Washington, DC, United States

          Publication History

          Published: 01 January 2007
          Published in TCBB Volume 4, Issue 1

          Author Tags

          1. Computations on discrete structures
          2. bioinformatics (genome or protein) databases
          3. biology
          4. genetics.
          5. graph algorithms

          Qualifiers

          • Article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)3
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 14 Dec 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2017)A New Feature Vector Based on Gene Ontology Terms for Protein-Protein Interaction PredictionIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2016.255530414:4(762-770)Online publication date: 1-Jul-2017
          • (2017)Improved prediction of proteinprotein interactions using novel negative samples, features, and an ensemble classifierArtificial Intelligence in Medicine10.1016/j.artmed.2017.03.00183:C(67-74)Online publication date: 1-Nov-2017
          • (2012)Mining from protein–protein interactionsWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery10.1002/widm.10652:5(400-410)Online publication date: 1-Sep-2012
          • (2010)Predicting protein-protein interactions using first principle methods and statistical scoringProceedings of the International Symposium on Biocomputing10.1145/1722024.1722038(1-8)Online publication date: 15-Feb-2010
          • (2010)Belief propagation estimation of protein and domain interactions using the sum-product algorithmIEEE Transactions on Information Theory10.1109/TIT.2009.203705156:2(742-755)Online publication date: 1-Feb-2010
          • (2009)Learning a prediction model for protein-protein recognitionProceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human10.1145/1655925.1656059(736-741)Online publication date: 24-Nov-2009
          • (2007)Predicting functional protein-protein interactions based on computational methodsProceedings of the 2007 international conference on Life System Modeling and Simulation10.5555/2393672.2393717(354-363)Online publication date: 14-Sep-2007
          • (2006)Domain-based predictive models for protein-protein interaction predictionEURASIP Journal on Advances in Signal Processing10.1155/ASP/2006/327672006(55-55)Online publication date: 1-Jan-2006

          View Options

          Login options

          Full Access

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media