Abstract
With the growth of the PDB and simultaneous slowing of the discovery of new protein folds, we may be able to answer the question of how discrete protein fold space is. Studies by Skolnick et al. (PNAS, 106, 15690, 2009) have concluded that it is in fact continuous. In the present work we extend our initial observation (PNAS, 106(51) E137, 2009) that this conclusion depends upon the resolution with which structures are considered, making the determination of what resolution is most useful of importance. We utilize graph theoretical approaches to investigate the connectedness of the protein structure universe, showing that the modularity of protein domain architecture is of fundamental importance for future improvements in structure matching, impacting our understanding of protein domain evolution and modification. We show that state-of-the-art structure superimposition algorithms are unable to distinguish between conformational and topological variation. This work is not only important for our understanding of the discreteness of protein fold space, but informs the more critical question of what precisely should be spatially aligned in structure superimposition. The metric-dependence is also investigated leading to the conclusion that fold usage in homology reduced datasets is very similar to usage across all of PDB and should not be ignored in large scale studies of protein structure similarity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cuff, A.L., et al.: Nucleic Acids Res. 37, D310–D314 (2009)
Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: J. Mol. Biol. 247, 536–540 (1995)
Finn, R.D., et al.: Nucleic Acids Res 36, D281–D288 (2008)
Cuff, A.L., et al.: Nucleic Acids Res. 39, D420–D426 (2011)
Zhang, Y., Hubner, I.A., Arakaki, A.K., Shakhnovich, E., Skolnick, J.: Proc. Natl. Acad. Sci. U.S.A. 103, 2605–2610 (2006)
Grabowski, M., Joachimiak, A., Otwinowski, Z., Minor, W.: Curr. Opin. Struct. Biol. 17, 347–353 (2007)
Skolnick, J., Arakaki, A.K., Lee, S.Y., Brylinski, M.: Proc. Natl. Acad. Sci. U.S.A. 106, 15690–15695 (2009)
Zhang, Y., Skolnick, J.: Nucleic Acids Res. 33, 2302–2309 (2005)
Berman, H.M., et al.: Nucleic Acids Res. 28, 235–242 (2000)
Zimmermann, M., Towfic, F., Jernigan, R.L., Kloczkowski, A.: Proc. Natl. Acad. Sci. U. S. A 106, E137 (2009)
Watts, D.J., Strogatz, S.H.: Nature 393, 440–442 (1998)
Newman, M.E., Girvan, M.: Phys. Rev. E 69, 026113 (2004)
Van Dongen, S.: Technical Report INS-R0010. National Research Institute for Mathematics and Computer Science in the Netherlands (2000)
Van Dongen, S.: Ph.D. Thesis, Univ Utrecht, The Netherlands (2000)
Gibrat, J.F., Madej, T., Bryant, S.H.: Curr. Opin. Struct. Biol. 6, 377–385 (1996)
Altschul, S.F., et al.: Nucleic Acids Res. 25, 3389–3402 (1997)
Zhang, Y.: BMC Bioinf. 9, 40 (2008)
de Leeuw, M., Reuveni, S., Klafter, J., Granek, R.: PLoS One 4, e7296 (2009)
Reuveni, S., Granek, R., Klafter, J.: Proc. Natl. Acad. Sci. U.S.A. 107, 13696–13700 (2010)
Lee, J., et al.: Science 322, 438–442 (2008)
Guntas, G., Purbeck, C., Kuhlman, B.: Proc. Natl. Acad. Sci. U.S.A. 107, 19296–19301 (2010)
Zhou, Y., Vitkup, D., Karplus, M.: J. Mol. Biol. 285, 1371–1375 (1999)
Holm, L., Sander, C.: Nucleic Acids Res. 25, 231–234 (1997)
Holm, L., Sander, C.: Nucleic Acids Res. 26, 316–319 (1998)
Yoo, P.D., Sikder, A.R., Taheri, J., Zhou, B.B., Zomaya, A.Y.: IEEE Trans. Nanobiosci. 7, 172–181 (2008)
Pandit, S.B., Skolnick, J.: BMC Bioinf. 9, 531 (2008)
Ye, Y., Godzik, A.: Bioinformatics 19(Suppl 2), ii246–ii255 (2003)
Horimoto, K., Toh, H.: Bioinformatics 17, 1143–1151 (2001)
Satuluri, V., Parthasarathy, S., Ucar, D.: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, BCB 2010, pp. 247–256 (2010). https://dl.acm.org/citation.cfm?doid=1854776.1854812
Viksna, J., Gilbert, D.: Bioinformatics 23, 832–841 (2007)
Birzele, F., Csaba, G., Zimmer, R.: Nucleic Acids Res. 36, 550–558 (2008)
Fong, J.H., Geer, L.Y., Panchenko, A.R., Bryant, S.H.: J. Mol. Biol. 366, 307–315 (2007)
Meier, S., et al.: Curr. Biol. 17, 173–178 (2007)
Gilbert, D., Westhead, D., Nagano, N., Thornton, J.: Bioinformatics 15, 317–326 (1999)
Torrance, G.M., Gilbert, D.R., Michalopoulos, I., Westhead, D.W.: Bioinformatics 21, 2537–2538 (2005)
Wang, G., Dunbrack Jr., R.L.: Bioinformatics 19, 1589–1591 (2003)
Acknowledgements
AK and RLJ acknowledge support from the National Science Foundation (DBI 1661391) and from National Institutes of Health (R01GM127701 and R01GM127701-01S1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zimmermann, M.T., Towfic, F., Jernigan, R.L., Kloczkowski, A. (2019). Characteristics of Protein Fold Space Exhibits Close Dependence on Domain Usage. In: Rojas, I., Valenzuela, O., Rojas, F., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2019. Lecture Notes in Computer Science(), vol 11465. Springer, Cham. https://doi.org/10.1007/978-3-030-17938-0_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-17938-0_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17937-3
Online ISBN: 978-3-030-17938-0
eBook Packages: Computer ScienceComputer Science (R0)