[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Back-Translation for Discovering Distant Protein Homologies

  • Conference paper
Algorithms in Bioinformatics (WABI 2009)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5724))

Included in the following conference series:

Abstract

Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins’ common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level. To cope with this situation, we propose a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. This allows us to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 71.50
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 89.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Raes, J., Van de Peer, Y.: Functional divergence of proteins through frameshift mutations. Trends in Genetics 21(8), 428–431 (2005)

    Article  CAS  PubMed  Google Scholar 

  2. Okamura, K., et al.: Frequent appearance of novel protein-coding sequences by frameshift translation. Genomics 88(6), 690–697 (2006)

    Article  CAS  PubMed  Google Scholar 

  3. Harrison, P., Yu, Z.: Frame disruptions in human mRNA transcripts, and their relationship with splicing and protein structures. BMC Genomics 8, 371 (2007)

    Article  PubMed  PubMed Central  Google Scholar 

  4. Hahn, Y., Lee, B.: Identification of nine human-specific frameshift mutations by comparative analysis of the human and the chimpanzee genome sequences. Bioinformatics 21(suppl. 1), i186–i194 (2005)

    Article  Google Scholar 

  5. Grantham, R., Gautier, C., Gouy, M., Mercier, R., Pave, A.: Codon catalog usage and the genome hypothesis. Nucleic Acids Research (8), 49–62 (1980)

    Google Scholar 

  6. Shepherd, J.C.: Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification.. Proceedings National Academy Sciences USA (78), 1596–1600 (1981)

    Google Scholar 

  7. Guigo, R.: DNA composition, codon usage and exon prediction. Nucleic Protein Databases, 53–80 (1999)

    Google Scholar 

  8. Leluk, J.: A new algorithm for analysis of the homology in protein primary structure. Computers and Chemistry 22(1), 123–131 (1998)

    Article  CAS  PubMed  Google Scholar 

  9. Leluk, J.: A non-statistical approach to protein mutational variability. BioSystems 56(2-3), 83–93 (2000)

    Article  CAS  PubMed  Google Scholar 

  10. Altschul, S., et al.: Basic local alignment search tool. JMB 215(3), 403–410 (1990)

    Article  CAS  Google Scholar 

  11. Altschul, S., et al.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Pellegrini, M., Yeates, T.: Searching for Frameshift Evolutionary Relationships Between Protein Sequence Families. Proteins 37, 278–283 (1999)

    Article  CAS  PubMed  Google Scholar 

  13. Arvestad, L.: Aligning coding DNA in the presence of frame-shift errors. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 180–190. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  14. Arvestad, L.: Algorithms for biological sequence alignment. PhD thesis, Royal Institute of Technology, Stocholm, Numerical Analysis and Computer Science (2000)

    Google Scholar 

  15. Blake, R., Hess, S., Nicholson-Tuell, J.: The influence of nearest neighbors on the rate and pattern of spontaneous point mutations. JME 34(3), 189–200 (1992)

    Article  CAS  Google Scholar 

  16. Kosiol, C., Holmes, I., Goldman, N.: An Empirical Codon Model for Protein Sequence Evolution. Molecular Biology and Evolution 24(7), 1464 (2007)

    Article  CAS  PubMed  Google Scholar 

  17. Pedersen, A., Jensen, J.: A dependent-rates model and an MCMC-based methodology for the maximum-likelihood analysis of sequences with overlapping reading frames. Molecular Biology and Evolution 18, 763–776 (2001)

    Article  CAS  PubMed  Google Scholar 

  18. Lio, P., Goldman, N.: Models of Molecular Evolution and Phylogeny. Genome Research 8(12), 1233–1244 (1998)

    CAS  PubMed  Google Scholar 

  19. Altschul, S., et al.: The estimation of statistical parameters for local alignment score distributions. Nucleic Acids Research 29(2), 351–361 (2001)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Olsen, R., Bundschuh, R., Hwa, T.: Rapid assessment of extremal statistics for gapped local alignment. In: ISMB, pp. 211–222 (1999)

    Google Scholar 

  21. Delaye, L., DeLuna, A., Lazcano, A., Becerra, A.: The origin of a novel gene through overprinting in Escherichia coli. BMC Evolutionary Biology 8, 31 (2008)

    Article  PubMed  PubMed Central  Google Scholar 

  22. Hubbard, T., et al.: Ensembl 2007. Nucleic Acids Res. 35 (2007)

    Google Scholar 

  23. Clamp, M., et al.: Distinguishing protein-coding and noncoding genes in the human genome. Proc. Natl. Acad. Sci. 104(49), 19428–19433 (2007)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Oostra, B., Chiurazzi, P.: The fragile X gene and its function. Clinical genetics 60(6), 399 (2001)

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gîrdea, M., Noé, L., Kucherov, G. (2009). Back-Translation for Discovering Distant Protein Homologies. In: Salzberg, S.L., Warnow, T. (eds) Algorithms in Bioinformatics. WABI 2009. Lecture Notes in Computer Science(), vol 5724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04241-6_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04241-6_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04240-9

  • Online ISBN: 978-3-642-04241-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics