Generating Peptide Candidates from Amino-Acid Sequence Databases for Protein Identification via Mass Spectrometry

Nathan Edwards⁶ &
Ross Lippert⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2452))

Included in the following conference series:

International Workshop on Algorithms in Bioinformatics

1157 Accesses
11 Citations

Abstract

Protein identification via mass spectrometry forms the foundation of high-throughput proteomics. Tandem mass spectrometry, when applied to a complex mixture of peptides, selects and fragments each peptide to reveal its amino-acid sequence structure. The successful analysis of such an experiment typically relies on amino-acid sequence databases to provide a set of biologically relevant peptides to examine. A key sub-problem, then, for amino-acid sequence database search engines that analyze tandem mass spectra is to efficiently generate all the peptide candidates from a sequence database with mass equal to one of a large set of observed peptide masses. We demonstrate that to solve the problem efficiently, we must deal with substring redundancy in the amino-acid sequence database and focus our attention on looking up the observed peptide masses quickly. We show that it is possible, with some preprocessing and memory overhead, to solve the peptide candidate generation problem in time asymptotically proportional to the size of the sequence database and the number of peptide candidates output.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Peptide Identification from Mass Spectrometry

Protein Identification from Tandem Mass Spectra by Database Searching

Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine

Article 08 October 2018

References

V. Bafna and N. Edwards. Scope: A probabilistic model for scoring tandem mass spectra against a peptide database. Bioinformatics, 17(Suppl. 1):S13–S21, 2001.
Google Scholar
T. Chen, M. Kao, M. Tepel, J. Rush, and G. Church. A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. In ACMSIAM Symposium on Discrete Algorithms, 2000.
Google Scholar
M. Cieliebak, T. Erlebach, S. Lipták, J. Stoye, and E. Welzl. Algorithmic complexity of protein identification: Combinatorics of weighted strings. Submitted to Discrete Applied Mathematics special issue on Combinatorics of Searching, Sorting, and Coding., 2002.
Google Scholar
J. Cottrell and C. Sutton. The identification of electrophoretically separated proteins by peptide mass fingerprinting. Methods in Molecular Biology, 61:67–82, 1996.
Google Scholar
V. Dancik, T. Addona, K. Clauser, J. Vath, and P. Pevzner. De novo peptide sequencing via tandem mass spectrometry. Journal of Computational Biology, 6:327–342, 1999.
Article Google Scholar
J. Eng, A. McCormack, and J. Yates. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of American Society of Mass Spectrometry, 5:976–989, 1994.
Article Google Scholar
D. Gusfield. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997.
Google Scholar
P. James, M. Quadroni, E. Carafoli, and G. Gonnet. Protein identification in dna databases by peptide mass fingerprinting. Protein Science, 3(8):1347–1350, 1994.
Article Google Scholar
S. Kurtz. Reducing the space requirement of suffix trees. Software-Practice and xperience, 29(13):1149–1171, 1999.
Article Google Scholar
D. Pappin. Peptide mass fingerprinting using maldi-tof mass spectrometry. Methods in Molecular Biology, 64:165–173, 1997.
Google Scholar
D. Pappin, P. Hojrup, and A. Bleasby. Rapid identification of proteins by peptidemass fingerprinting. Currents in Biology, 3(6):327–332, 1993.
Article Google Scholar
D. Perkins, D. Pappin, D. Creasy, and J. Cottrell. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis, 20(18):3551–3567, 1997.
Article Google Scholar
P. Pevzner, V. Dancik, and C. Tang. Mutation-tolerant protein identification by mass-spectrometry. In R. Shamir, S. Miyano, S. Istrail, P. Pevzner, and M. Waterman, editors, International Conference on Computational Molecular Biology (RECOMB), pages 231–236. ACM Press, 2000.
Google Scholar
J. Taylor and R. Johnson. Sequence database searches via de novo peptide sequencing by mass spectrometry. Rapid Communications in Mass Spectrometry, 11:1067–1075, 1997.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Celera Genomics, 45 West Gude Drive, Rockville, MD
Nathan Edwards & Ross Lippert

Authors

Nathan Edwards
View author publications
You can also search for this author in PubMed Google Scholar
Ross Lippert
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IMIM-UPF-CRG, Dr. Aiguader 80, 08003, Barcelona, Spain
Roderic Guigó
Department of Computer Science, University of California, 95616, Davis, CA, USA
Dan Gusfield

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Edwards, N., Lippert, R. (2002). Generating Peptide Candidates from Amino-Acid Sequence Databases for Protein Identification via Mass Spectrometry. In: Guigó, R., Gusfield, D. (eds) Algorithms in Bioinformatics. WABI 2002. Lecture Notes in Computer Science, vol 2452. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45784-4_6

Download citation

DOI: https://doi.org/10.1007/3-540-45784-4_6
Published: 10 October 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44211-0
Online ISBN: 978-3-540-45784-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Generating Peptide Candidates from Amino-Acid Sequence Databases for Protein Identification via Mass Spectrometry

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Peptide Identification from Mass Spectrometry

Protein Identification from Tandem Mass Spectra by Database Searching

Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Generating Peptide Candidates from Amino-Acid Sequence Databases for Protein Identification via Mass Spectrometry

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Peptide Identification from Mass Spectrometry

Protein Identification from Tandem Mass Spectra by Database Searching

Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation