A Hybrid Scoring Function for Protein Multiple Alignment

Emily Rocke⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2452))

Included in the following conference series:

International Workshop on Algorithms in Bioinformatics

Abstract

Previous algorithms for motif discovery and protein alignment have used a variety of scoring functions, each specialized to find certain types of similarity in preference to others. Here we present a novel scoring function that combines the relative entropy score with a sensitivity to amino acid similarities, producing a score that is highly sensitive to the types of weakly-conserved patterns that are typically seen in proteins. We investigate the performance of the hybrid score compared to existing scoring functions. We conclude that the hybrid is more sensitive than previous protein scoring functions, both in the initial detection of a weakly conserved region of similarity, and given such a similarity, in the detection of weakly-conserved instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Dynamic Programming

PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences

Benchmarking Methods of Protein Structure Alignment

Article 28 July 2020

References

M. Blanchette, B. Schwikowski, and M. Tompa. Algorithms for phylogenetic footprinting. J. Comp. Bio., 9(2):211–223, 2002.
Article Google Scholar
K. S. Chan. Asymptotic behavior of the gibbs sampler. J. Amer. Statist. Assoc., 88:320–326, 1993.
Article MATH MathSciNet Google Scholar
M. 0. Dayhoff, R. M. Schwartz, and B. C. Orcutt. A model of evolutionary change in proteins. In M. O. Dayhoff, editor, Atlas of Protein Sequence and Structure, volume 5, suppl. 3, pages 345–352. Natl. Biomed. Res. Found., Washington, 1978.
Google Scholar
A. Dembo and S. Karlin. Strong limit theorems of empirical functionals for large exceedances of partial sums of iid variables. Annals of Probability, 19(4):1737–1755, 1991.
Article MATH MathSciNet Google Scholar
R. Laskowski et. al. Pdbsum. http://www.biochem.ucl.ac.uk/bsm/pdbsum/, 2002.
Schwartz et al. Pipmaker—a web server for aligning two genomic dna sequences. Genome Research, 10:577–586, April 2000.
Google Scholar
ExPASy. Prosite. http://www.expasy.ch/prosite/, 2002. hosted by the Swiss Insitute of Bioinformatics.
ExPASy. Swiss-prot. http://www.expasy.ch/sprot/, 2002. hosted by the Swiss Insitute of Bioinformatics.
J. G. Henikoff and S. Henikoff. Using substitution probabilities to improve position-specific scoring matrices. Comput. Appl. Biosci., 12(2):135–43, 1996.
Google Scholar
S. Henikoff and J. G. Henikoff. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA, 89:10915–10919, 1992.
Google Scholar
S. Karlin and S. F. Altschul. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl Acad. Sci. USA, 87:2264–2268, 1990.
Google Scholar
S. Karlin and S. F. Altschul. Applications and statistics for multiple high-scoring segments in molecular sequences. Proc. Natl Acad. Sci. USA, 90:5873–5877, 1993.
Google Scholar
C. E. Lawrence, S. F. Altschul, M. S. Boguski, J. S. Liu, A. F. Neuwald, and J. C. Wootton. Detecting subtle sequence signals: A gibbs sampling strategy for multiple alignment. Science, 262:208–214, 1993.
Article Google Scholar
B. Morgenstern. Dialign 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics, 15:211–218, 1999.
Article Google Scholar
B. Morgenstern, A. Dress, and T. Werner. Multiple dna and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA, 93:12098–12103, 1996.
Google Scholar
S. B. Needleman and C. D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., 48:443–453, 1970.
Article Google Scholar
S. Pietrokovski, J. G. Henikoff, and S. Henikoff. The blocks database—a system for protein classification. Nucl. Acids Res., 24(1):197–200, 1996.
Article Google Scholar
E. Rocke and M. Tompa. An algorithm for finding novel gapped motifs in dna sequences. In Proc. of the 2nd Annual International Conference on Computational Molecular Biology (RECOMB 1998), pages 228–233, March 1998.
Google Scholar
T. F. Smith and M. S. Waterman. Identification of common molecular subsequences. J. Mol. Biol., 147:195–197, 1981.
Article Google Scholar
J. D. Thompson, D. G. Higgins, and T. J. Gibson. Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res., 22:4673–4680, 1994.
Article Google Scholar
T. D. Wu and D. L. Brutlag. Discovering empirically conserved amino acid substitution groups in databases of protein families. In Proc. of the 4th International Conference on Intelligent Systems for Molecular Biology (ISMB 1996), pages 230–240, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Washington, USA
Emily Rocke

Authors

Emily Rocke
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IMIM-UPF-CRG, Dr. Aiguader 80, 08003, Barcelona, Spain
Roderic Guigó
Department of Computer Science, University of California, 95616, Davis, CA, USA
Dan Gusfield

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rocke, E. (2002). A Hybrid Scoring Function for Protein Multiple Alignment. In: Guigó, R., Gusfield, D. (eds) Algorithms in Bioinformatics. WABI 2002. Lecture Notes in Computer Science, vol 2452. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45784-4_19

Download citation

DOI: https://doi.org/10.1007/3-540-45784-4_19
Published: 10 October 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44211-0
Online ISBN: 978-3-540-45784-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

A Hybrid Scoring Function for Protein Multiple Alignment

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Dynamic Programming

PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences

Benchmarking Methods of Protein Structure Alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Hybrid Scoring Function for Protein Multiple Alignment

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Dynamic Programming

PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences

Benchmarking Methods of Protein Structure Alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation