[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

On the use of structure and sequence-based features for protein classification and retrieval

Published: 19 December 2007 Publication History

Abstract

The need to retrieve or classify proteins using structure or sequence-based similarity underlies many biomedical applications. In drug discovery, researchers search for proteins that share specific chemical properties as sources for new treatment. With folding simulations, similar intermediate structures might be indicative of a common folding pathway. Here we present two normalized, stand-alone representations of proteins that enable fast and efficient object retrieval based on sequence or structure. To create our sequence-based representation, we take the profiles returned by the PSI-BLAST alignment algorithm and create a normalized summary using a discrete wavelet transform. For our structural representation, we transform each 3D structure into a normalized 2D distance matrix and apply a 2D wavelet decomposition to generate our descriptor. We also create a hybrid representation by concatenating together the above descriptors. We evaluate the generality of our models by using them as indices for database retrieval experiments as well as feature vectors for classification. We find that our methods provide excellent performance when compared with the state-of-the-art for each task. Our results show that the sequence-based representation is generally superior to the structure-based representation and that in the classification context, the hybrid strategy affords a significant improvement over sequence or structure.

Cited By

View all
  • (2019)High-throughput and scalable protein function identification with Hadoop and Map-only pattern of the MapReduce processing modelKnowledge and Information Systems10.1007/s10115-018-1245-360:1(145-178)Online publication date: 2-Aug-2019
  • (2018)Multiresolution-based bilinear recurrent neural networkKnowledge and Information Systems10.5555/3227223.322740719:2(235-248)Online publication date: 29-Dec-2018
  • (2014)Unified framework for representing and rankingPattern Recognition10.1016/j.patcog.2013.12.00347:6(2293-2300)Online publication date: 1-Jun-2014

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Knowledge and Information Systems
Knowledge and Information Systems  Volume 14, Issue 1
December 2007
137 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 19 December 2007

Author Tags

  1. Bioinformatics
  2. Protein indexing
  3. Protein retrieval
  4. Sequence and structure-based protein representations

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2019)High-throughput and scalable protein function identification with Hadoop and Map-only pattern of the MapReduce processing modelKnowledge and Information Systems10.1007/s10115-018-1245-360:1(145-178)Online publication date: 2-Aug-2019
  • (2018)Multiresolution-based bilinear recurrent neural networkKnowledge and Information Systems10.5555/3227223.322740719:2(235-248)Online publication date: 29-Dec-2018
  • (2014)Unified framework for representing and rankingPattern Recognition10.1016/j.patcog.2013.12.00347:6(2293-2300)Online publication date: 1-Jun-2014

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media