[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/SC.2005.18acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article

ClawHMMER: A Streaming HMMer-Search Implementatio

Published: 12 November 2005 Publication History

Abstract

The proliferation of biological sequence data has motivated the need for an extremely fast probabilistic sequence search. One method for performing this search involves evaluating the Viterbi probability of a hidden Markov model (HMM) of a desired sequence family for each sequence in a protein database. However, one of the difficulties with current implementations is the time required to search large databases. Many current and upcoming architectures offering large amounts of compute power are designed with data-parallel execution and streaming in mind. We present a streaming algorithm for evaluating an HMM's Viterbi probability and refine it for the specific HMM used in biological sequence search. We implement our streaming algorithm in the Brook language, allowing us to execute the algorithm on graphics processors. We demonstrate that this streaming algorithm on graphics processors can outperform available CPU implementations. We also demonstrate this implementation running on a 16 node graphics cluster.

References

[1]
ALTSCHUL, S., GISH, W., MILLER, W., MYERS, E., AND LIPMAN, D. 1990. Basic local alignment search tool. Journal of Molecular Biology 215, 3 (October), 403-410.
[2]
ATI, 2004. Radeon X800 product site. http://www.ati.com/products/radeonx800.
[3]
BHAYA, D., DUFRESNE, A., VAULOT, D., AND GROSSMAN, A. 2002. Analysis of the hli gene family in marine and freshwater cyanobacteria. FEMS Microbiology Letters 215, 209-219.
[4]
BORKAR, S. Y., DUBEY, P., KAHN, K. C., KUCK, D. J., MULDER, H., PAWLOWSKI, S. S., AND RATTNER, J. R. 2005. Platform 2015: Intel processor and platform evolution for the next decade. Technology@Intel Magazine 3, 3 (April).
[5]
BUCK, I., FATAHALIAN, K., AND HANRAHAN, P. 2004. Gpubench: Evaluating gpu performance for numerical and scientific applications. In Poster Session at GP2 Workshop on General Purpose Computing on Graphics Processors. http://gpubench.sourceforge.net.
[6]
BUCK, I., FOLEY, T., HORN, D., SUGERMAN, J., HANRAHAN, P., HOUSTON, M., AND FATAHALIAN, K. 2004. Brook for GPUs: Stream Computing on Graphics Hardware. In Proceedings of the ACM SIGGRAPH 2004.
[7]
BUCK, I. 2005. Taking the plunge into GPU computing. In GPU Gems 2: Programming Techniques for High Performance Graphics and General Purpose Computation, M. Pharr, Ed. Addison Wesley, 880.
[8]
CHUKKAPALLI, G., GUDA, C., AND SUBRAMANIAM, S. 2004 SledgeHMMER: a web server for batch searching the pfam database. Nucleic Acids Research 32 (July), W542-544.
[9]
CLARKE, N., AND BERG, J. M. 1998. Zinc fingers in caenorhabditis elegans: Finding families and probing pathways. Science 282 (December), 2018-2022.
[10]
COFER, H., AND SGI., 2002. HMMER on Silicon Graphics. http://sgi.com/industries/sciences/chembio/resources/hmmer.
[11]
DALLY, W. J., HANRAHAN, P., EREZ, M., KNIGHT, T. J., LABONTÉ, F., AHN, J.-H., JAYASENA, N., KAPASI, U. J., DAS, A., GUMMARAJU, J., AND BUCK, I. 2003 Merrimac: Supercomputing with Streams. In Proceedings of SC2003, ACM Press.
[12]
DHONG, S. H., TAKAHASHI, O., WHITE, M., ASANO, T., NAKAZATO, T., SILBERMAN, J., KAWASUMI, A., AND YOSHIHARA, H. 2005. A 4.8 GHz fully pipelined embedded SRAM in the streaming processor of a CELL processor. In Proceedings of IEEE International Solid-state Circuits Conference, 486-487, 612.
[13]
EDDY, S., 2003. HMMER: Profile HMMs for protein sequence analysis. http://hmmer.wustl.edu.
[14]
EDDY, S. 2003. HMMER user's guide. Howard Hughes Medical Institute and Dept. of Genetics, Washington University School of Medicine. (October).
[15]
EUROPEAN BIOINFORMATICS INSTITUTE, SWISS INSTITUTE OF BIOINFORMATICS, AND GEORGETOWN UNIVERSITY, 2005. Universal protein resource. http://www.uniprot.org.
[16]
FATAHALIAN, K., SUGERMAN, J., AND HANRAHAN, P. 2004. Understanding the efficiency of GPU algorithms for matrix-matrix multiplication. In Proceedings of Graphics Hard-ware , Eurographics Association.
[17]
FAWCETT, P., EICHENBERGER, P., LOSICK, R., AND YOUNGMAN, P. 2000. The transcriptional profile of early to middle sporulation in Bacillus subtilis. Proceedings of the National Academy of Sciences USA 97, 14 (July), 8063- 8068.
[18]
FLACHS, B., ASANO, S., DHONG, S. H., HOFSTEE, P., GERVAIS, G., KIM, R., LE, T., LIU, P., LEENSTRA, J., LIBERTY, J., MICHAEL, B., OH, H., MUELLER, S. M., TAKAHASHI, O., HATAKEYAMA, A., WATANABE, Y., AND YANO, N. 2005. A streaming processor unit for a CELL processor. In Proceedings of IEEE International Solid-state Circuits Conference, 134-135.
[19]
FORNEY, G. D. 1973. The Viterbi algorithm. Proc. IEEE 61 (Mar.), 268-78.
[20]
KAPASI, U., DALLY, W. J., RIXNER, S., OWENS, J. D., AND KHAILANY, B. 2002. The Imagine Stream Processor. Proceedings of International Conference on Computer Design (September).
[21]
KROGH, A., BROWN, M., MIAN, S., SJOLANDER, K., AND HAUSSLER, D. 1994. Hidden markov models in computational biology: Applications to protein modeling. Journal of Molecular Biology 235, 1501-1531.
[22]
LINDAHL, E., 2005. Altivec HMMer, version 2. http://lindahl.sbc.su.se/software/altivec/altivec-hmmer,-version- 2.html.
[23]
LINDHOLM, E., KILGARD, M. J., AND MORETON, H. 2001. A user-programmable vertex engine. In Proceedings of SIGGRAPH 2001, ACM Press/Addison-Wesley Publishing Co., 149-158.
[24]
NARUKAWA, K., AND KADOWAKI, T. 2004. Transmembrane regions prediction for G-protein-coupled receptors by hidden markov model. In Proceedings of the 15th International Conference on Genome Informatics, Universal Academy Press.
[25]
2005. National Center for Biotechnology Information. ftp://ncbi.nlm.nih.gov.
[26]
NVIDIA, 2005. GeForce 7800: Product overview. http://nvidia.com/page/geforce_7800.html.
[27]
RABINER, L. R. 1989. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 2 (Februrary).
[28]
SÁNCHEZ-PULIDO, L., ROJAS, A., VAN WELY, K., MARTINEZ-A, C., AND VALENCIA, A. 2004. Spoc: A widely distributed domain associated with cancer, apoptosis and transcription. BMC Bioinformatics 5, 1, 91.
[29]
STAUB, E., MENNERICH, D., AND ROSENTHAL, A. 2001. The Spin/Ssty repeat: a new motif identified in proteins involved in vertebrate development from gamete to embryo. Genome Biology 3, 1, research0003.1-research0003.6.
[30]
VITERBI, A. J. 1967. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. on Information Theory 13, 2, 260-269.

Cited By

View all
  • (2017)Accelerating Viterbi algorithm on graphics processing unitsComputing10.1007/s00607-017-0557-699:11(1105-1123)Online publication date: 1-Nov-2017
  • (2013)Multi-parallel prefiltering on the convey HC-1 for supporting homology detectionProceedings of the 20th European MPI Users' Group Meeting10.1145/2488551.2488587(169-174)Online publication date: 15-Sep-2013
  • (2012)A protein sequence analysis hardware accelerator based on divergencesInternational Journal of Reconfigurable Computing10.1155/2012/2013782012(4-4)Online publication date: 1-Jan-2012
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing
November 2005
829 pages
ISBN:1595930612

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 12 November 2005

Check for updates

Author Tags

  1. Bio Science
  2. Brook
  3. Data Parallel Computing
  4. GPU Computing
  5. Programmable Graphics Hardware
  6. Stream Computing

Qualifiers

  • Article

Conference

SC '05
Sponsor:

Acceptance Rates

SC '05 Paper Acceptance Rate 62 of 260 submissions, 24%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Accelerating Viterbi algorithm on graphics processing unitsComputing10.1007/s00607-017-0557-699:11(1105-1123)Online publication date: 1-Nov-2017
  • (2013)Multi-parallel prefiltering on the convey HC-1 for supporting homology detectionProceedings of the 20th European MPI Users' Group Meeting10.1145/2488551.2488587(169-174)Online publication date: 15-Sep-2013
  • (2012)A protein sequence analysis hardware accelerator based on divergencesInternational Journal of Reconfigurable Computing10.1155/2012/2013782012(4-4)Online publication date: 1-Jan-2012
  • (2012)Dynamically managed data for CPU-GPU architecturesProceedings of the Tenth International Symposium on Code Generation and Optimization10.1145/2259016.2259038(165-174)Online publication date: 31-Mar-2012
  • (2012)Exploring the limits of GPGPU scheduling in control flow bound applicationsACM Transactions on Architecture and Code Optimization10.1145/2086696.20867088:4(1-22)Online publication date: 26-Jan-2012
  • (2011)Autotuned parallel I/O for highly scalable biosequence analysisProceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery10.1145/2016741.2016772(1-8)Online publication date: 18-Jul-2011
  • (2011)Automatic CPU-GPU communication management and optimizationProceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/1993498.1993516(142-151)Online publication date: 4-Jun-2011
  • (2011)Automatic CPU-GPU communication management and optimizationACM SIGPLAN Notices10.1145/1993316.199351646:6(142-151)Online publication date: 4-Jun-2011
  • (2010)A HMMER hardware accelerator using divergencesProceedings of the Conference on Design, Automation and Test in Europe10.5555/1870926.1871023(405-410)Online publication date: 8-Mar-2010
  • (2010)Accelerating HMMER on GPUs by implementing hybrid data and task parallelismProceedings of the First ACM International Conference on Bioinformatics and Computational Biology10.1145/1854776.1854844(418-421)Online publication date: 2-Aug-2010
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media