[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

A Framework for Scalable Genome Assembly on Clusters, Clouds, and Grids

Published: 01 December 2012 Publication History

Abstract

Bioinformatics researchers need efficient means to process large collections of genomic sequence data. One application of interest, genome assembly, has great potential for parallelization; however, most previous attempts at parallelization require uncommon high-end hardware. This paper introduces the Scalable Assembler at Notre Dame (SAND) framework that can achieve significant speedup using large numbers of commodity machines harnessed from clusters, clouds, and grids. SAND interfaces with the Celera open-source assembly toolkit, replacing two independent sequential modules with scalable parallel alternatives: the candidate selector exploits distributed memory capacity, and the sequence aligner exploits distributed computing capacity. For large problems, these modules provide robust task and data management while also achieving speedup with high efficiency. We show results for several data sets ranging from 738 thousand to over 320 million alignments using resources ranging from a small cluster to more than a thousand nodes spanning three institutions.

Cited By

View all
  • (2024)Extending parallel programming patterns with adaptability featuresCluster Computing10.1007/s10586-024-04622-027:9(12547-12568)Online publication date: 1-Dec-2024
  • (2019)A programming-level approach for elasticizing parallel scientific applicationsJournal of Systems and Software10.1016/j.jss.2015.08.051110:C(239-252)Online publication date: 3-Jan-2019
  • (2018)Scaling up genome annotation using MAKER and work queueInternational Journal of Bioinformatics Research and Applications10.1504/IJBRA.2014.06299410:4/5(447-460)Online publication date: 21-Dec-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems  Volume 23, Issue 12
December 2012
191 pages

Publisher

IEEE Press

Publication History

Published: 01 December 2012

Author Tags

  1. Bioinformatics
  2. Biomedical informatics
  3. Cloud computing
  4. Distributed processing
  5. Distributed systems
  6. Genomics
  7. Random access memory
  8. bioinformatics
  9. genome assembly

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Extending parallel programming patterns with adaptability featuresCluster Computing10.1007/s10586-024-04622-027:9(12547-12568)Online publication date: 1-Dec-2024
  • (2019)A programming-level approach for elasticizing parallel scientific applicationsJournal of Systems and Software10.1016/j.jss.2015.08.051110:C(239-252)Online publication date: 3-Jan-2019
  • (2018)Scaling up genome annotation using MAKER and work queueInternational Journal of Bioinformatics Research and Applications10.1504/IJBRA.2014.06299410:4/5(447-460)Online publication date: 21-Dec-2018
  • (2017)PersonaProceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference10.5555/3154690.3154705(153-165)Online publication date: 12-Jul-2017
  • (2013)Storm surge simulation and load balancing in Azure cloudProceedings of the High Performance Computing Symposium10.5555/2499968.2499982(1-9)Online publication date: 7-Apr-2013

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media