[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Anatomy of a hash-based long read sequence mapping algorithm for next generation DNA sequencing

Published: 01 January 2011 Publication History

Abstract

Motivation: Recently, a number of programs have been proposed for mapping short reads to a reference genome. Many of them are heavily optimized for short-read mapping and hence are very efficient for shorter queries, but that makes them inefficient or not applicable for reads longer than 200 bp. However, many sequencers are already generating longer reads and more are expected to follow. For long read sequence mapping, there are limited options; BLAT, SSAHA2, FANGS and BWA-SW are among the popular ones. However, resequencing and personalized medicine need much faster software to map these long sequencing reads to a reference genome to identify SNPs or rare transcripts.
Results: We present AGILE (AliGnIng Long rEads), a hash table based high-throughput sequence mapping algorithm for longer 454 reads that uses diagonal multiple seed-match criteria, customized q-gram filtering and a dynamic incremental search approach among other heuristics to optimize every step of the mapping process. In our experiments, we observe that AGILE is more accurate than BLAT, and comparable to BWA-SW and SSAHA2. For practical error rates (< 5%) and read lengths (200–1000 bp), AGILE is significantly faster than BLAT, SSAHA2 and BWA-SW. Even for the other cases, AGILE is comparable to BWA-SW and several times faster than BLAT and SSAHA2.
Availability: http://www.ece.northwestern.edu/~smi539/agile.html.
Supplementary information: Supplementary data are available at Bioinformatics online.

Cited By

View all
  • (2014)Accelerating the next generation long read mapping with the FPGA-based systemIEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)10.1109/TCBB.2014.232687611:5(840-852)Online publication date: 1-Sep-2014
  • (2014)Approximate k-Mer matching using fuzzy hash mapsIEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)10.1109/TCBB.2014.230960911:1(258-264)Online publication date: 1-Jan-2014
  • (2013)Acceleration of the long read mapping on a PC-FPGA architecture (abstract only)Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays10.1145/2435264.2435329(271-271)Online publication date: 11-Feb-2013

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Bioinformatics
Bioinformatics  Volume 27, Issue 2
January 2011
144 pages

Publisher

Oxford University Press, Inc.

United States

Publication History

Published: 01 January 2011

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2014)Accelerating the next generation long read mapping with the FPGA-based systemIEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)10.1109/TCBB.2014.232687611:5(840-852)Online publication date: 1-Sep-2014
  • (2014)Approximate k-Mer matching using fuzzy hash mapsIEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)10.1109/TCBB.2014.230960911:1(258-264)Online publication date: 1-Jan-2014
  • (2013)Acceleration of the long read mapping on a PC-FPGA architecture (abstract only)Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays10.1145/2435264.2435329(271-271)Online publication date: 11-Feb-2013

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media