[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2351316.2351328acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Accelerating minor allele frequency computation with graphics processors

Published: 12 August 2012 Publication History

Abstract

The computation of minor allele frequency (MAF) is at the core of a Genome-Wide Association Study (GWAS). Due to the high computation intensity and high precision requirement, so far the scale of MAF computation analysis is up to hundreds of individuals. To enable the computation for thousands of individuals, we have developed GAMA, a high performance MAF computation program with GPU acceleration. Specifically, we design a parallel reduction algorithm that matches the GPU's data-parallel architecture. To implement the new algorithm efficiently on the GPU, we utilize the fast, on-chip local memory shared within each GPU multiprocessor effectively. To avoid user-level thread synchronization, we exploit the GPU thread-warp based scheduling. Furthermore, we address the floating point underflow issue through a logarithm transformation. As a result, GAMA enables MAF computation for up to a thousand individuals for the first time. On a server equipped with an NVIDIA Tesla C2070 GPU and two Intel Xeon E5520 2.27 GHz CPUs, GAMA outperforms a state-of-the-art single-threaded MAF computation tool and our optimized parallel implementation (16-threaded) on the CPU by around 47 and 3.5 times, respectively.

References

[1]
ARPREC: Arbitrary Precision Package. http://crd.lbl.gov/dhbailey/mpdist/.
[2]
S. Y. Kim, K. E. Lohmueller, and et al. Estimation of allele frequency and association mapping using next-generation sequencing data. BMC Bioinformatics, 12:231, 2011.
[3]
T. Korneliussen and R. Nielsen. realSFS: A program for estimating the site frequence spectrum: http://128.32.118.212/thorfinn/realSFS/. 2010.
[4]
Y. Li, N. Vinckenbosch, and et al. Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nature Genetics, 42(11):969--972, 2010.
[5]
M. Lu, B. He, and Q. Luo. Supporting extended precision on graphics processors. In Proceedings of the Sixth International Workshop on Data Management on New Hardware, DaMoN '10. ACM, 2010.
[6]
T. A. Manolio. Genomewide association studies and assessment of the risk of disease. New England Journal of Medicine, 363(2):166--176, July 2010.
[7]
R. Nielsen and T. Korneliussen. Personal communication on realsfs. September 2011.
[8]
X. Yi, Y. Liang, and et al. Sequencing of 50 Human Exomes Reveals Adaptation to High Altitude. Science, 329(5987):75--78, July 2010.

Cited By

View all
  • (2022)Estimation of site frequency spectra from low-coverage sequencing data using stochastic EM reduces overfitting, runtime, and memory usageGenetics10.1093/genetics/iyac148222:4Online publication date: 29-Sep-2022
  • (2019)Accelerating Large-Scale Genome-Wide Association Studies With Graphics ProcessorsBiotechnology10.4018/978-1-5225-8903-7.ch017(428-461)Online publication date: 2019
  • (2014)Accelerating Large-Scale Genome-Wide Association Studies with Graphics ProcessorsBig Data Management, Technologies, and Applications10.4018/978-1-4666-4699-5.ch014(349-380)Online publication date: 2014

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
BigMine '12: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
August 2012
134 pages
ISBN:9781450315470
DOI:10.1145/2351316
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2012

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

KDD '12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 13 of 23 submissions, 57%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)3
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Estimation of site frequency spectra from low-coverage sequencing data using stochastic EM reduces overfitting, runtime, and memory usageGenetics10.1093/genetics/iyac148222:4Online publication date: 29-Sep-2022
  • (2019)Accelerating Large-Scale Genome-Wide Association Studies With Graphics ProcessorsBiotechnology10.4018/978-1-5225-8903-7.ch017(428-461)Online publication date: 2019
  • (2014)Accelerating Large-Scale Genome-Wide Association Studies with Graphics ProcessorsBig Data Management, Technologies, and Applications10.4018/978-1-4666-4699-5.ch014(349-380)Online publication date: 2014

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media