[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2649387.2649408acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Amb-EM: a SNP-based prediction of HLA alleles using ambiguous HLA data

Published: 20 September 2014 Publication History

Abstract

The Human Leukocyte Antigen (HLA) genes are some of the most studied genes on the genome. This is due to their importance in bone marrow and solid organ transplantation, as well as their strong associations with many autoimmune, infectious, and inflammatory diseases. As such, they can be a highly valuable asset to clinicians and researchers for elucidating biological mechanism that may drive those diseases. The extraordinary genetic polymorphism that exists in this region makes it very challenging to type. Therefore, several approaches were proposed for prediction of HLA genes from widely available genome-wide single nucleotide polymorphism (SNP) data sets in the attempt to reduce cost and utilize existing data. These methods use SNPs and high-resolution training HLA data to build models for prediction of HLA genes in new samples. However, most of the existing HLA data sets are not available in high-resolution (exact allele assignment) but contain allelic ambiguities (inexact allele assignments). This is a result of existing typing methodologies not always being able to distinguish between several possible alleles at a given gene and produce ambiguous allele as a result. Current approaches for prediction of HLA genes from SNP data do not accommodate learning from ambiguous HLA data and, as such, miss the potential for an increased sample size and consequently improvements in prediction performance. In this paper, we propose Amb-EM, a novel algorithm for SNP-based prediction of HLA genes that utilizes ambiguities in the HLA data and predicts high-resolution alleles using ambiguous HLA alleles for building the model. Additionally, we measure the impact that the uncertainty in the training data has on the prediction accuracy, and evaluate it on a real world data set. Our results show that the prediction from ambiguous HLA data outperforms the alternative approach which first imputes the ambiguous data into high-resolution HLA alleles and uses it to build the model.

References

[1]
P. R. Burton, D. G. Clayton, L. R. Cardon, N. Craddock, P. Deloukas, A. Duncanson, D. P. Kwiatkowski, M. I. McCarthy, W. H. Ouwehand, N. J. Samani, et al. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 447(7145):661--678, 2007.
[2]
P. de Bakker, G. McVean, P. Sabeti, M. Miretti, T. Green, J. Marchini, X. Ke, A. Monsuur, P. Whittaker, M. Delgado, et al. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nature genetics, 38(10):1166--1172, 2006.
[3]
P. I. de Bakker and S. Raychaudhuri. Interrogating the major histocompatibility complex with high-throughput genomics. Human Molecular Genetics, 21(R1):R29--R36, 2012.
[4]
A. Dilthey, S. Leslie, L. Moutsianas, J. Shen, C. Cox, M. R. Nelson, and G. McVean. Multi-population classical HLA type imputation. PLOS Computational Biology, 9(2):e1002877, 2013.
[5]
M. Fernando, C. Stevens, E. Walsh, P. De Jager, P. Goyette, R. Plenge, T. Vyse, and J. Rioux. Defining the role of the MHC in autoimmunity: a review and pooled analysis. PLoS genetics, 4(4):e1000024, 2008.
[6]
Y. Ghodke, K. Joshi, A. Chopra, and B. Patwardhan. HLA and disease. European journal of epidemiology, 20(6):475--488, 2005.
[7]
P.-A. Gourraud, P. Khankhanian, N. Cereb, S. Y. Yang, M. Feolo, M. Maiers, J. D. Rioux, S. Hauser, and J. Oksenberg. HLA diversity in the 1000 genomes dataset. PloS one, 9(7):e97282, 2014.
[8]
L. Gragert, A. Madbouly, J. Freeman, and M. Maiers. Six-locus high resolution HLA haplotype frequencies derived from mixed-resolution DNA typing for the entire US donor registry. Human immunology, 2013.
[9]
C. Kollman, M. Maiers, L. Gragert, C. Müller, M. Setterholm, M. Oudshoorn, and C. Hurley. Estimation of HLA-A,-B,-DRB1 haplotype frequencies using mixed resolution data from a national registry with selective retyping of volunteers. Human immunology, 68(12):950--958, 2007.
[10]
S. Leslie, P. Donnelly, and G. McVean. A statistical method for predicting classical HLA alleles from SNP data. The American Journal of Human Genetics, 82(1):48--56, 2008.
[11]
S. Li, H. Wang, A. Smith, B. Zhang, X. Zhang, G. Schoch, D. Geraghty, J. Hansen, and L. Zhao. Predicting multiallelic genes using unphased and flanking single nucleotide polymorphisms. Genetic epidemiology, 35(2):85--92, 2011.
[12]
A. Madbouly, L. Gragert, J. Freeman, N. Leahy, P.-A. Gourraud, J. Hollenbach, M. Kamoun, M. Fernandez-Vina, and M. Maiers. Validation of statistical imputation of allele-level multilocus phased genotypes from ambiguous HLA assignments. Tissue Antigens, 2014.
[13]
M. Malkki, R. Single, M. Carrington, G. Thomson, and E. Petersdorf. MHC microsatellite diversity and linkage disequilibrium among common HLA-A, HLA-B, DRB1 haplotypes: implications for unrelated donor hematopoietic transplantation and disease association studies. Tissue Antigens, 66(2):114--124, 2005.
[14]
S. Marsh, E. Albert, W. Bodmer, R. Bontrop, B. Dupont, H. Erlich, D. Geraghty, J. Hansen, C. Hurley, B. Mach, et al. Nomenclature for factors of the HLA system, 2004. International journal of immunogenetics, 32(2):107--159, 2005.
[15]
M. McCormack, A. Alfirevic, S. Bourgeois, J. J. Farrell, D. Kasperavičiūtė, M. Carrington, G. J. Sills, T. Marson, X. Jia, P. I. de Bakker, et al. HLA-A* 3101 and carbamazepine-induced hypersensitivity reactions in Europeans. New England Journal of Medicine, 364(12):1134--1143, 2011.
[16]
V. Paunić, M. Steinbach, V. Kumar, and M. Maiers. Prediction of HLA genes from SNP data and HLA haplotype frequencies. In Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on, pages 964--971. IEEE, 2012.
[17]
V. Paunić, M. Steinbach, A. Madbouly, and V. Kumar. Evaluation of label dependency for the prediction of HLA genes. In Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics, pages 296--305. ACM, 2013.
[18]
E. Thorsby and B. Lie. HLA associated genetic predisposition to autoimmune diseases: Genes involved and possible mechanisms. Transplant immunology, 14(3):175--182, 2005.
[19]
X. Zheng, J. Shen, C. Cox, J. Wakefield, M. Ehm, M. Nelson, and B. Weir. HIBAG - HLA genotype imputation with attribute bagging. The Pharmacogenomics Journal, 2013.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
BCB '14: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
September 2014
851 pages
ISBN:9781450328944
DOI:10.1145/2649387
  • General Chairs:
  • Pierre Baldi,
  • Wei Wang
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 September 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. HLA prediction
  2. SNPs
  3. ambiguous genotypes
  4. expectation-maximization
  5. uncertain data

Qualifiers

  • Research-article

Conference

BCB '14
Sponsor:
BCB '14: ACM-BCB '14
September 20 - 23, 2014
California, Newport Beach

Acceptance Rates

Overall Acceptance Rate 254 of 885 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 56
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media