Analysis of Regions of Homozygosity: Revisited Through New Bioinformatic Approaches
<p>Flowchart representing the automation of the creation of multigene panels based on ROHs (DB—database; DF—dataframe).</p> "> Figure 2
<p>Flowchart to obtain the reference BED file.</p> "> Figure 3
<p>The flowchart of the multigene panel lists: white, grey, and black.</p> "> Figure 4
<p>Flowchart of the ROH and HPO multigene panel automation.</p> "> Figure 5
<p>Overview of the results regarding processes of generating the multigene panel application in a case study, the first Portuguese ROH characterization, and the clustering model.</p> "> Figure 6
<p>Pedigree depicting two affected sisters, daughters of a consanguineous couple.</p> "> Figure 7
<p>Example of an input for the personalized multigene panels based on HPO term and ROHs.</p> "> Figure 8
<p>IGV visualization of the reads mapped to the <span class="html-italic">CSTB</span> gene in both sisters (II:1 and II:2).</p> "> Figure 9
<p>BAM visualization depicting the region of the dodecamer repeat expansion in a control sample (I), and in both sisters (II:1 and II:2). No reads are aligned in this region in both patients, suggesting that a possible expansion is biallelic (present in both <span class="html-italic">CSTB</span> alleles).</p> "> Figure 10
<p>Histogram depicting the distribution of ROH length above 0.5 Mb in a Portuguese cohort of 3941 samples.</p> "> Figure 11
<p>Geographical distribution per municipality of F<sub>ROH</sub> > 0.5 Mb in Portugal Mainland, Autonomous Region of Açores, and Autonomous Region of Madeira.</p> "> Figure 12
<p>Geographical distribution per municipality of F<sub>ROH</sub> > 1.5 Mb in Portugal Mainland, Autonomous Region of Açores, and Autonomous Region of Madeira.</p> "> Figure 13
<p>Geographical distribution per municipality of F<sub>ROH</sub> > 5 Mb in Portugal Mainland, Autonomous Region of Açores, and Autonomous Region of Madeira.</p> "> Figure 14
<p>Map of Portugal representing the consanguinity between 1980 and 1986 (/100,000) adapted from [<a href="#B89-biomedinformatics-04-00128" class="html-bibr">89</a>] (<b>upper left</b>) and the Portugal Mainland maps for the F<sub>ROH</sub> calculated for ROHs of size above 0.5 Mb (<b>upper right</b>), 1.5 Mb (<b>lower left</b>), and 5 Mb (<b>lower right</b>).</p> "> Figure 15
<p>Low−dimensional MDS representations of each “tier” dataset, where Tier 0 is training and validation results (<b>A</b>) and testing results (<b>B</b>); Tier 1 is training and validation results (<b>C</b>) and testing results (<b>D</b>); Tier 2 is training and validation results (<b>E</b>) and testing results (<b>F</b>). Data points are colored according to their consanguinity labels: White “unknown” points do not possess a ground truth label; green “NCON” points represent non-consanguineous samples; red “CON” points represent consanguineous samples; and purple “CON_ST” represent stringent consanguineous points. The red dashed circles represent the elliptic envelope’s outlier decision boundary (i.e., points falling outside of the envelope are predicted to be consanguineous, either stringent or non-stringent).</p> ">
Abstract
:1. Introduction
2. Materials and Methods
2.1. Creation of Personalized Multigene Panels Based on ROHs
- Clean up the ROHMMCLI BED file to contain only chromosome and start and end positions.
- Merge the HM and the cleaned ROHMMCLI BED files using bedtools merge with option −d of 1,000,000 bp, the maximum distance between ROHs to be merged.
- Use bedtools intersect to find overlaps between the merged BED file and the coding sequence coordinate BED file, producing another BED file with the list of gene coordinates found within ROHs.
- Create a text file with a list of gene Entrez IDs present in the identified ROHs.
- Find the CNV results for the sample in analysis and filter by CNVs with a span above 500,000 bp and that are ‘Heterozygous Deletion’, resulting in a BED file with CNV genomic coordinates.
- Filter by non-empty files, meaning files that contain CNVs.
- The shell script uses bedtools jaccard tool to calculate the Jaccard index for each CNV that intersects an ROH, using the merged ROH results and CNV BED files.
2.2. Creation of Personalized Multigene Panels Based on HPO Terms
2.3. Creation of Personalized Multigene Panels Based on ROH and HPO Terms
2.4. Django Web Application Development
2.5. Establishing the First Portuguese ROH Characterization on a Genomic Scale
- One containing ROHs > 0.5 Mb;
- One containing ROHs > 1.5 Mb;
- One containing ROHs > 5 Mb.
2.6. Consanguinity Classification Approach
2.6.1. Feature Extraction
- Count_x: the number of ROHs in chromosome x.
- Sum_x: the sum of ROH sizes in chromosome x.
- Min_x: the minimum ROH size in chromosome x.
- Max_x: the maximum of ROH size in chromosome x.
- Mean_x: the mean number of ROHs in chromosome x.
- STD_x: the standard deviation of ROH size in chromosome x.
- Tier 0: includes “Count_x” and “Sum_x” features only;
- Tier 1: includes “Count_x”, “Sum_x”, “Min_x”, and “Max_x” features only;
- Tier 2: includes “Count_x”, “Sum_x”, “Min_x”, “Max_x”, “Mean_x”, and “STD_x” features.
2.6.2. Outlier Detection
3. Results
3.1. Personalized Multigene Panels
- The creation of 15 multigene panels based on a single HPO term: HP: 0001627 (abnormal heart morphology); HP: 0001047 (atopic dermatitis); HP: 0005584 (renal cell carcinoma); HP: 0001789 (hydrops fetalis); HP: 0011842 (abnormal skeletal morphology); HP: 0000846 (adrenal insufficiency); HP: 0003155 (elevated circulating alkaline phosphatase concentration); HP: 0000548 (cone/cone–rod dystrophy); HP: 0011510 (drusen); HP: 0000365 (hearing impairment); HP: 0000925 (abnormality of the vertebral column); HP: 0001949 (neoplasm of the gastrointestinal tract); HP: 0007373 (motor neuron atrophy); HP: 0006530 (abnormal pulmonary interstitial morphology); HP: 0012211 (abnormal renal physiology); HP: 0001733 (pancreatitis); HP: 0000556 (retinal dystrophy);
- The creation of three multigene panels based on multiple HPO terms: HP: 0000077 (abnormality of the kidney), HP: 0100243 (leiomyosarcoma), and HP: 0100522 (thymoma); HP:0100574 (biliary tract neoplasm) and HP: 0003003 (colon cancer); and HP: 0003198 (myopathy) and HP: 0003473 (fatigable weakness);
- The creation of five personalized multigene panels based on a single HPO previously manually prepared and curated—HP: 0000126 (hydronephrosis); HP: 0001250 (seizure); HP: 0010566 (hamartoma); HP: 0012091 (abnormality of pancreas physiology); and HP:0012114 (endometrial carcinoma)—and comparison with the obtained results.
3.1.1. Output Obtained for Each Multigene Panel
3.1.2. Application of New Bioinformatic Resources in a Clinical Case
3.2. First Portuguese ROH Characterization on a Genomic Scale
3.2.1. Distribution of ROHs per Length in Portugal
3.2.2. Maps of Portugal and Respective Data for FROH > 0.5, 1.5 and 5 Mb
3.2.3. Comparison with Other Studies
3.3. Consanguinity Classification Results
4. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Oliveira, J.; Pereira, R.; Santos, R.; Sousa, M. Evaluating runs of homozygosity in exome sequencing data—Utility in disease inheritance model selection and variant filtering. Commun. Comput. Inf. Sci. 2018, 881, 268–288. [Google Scholar] [CrossRef]
- Peripolli, E.; Munari, D.P.; Silva, M.V.G.B.; Lima, A.L.F.; Irgang, R.; Baldi, F. Runs of homozygosity: Current knowledge and applications in livestock. Anim. Genet. 2017, 48, 255–271. [Google Scholar] [CrossRef]
- Magi, A.; Tattini, L.; Palombo, F.; Benelli, M.; Gialluisi, A.; Giusti, B.; Abbate, R.; Seri, M.; Gensini, G.F.R.; Romeo, G.; et al. H3M2: Detection of runs of homozygosity from whole-exome sequencing data. Bioinformatics 2014, 30, 2852–2859. [Google Scholar] [CrossRef]
- Oniya, O.; Neves, K.; Ahmed, B.; Konje, J.C. A review of the reproductive consequences of consanguinity. Eur. J. Obstet. Gynecol. Reprod. Biol. 2019, 232, 87–96. [Google Scholar] [CrossRef] [PubMed]
- Marchi, N.; Mennecier, P.; Georges, M.; Lafosse, S.; Hegay, T.; Dorzhu, C.; Chichlo, B.; Ségurel, L.; Heyer, E. Close inbreeding and low genetic diversity in Inner Asian human populations despite geographical exogamy. Sci. Rep. 2018, 8, 9397. [Google Scholar] [CrossRef] [PubMed]
- Yengo, L.; Wray, N.R.; Visscher, P.M. Extreme inbreeding in a European ancestry sample from the contemporary UK population. Nat. Commun. 2019, 10, 3719. [Google Scholar] [CrossRef] [PubMed]
- Slatkin, M. A Population-Genetic Test of Founder Effects and Implications for Ashkenazi Jewish Diseases. Am. J. Hum. Genet 2004, 75, 282–293. [Google Scholar] [CrossRef]
- Dong, J.-T. Chromosomal deletions and tumor suppressor genes in prostate cancer. Cancer Metastasis Rev. 2001, 20, 173–193. [Google Scholar] [CrossRef]
- Nalls, M.A.; Simon-Sanchez, J.; Gibbs, J.R.; Paisan-Ruiz, C.; Bras, J.T.; Tanaka, T.; Matarin, M.; Scholz, S.; Weitz, C.; Harris, T.B.; et al. Measures of autozygosity in decline: Globalization, urbanization, and its implications for medical genetics. PLoS Genet. 2009, 5, e1000415. [Google Scholar] [CrossRef]
- Ceballos, F.C.; Hazelhurst, S.; Ramsay, M. Runs of homozygosity in sub-Saharan African populations provide insights into complex demographic histories. Hum. Genet. 2019, 138, 1123–1142. [Google Scholar] [CrossRef]
- Lemes, R.B.; Nunes, K.; Carnavalli, J.E.P.; Kimura, L.; Mingroni-Netto, R.C.; Meyer, D.; Otto, P.A. Inbreeding estimates in human populations: Applying new approaches to an admixed Brazilian isolate. PLoS ONE 2018, 13, e0196360. [Google Scholar] [CrossRef]
- Ben Halim, N.; Nagara, M.; Regnault, B.; Hsouna, S.; Lasram, K.; Kefi, R.; Azaiez, H.; Khemira, L.; Saidane, R.; Ammar, S.; et al. Estimation of Recent and Ancient Inbreeding in a Small Endogamous Tunisian Community Through Genomic Runs of Homozygosity. Ann. Hum. Genet. 2015, 79, 402–417. [Google Scholar] [CrossRef] [PubMed]
- Kang, J.T.L.; Goldberg, A.; Edge, M.D.; Behar, D.M.; Rosenberg, N.A. Consanguinity Rates Predict Long Runs of Homozygosity in Jewish Populations. Hum. Hered. 2017, 82, 87–102. [Google Scholar] [CrossRef]
- Pemberton, T.J.; Absher, D.; Feldman, M.W.; Myers, R.M.; Rosenberg, N.A.; Li, J.Z. Genomic patterns of homozygosity in worldwide human populations. Am. J. Hum. Genet. 2012, 91, 275–292. [Google Scholar] [CrossRef] [PubMed]
- Kirin, M.; Mcquillan, R.; Franklin, C.S.; Campbell, H.; Mckeigue, P.M. Genomic Runs of Homozygosity Record Population History and Consanguinity. PLoS ONE 2010, 5, e13996. [Google Scholar] [CrossRef] [PubMed]
- Hunter-Zinck, H.; Musharoff, S.; Salit, J.; Al-Ali, K.A.; Chouchane, L.; Gohar, A.; Matthews, R.; Butler, M.W.; Fuller, J.; Hackett, N.R.; et al. Population genetic structure of the people of Qatar. Am. J. Hum. Genet. 2010, 87, 17–25. [Google Scholar] [CrossRef]
- Mezzavilla, M.; Cocca, M.; Maisano Delser, P.; Badii, R.; Abbaszadeh, F.; Hadi, K.A.; Giorgia, G.; Gasparini, P. Ancestry-related distribution of Runs of homozygosity and functional variants in Qatari population. BMC Genom. Data 2022, 23, 73. [Google Scholar] [CrossRef] [PubMed]
- Scott, E.M.; Halees, A.; Itan, Y.; Spencer, E.G.; He, Y.; Azab, M.A.; Gabriel, S.B.; Belkadi, A.; Boisson, B.; Abel, L.; et al. Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery. Nat. Genet. 2016, 48, 1071. [Google Scholar] [CrossRef] [PubMed]
- Yang, X.; Al-Bustan, S.; Feng, Q.; Guo, W.; Ma, Z.; Marafie, M.; Jacob, S.; Al-Mulla, F.; Xu, S. The influence of admixture and consanguinity on population genetic diversity in Middle East. J. Hum. Genet. 2014, 59, 615–622. [Google Scholar] [CrossRef]
- Ceballos, F.C.; Gürün, K.; Altınışık, N.E.; Gemici, H.C.; Karamurat, C.; Koptekin, D.; Vural, K.B.; Mapelli, I.; Sağlıcan, E.; Sürer, E.; et al. Human inbreeding has decreased in time through the Holocene. Curr. Biol. 2021, 31, 3925–3934.e8. [Google Scholar] [CrossRef] [PubMed]
- Kars, M.E.; Baṣak, A.N.; Onat, O.E.; Bilguvar, K.; Choi, J.; Itan, Y.; Ça, C.; Palvadeau, R.; Casanova, J.-L.; Cooper, D.N.; et al. The genetic structure of the Turkish population reveals high levels of variation and admixture. Proc. Natl. Acad. Sci. USA 2021, 118, e2026076118. [Google Scholar] [CrossRef] [PubMed]
- Binzer, S.; Imrell, K.; Binzer, M.; Kyvik, K.O.; Hillert, J.; Stenager, E. High inbreeding in the Faroe Islands does not appear to constitute a risk factor for multiple sclerosis. Mult. Scler. 2015, 21, 996–1002. [Google Scholar] [CrossRef] [PubMed]
- Karafet, T.M.; Bulayeva, K.B.; Bulayev, O.A.; Gurgenova, F.; Omarova, J.; Yepiskoposyan, L.; Savina, O.V.; Veeramah, K.R.; Hammer, M.F. Extensive genome-wide autozygosity in the population isolates of Daghestan. Eur. J. Hum. Genet. 2015, 23, 1405–1412. [Google Scholar] [CrossRef] [PubMed]
- McLaughlin, R.L.; Kenna, K.P.; Vajda, A.; Heverin, M.; Byrne, S.; Donaghy, C.G.; Cronin, S.; Bradley, D.G.; Hardiman, O. Homozygosity mapping in an Irish ALS case-control cohort describes local demographic phenomena and points towards potential recessive risk loci. Genomics 2015, 105, 237–241. [Google Scholar] [CrossRef] [PubMed]
- Alabdullatif, M.A.; Al Dhaibani, M.A.; Khassawneh, M.Y.; El-Hattab, A.W. Chromosomal microarray in a highly consanguineous population: Diagnostic yield, utility of regions of homozygosity, and novel mutations. Clin. Genet. 2017, 91, 616–622. [Google Scholar] [CrossRef]
- Wang, J.C.; Ross, L.; Mahon, L.W.; Owen, R.; Hemmat, M.; Wang, B.T.; El Naggar, M.; Kopita, K.A.; Randolph, L.M.; Chase, J.M.; et al. Regions of homozygosity identified by oligonucleotide SNP arrays: Evaluating the incidence and clinical utility. Eur. J. Hum. Genet. 2015, 23, 663–671. [Google Scholar] [CrossRef] [PubMed]
- Prasad, A.; Sdano, M.A.; Vanzo, R.J.; Mowery-Rushton, P.A.; Serrano, M.A.; Hensel, C.H.; Wassman, E.R. Clinical utility of exome sequencing in individuals with large homozygous regions detected by chromosomal microarray analysis. BMC Med. Genet. 2018, 19, 46. [Google Scholar] [CrossRef] [PubMed]
- Hengel, H.; Buchert, R.; Sturm, M.; Haack, T.B.; Schelling, Y.; Mahajnah, M.; Sharkia, R.; Azem, A.; Balousha, G.; Ghanem, Z.; et al. First-line exome sequencing in Palestinian and Israeli Arabs with neurological disorders is efficient and facilitates disease gene discovery. Eur. J. Hum. Genet. 2020, 28, 1034–1043. [Google Scholar] [CrossRef]
- Palombo, F.; Graziano, C.; Al Wardy, N.; Nouri, N.; Marconi, C.; Magini, P.; Severi, G.; La Morgia, C.; Cantalupo, G.; Cordelli, D.M.; et al. Autozygosity-driven genetic diagnosis in consanguineous families from Italy and the Greater Middle East. Hum. Genet. 2020, 139, 1429–1441. [Google Scholar] [CrossRef]
- Knopp, C.; Rudnik-Schöneborn, S.; Eggermann, T.; Bergmann, C.; Begemann, M.; Schoner, K.; Zerres, K.; Brüchle, N.O. Syndromic ciliopathies: From single gene to multi gene analysis by SNP arrays and next generation sequencing. Mol. Cell. Probes 2015, 29, 299–307. [Google Scholar] [CrossRef]
- de Farias, A.A.; Nunes, K.; Lemes, R.B.; Moura, R.; Fernandes, G.R.; Melo, U.S.; Zatz, M.; Kok, F.; Santos, S. Origin and age of the causative mutations in KLC2, IMPA1, MED25 and WNT7A unravelled through Brazilian admixed populations. Sci. Rep. 2018, 8, 16552. [Google Scholar] [CrossRef] [PubMed]
- Wakil, S.M.; Ramzan, K.; Abuthuraya, R.; Hagos, S.; Al-Dossari, H.; Al-Omar, R.; Murad, H.; Chedrawi, A.; Al-Hassnan, Z.N.; Finsterer, J.; et al. Infantile-onset ascending hereditary spastic paraplegia with bulbar involvement due to the novel ALS2 mutation c.2761C>T. Gene 2014, 536, 217–220. [Google Scholar] [CrossRef]
- Lobo-Prada, T.; Sticht, H.; Bogantes-Ledezma, S.; Ekici, A.; Uebe, S.; Reis, A.; Leal, A. A homozygous mutation in GPT2 associated with nonsyndromic intellectual disability in a consanguineous family from costa rica. JIMD Rep. 2017, 36, 59–66. [Google Scholar] [CrossRef] [PubMed]
- Guo, T.; Tan, Z.P.; Chen, H.M.; Zheng, D.Y.; Liu, L.; Huang, X.G.; Chen, P.; Luo, H.; Yang, Y.F. An effective combination of whole-exome sequencing and runs of homozygosity for the diagnosis of primary ciliary dyskinesia in consanguineous families. Sci. Rep. 2017, 7, 7905. [Google Scholar] [CrossRef]
- Costa, P.; Zanus, C.; Faletra, F.; Ventura, G.; di Marzio, G.M.; Cervesi, C.; Carrozzi, M. Epileptic encephalopathy with microcephaly in a patient with asparagine synthetase deficiency: A video-EEG report. Epileptic Disord. 2019, 21, 466–470. [Google Scholar] [CrossRef] [PubMed]
- Khan, R.; Shabbir, R.M.K.; Raza, I.; Abdullah, U.; Naeem, M.A.; Ahmed, A.; Malik, S.; Hu, Z.; Xia, K. A founder RDH5 splice site mutation leads to retinitis punctata albescens in two inbred Pakistani kindreds. Ophthalmic Genet. 2020, 41, 7–12. [Google Scholar] [CrossRef]
- Yu, W.; You, X.; Wang, D.; Dong, K.; Su, J.; Li, C.; Liu, J.; Zhang, Q.; You, F.; Wang, X.; et al. Microarray analysis unmasked two siblings with pure hereditary spastic paraplegia shared a run of homozygosity region on chromosome 3q28-q29. J. Neurol. Sci. 2015, 359, 351–355. [Google Scholar] [CrossRef]
- Calderón, R.; Hernández, C.L.; García-Varela, G.; Masciarelli, D.; Cuesta, P. Inbreeding in Southeastern Spain: The Impact of Geography and Demography on Marital Mobility and Marital Distance Patterns (1900–1969). Hum. Nat. 2018, 29, 45–64. [Google Scholar] [CrossRef] [PubMed]
- Pippucci, T.; Magi, A.; Gialluisi, A.; Romeo, G. Detection of runs of homozygosity from whole exome sequencing data: State of the art and perspectives for clinical, population and epidemiological studies. Hum. Hered. 2014, 77, 63–72. [Google Scholar] [CrossRef] [PubMed]
- Lander, E.S.; Botstein, D. Homozygosity Mapping: A Way to Map Human Recessive Traits with the DNA of Inbred Children. Science 1987, 236, 1567–1570. [Google Scholar] [CrossRef]
- Hu, T.; Chitnis, N.; Monos, D.; Dinh, A. Next-generation sequencing technologies: An overview. Hum. Immunol. 2021, 82, 801–811. [Google Scholar] [CrossRef] [PubMed]
- Pereira, R.; Oliveira, J.; Sousa, M. Bioinformatics and computational tools for next-generation sequencing analysis in clinical genetics. J. Clin. Med. 2020, 9, 132. [Google Scholar] [CrossRef]
- Thompson, J.F.; Milos, P.M. The properties and applications of single-molecule DNA sequencing. Genome Biol. 2011, 12, 217. [Google Scholar] [CrossRef] [PubMed]
- Rhoads, A.; Au, K.F. PacBio Sequencing and Its Applications. Genom. Proteom. Bioinform. 2015, 13, 278–289. [Google Scholar] [CrossRef]
- Zhang, L.; Chen, F.X.; Zeng, Z.; Xu, M.; Sun, F.; Yang, L.; Bi, X.; Lin, Y.; Gao, Y.J.; Hao, H.X.; et al. Advances in Metagenomics and Its Application in Environmental Microorganisms. Front. Microbiol. 2015, 12, 766364. [Google Scholar] [CrossRef] [PubMed]
- Qin, D. Next-generation sequencing and its clinical application. Cancer Biol. Med. 2019, 16, 4–10. [Google Scholar] [CrossRef]
- Barbitoff, Y.A.; Polev, D.E.; Glotov, A.S.; Serebryakova, E.A.; Shcherbakova, I.V.; Kiselev, A.M.; Kostareva, A.A.; Glotov, O.S.; Predeus, A.V. Systematic dissection of biases in whole-exome and whole-genome sequencing reveals major determinants of coding sequence coverage. Sci. Rep. 2020, 10, 2057. [Google Scholar] [CrossRef]
- Choi, M.; Scholl, U.I.; Ji, W.; Liu, T.; Tikhonova, I.R.; Zumbo, P.; Nayir, A.; Bakkaloğlu, A.; Ozen, S.; Sanjad, S.; et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc. Natl. Acad. Sci. USA 2009, 106, 19096–19101. [Google Scholar] [CrossRef] [PubMed]
- Bartha, Á.; Győrffy, B. Comprehensive outline of whole exome sequencing data analysis tools available in clinical oncology. Cancers 2019, 11, 1725. [Google Scholar] [CrossRef]
- Warman Chardon, J.; Beaulieu, C.; Hartley, T.; Boycott, K.M.; Dyment, D.A. Axons to Exons: The Molecular Diagnosis of Rare Neurological Diseases by Next-Generation Sequencing. Curr. Neurol. Neurosci. Rep. 2015, 15, 64. [Google Scholar] [CrossRef]
- Gargano, M.A.; Matentzoglu, N.; Coleman, B.; Addo-Lartey, E.B.; Anagnostopoulos, A.V.; Anderton, J.; Avillach, P.; Bagley, A.M.; Bakštein, E.; Balhoff, J.P.; et al. The Human Phenotype Ontology in 2024: Phenotypes around the world. Nucleic Acids Res. 2024, 52, D1333–D1346. [Google Scholar] [CrossRef] [PubMed]
- Bullich, G.; Matalonga, L.; Pujadas, M.; Papakonstantinou, A.; Piscia, D.; Tonda, R.; Artuch, R.; Gallano, P.; Garrabou, G.; González, J.R.; et al. Systematic Collaborative Reanalysis of Genomic Data Improves Diagnostic Yield in Neurologic Rare Diseases. J. Mol. Diagn. 2022, 24, 529–542. [Google Scholar] [CrossRef] [PubMed]
- Matalonga, L.; Laurie, S.; Papakonstantinou, A.; Piscia, D.; Mereu, E.; Bullich, G.; Thompson, R.; Horvath, R.; Pérez-Jurado, L.; Riess, O.; et al. Improved Diagnosis of Rare Disease Patients through Systematic Detection of Runs of Homozygosity. J. Mol. Diagn. 2020, 22, 1205–1215. [Google Scholar] [CrossRef] [PubMed]
- Becker, J.; Semler, O.; Gilissen, C.; Li, Y.; Bolz, H.J.; Giunta, C.; Bergmann, C.; Rohrbach, M.; Koerber, F.; Zimmermann, K.; et al. Exome sequencing identifies truncating mutations in human SERPINF1 in autosomal-recessive osteogenesis imperfecta. Am. J. Hum. Genet. 2011, 88, 362–371. [Google Scholar] [CrossRef]
- Mezzavilla, M.; Vozzi, D.; Badii, R.; Khalifa Alkowari, M.; Abdulhadi, K.; Girotto, G.; Gasparini, P. Increased rate of deleterious variants in long runs of homozygosity of an inbred population from Qatar. Hum. Hered. 2015, 79, 14–19. [Google Scholar] [CrossRef]
- Yang, T.L.; Guo, Y.; Zhang, L.S.; Tian, Q.; Yan, H.; Papasian, C.J.; Recker, R.R.; Deng, H.W. Runs of homozygosity identify a recessive locus 12q21.31 for human adult height. J. Clin. Endocrinol. Metab. 2010, 95, 3777–3782. [Google Scholar] [CrossRef]
- Wang, L.S.; Hranilovic, D.; Wang, K.; Lindquist, I.E.; Yurcaba, L.; Petkovic, Z.B.; Gidaya, N.; Jernej, B.; Hakonarson, H.; Bucan, M. Population-based study of genetic variation in individuals with autism spectrum disorders from Croatia. BMC Med. Genet. 2010, 11, 134. [Google Scholar] [CrossRef]
- Gross, A.; Tönjes, A.; Kovacs, P.; Veeramah, K.R.; Ahnert, P.; Roshyara, N.R.; Gieger, C.; Rueckert, I.M.; Loeffler, M.; Stoneking, M.; et al. Population-genetic comparison of the Sorbian isolate population in Germany with the German KORA population using genome-wide SNP arrays. BMC Genet. 2011, 12, 67. [Google Scholar] [CrossRef] [PubMed]
- Ghani, M.; Sato, C.; Lee, J.H.; Reitz, C.; Moreno, D.; Mayeux, R.; George-Hyslop, P.S.; Rogaeva, E. Evidence of recessive Alzheimer disease loci in a Caribbean Hispanic data set: Genome-wide survey of runs of homozygosity. JAMA Neurol. 2013, 70, 1261–1267. [Google Scholar] [CrossRef] [PubMed]
- Yang, T.L.; Guo, Y.; Zhang, J.G.; Xu, C.; Tian, Q.; Deng, H.W. Genome-wide Survey of Runs of Homozygosity Identifies Recessive Loci for Bone Mineral Density in Caucasian and Chinese Populations. J. Bone Miner. Res. Off. J. Am. Soc. Bone Miner. Res. 2015, 30, 2119–2126. [Google Scholar] [CrossRef] [PubMed]
- Ghani, M.; Reitz, C.; Cheng, R.; Vardarajan, B.N.; Jun, G.; Sato, C.; Naj, A.; Rajbhandary, R.; Wang, L.S.; Valladares, O.; et al. Association of Long Runs of Homozygosity with Alzheimer Disease Among African American Individuals. JAMA Neurol. 2015, 72, 1313–1323. [Google Scholar] [CrossRef] [PubMed]
- Bandrés-Ciga, S.; Price, T.R.; Barrero, F.J.; Escamilla-Sevilla, F.; Pelegrina, J.; Arepalli, S.; Hernández, D.; Gutiérrez, B.; Cervilla, J.; Rivera, M.; et al. Genome-wide assessment of Parkinson’s disease in a Southern Spanish population. Neurobiol. Aging 2016, 45, 213.e3–213.e9. [Google Scholar] [CrossRef]
- Barbieri, C.; Barquera, R.; Arias, L.; Sandoval, J.R.; Acosta, O.; Zurita, C.; Aguilar-Campos, A.; Tito-Álvarez, A.M.; Serrano-Osuna, R.; Gray, R.D.; et al. The Current Genomic Landscape of Western South America: Andes, Amazonia, and Pacific Coast. Mol. Biol. Evol. 2019, 36, 2698–2713. [Google Scholar] [CrossRef]
- Font-Porterias, N.; Caro-Consuegra, R.; Lucas-Sánchez, M.; Lopez, M.; Giménez, A.; Carballo-Mesa, A.; Bosch, E.; Calafell, F.; Quintana-Murci, L.; Comas, D. The Counteracting Effects of Demography on Functional Genomic Variation: The Roma Paradigm. Mol. Biol. Evol. 2021, 38, 2804–2817. [Google Scholar] [CrossRef] [PubMed]
- Da Cruz, P.R.S.; Ananina, G.; Secolin, R.; Gil-Da-Silva-Lopes, V.L.; Lima, C.S.P.; de França, P.H.C.; Donatti, A.; Lourenço, G.J.; de Araujo, T.K.; Simioni, M.; et al. Demographic history differences between Hispanics and Brazilians imprint haplotype features. G3 2022, 12, jkac111. [Google Scholar] [CrossRef] [PubMed]
- Ruan, X.; Kocher, J.P.A.; Pommier, Y.; Liu, H.; Reinhold, W.C. Mass homozygotes accumulation in the NCI-60 cancer cell lines as compared to HapMap Trios, and relation to fragile site location. PLoS ONE 2012, 7, e31628. [Google Scholar] [CrossRef]
- Santoni, F.A.; Makrythanasis, P.; Antonarakis, S.E. CATCHing putative causative variants in consanguineous families. BMC Bioinform. 2015, 16, 310. [Google Scholar] [CrossRef] [PubMed]
- Sonehara, K.; Okada, Y. Obelisc: An identical-by-descent mapping tool based on SNP streak. Bioinformatics 2020, 36, 5567–5570. [Google Scholar] [CrossRef] [PubMed]
- Garone, C.; Pippucci, T.; Cordelli, D.M.; Zuntini, R.; Castegnaro, G.; Marconi, C.; Graziano, C.; Marchiani, V.; Verrotti, A.; Seri, M.; et al. FA2H-related disorders: A novel c.270+3A>T splice-site mutation leads to a complex neurodegenerative phenotype. Dev. Med. Child Neurol. 2011, 53, 958–961. [Google Scholar] [CrossRef]
- Seelow, D.; Schuelke, M. HomozygosityMapper2012-bridging the gap between homozygosity mapping and deep sequencing. Nucleic Acids Res. 2012, 40, W516–W520. [Google Scholar] [CrossRef] [PubMed]
- Seelow, D.; Schuelke, M.; Hildebrandt, F.; Nürnberg, P. HomozygosityMapper—An interactive approach to homozygosity mapping. Nucleic Acids Res. 2009, 37 (Suppl. S2), W593–W599. [Google Scholar] [CrossRef] [PubMed]
- Kancheva, D.; Atkinson, D.; De Rijk, P.; Zimon, M.; Chamova, T.; Mitev, V.; Yaramis, A.; Maria Fabrizi, G.; Topaloglu, H.; Tournev, I.; et al. Novel mutations in genes causing hereditary spastic paraplegia and Charcot-Marie-Tooth neuropathy identified by an optimized protocol for homozygosity mapping based on whole-exome sequencing. Genet. Med. 2016, 18, 600–607. [Google Scholar] [CrossRef] [PubMed]
- Szpiech, Z.A.; Blant, A.; Pemberton, T.J. GARLIC: Genomic Autozygosity Regions Likelihood-based Inference and Classification. Bioinformatics 2017, 33, 2059–2062. [Google Scholar] [CrossRef] [PubMed]
- Görmez, Z.; Bakir-Gungor, B.; Saǧiroǧlu, M.Ş. HomSI: A homozygous stretch identifier from next-generation sequencing data. Bioinformatics 2014, 30, 445–447. [Google Scholar] [CrossRef] [PubMed]
- Quinodoz, M.; Peter, V.G.; Bedoni, N.; Bertrand, B.R.; Cisarova, K.; Salmaninejad, A.; Sepahi, N.; Rodrigues, R.; Piran, M.; Mojarrad, M.; et al. AutoMap is a high performance homozygosity mapping tool using next-generation sequencing data. Nat. Commun. 2021, 12, 518. [Google Scholar] [CrossRef]
- Yoon, B.-J. Hidden Markov Models and their Applications in Biological Sequence Analysis. Curr. Genom. 2009, 10, 402–415. [Google Scholar] [CrossRef]
- Narasimhan, V.; Danecek, P.; Scally, A.; Xue, Y.; Tyler-Smith, C.; Durbin, R. BCFtools/RoH: A hidden Markov model approach for detecting autozygosity from next-generation sequencing data. Bioinformatics 2016, 32, 1749–1751. [Google Scholar] [CrossRef] [PubMed]
- Zhuang, Z.; Gusev, A.; Cho, J.; Pe’er, I. Detecting Identity by Descent and Homozygosity Mapping in Whole-Exome Sequencing Data. PLoS ONE 2012, 7, e47618. [Google Scholar] [CrossRef]
- Browning, S.R.; Browning, B.L. High-Resolution Detection of Identity by Descent in Unrelated Individuals. Am. J. Hum. Genet. 2010, 86, 526–539. [Google Scholar] [CrossRef]
- Çelik, G.; Tuncalı, T. ROHMM—A flexible hidden Markov model framework to detect runs of homozygosity from genotyping data. Hum. Mutat. 2022, 43, 158–168. [Google Scholar] [CrossRef]
- Vigeland, M.D.; Gjøtterud, K.S.; Selmer, K.K. FILTUS: A desktop GUI for fast and efficient detection of disease-causing variants, including a novel autozygosity detector. Bioinformatics 2016, 32, 1592–1594. [Google Scholar] [CrossRef]
- hapROH · PyPI. (n.d.). Retrieved 27 March 2023. Available online: https://pypi.org/project/hapROH/ (accessed on 6 June 2023).
- Ringbauer, H.; Novembre, J.; Steinrücken, M. Parental relatedness through time revealed by runs of homozygosity in ancient DNA. Nat. Commun. 2021, 12, 5425. [Google Scholar] [CrossRef] [PubMed]
- Kruskal, J.B.; Hill, M. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 1964, 29, 1–27. [Google Scholar] [CrossRef]
- Rousseeuw, P.J.; Van Driessen, K. A fast algorithm for the minimum covariance determinant estimator. Technometrics 1999, 41, 212–223. [Google Scholar] [CrossRef]
- Lalioti, M.D.; Mirotsou, M.; Buresi, C.; Peitsch, M.C.; Rossier, C.; Ouazzani, R.; Baldy-Moulinier, M.; Bottani, A.; Malafosse, A.; Antonarakis, S.E. Identification of mutations in cystatin B, the gene responsible for the Unverricht-Lundborg type of progressive myoclonus epilepsy (EPM1). Am. J. Hum. Genet. 1997, 60, 342. [Google Scholar] [PubMed]
- McQuillan, R.; Leutenegger, A.L.; Abdel-Rahman, R.; Franklin, C.S.; Pericic, M.; Barac-Lauc, L.; Smolej-Narancic, N.; Janicijevic, B.; Polasek, O.; Tenesa, A.; et al. Runs of Homozygosity in European Populations. Am. J. Hum. Genet. 2008, 83, 359. [Google Scholar] [CrossRef] [PubMed]
- Moreno-Grau, S.; Fernández, M.V.; de Rojas, I.; Garcia-González, P.; Hernández, I.; Farias, F.; Budde, J.P.; Quintela, I.; Madrid, L.; González-Pérez, A.; et al. Long runs of homozygosity are associated with Alzheimer’s disease. Transl. Psychiatry 2021, 11, 142. [Google Scholar] [CrossRef]
- Santos, H.G.; Dias, J.A.; Pimenta, Z.P. Sumário 41 Incidência de Casamentos Consanguíneos na População Incidência de Casamentos Consanguíneos na População Portuguesa-1980–1986. In Saúde em Números; 1988; Volume 3, pp. 41–48. [Google Scholar]
- Ceballos, F.C.; Joshi, P.K.; Clark, D.W.; Ramsay, M.; Wilson, J.F. Runs of homozygosity: Windows into population history and trait architecture. Nat. Rev. Genet. 2018, 19, 220–234. [Google Scholar] [CrossRef]
- Martin, A.R.; Williams, E.; Foulger, R.E.; Leigh, S.; Daugherty, L.C.; Niblock, O.; Leong, I.U.S.; Smith, K.R.; Gerasimenko, O.; Haraldsdottir, E.; et al. PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nat. Genet. 2019, 51, 1560–1565. [Google Scholar] [CrossRef] [PubMed]
II:1 | II:2 |
---|---|
DHDDS | SIK1 |
HMGCL | CSTB |
MERC | SLC32A1 |
SDHA | |
SIK1 | |
CSTB | |
PIGV | |
SLC25A19 | |
SLC32A1 | |
TERT | |
TSEN54 |
FROH > 0.5 Mb Intervals | Number of Samples |
---|---|
(0.000, 0.004] | 3086 |
(0.004, 0.006] | 205 |
(0.006, 0.010] | 153 |
(0.010, 0.018] | 105 |
(0.018, 0.034] | 123 |
(0.034, 0.088] | 88 |
FROH > 1.5 Mb Intervals | Number of Samples |
---|---|
(0.000, 0.003] | 1418 |
(0.003, 0.005] | 192 |
(0.005, 0.009] | 160 |
(0.009, 0.017] | 103 |
(0.017, 0.033] | 126 |
(0.033, 0.085] | 77 |
FROH > 5 Mb Intervals | Number of Samples |
---|---|
(0.000, 0.002] | 36 |
(0.002, 0.004] | 314 |
(0.004, 0.008] | 144 |
(0.008, 0.016] | 110 |
(0.016, 0.032] | 90 |
(0.032, 0.074] | 43 |
Mean FROH | Mean FROH of Means per Municipality | FROH Comparative Values [87] | |
---|---|---|---|
FROH > 0.5 Mb | 0.0042 | 0.0057 | 0.0315 |
FROH > 1.5 Mb | 0.0033 | 0.0049 | 0.0021 |
FROH > 5 Mb | 0.0020 | 0.0039 | 0.0001 |
District | FROH > 0.5 Mb | FROH > 1.5 Mb | FROH > 5.0 Mb | Number of Consanguineous Marriages (10,000) [89] |
---|---|---|---|---|
Açores | 0.0046 | 0.0035 | 0.0023 | 78.7 |
Aveiro | 0.0052 | 0.0041 | 0.0024 | 22.1 |
Beja | 0.0048 | 0.0039 | 0.0022 | 22.8 |
Braga | 0.0034 | 0.0025 | 0.0013 | 19.2 |
Bragança | 0.0102 | 0.0090 | 0.0060 | 52.7 |
Castelo Branco | 0.0066 | 0.0054 | 0.0034 | 19.9 |
Coimbra | 0.0063 | 0.0053 | 0.0036 | 38.2 |
Évora | 0.0039 | 0.0028 | 0.0016 | 34.5 |
Faro | 0.0029 | 0.0019 | 0.0010 | 27.2 |
Guarda | 0.0048 | 0.0038 | 0.0024 | 35.3 |
Leiria | 0.0058 | 0.0047 | 0.0030 | 35.1 |
Lisboa | 0.0039 | 0.0030 | 0.0017 | 20.2 |
Madeira | 0.0077 | 0.0068 | 0.0041 | 133.6 |
Portalegre | 0.0106 | 0.0092 | 0.0070 | 24.8 |
Porto | 0.0026 | 0.0018 | 0.0010 | 14.4 |
Santarém | 0.0056 | 0.0045 | 0.0032 | 27.6 |
Setúbal | 0.0038 | 0.0029 | 0.0019 | 30.1 |
Viana do Castelo | 0.0030 | 0.0021 | 0.0010 | 17.8 |
Vila Real | 0.0070 | 0.0060 | 0.0037 | 38.3 |
Viseu | 0.0105 | 0.0091 | 0.0059 | 38.7 |
Dataset Tier (Feature Set) | Best Contamination Hyperparameter | Validation F1-Score | Test F1-Score |
---|---|---|---|
Tier 0 (Count_x, Sum_x) | 0.0786 | 0.9310 | 0.9412 |
Tier 1 (Count_x, Sum_x, Min_x, Max_x) | 0.1190 | 0.9655 | 0.9434 |
Tier 2 (Count_x, Sum_x, Min_x, Max_x, Mean_x, STD_x) | 0.1061 | 0.9474 | 0.9615 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Valente, S.; Ribeiro, M.; Schnur, J.; Alves, F.; Moniz, N.; Seelow, D.; Freixo, J.P.; Silva, P.F.; Oliveira, J. Analysis of Regions of Homozygosity: Revisited Through New Bioinformatic Approaches. BioMedInformatics 2024, 4, 2374-2399. https://doi.org/10.3390/biomedinformatics4040128
Valente S, Ribeiro M, Schnur J, Alves F, Moniz N, Seelow D, Freixo JP, Silva PF, Oliveira J. Analysis of Regions of Homozygosity: Revisited Through New Bioinformatic Approaches. BioMedInformatics. 2024; 4(4):2374-2399. https://doi.org/10.3390/biomedinformatics4040128
Chicago/Turabian StyleValente, Susana, Mariana Ribeiro, Jennifer Schnur, Filipe Alves, Nuno Moniz, Dominik Seelow, João Parente Freixo, Paulo Filipe Silva, and Jorge Oliveira. 2024. "Analysis of Regions of Homozygosity: Revisited Through New Bioinformatic Approaches" BioMedInformatics 4, no. 4: 2374-2399. https://doi.org/10.3390/biomedinformatics4040128
APA StyleValente, S., Ribeiro, M., Schnur, J., Alves, F., Moniz, N., Seelow, D., Freixo, J. P., Silva, P. F., & Oliveira, J. (2024). Analysis of Regions of Homozygosity: Revisited Through New Bioinformatic Approaches. BioMedInformatics, 4(4), 2374-2399. https://doi.org/10.3390/biomedinformatics4040128