WO2014121180A1 - Genetic variants in interstitial lung disease subjects - Google Patents
Genetic variants in interstitial lung disease subjects Download PDFInfo
- Publication number
- WO2014121180A1 WO2014121180A1 PCT/US2014/014395 US2014014395W WO2014121180A1 WO 2014121180 A1 WO2014121180 A1 WO 2014121180A1 US 2014014395 W US2014014395 W US 2014014395W WO 2014121180 A1 WO2014121180 A1 WO 2014121180A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- genetic variant
- lung disease
- subject
- interstitial lung
- nucleic acid
- Prior art date
Links
- 208000029523 Interstitial Lung disease Diseases 0.000 title claims abstract description 143
- 230000002068 genetic effect Effects 0.000 title claims description 272
- 238000000034 method Methods 0.000 claims abstract description 230
- 230000004083 survival effect Effects 0.000 claims abstract description 34
- 239000000523 sample Substances 0.000 claims description 154
- 201000009794 Idiopathic Pulmonary Fibrosis Diseases 0.000 claims description 93
- 208000036971 interstitial lung disease 2 Diseases 0.000 claims description 93
- 150000007523 nucleic acids Chemical class 0.000 claims description 90
- 102000039446 nucleic acids Human genes 0.000 claims description 88
- 108020004707 nucleic acids Proteins 0.000 claims description 88
- 102100024652 Toll-interacting protein Human genes 0.000 claims description 83
- 101000830560 Homo sapiens Toll-interacting protein Proteins 0.000 claims description 78
- 239000002773 nucleotide Substances 0.000 claims description 57
- 125000003729 nucleotide group Chemical group 0.000 claims description 57
- 101000972276 Homo sapiens Mucin-5B Proteins 0.000 claims description 41
- 102100022494 Mucin-5B Human genes 0.000 claims description 41
- 238000002866 fluorescence resonance energy transfer Methods 0.000 claims description 37
- 101000615657 Homo sapiens MAM domain-containing glycosylphosphatidylinositol anchor protein 2 Proteins 0.000 claims description 36
- 102100021319 MAM domain-containing glycosylphosphatidylinositol anchor protein 2 Human genes 0.000 claims description 36
- 101150062121 tollip gene Proteins 0.000 claims description 32
- 108020004711 Nucleic Acid Probes Proteins 0.000 claims description 28
- 239000002853 nucleic acid probe Substances 0.000 claims description 28
- 230000003321 amplification Effects 0.000 claims description 25
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 25
- 230000014509 gene expression Effects 0.000 claims description 21
- 108090000623 proteins and genes Proteins 0.000 claims description 18
- 102100026820 Signal peptide peptidase-like 2C Human genes 0.000 claims description 15
- 102000054765 polymorphisms of proteins Human genes 0.000 claims description 15
- 230000003176 fibrotic effect Effects 0.000 claims description 14
- 230000003247 decreasing effect Effects 0.000 claims description 13
- 238000000338 in vitro Methods 0.000 claims description 12
- 108091033319 polynucleotide Proteins 0.000 claims description 11
- 102000040430 polynucleotide Human genes 0.000 claims description 11
- 239000002157 polynucleotide Substances 0.000 claims description 11
- 101100203811 Homo sapiens SPPL2C gene Proteins 0.000 claims description 10
- 101150031456 MDGA2 gene Proteins 0.000 claims description 10
- 238000004393 prognosis Methods 0.000 claims description 10
- 101150039612 rpsK gene Proteins 0.000 claims description 5
- 108020004999 messenger RNA Proteins 0.000 claims description 4
- 239000000556 agonist Substances 0.000 claims description 3
- 230000002759 chromosomal effect Effects 0.000 claims description 3
- 230000002685 pulmonary effect Effects 0.000 claims description 3
- 102100027840 Acyl-CoA wax alcohol acyltransferase 1 Human genes 0.000 claims description 2
- 101100004038 Homo sapiens AWAT1 gene Proteins 0.000 claims description 2
- 238000010998 test method Methods 0.000 claims description 2
- 101150107884 Sppl2c gene Proteins 0.000 claims 12
- 108700028369 Alleles Proteins 0.000 description 60
- 238000004458 analytical method Methods 0.000 description 44
- 108020004414 DNA Proteins 0.000 description 23
- 230000000295 complement effect Effects 0.000 description 18
- 230000007614 genetic variation Effects 0.000 description 17
- 230000000694 effects Effects 0.000 description 16
- 210000000349 chromosome Anatomy 0.000 description 15
- 239000000975 dye Substances 0.000 description 15
- 238000003205 genotyping method Methods 0.000 description 15
- 210000004072 lung Anatomy 0.000 description 15
- 239000000370 acceptor Substances 0.000 description 14
- 238000001514 detection method Methods 0.000 description 14
- 238000003752 polymerase chain reaction Methods 0.000 description 14
- 238000012360 testing method Methods 0.000 description 14
- 201000010099 disease Diseases 0.000 description 13
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 13
- 238000003556 assay Methods 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 12
- 238000009396 hybridization Methods 0.000 description 12
- 238000013461 design Methods 0.000 description 9
- 238000010197 meta-analysis Methods 0.000 description 9
- 230000000241 respiratory effect Effects 0.000 description 9
- -1 SPPL2C Proteins 0.000 description 8
- 238000003745 diagnosis Methods 0.000 description 8
- 239000000203 mixture Substances 0.000 description 8
- 230000010076 replication Effects 0.000 description 8
- 239000000654 additive Substances 0.000 description 7
- 230000000996 additive effect Effects 0.000 description 7
- 238000011282 treatment Methods 0.000 description 7
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 6
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 6
- 238000013459 approach Methods 0.000 description 6
- 229940124446 critical care medicine Drugs 0.000 description 6
- 201000003838 Idiopathic interstitial pneumonia Diseases 0.000 description 5
- 108091034117 Oligonucleotide Proteins 0.000 description 5
- 230000001364 causal effect Effects 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 102000054766 genetic haplotypes Human genes 0.000 description 5
- 238000007477 logistic regression Methods 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- UGFAIRIUMAVXCW-UHFFFAOYSA-N Carbon monoxide Chemical compound [O+]#[C-] UGFAIRIUMAVXCW-UHFFFAOYSA-N 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- 108010000178 IGF-I-IGFBP-3 complex Proteins 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 239000003814 drug Substances 0.000 description 4
- 229940088598 enzyme Drugs 0.000 description 4
- 230000035772 mutation Effects 0.000 description 4
- 102000004169 proteins and genes Human genes 0.000 description 4
- 230000001105 regulatory effect Effects 0.000 description 4
- 238000012163 sequencing technique Methods 0.000 description 4
- 208000024891 symptom Diseases 0.000 description 4
- 210000000115 thoracic cavity Anatomy 0.000 description 4
- 241000282412 Homo Species 0.000 description 3
- 102000002689 Toll-like receptor Human genes 0.000 description 3
- 108020000411 Toll-like receptor Proteins 0.000 description 3
- 238000003491 array Methods 0.000 description 3
- 229910002091 carbon monoxide Inorganic materials 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 239000010432 diamond Substances 0.000 description 3
- 239000007850 fluorescent dye Substances 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000012417 linear regression Methods 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000001681 protective effect Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000007894 restriction fragment length polymorphism technique Methods 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- 238000013517 stratification Methods 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 238000002054 transplantation Methods 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- 102000053602 DNA Human genes 0.000 description 2
- 101000651017 Homo sapiens Pulmonary surfactant-associated protein A2 Proteins 0.000 description 2
- 101000612671 Homo sapiens Pulmonary surfactant-associated protein C Proteins 0.000 description 2
- 208000022559 Inflammatory bowel disease Diseases 0.000 description 2
- 241000208125 Nicotiana Species 0.000 description 2
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 2
- 101710163270 Nuclease Proteins 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- 102100027773 Pulmonary surfactant-associated protein A2 Human genes 0.000 description 2
- 102100040971 Pulmonary surfactant-associated protein C Human genes 0.000 description 2
- 238000001604 Rao's score test Methods 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- 108010006785 Taq Polymerase Proteins 0.000 description 2
- 102000004887 Transforming Growth Factor beta Human genes 0.000 description 2
- 108090001012 Transforming Growth Factor beta Proteins 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 2
- 208000006673 asthma Diseases 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003352 fibrogenic effect Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000001502 gel electrophoresis Methods 0.000 description 2
- 238000011223 gene expression profiling Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000010348 incorporation Methods 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 238000002844 melting Methods 0.000 description 2
- 230000008018 melting Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000000902 placebo Substances 0.000 description 2
- 229940068196 placebo Drugs 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 239000013641 positive control Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 208000005069 pulmonary fibrosis Diseases 0.000 description 2
- 238000003908 quality control method Methods 0.000 description 2
- 230000007115 recruitment Effects 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 239000012925 reference material Substances 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 230000019491 signal transduction Effects 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N silicon dioxide Inorganic materials O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 108010057210 telomerase RNA Proteins 0.000 description 2
- ZRKFYGHZFMAOKI-QMGMOQQFSA-N tgfbeta Chemical compound C([C@H](NC(=O)[C@H](C(C)C)NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CC(C)C)NC(=O)CNC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCSC)C(C)C)[C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O)C1=CC=C(O)C=C1 ZRKFYGHZFMAOKI-QMGMOQQFSA-N 0.000 description 2
- LLTDOAPVRPZLCM-UHFFFAOYSA-O 4-(7,8,8,16,16,17-hexamethyl-4,20-disulfo-2-oxa-18-aza-6-azoniapentacyclo[11.7.0.03,11.05,9.015,19]icosa-1(20),3,5,9,11,13,15(19)-heptaen-12-yl)benzoic acid Chemical compound CC1(C)C(C)NC(C(=C2OC3=C(C=4C(C(C(C)[NH+]=4)(C)C)=CC3=3)S(O)(=O)=O)S(O)(=O)=O)=C1C=C2C=3C1=CC=C(C(O)=O)C=C1 LLTDOAPVRPZLCM-UHFFFAOYSA-O 0.000 description 1
- 102000007469 Actins Human genes 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- 201000004384 Alopecia Diseases 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 108091005471 CRHR1 Proteins 0.000 description 1
- 102000003727 Caveolin 1 Human genes 0.000 description 1
- 108090000026 Caveolin 1 Proteins 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 102100038018 Corticotropin-releasing factor receptor 1 Human genes 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 238000000018 DNA microarray Methods 0.000 description 1
- 206010012438 Dermatitis atopic Diseases 0.000 description 1
- SHIBSTMRCDJXLN-UHFFFAOYSA-N Digoxigenin Natural products C1CC(C2C(C3(C)CCC(O)CC3CC2)CC2O)(O)C2(C)C1C1=CC(=O)OC1 SHIBSTMRCDJXLN-UHFFFAOYSA-N 0.000 description 1
- 108700039887 Essential Genes Proteins 0.000 description 1
- 238000001327 Förster resonance energy transfer Methods 0.000 description 1
- 208000034706 Graft dysfunction Diseases 0.000 description 1
- 101001046974 Homo sapiens KAT8 regulatory NSL complex subunit 1 Proteins 0.000 description 1
- 101000891579 Homo sapiens Microtubule-associated protein tau Proteins 0.000 description 1
- 101000957106 Homo sapiens Mitotic spindle assembly checkpoint protein MAD1 Proteins 0.000 description 1
- 101000831567 Homo sapiens Toll-like receptor 2 Proteins 0.000 description 1
- 101000669447 Homo sapiens Toll-like receptor 4 Proteins 0.000 description 1
- 101000962461 Homo sapiens Transcription factor Maf Proteins 0.000 description 1
- 206010062016 Immunosuppression Diseases 0.000 description 1
- 101710125769 Importin-5 Proteins 0.000 description 1
- 102100022903 KAT8 regulatory NSL complex subunit 1 Human genes 0.000 description 1
- 238000010824 Kaplan-Meier survival analysis Methods 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 108091007460 Long intergenic noncoding RNA Proteins 0.000 description 1
- 208000019693 Lung disease Diseases 0.000 description 1
- 101150053046 MYD88 gene Proteins 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 102100040243 Microtubule-associated protein tau Human genes 0.000 description 1
- 102100038828 Mitotic spindle assembly checkpoint protein MAD1 Human genes 0.000 description 1
- 102100030608 Mothers against decapentaplegic homolog 7 Human genes 0.000 description 1
- 108010074596 Myosin-Light-Chain Kinase Proteins 0.000 description 1
- 102000003945 NF-kappa B Human genes 0.000 description 1
- 108010057466 NF-kappa B Proteins 0.000 description 1
- 239000000020 Nitrocellulose Substances 0.000 description 1
- 108091005461 Nucleic proteins Proteins 0.000 description 1
- 208000008589 Obesity Diseases 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- ZYFVNVRFVHJEIU-UHFFFAOYSA-N PicoGreen Chemical compound CN(C)CCCN(CCCN(C)C)C1=CC(=CC2=[N+](C3=CC=CC=C3S2)C)C2=CC=CC=C2N1C1=CC=CC=C1 ZYFVNVRFVHJEIU-UHFFFAOYSA-N 0.000 description 1
- 208000004530 Primary Graft Dysfunction Diseases 0.000 description 1
- 101710144202 Probable soluble pyridine nucleotide transhydrogenase Proteins 0.000 description 1
- 101000613608 Rattus norvegicus Monocyte to macrophage differentiation factor Proteins 0.000 description 1
- 101700026522 SMAD7 Proteins 0.000 description 1
- 102100028023 Saitohin Human genes 0.000 description 1
- 206010040047 Sepsis Diseases 0.000 description 1
- 101710165942 Soluble pyridine nucleotide transhydrogenase Proteins 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 230000006052 T cell proliferation Effects 0.000 description 1
- 108010017842 Telomerase Proteins 0.000 description 1
- 102100032938 Telomerase reverse transcriptase Human genes 0.000 description 1
- 101710182709 Toll-interacting protein Proteins 0.000 description 1
- 102100024333 Toll-like receptor 2 Human genes 0.000 description 1
- 102100039360 Toll-like receptor 4 Human genes 0.000 description 1
- 102000004243 Tubulin Human genes 0.000 description 1
- 108090000704 Tubulin Proteins 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 201000002996 androgenic alopecia Diseases 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 239000003146 anticoagulant agent Substances 0.000 description 1
- 229940127219 anticoagulant drug Drugs 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 210000000612 antigen-presenting cell Anatomy 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 238000002820 assay format Methods 0.000 description 1
- 238000012093 association test Methods 0.000 description 1
- 201000008937 atopic dermatitis Diseases 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 239000012472 biological sample Substances 0.000 description 1
- 239000000090 biomarker Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 238000005251 capillar electrophoresis Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- VYXSBFYARXAAKO-WTKGSRSZSA-N chembl402140 Chemical compound Cl.C1=2C=C(C)C(NCC)=CC=2OC2=C\C(=N/CC)C(C)=CC2=C1C1=CC=CC=C1C(=O)OCC VYXSBFYARXAAKO-WTKGSRSZSA-N 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 208000037976 chronic inflammation Diseases 0.000 description 1
- 230000006020 chronic inflammation Effects 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 230000001143 conditioned effect Effects 0.000 description 1
- 230000001268 conjugating effect Effects 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 238000003796 diagnosis of exclusion Methods 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- QONQRTHLHBTMGP-UHFFFAOYSA-N digitoxigenin Natural products CC12CCC(C3(CCC(O)CC3CC3)C)C3C11OC1CC2C1=CC(=O)OC1 QONQRTHLHBTMGP-UHFFFAOYSA-N 0.000 description 1
- SHIBSTMRCDJXLN-KCZCNTNESA-N digoxigenin Chemical compound C1([C@@H]2[C@@]3([C@@](CC2)(O)[C@H]2[C@@H]([C@@]4(C)CC[C@H](O)C[C@H]4CC2)C[C@H]3O)C)=CC(=O)OC1 SHIBSTMRCDJXLN-KCZCNTNESA-N 0.000 description 1
- 230000009266 disease activity Effects 0.000 description 1
- 208000022602 disease susceptibility Diseases 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000000295 emission spectrum Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 230000005281 excited state Effects 0.000 description 1
- 230000009795 fibrotic process Effects 0.000 description 1
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 1
- 238000005558 fluorometry Methods 0.000 description 1
- 239000005350 fused silica glass Substances 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 102000006602 glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 1
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 230000000984 immunochemical effect Effects 0.000 description 1
- 230000001506 immunosuppresive effect Effects 0.000 description 1
- 238000000099 in vitro assay Methods 0.000 description 1
- 229940125369 inhaled corticosteroids Drugs 0.000 description 1
- 230000015788 innate immune response Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 229940028862 interferon gamma-1b Drugs 0.000 description 1
- 108010042414 interferon gamma-1b Proteins 0.000 description 1
- 230000010189 intracellular transport Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000001325 log-rank test Methods 0.000 description 1
- 239000000891 luminescent agent Substances 0.000 description 1
- 230000004199 lung function Effects 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000001840 matrix-assisted laser desorption--ionisation time-of-flight mass spectrometry Methods 0.000 description 1
- 230000008172 membrane trafficking Effects 0.000 description 1
- 239000002082 metal nanoparticle Substances 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000001616 monocyte Anatomy 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 201000006417 multiple sclerosis Diseases 0.000 description 1
- 238000000491 multivariate analysis Methods 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 229920001220 nitrocellulos Polymers 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 235000020824 obesity Nutrition 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 230000008506 pathogenesis Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000009325 pulmonary function Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 238000013442 quality metrics Methods 0.000 description 1
- 239000010453 quartz Substances 0.000 description 1
- 238000013180 random effects model Methods 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 102000005962 receptors Human genes 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000002165 resonance energy transfer Methods 0.000 description 1
- 230000004202 respiratory function Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000001022 rhodamine dye Substances 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 208000032919 susceptibility to 1 mycobacterium tuberculosis Diseases 0.000 description 1
- 208000032922 susceptibility to mycobacterium tuberculosis Diseases 0.000 description 1
- 230000002459 sustained effect Effects 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 102000055501 telomere Human genes 0.000 description 1
- 108091035539 telomere Proteins 0.000 description 1
- 210000003411 telomere Anatomy 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000011285 therapeutic regimen Methods 0.000 description 1
- ANRHNWWPFJCPAZ-UHFFFAOYSA-M thionine Chemical compound [Cl-].C1=CC(N)=CC2=[S+]C3=CC(N)=CC=C3N=C21 ANRHNWWPFJCPAZ-UHFFFAOYSA-M 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 238000013520 translational research Methods 0.000 description 1
- 238000009424 underpinning Methods 0.000 description 1
- PJVWKTKQMONHTI-UHFFFAOYSA-N warfarin Chemical compound OC=1C2=CC=CC=C2OC(=O)C=1C(CC(=O)C)C1=CC=CC=C1 PJVWKTKQMONHTI-UHFFFAOYSA-N 0.000 description 1
- 229960005080 warfarin Drugs 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Definitions
- Idiopathic Pulmonary Fibrosis is a low prevalence, devastating disease of unknown etiology characterized by an interstitial fibrotic process and high mortality.
- the course of disease is heterogeneous with a 2-5 year median survival from diagnosis.
- lung transplantation remains the only successful treatment option, while immunosuppression regimens were recently demonstrated as harmful. Therefore, identifying genetic variants associated with susceptibility to IPF and alleles involved in the heterogeneity of disease course and mortality remains a major challenge.
- SNP single nucleotide polymorphism
- compositions and methods for identifying genetic variants in interstitial lung disease subjects are also provided. Also provided are compositions and methods of determining whether a human subject has, or is at risk of developing, an interstitial lung disease. In certain embodiments, the methods include detecting whether the genome of the subject comprises a genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease. In certain embodiments, more than one genetic variant of TOLLIP and/or SPPL2C and/or MDGA2 is detected.
- the method in addition to detecting genetic variants of TOLLIP and/or SPPL2C and/or MDGA2, the method includes detecting whether the genome of the subject includes other genetic variants diagnostic or predictive of risk for interstitial lung disease, e.g., a genetic variant of MUC5B, such as rs35705950.
- a genetic variant of MUC5B such as rs35705950.
- Fig. 2A is a flowchart showing the approach used in a three-stage association study
- Fig. 2B is flowchart of mortality analyses by regression.
- Fig. 3 QQ plot of the genome-wide association study (GWAS) of idiopathic pulmonary fibrosis (IPF).
- GWAS genome-wide association study
- IPF idiopathic pulmonary fibrosis
- Fig. 4 includes regional association plots showing the IPF-associated regions in Ch11p15.5 (Fig. 4A) and Ch17q21.31 (Fig. 4B).
- Fig. 5 survival probability over time for people with or without H2 and with or without an SPPL2C variant.
- Fig. 6A is a KM plot for TOLLIP*/MUC5B risk alleles
- Fig. 6B is KM plot by Risk Index for WPGS using all 3 genes (TOLLIP, SPPL2C & MUC5B) and categorizing into 4 groups.
- Fig. 7A-7C is a list of top associated loci with susceptibility to IPF.
- Fig. 8 is a table listing the sample sources and sizes used in a three stage study.
- Fig. 9 shows the characteristics of IPF patients used in stage 1 discovery GWAS study.
- Fig. 10 lists the characteristics of IPF patients by stage and availability.
- Fig. 1 1 A-11 C is a list of 44 SNPs and their association p-values with susceptibility to IPF from stage 1 , stage 2, and overall.
- Fig. 12 shows characteristics of IPF case series for mortality analysis.
- Fig. 13 is a table showing association signals with susceptibility to IPF across stages of six SNPs followed up in Stage 3.
- Fig. 14 is a table listing SNP effects on mortality.
- Fig. 15 provides summaries of univariate Cox analysis for mortality.
- Fig. 16 provides summaries of univariate and multivariate Cox analysis for mortality
- Fig. 17 provides summaries of Kaplan-Meier survival analysis.
- Fig. 18 lists predictors of survival in IPF patients identified using a univariant Cox model.
- Fig. 19A-Fig. 19B lists predictors of survival in IPF patients identified using a multivariate analysis of covariance.
- Fig. 20 lists 30 regions identified showing the value of aggregation and using information in addition to protein coding SNPs, with the six p values represent highest-ranking SNPs in each region in bold.
- GWAS genome wide association study
- the results obtained identified three genetic loci and replicated the association of four novel SNPs (rs11 1521887, rs5743894, rs5743890, and rs17690703) in two novel loci (ch11 p15.5/TOLL/P and ch17q21.3MSPPL2C), and the MUC5B promoter SNP (rs35705950) with IPF susceptibility in European- Americans through a three-stage case-control study.
- the findings reported herein provide, inter alia, for novel compositions and methods for identifying genetic variants in interstitial lung disease subjects and/or determining whether an individual has, or is at risk for developing, interstitial lung disease and/or compositions and methods for predicting prognosis, e.g., survival time or mortality, of an individual with an interstitial lung disease, for example, a fibrotic interstitial lung disease, such as IPF, or familial interstitial pneumonia.
- a fibrotic interstitial lung disease such as IPF, or familial interstitial pneumonia.
- nucleic acid refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form, and complements thereof.
- Nucleic acid or oligonucleotide or polynucleotide or grammatical equivalents used herein means at least two nucleotides covalently linked together. Oligonucleotides are typically from about 5, 6, 7, 8, 9, 0, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up to about 100 nucleotides in length.
- Nucleic acids and polynucleotides are a polymers of any length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, etc.
- the term "nucleotide” typically refers to a single unit of a polynucleotide, i.e., a monomer.
- Nucleotides can be ribonucleotides, deoxyribonucleotides, or modified versions thereof.
- a “genetic variant” refers to a mutation, single nucleotide polymorphism (SNP), deletion variant, missense variant, insertion variant, inversion, or copy number variant.
- probe refers to one or more nucleic acid fragments whose specific hybridization to a sample can be detected.
- a probe or primer can be of any length depending on the particular technique it will be used for.
- PCR primers are generally between 10 and 40 nucleotides in length, while nucleic acid probes for, e.g., a Southern blot, can be more than a hundred nucleotides in length.
- the probe or primers can be unlabeled or labeled as described below so that its binding to a target sequence can be detected (e.g., with a FRET donor or acceptor label).
- the probe or primer can be designed based on one or more particular (preselected) portions of a chromosome, e.g., one or more clones, an isolated whole chromosome or chromosome fragment, or a collection of polymerase chain reaction (PCR) amplification products.
- PCR polymerase chain reaction
- the length and complexity of the nucleic acid fixed onto the target element is not critical to the invention. One of skill can adjust these factors to provide optimum hybridization and signal production for a given hybridization and detection procedures, and to provide the required resolution among different genes or genomic locations.
- Probes and primers can also be immobilized on a solid surface (e.g., nitrocellulose, glass, quartz, fused silica slides), as in an array.
- a solid surface e.g., nitrocellulose, glass, quartz, fused silica slides
- Techniques for producing high density arrays can also be used for this purpose (see, e.g., Fodor (1991) Science 767-773; Johnston (1998) Curr. Biol. 8: R171-R174; Schummer (1997) Biotechniques 23: 1087-1092; Kern (1997) Biotechniques 23: 120-124; U.S. Patent No. 5,143,854).
- probes and primers can be modified from the target sequence to a certain degree to produce probes that are “substantially identical” or “substantially complementary to” a target sequence, but retain the ability to specifically bind to (i.e., hybridize specifically to) the same targets from which they were derived.
- a probe or primer is "capable of detecting" a genetic variant if it is complementary to a region that covers or is adjacent to the genetic variant.
- primers can be designed on either side of the SNP, and primer extension used to determine the identity of the nucleotide at the position of the SNP.
- FRET-labeled primers are used (at least one labeled with a FRET donor and at least one labeled with a FRET acceptor) so that FRET signal will be detected only upon hybridization of both primers.
- a probe is used in conditions such that it hybridizes only to a genetic variant, or only to a dominant sequence.
- the probe can be designed to hybridize to a junction point of a genetice inversion, but not to a sequence that does not include the inversion.
- the term “capable of hybridizing to” refers to a polynucleotide sequence that forms non-covalent, Watson-Crick bonds with a complementary sequence.
- percent complementarity need not be 100% for hybridization to occur, depending on the length of the polynucleotides, length of the complementary region, and stringency of the conditions.
- a polynucleotide e.g., primer or probe
- a polynucleotide can be capable of hybrindizing (binding) to a polynucleotide having 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% complementarity over the stretch of the complementary region.
- Stringency can be increased by reducing the length of the complementary region, reducing the G-C content of the complementary region, increasing temperature and/or detergent levels, varying salt levels and pH, etc. as known in the art.
- a polynucleotide is capable of hybridizing to a complementary sequence in standard PCR annealing conditions. In the context of detecting genetic variants, the tolerated percent complementarity or number of mismatches will vary depending on the technique used for detection (see below).
- amplification product refers to a polynucleotide that results from an amplification reaction, e.g., PCR and variations thereof, rtPCR, strand displacement reaction (SDR), ligase chain reaction (LCR), transcription mediated amplification (TMA), or Qbeta replication.
- a thermally stable polymerase e.g., Taq, can be used to avoid repeated addition of polymerase throughout amplification procedures that involve cyclic or extreme temperatures (e.g., PCR and its variants).
- label refers to a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means.
- useful labels include fluorescent dyes, luminescent agents, radioisotopes (e.g., 32 P, 3 H), electron- dense reagents, enzymes, biotin, digoxigenin, or haptens and proteins or other entities which can be made detectable, e.g., by affinity. Any method known in the art for conjugating a nucleic acid or other biomolecule to a label may be employed, e.g., using methods described in Hermanson, Bioconiuqate Techniques 1996, Academic Press, Inc., San Diego.
- tag can be used synonymously with the term “label,” but generally refers to an affinity-based moiety, e.g., a "His tag” for purification, or a “strepavidin tag” that interacts with biotin.
- a "labeled" molecule e.g., nucleic acid, protein, or antibody
- FRET F5rster resonance energy transfer
- FRET donor donor chromophore
- FRET acceptor acceptor chromophore
- a "FRET signal” is thus the signal that is generated by the emission of light from the acceptor.
- R 0 is about 50-60 A for some commonly used dye pairs (e.g., Cy3-Cy5).
- FRET signal varies as the distance to the 6 th power. If the donor-acceptor pair is positioned around R 0 , a small change in distance ranging from 1 A to 50 A can be measured with the greatest signal to noise. With current technology, 1 ms or faster parallel imaging of many single FRET pairs is achievable.
- FRET pair refers to a FRET donor and FRET acceptor pair that are capable of FRET detection.
- fluorophore fluorophore
- die fluorescent molecule
- fluorescent dye fluorescent dye
- FRET dye and like terms are used synonymously herein unless otherwise indicated.
- Subject “patient,” “individual” and like terms are used interchangeably and refer to, except where indicated, humans and non-human animals. The term does not necessarily indicate that the subject has been diagnosed with a particular disease, but typically refers to an individual under medical supervision.
- a patient can be an individual that is seeking diagnosis, treatment, monitoring, adjustment or modification of an existing therapeutic regimen, etc.
- sample refers to a biological sample obtained from a subject. Samples include material that is processed prior to carrying out testing, e.g., genomic DNA separated or purified from other cellular and non-cellular debris.
- the sample includes genomic DNA from the subject, e.g., cheek swab, blood sample, mucosal sample, buccal swab, skin sample, hair, etc.
- a "control" sample or value refers to a sample that serves as a reference, usually a known reference, for comparison to a test sample.
- a test sample can be taken from a test condition, e.g., a sample from an individual of unknown disease status, and compared to samples from individuals with known conditions, e.g., healthy, or lacking a given genetic variation (negative control), or pulmonary disease or having a given genetic variation (positive control).
- a control can also represent an average value gathered from a number of tests or results.
- controls can be designed for assessment of any number of parameters. For example, a control can be devised to compare signal strength in given conditions, e.g., in the presence of a test probe, or primer.
- Controls are valuable in a given situation and be able to analyze data based on comparisons to control values. Controls are also valuable for determining the significance of data. For example, if values for a given parameter are widely variant in controls, variation in test samples will not be considered as significant.
- compositions and methods for determining whether a human subject has or is at risk of developing an interstitial lung disease and/or prognosing interstitial lung disease may be used in conjunction with any other diagnostic or prognostic criterion or method, including, but not limited to, currently known criterion or methods.
- the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting whether the genome of the subject comprises a genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease.
- more than one genetic variant of TOLLIP and/or SPPL2C and/or MDGA2 is detected.
- the method in addition to detecting genetic variants of TOLLIP and/or SPPL2C and/or DGA2, the method includes detecting whether the genome of the subject includes other genetic variants diagnostic or predictive of risk for interstitial lung disease, e.g., a genetic variant of MUC5B, such as rs35705950.
- the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence or absence of one or more SNPs selected from rs11 1521887, rs5743894, rs5743890, rs17690703, and rs7144383.
- the presence or absence of each SNP may be detected alone or in combination with each other, i.e., the methods of the invention may include detection of one, two, three, four, or five of rs11 1521887, rs5743894, rs5743890, rs17690703, and rs7144383 in any possible combination.
- the method includes detecting the presence or absence of from one to five of rs1 11521887, rs5743894, rs5743890, rs17690703, and rs7144383 in any combination and the presence or absence of any other SNP associated with an interstitial lung disease or its prognosis, including, without limitation, the MUC5B SNP rs35705950.
- the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs1 11521887 ⁇ e.g., G or other non-dominant allele). In some embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs5743894 (e.g., G or other non-dominant allele). In some embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs5743890 (e.g., G or other non-dominant allele).
- the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs17690703 (e.g., T or other non-dominant allele). In some embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs7144383 (e.g., G or other non-dominant allele).
- the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting one or more genetic variants listed in Fig. 7.
- the one or more genetic variants may be detected alone or in any possible combination of from two to 52 of the listed genetic variants. If the method includes detecting rs35705950, then the method includes detecting at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7.
- the method includes prognosing an interstitial lung disease in a human subject.
- the method comprises detecting whether the genome of the subject comprises a genetic variant of TOLLIP and/or SPPL2C prognostic of increased or decreased survival.
- the methods include detecting whether the genome of the subject comprises a genetic variant of MUC5B and whether the genome comprises a genetic variant of a genetic variant of TOLLIP and/or SPPL2C prognostic of increased or decreased survival.
- the method includes detecting whether the genome comprises rs17690703 and/or rs5743890, each of which is predictive of decreased survival.
- the method detects whether the genome comprises rs35705950, which is predictive of increased survival, and rs17690703 and/or rs5743890. In some embodiments, the method comprises detecting rs17690703 (e.g., T or other non-dominant allele), and prognosing reduced survival time for the subject, In some embodiments, the method comprises detecting rs5743890 (e.g., G or other non-dominant allele), and prognosing reduced survival time for the subject.
- rs17690703 e.g., T or other non-dominant allele
- rs5743890 e.g., G or other non-dominant allele
- the method for prognosing the interstitial lung disease in a human subject includes detecting one or more genetic variants listed in Fig. 7.
- the one or more genetic variants may be detected alone or in any possible combination of from two to 52 of the listed genetic variants. If the method includes detecting rs35705950, then the method includes detecting at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7.
- the present invention provides methods for detecting the presence or absence of at least one genetic variant in a human subject. In certain embodiments, the method includes detecting the presence or absence of at least one genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2 in a sample from the subject.
- more than one genetic variant of TOLLIP and/or SPPL2C and/or MDGA2 is detected.
- the method in addition to detecting genetic variants of TOLLIP and/or SPPL2C and/or MDGA2, includes detecting a genetic variant of MUC5B, such as rs355950.
- the method for detecting the presence or absence of at least one genetic variant in a human subject includes detecting the presence or absence of at least one genetic variant of the genetic variants listed in Fig. 7.
- the one or more genetic variants may be detected alone or in any possible combination of from two to 52 of the genetic variants listed in Fig. 7. If the method includes detecting rs35705950, then the method includes detecting at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7.
- the at least one genetic variant includes one or more of a single nucleotide polymorphism selected from the group consisting of rs111521887, rs5743894, rs5743890, rs17690703, and rs7144383 in any possible combination.
- the method for detecting the presence or absence of at least one genetic variant in a human subject includes detecting the presence or absence of heterozygosity in least one genetic variant of the genetic variants listed in Fig. 7.
- the method for detecting the presence or absence of at least one genetic variant in a human subject includes detecting the presence or absence of homozygosity in least one genetic variant of the genetic variants listed in Fig. 7.
- the heterozygosity or homozygosity of the one or more genetic variants may be detected alone or in any possible combination of from two to 52 of the genetic variants listed in Fig. 7, wherein the genetic variant may be the same or different in the individual chromosomes present in the diploid human subject.
- the method includes detecting heterozygosity or homozygosity of rs35705950, then the method includes detecting heterozygosity or homozygosity of at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7.
- the heterozygosity or homozygosity of at least one genetic variant includes the heterozygosity or homozygosity of one or more of a single nucleotide polymorphism selected from the group consisting of rs111521887, rs5743894, rs5743890, rs17690703, and rs7144383 in any possible combination.
- a method for testing for interstitial lung disease in a human subject involves detecting the level of TOLLIP gene expression in a sample from the subject, a low level of TOLLIP gene expression relative to a control being indicative of interstitial lung disease.
- the level of gene expression may be detected by measuring, directly or indirectly, TOLLIP mRNA or by measuring Tollip protein by any suitable method, several of which are known in the art.
- the control may include, for example, a sample from a human that does not have interstitial lung disease or a value or set of values, for example, a normal range, derived from several humans that do not have interstitial lung disease.
- a low level of TOLLIP gene expression relative to a control (standard control) indicative of interstitial lung disease is a level that is less than about 50% of the control.
- the present invention includes a method of treating a human subject having an interstitial lung disease comprising detecting the level of TOLLIP expression in a sample from the subject, and if the subject has a low level of TOLLIP expression relative to a control (standard control), administering to the subject an amount of a Tollip agonist, Tollip or a genetic construct expressing TOLLIP effective to treat the interstitial lung disease.
- An amount effective to treat the interstitial lung disease is an amount effective to delay onset, reduce frequency and/or severity of one or more symptoms, ameliorate one or more symptoms, and/or improve comfort and/or some function of the subject, e.g., respiratory function, relative to an untreated second subject or pool of subjects, or relative to, or to the same subject prior to treatment, or after cessation of treatment.
- the methods of the invention are not limited to any particular way of detecting the presence or absence of a genetic variant (e.g. SNP) and can employ any suitable method to detect the presence or absence of a variant(s), of which numerous detection methods are known in the art.
- a genetic variant e.g. SNP
- any suitable method to detect the presence or absence of a variant(s) of which numerous detection methods are known in the art.
- DASH Dynamic allele-specific hybridization
- DASH genotyping takes advantage of the differences in the melting temperature in DNA that results from the instability of mismatched base pairs.
- the process can be vastly automated and encompasses a few simple principles.
- the target genomic segment is amplified and separated from non- target sequence, e.g., through use of a biotinylated primer and chromatography.
- a probe that is specific for the particular allele is added to the amplification product.
- the probe can be designed to hybridize specifically to a variant sequence or to the dominant allelic sequence.
- the probe can be either labeled with or added in the presence of a molecule that fluoresces when bound to double-stranded DNA.
- the signal intensity is then measured as temperature is increased until the Tm can be determined.
- a non-matching sequence (either genetic variant or dominant allelic sequence, depending on probe design), will result in a lower than expected Tm.
- DASH genotyping relies on a quantifiable change in Tm, and is thus capable of measuring many types of mutations, not just SNPs.
- Other benefits of DASH include its ability to work with label free probes and its simple design and performance conditions.
- Molecular beacons can also be used to detect a genetic variant.
- This method makes use of a specifically engineered single-stranded oligonucleotide probe.
- the oligonucleotide is designed such that there are complementary regions at each end and a probe sequence located in between. This design allows the probe to take on a hairpin, or stem-loop, structure in its natural, isolated state. Attached to one end of the probe is a fluorophore and to the other end a fluorescence quencher. Because of the stem-loop structure of the probe, the fluorophore is in close proximity to the quencher, thus preventing the molecule from emitting any fluorescence.
- the molecule is also engineered such that only the probe sequence is complementary to the targeted genomic DNA sequence.
- the probe sequence of the molecular beacon encounters its target genomic DNA sequence during the assay, it will anneal and hybridize. Because of the length of the probe sequence, the hairpin segment of the probe will be denatured in favor of forming a longer, more stable probe-target hybrid. This conformational change permits the fluorophore and quencher to be free of their tight proximity due to the hairpin association, allowing the molecule to fluoresce.
- the molecular beacon will preferentially stay in its natural hairpin state and no fluorescence will be observed, as the fluorophore remains quenched.
- the unique design of these molecular beacons allows for a simple diagnostic assay to identify SNPs at a given location. If a molecular beacon is designed to match a wild-type allele and another to match a mutant of the allele, the two can be used to identify the genotype of an individual. If only the first probe's fluorophore wavelength is detected during the assay then the individual is homozygous to the wild type.
- a microarray can also be used to detect genetic variants. Hundreds of thousands of probes can be arrayed on a small chip, allowing for many genetic variants or SNPs to be interrogated simultaneously. Because SNP alleles only differ in one nucleotide and because it is difficult to achieve optimal hybridization conditions for all probes on the array, the target DNA has the potential to hybridize to mismatched probes. This can be addressed by using several redundant probes to interrogate each SNP. Probes can be designed to have the SNP site in several different locations as well as containing mismatches to the SNP allele. By comparing the differential amount of hybridization of the target DNA to each of these redundant probes, it is possible to determine specific homozygous and heterozygous alleles.
- Restriction fragment length polymorphism can be used to detect genetic variants and SNPs.
- RFLP makes use of the many different restriction endonucleases and their high affinity to unique and specific restriction sites. By performing a digestion on a genomic sample and determining fragment lengths through a gel assay it is possible to ascertain whether or not the enzymes cut the expected restriction sites. A failure to cut the genomic sample results in an identifiably larger than expected fragment implying that there is a mutation at the point of the restriction site which is rendering it protected from nuclease activity.
- PCR- and amplification-based methods can be used to detect genetic variants.
- tetra-primer PCR employs two pairs of primers to amplify two alleles in one PCR reaction.
- the primers are designed such that the two primer pairs overlap at a SNP location but each matches perfectly to only one of the possible alleles.
- the two primer pairs can be designed such that their PCR products are of a significantly different length allowing for easily distinguishable bands by gel electrophoresis, or such that they are differently labeled.
- Primer extension can also be used to detect genetic variants.
- Primer extension first involves the hybridization of a probe to the bases immediately upstream of the SNP nucleotide followed by a 'mini-sequencing' reaction, in which DNA polymerase extends the hybridized primer by adding a base that is complementary to the SNP nucleotide. The incorporated base that is detected determines the presence or absence of the SNP allele. Because primer extension is based on the highly accurate DNA polymerase enzyme, the method is generally very reliable. Primer extension is able to genotype most SNPs under very similar reaction conditions making it also highly flexible. The primer extension method is used in a number of assay formats, and can be detected using e.g., fluorescent labels or mass spectrometry.
- Primer extension can involve incorporation of either fluorescently labeled ddNTP or fluorescently labeled deoxynucleotides (dNTP).
- ddNTPs probes hybridize to the target DNA immediately upstream of SNP nucleotide, and a single, ddNTP complementary to the SNP allele is added to the 3' end of the probe (the missing 3'-hydroxyl in didioxynucleotide prevents further nucleotides from being added).
- Each ddNTP is labeled .with a different fluorescent signal allowing for the detection of all four alleles in the same reaction.
- allele-specific probes have 3' bases which are complementary to each of the SNP alleles being interrogated.
- the target DNA contains an allele complementary to the 3' base of the probe, the target DNA will completely hybridize to the probe, allowing DNA polymerase to extend from the 3' end of the probe. This is detected by the incorporation of the fluorescently labeled dNTPs onto the end of the probe. If the target DNA does not contain an allele complementary to the probe's 3' base, the target DNA will produce a mismatch at the 3' end of the probe and DNA polymerase will not be able to extend from the 3' end of the probe.
- the iPLEX® SNP genotyping method takes a slightly different approach, and relies on detection by mass spectrometer. Extension probes are designed in such a way that many different SNP assays can be amplified and analyzed in a PCR cocktail.
- the extension reaction uses ddNTPs as above, but the detection of the SNP allele is dependent on the actual mass of the extension product and not on a fluorescent molecule. This method is for low to medium high throughput, and is not intended for whole genome scanning.
- Primer extension methods are, however, amenable to high throughput analysis. Primer extension probes can be arrayed on slides allowing for many SNPs to be genotyped at once. Broadly referred to as arrayed primer extension (APEX), this technology has several benefits over methods based on differential hybridization of probes. Comparatively, APEX methods have greater discriminating power than methods using differential hybridization, as it is often impossible to obtain the optimal hybridization conditions for the thousands of probes on DNA microarrays (usually this is addressed by having highly redundant probes).
- Oligonucleotide ligation assays can also be used to detect genetic variants.
- DNA ligase catalyzes the ligation of the 3' end of a DNA fragment to the 5' end of a directly adjacent DNA fragment. This mechanism can be used to interrogate a SNP by hybridizing two probes directly over the SNP polymorphic site, whereby ligation can occur if the probes are identical to the target DNA.
- two probes can be designed; an allele-specific probe which hybridizes to the target DNA so that its 3' base is situated directly over the SNP nucleotide and a second probe that hybridizes the template upstream (downstream in the complementary strand) of the SNP polymorphic site providing a 5' end for the ligation reaction. If the allele-specific probe matches the target DNA, it will fully hybridize to the target DNA and ligation can occur. Ligation does not generally occur in the presence of a mismatched 3' base. Ligated or unligated products can be detected by gel electrophoresis, MALDI- TOF mass spectrometry or by capillary electrophoresis.
- the 5'-nuclease activity of Taq DNA polymerase can be used for detecting genetic variants.
- the assay is performed concurrently with a PCR reaction and the results can be read in real-time.
- the assay requires forward and reverse PCR primers that will amplify a region that includes the SNP polymorphic site. Allele discrimination is achieved using FRET, and one or two allele-specific probes that hybridize to the SNP polymorphic site.
- the probes have a fluorophore linked to their 5' end and a quencher molecule linked to their 3' end. While the probe is intact, the quencher will remain in close proximity to the fluorophore, eliminating the fluorophore's signal .
- the allele-specific probe if the allele-specific probe is perfectly complementary to the SNP allele, it will bind to the target DNA strand and then get degraded by 5'-nuclease activity of the Taq polymerase as it extends the DNA from the PCR primers. The degradation of the probe results in the separation of the fluorophore from the quencher molecule, generating a detectable signal. If the allele-specific probe is not perfectly complementary, it will have lower melting temperature and not bind as efficiently. This prevents the nuclease from acting on the probe.
- Fluorescence resonance energy transfer (FRET) detection can be used for detection in primer extension and ligation reactions where the two labels are brought into close proximity to each other. It can also be used in the 5'-nuclease reaction, the molecular beacon reaction, and the invasive cleavage reactions where the neighboring donor/acceptor pair is separated by cleavage or disruption of the stem- loop structure that holds them together. FRET occurs when two conditions are met. First, the emission spectrum of the fluorescent donor dye must overlap with the excitation wavelength of the acceptor dye. Second, the two dyes must be in close proximity to each other because energy transfer drops off quickly with distance. The proximity requirement is what makes FRET a good detection method for a number of allelic discrimination mechanisms.
- a variety of dyes can be used for FRET, and are known in the art. The most common ones are fluorescein, cyanine dyes (Cy3 to Cy7), rhodamine dyes (e.g. rhodamine 6G), the Alexa series of dyes (Alexa 405 to Alexa 730). Some of these dyes have been used in FRET networks (with multiple donors and acceptors). Optics for imaging all of these require detection from UV to near IR (e.g. Alex 405 to Cy7), and the Atto series of dyes (Atto-Tec GmbH). The Alexa series of dyes from Invitrogen cover the whole spectral range. They are very bright and photostable.
- Example dye pairs for FRET labeling include Alexa-405/Alex-488, Alexa- 488/Alexa-546, Alexa-532/Alexa-594, Alexa-594/Alexa-680, Alexa-594/Alexa-700, Alexa-700/Alexa-790, Cy3/Cy5, Cy3.5/Cy5.5, and Rhodamine-Green/Rhodamine- Red, etc.
- Fluorescent metal nanoparticles such as silver and gold nanoclusters can also be used (Richards ei al. (2008) J Am Chem Soc 130:5038-39; Vosch et al.
- the present invention provides a kit for predicting, diagnosing, or prognosing interstitial lung disease in a human subject, the kit including (e.g. consisting essentially of) at least one probe or primer for detecting the presence or absence of at least one genetic variation.
- the at least one genetic variation includes a genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2.
- the kit includes at least one primer or probe for detecting more than one genetic variant of TOLLIP and/or SPPL2C and/or MDGA2.
- the kit includes at least one probe or primer for detecting additional genetic variants diagnostic or predictive of risk for interstitial lung disease, e.g., a genetic variant of MUC5B, such as rs37055950.
- the kit includes a probe or primer for detecting one or more SNPs selected from rs11 1521887, rs5743894, rs5743890, rs17690703, and rs7144383.
- the kit may include probes or primers for detecting rs1 11521887, rs5743894, rs5743890, rs17690703, and rs7144383 alone or in any combination.
- the kit may include additional primers or probes for detecting the presence of detecting the presence or absence of rs37055950 and rs1 11521887, rs5743894, rs5743890, rs17690703, or rs7144383 in any combination.
- the kit includes at least one probe or primer includes at least one probe or primer for detecting one or more of the genetic variants listed in Fig. 7.
- the kit may include probes or primers for detecting the one or more genetic variants listed in Fig. 7 alone or in any possible combination of from two to 52 of the listed genetic variants. If the kit includes a probe or primer for detecting rs35705950, the kit also includes a probe or primer for detecting at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7.
- kits for predicting, diagnosing, or prognosing interstitial lung disease in a human subject “consisting essentially of” certain types of probes or primers is intended to capture kits that include probes or primers that are suitable primarily for detecting genetic variants associated with interstitial lung disease in humans, although the kits may also include additional probes or primers used as controls, for example, probes or primers for detecting housekeeping genes such ⁇ - actin, tubulin, or glyceraldehyde-3-phosphate dehydrogenase, for example.
- the use of the transitional phrase "consisting essentially of” is intended to exclude arrays containing thousands of probes, the vast majority of which are unrelated to interstitial lung disease.
- the kits may include buffers, enzymes, labels, and the like, for example, for use in isolating DNA or mRNA, generating cDNA, or for amplifying and/or detecting and/or sequencing specific SNPs.
- the kit includes (or consists essentially of) a nucleic acid primer capable of hybridizing to a genetic variant in the TOLLIP gene (e.g., a TOLLIP nucleic acid), SPPL2C gene (e.g., a SPPL2C nucleic acid), or MDGA2 gene (e.g., MDGA2 nucleic acid).
- the genetic variant has been extracted from a human subject with an interstitial lung disease, or suspected of having an interstitial lung disease.
- the genetic variant is an amplification product of DNA extracted from a human subject with an interstitial lung disease, or suspected of having an interstitial lung disease.
- the interstitial lung disease is a pulmonary fibrotic condition.
- the kit includes a first nucleic acid probe (e.g. , a labeled probe) capable of hybridizing to an amplification product of a genetic variant in the TOLLIP gene (e.g., a TOLLIP nucleic acid), SPPL2C gene (e.g., a SPPL2C nucleic acid), or MDGA2 gene (e.g. , MDGA2 nucleic acid).
- a first nucleic acid probe e.g. , a labeled probe
- a first nucleic acid probe capable of hybridizing to an amplification product of a genetic variant in the TOLLIP gene (e.g., a TOLLIP nucleic acid), SPPL2C gene (e.g., a SPPL2C nucleic acid), or MDGA2 gene (e.g. , MDGA2 nucleic acid).
- the kit includes a second nucleic acid probe capable of hybridizing to an amplification product of a genetic variant in the TOLLIP gene (e.g., a TOLLIP nucleic acid), SPPL2C gene (e.g., a SPPL2C nucleic acid), or MDGA2 gene (e.g., MDGA2 nucleic acid).
- the second nucleic acid probe is capable of hybridizing to a different sequence than the first probe.
- only one of the nucleic acid probes hybridizes to the variant nucleotide(s) (e.g., in the case of a SNP), while the other nucleic acid probe hybridizes to a nearby sequence.
- the second probe is labeled, e.g., with a different label than the first probe.
- the first nucleic acid probe is labeled with a first label
- the second nucleic acid probe is labeled with a second label, wherein the first and second label form a FRET pair (are capable of fluorescence resonance energy transfer) when hybridized to the genetic variant TOLLIP gene (e.g., a TOLLIP nucleic acid), SPPL2C gene (e.g., a SPPL2C nucleic acid), or MDGA2 gene (e.g. , MDGA2 nucleic acid), or amplification product thereof.
- TOLLIP gene e.g., a TOLLIP nucleic acid
- SPPL2C gene e.g., a SPPL2C nucleic acid
- MDGA2 gene e.g. , MDGA2 nucleic acid
- the kit includes (or consists essentially of) primers or at least one probe capable of detecting a genetic variant, e.g., as described above, depending on the detection method selected.
- the kit includes primers or at least one probe capable of detecting a genetic variant in a region selected from the group consisting of 11p15.5, 14q21.3, and 17q21.31.
- the kit includes primers or at least one probe capable of detecting at least one genetic variant in 11p15.5 (e.g., rs111521887, rs5743894, rs5743890, and rs35705950).
- the kit includes primers or probes capable of detecting more than one (e.g., 2, 3, 4, 5, 5-10, 10-20, or more) genetic variant in 11p15.5 and 14q21.3 (e.g., rs7144383). In some embodiments, the kit includes primers or probes capable of detecting more than one (e.g., 2, 3, 4, 5, 5-10, 10-20, or more) genetic variant in 11 p15.5 and 17q21.31 (e.g., rs17690703, a genetic inversion, or copy number variation). In some embodiments, the kit includes primers or probes capable of detecting more than one (e.g., 2, 3, 4, 5, or more) genetic variant in 14q21.3 and 17q21.31. In some embodiments, the kit includes primers or probes capable of detecting more than one (e.g., 2, 3, 4, 5, 5-10, 10-20, or more) genetic variant in 11p15.5, 14q21.3, and 17q21.31.
- the primers and/or probes are labeled, e.g., with fluorescent labels or FRET labels. In some embodiments, the primers and/or probes are unlabeled. In some embodiments, the kit includes primers and/or probes that detect both a variant allelic sequence and the dominant allelic sequence at a selected genetic variant site, e.g., with different labels, or designed to generate amplification or primer extension products with different masses.
- the kit further includes at least one control sample, e.g., sample(s) with dominant allele(s) at the selected genetic variation site(s), or sample(s) with variant allele(s) at the selected genetic variation site(s).
- the kit includes a polymerase.
- nucleic acid complexes e.g., formed in in vitro assays to indicate the presence of a genetic variant sequence.
- a nucleic acid complex can also be formed to detect the presence of a dominant allelic sequence, depending on the design of the probe or primer, e.g., in assays to distinguish homozygous and heterozygous subjects.
- the complex comprises a first nucleic acid hybridized to a genetic variant nucleic acid, wherein the genetic variant nucleic acid is a genetic variant in a region selected from 11 p15.5, 14q21.3, and 17q21.31.
- the genetic variant nucleic acid is an amplification product.
- the genetic variant nucleic acid is on genomic DNA, e.g., from a subject that has or is suspected of having an interstitial lung disease.
- the first nucleic acid is an amplification product or a primer extension product.
- the first nucleic acid is labeled.
- the nucleic acid complex further comprises a second nucleic acid hybridized to the genetic variant nucleic acid.
- the second nucleic acid is labeled e.g., with a FRET or other fluorescent label.
- the first and second nucleic acids form a FRET pair when hybridized to a genetic variant sequence.
- the genetic variant is in the TOLLIP gene (e.g., rs1 11521887, rs5743894, rs5743890). In some embodiments, the genetic variant is in the MDGA2 gene (e.g., rs7144383). In some embodiments, the genetic variant is in the SPPL2C gene (e.g., rs17690703, a genetic inversion, or copy number variation).
- an in vitro complex comprising a first nucleic acid probe (e.g., a labeled probe) hybridized to a genetic variant nucleic acid, wherein said genetic variant nucleic acid comprises a genetic variant TOLLIP, SPPL2C or MDGA2 gene sequence, wherein said genetic variant nucleic acid is extracted from a human subject with an interstitial lung disease or suspected of having an interstitial lung disease, or is an amplification product thereof.
- the complex further comprises a second nucleic acid probe (e.g., labeled with a different label) hybridized to said genetic variant nucleic acid.
- first nucleic acid probe comprises a first label and said second nucleic acid probe comprises a second label, wherein said first and second label are capable of fluorescence resonance energy transfer.
- the complex further comprises an enzyme, such as a
- DNA polymerase e.g., standard DNA polymerase or thermally stable polymerase such as Taq
- ligase e.g., ligase
- MUC5B and TOLLIP genes reside on the same genetic locus. Based on the analysis performed, the association of TOLLIP genetic variants was found to be independent from association with the previously reported MUC5B promoter SNP, rs35705950, on IPF susceptibility. Notably, the minor allele of TOLLIP SNP, rs5743890_G, was discovered to be a "protective" allele, as it lowered susceptibility to IPF compared with controls. However, mortality analysis demonstrated that individuals who developed IPF despite having the protective rs5743890_G allele had increased mortality in two independent case series and in a meta-analysis. The MUC5B/TOLLIP region on chromosome 11 p15.5 exemplifies the association patterns, disease susceptibility and outcomes.
- the Toll interacting protein (Tollip), encoded by the TOLLIP gene, is known to be a critical regulator of Toll-like receptor (TLR)-mediated innate immune responses and transforming growth factor- ⁇ (TGF- ⁇ ) signaling pathway.
- TLR Toll-like receptor
- TGF- ⁇ transforming growth factor- ⁇
- Tollip activates Myd88-dependent NF-kB to modulate TLR signaling and membrane trafficking; interacts with Smad7 to modulate intracellular trafficking and negatively regulated TGF- ⁇ signaling pathway by degrading ubiquitinated TGF- ⁇ type 1 receptor; interacts with caveolin-1 interacting protein in monocytes, regulating signaling in antigen-presenting cells to induce antigen specific proliferation of T-cell proliferation, B cells, or both.
- TOLLIP polymorphisms are involved in regulation of TLR2 and TLR4 and are associated with susceptibility to tuberculosis, atopic dermatitis, sepsis, and TOLLIP is differentially hypomethylated in IPF lungs. Lastly, failure to upregulate TOLLIP expression in inflammatory bowel disease, may lead to chronic inflammation.
- Chromosome 17q21 region has been associated with Parkinson's, multiple sclerosis, Alzheimer's, androgenic alopecia, and interestingly, with the response to inhaled corticosteroids in asthma and COPD.
- the minor allele rs17690703_T in the 17q21.31 region was associated with decreased susceptibility for IPF development and also conferred increased mortality in Inter une, UChicago, and in the meta-analysis.
- H2 a known inversion, referred to as H2
- H2 in a large region of conserved LD on the chromosome, which is positively selected in Europeans.
- CNVs copy number variants
- MDGA2 a novel region, resides on 14q21.23 and showed association with
- MDGA2 is a paralog for ICAM, which has been recently demonstrated as a potential biomarker of IPF disease activity. The instant findings indicate the importance of this gene in IPF.
- IPF is a heterogeneous disease and, by definition, is a diagnosis of exclusion. As such, misdiagnoses are possible, which might lead to a reduction in power.
- all subjects met currently accepted criteria for diagnosis as outlined by ATS/ERS/JRS/ELAT with many having been vetted with core pathology and radiology as in Inter une, ACE-IPF, as well as participation in variety of studies.
- a three-stage association study was conducted including a discovery GWAS for susceptibility to IPF in Stage 1 , and replicated the findings in two independent case-control association studies (Stage 2 and Stage 3, respectively). Association with mortality was evaluated in three case series. A flowchart illustrating the strategic approach used is shown in Fig. 2.
- Stage 1 samples consisting of African-Americans (AA) and European-Americans (EA) were collected for the discovery phase of the genome- wide association study (GWAS), while Stages 2 and 3 consisting of only EA samples were collected for two independent replication studies (replication 1 and 2, respectively). All eligible subjects were at least 35 years of age and reported having symptoms of idiopathic interstitial pneumonia for at least 3 months.
- a high-resolution computed tomographic scan was required to show definite or probable idiopathic interstitial pneumonia in accordance with predefined criteria, 14 and a surgical lung biopsy confirming UIP, was obtained in 37.3% of subjects in the discovery GWAS stage. Subjects with clinically significant exposure to known fibrogenic agents or another cause of interstitial lung disease were excluded.
- EA European American
- Stage 1 genotyping was conducted using the Genome-Wide Human SNP 6.0 array (Affymetrix, Santa Clara, CA). Stages 2 and 3 genotyping was conducted using the iPLEX GoldTM Platform (Sequenom, San Diego, CA). Genotype imputation was performed with IMPUTE2 using European ancestry panel data from the 1000 Genomes Project as a reference. Association testing was performed using SNPTEST software (v2.3). 7 Fifty-two SNPs selected in 19 loci showing an association with IPF (p ⁇ 10 "4 ) in Stage 1 were carried forward to Stage 2. As the selected SNPs with the lowest p-value in Stage 1 were all a result of imputation, their association was validated by genotyping using the iPLEX GoldTM Platform. Six SNPs in 3 loci achieving an overall p ⁇ 5x10 "8 (i.e. Stage 1 and 2 combined) were carried forward to Stage 3.
- Genotypes were recalled plate-by-plate in the study, including those downloaded from dbGaP using "crlmm" package, a new implementation of the Corrected Robust Linear Model with Maximum Likelihood Classification (CRLMM) algorithm, available through the Oligo package at Bioconductor. 18, 19
- Samples were excluded from the analysis if they failed any of several quality metrics: low call rate (below 97% or 93% for production plate with > 35 samples or with ⁇ 35 samples, respectively), incompatibility between reported gender and genetically determined gender, or incompatibility between reported race and genetically determined race. Samples were also checked for unexpected familial relationships using pairwise IBD estimation in PLINK. 20 The total number of European-American IPF case and control samples passing all initial QC tests was 575 and 1 ,427 (1 ,340 of the available 1,442 cases from dbGaP and 87 of the 103 cases from University of Pittsburgh), respectively.
- Genome-wide SNP imputation was performed for the cleaned dataset to identify additional SNPs possibly showing associations.
- SHAPEIT 23 software was used to estimate phased haplotypes from the directly observed genotype data. Haplotypes derived from a European ancestry panel, consisting on samples from CEU, FIN, GBR, IBS and TSI from 1000 Genomes Project (February 2012 release), was used as a reference. Imputation was conducted using IMPUTE2. The inflation factor ( ⁇ ) between cases and controls across all SNPs was 1.06.
- SNPTEST software (v2.3) 24 was used to calculate p-values based on a one degree-of-freedom score test for a logistic regression which assumes that the allele effect on the genotype for each SNP is additive.
- the score test implemented in SNPTEST allows for genotypic uncertainty via missing data likelihood, therefore it is applicable to both imputed genotypic data (i.e. in Stage 1) and to directly genotyped data (i.e. all stages).
- P- values were calculated for each stage separately, for Stages 1 and 2 combined, and finally for a joint analysis with all stages combined as one sample.
- Model parameters were estimated with a random subset of 200 individuals before imputation on the entire dataset. Regions were deemed for follow-up in Stage 2 if they had a SNP with an association p ⁇ 10 "4 in Stage 1. A minimum of 2 SNPs was selected from each region for Stage 2 genotyping.
- the linkage disequilibrium (LD) of those two SNPs was low (? ⁇ 0.2), where one of them was the variant with direct genotyping data showing the lowest p-value, and the other was the variant with imputed data showing the lowest p-value. Based on these criteria, a total of 40 SNPs for 19 loci were selected (2 SNPs per loci except for chrl VTOLUP, chrl 7/SPPL2C, and chr7/MAD1L1 regions with 3 SNPs; for c rtlS H region with only 1 SNP).
- tSNPs tagging SNPs
- haplotype i 2 included in TagIT 3.03 software.
- CEU, FIN, GBR, IBS and TSI European individuals
- Linkage disequilibrium (LD) between SNPs in the MUC5B/T0LLIP region was measured using pairwise r 2 measures. 8
- the mode of inheritance for these SNPs was determined by comparing the odds ratios of the heterozygous and at-risk homozygous genotypes.
- a regression-based conditional analysis of the interaction between MUC5B and TOLLIP SNPs on IPF susceptibility was implemented in the R statistical package.
- Fig. 9 and Fig. 10 Demographic and clinical characteristics of IPF patients and controls in each stage are shown in Fig. 9 and Fig. 10.
- cases in the discovery stage had a wide range of disease severity and age.
- the Stage 2 patients were a blend of cases with milder (InterMune) and more severe disease undergoing lung transplantation (LTOG), yielding a very similar group to Stage 1 based on the overall physiologic severity as assessed by forced vital capacity (FVC) and diffusing capacity for carbon monoxide (DLCO) (Fig. 10).
- FVC forced vital capacity
- DLCO carbon monoxide
- the Stage 3 patients were more severe, derived from the LTOG and ACE-lPF study. However all IPF cases met diagnostic criteria 16 and were all of similar age and gender. Characteristics of cases with follow-up data for survival analysis are shown in Fig. 12.
- a total of 19 genomic loci with an association were identified from Stage 1 discovery GWAS. Fifty-two SNPs were compiled from the combination of genotyped, imputed, and tSNPs.
- Fig. 7 summarizes annotations for these loci, allele frequency in reference populations (CEU, EUR), IPF cases, controls, as well as their association p-values with susceptibility to IPF.
- Directly genotyped SNPs in Stage 2 nominally replicated many of the associations with IPF susceptibility detected in Stage 1 GWAS.
- Five imputed SNPs and the previously identified UC5B promoter SNP reached genome-wide significance levels (p-value ⁇ 4.2 x 10 "8 ) in a joint analysis of Stage 1 and 2. These six SNPs were re-genotyped in Stage 1 samples and the association confirmed.
- chrl 1 highlighted loci of chrl 1 p15.5 containing SNPs of TOLLIP (rs111521887, rs5743894, rs5743890) and MUC5B (rs35705950); chr17q21.31 of SPPL2C (rs17690703) and Chr14q21.3 of MDGA2 (rs7144383).
- Diamonds and circles represent individual SNP of the GWA screen using genotyped and imputed data, respectively. Colored diamonds indicate SNP data obtained by the analysis of 542 IPF cases and 542 controls. Additional tSNPs selected for better coverage are included. Associations were assessed assuming recessive and additive modes of inheritance for the MUC5B/TOLLIP locus and the SPPL2C locus, respectively. Levels of linkage disequilibria (r 2 ) with the best-associated SNP (red diamonds) are color-coded. Blue lines indicate recombination fractions as estimated from the European panel sample.
- the r 2 values of MUC5B promoter SNP, rs35705950, and TOLLIP S Ps were 0.07, 0.16, and 0.01, respectively.
- These low levels of LD indicate that the signals of association for TOLLIP SNPs are independent from MUC5B (Fig. 4A).
- the mode of effect for the MUC5B SNP (dominant) was different than that for the TOLLIP SNPs (additive or recessive), providing additional evidence that these are independent signals.
- EA European ancestry
- H2 status was based on the presence of all 3 SNPs that tag H2 (rs916793, rs2902662, rs17651213). This method allowed H2 assignment to all but 3 patients in this cohort. The addition of a proxy SNP (rs199448) allowed H2 status to be determined for the 3 remaining patients. These data suggest that presence of an H2 haplotype increases susceptibility to IPF.
- the cohort of 120 EA individuals was then stratified based on H2 (absent vs. present) and SPPL2C (wild-type (WT) vs variant (Var)) status. Inclusion of SPPL2C in the stratification is necessary given the strong correlation between the two variants and potential confounding by SPPL2C.
- the vast majority of patients belonged to either the H2(-)/ SPPL2CANT or H2(+)/SPPL2C-Var group, making it difficult to draw a conclusion about the two smaller groups. When comparing one group to another the statistical significance was lost.
- a barrier inherent in the large amount of data generated by next generation sequencing of genetic regions involves methods to evaluate uncommon or rare variants.
- regions with common variants have a greater number of uncommon or rare variants as well.
- One approach using the fundamentals of a logistic regression involves an L1 -regularized regression to accommodate large number of variants.
- the Lasso method is a shrinkage and selection method for linear regression. It minimizes the usual sum of squared errors, with a bound on the sum of the absolute values of the coefficients. It has connections to soft-thresholding of wavelet coefficients, forward stagewise regression, and boosting methods.
- a method of determining whether a human subject has or is at risk of developing an interstitial lung disease comprising detecting whether the genome of the subject comprises a genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2 and determining whether the subject has or is at risk of developing an interstitial lung disease, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease.
- interstitial lung disease is a fibrotic interstitial lung disease.
- interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
- a method of prognosing an interstitial lung disease in a human subject comprising detecting whether the genome of the subject comprises a genetic variant of TOLLIP or SPPL2C and determining a prognosis for the subject, the presence of the genetic variant gene being prognostic of increased or decreased survival.
- interstitial lung disease is a fibrotic interstitial lung disease.
- interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
- a method of detecting the presence or absence of at least one genetic variant in a human subject comprising: detecting the presence or absence of at least one genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2 in a sample from the subject.
- the at least one genetic variant includes one or more of a single nucleotide polymorphism selected from the group consisting of rs1 1521887, rs5743894, rs5743890, rs17690703, and rs7144383.
- interstitial lung disease is a fibrotic interstitial lung disease or familial interstitial pneumonia.
- a method of detecting the presence or absence of at least two genetic variants in a human subject having or suspected of being at risk for developing an interstitial lung disease comprising: detecting the presence or absence of at least two of the genetic variants listed in Fig. 7 in a sample from the subject.
- a method of testing for interstitial lung disease in a human subject comprising: detecting a level of TOLLIP gene expression in a sample from the subject, a low level of TOLLIP gene expression relative to a control being indicative of interstitial lung disease.
- a method of treating a human subject having an interstitial lung disease comprising: detecting a level of TOLLIP expression according to any one of embodiments 42-44; and if the subject has a low level of TOLLIP expression relative to a control, administering to the subject an amount of a Tollip agonist, Tollip or a genetic construct expressing TOLLIP effective to treat the interstitial lung disease.
- kits for predicting, diagnosing, or prognosing interstitial lung disease in a human subject consisting essentially of: at least one probe or primer for detecting the presence or absence of at least one genetic variation in at least one of TOLLIP, SPPL2C, and MDGA2.
- kits of embodiment 46 wherein the at least one probe or primer includes probes or primers for detecting at least one genetic variation in TOLLIP.
- kits of embodiment 46 or 47, wherein the at least one probe or primer includes probes or primers for detecting at least one genetic variation in SPPL2C.
- kit of any one of embodiments 46-50, wherein the genetic variation includes at least one of rs111521887, rs5743894, rs5743890, rs17690703, rs7144383, and rs35705950.
- kits for predicting, diagnosing, or prognosing interstitial lung disease in a human subject comprising: at least one probe or primer for detecting the presence or absence of at least two genetic variations selected from the genetic variations listed in Fig. 7.
- kit of embodiment 53 wherein the kit comprises probes and/or primers for detecting the presence or absence of from two to 52 of the genetic variations listed in Fig. 7.
- kit of embodiment 54 wherein the kit comprises probes and/or primers for detecting the presence or absence of from two to 44 of the genetic variations listed in Fig. 11.
- a method of determining whether a human subject has or is at risk of developing an interstitial lung disease comprising detecting whether the genome of the subject comprises at least two genetic variants selected from the group of variants listed in Fig. 7 and determining whether the subject has or is at risk of developing an interstitial lung disease, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease.
- interstitial lung disease is a fibrotic interstitial lung disease.
- interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
- a method of prognosing an interstitial lung disease in a human subject comprising detecting whether the genome of the subject comprises at least two of the genetic variants listed in Fig. 7 and determining a prognosis for the subject, the presence of the genetic variant gene being prognostic of increased or decreased survival.
- interstitial lung disease is a fibrotic interstitial lung disease.
- interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
- a method of prognosing an interstitial lung disease in a human subject comprising detecting whether the genome of the subject comprises an inversion in the 17q21.31 chromosomal region and determining a prognosis for the subject, the presence of the inversion being prognostic of increased or decreased survival.
- a kit comprising a nucleic acid primer capable of hybridizing to a genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
- kit of claim 65 wherein said genetic variant has been extracted from a human subject with an interstitial lung disease or is an amplification product of a nucleic acid extracted from a human subject with an interstitial lung disease.
- kits of claim 65 or 66 wherein said interstitial lung disease is a pulmonary fibrotic condition.
- kit of one of claims 65-67 further comprising a first labeled nucleic acid probe capable of hybridizing to an amplification product of said genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
- kit of claim 68 further comprising a second labeled nucleic acid probe capable of hybridizing to an amplification product of said genetic variant
- TOLLIP nucleic acid SPPL2C nucleic acid, or MDGA2 nucleic acid.
- kit of claim 69 wherein said first labeled nucleic acid probe comprises a first label and said additional labeled nucleic acid probe comprises a second label, wherein said first and second label are capable of fluorescence resonance energy transfer when hybridized to said genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
- An in vitro complex comprising a first nucleic acid probe hybridized to a genetic variant nucleic acid, said genetic variant nucleic acid comprising a genetic variant TOLLIP, SPPL2C or MDGA2 gene sequence, wherein said genetic variant nucleic acid is extracted from a human subject with an interstitial lung disease or is an amplification product of a nucleic acid extracted from a human subject with an interstitial lung disease.
- An in vitro complex comprising a thermally stable polymerase bound to a genetic variant nucleic acid, said genetic variant nucleic acid comprising a genetic variant TOLLIP, SPPL2C or MDGA2 gene sequence, wherein said genetic variant nucleic acid is extracted from a human subject with an interstitial lung disease or is an amplification product of a nucleic acid extracted from a human subject with an interstitial lung disease.
- Carvalho B Bengtsson H, Speed TP, Irizarry RA. Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics (Oxford, England) 2007;8(2):485-99. 19. Carvalho BS, Louis TA, Irizarry RA. Quantifying uncertainty in genotype calls. Bioinformatics (Oxford, England) 2010;26(2):242-9.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Pathology (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Disclosed are methods and kits for diagnosing or predicting risk for developing interstitial pulmonary fibrosis or predicting survival of individuals with interstitial pulmonary fibrosis.
Description
GENETIC VARIANTS IN INTERSTITIAL LUNG DISEASE SUBJECTS
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR
DEVELOPMENT
Not applicable.
CROSS-REFERENCE TO RELATED APPLICATIONS
This PCT application claims the benefit of US Provisional Application No. 61/759,820, filed February 1 , 2013, which is incorporated by reference herein.
INTRODUCTION
Idiopathic Pulmonary Fibrosis (IPF) is a low prevalence, devastating disease of unknown etiology characterized by an interstitial fibrotic process and high mortality. The course of disease is heterogeneous with a 2-5 year median survival from diagnosis. To date, lung transplantation remains the only successful treatment option, while immunosuppression regimens were recently demonstrated as harmful. Therefore, identifying genetic variants associated with susceptibility to IPF and alleles involved in the heterogeneity of disease course and mortality remains a major challenge.
A common single nucleotide polymorphism (SNP) of MUC5B is present in 34- 38% of non-familial IPF cases, suggesting that a genetic underpinning contributes to disease. A prior genome-wide association study (GWAS) examining approximately 250,000 SNPs in 159 IPF cases demonstrated the association of an intronic common variant in telomerase reverse transcriptase {TERT) gene with susceptibility to IPF1 Mutations in TERT or telomerase RNA component (TERC) genes result in telomere shortening and are associated with both familial and non-familial IPF. Rare heterozygous variants in surfactant protein A2 (SFTPA2) and surfactant protein C (SFTPC) genes have also been implicated in familial IPF. These findings suggest that the etiology of IPF may integrate multiple genetic loci.
There is a need in the art to identify genetic variants in interstitial lung disease subjects. Provided here are methods and compositions addressing these and other needs in the art. l
SUMMARY OF THE INVENTION
In certain embodiments is provided compositions and methods for identifying genetic variants in interstitial lung disease subjects. Also provided are compositions and methods of determining whether a human subject has, or is at risk of developing, an interstitial lung disease. In certain embodiments, the methods include detecting whether the genome of the subject comprises a genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease. In certain embodiments, more than one genetic variant of TOLLIP and/or SPPL2C and/or MDGA2 is detected. In certain embodiments, in addition to detecting genetic variants of TOLLIP and/or SPPL2C and/or MDGA2, the method includes detecting whether the genome of the subject includes other genetic variants diagnostic or predictive of risk for interstitial lung disease, e.g., a genetic variant of MUC5B, such as rs35705950.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 shows probability of survival over time, revealing the association with marker SNPs in the ch11 p15.5 and ch17q21.31 regions on InterMune, UChicago and UPittsburgh case series. Brown=homozygote minor; green=heterozygote; blue=homozygote major for each single nucleotide polymorphism.
Fig. 2A is a flowchart showing the approach used in a three-stage association study; Fig. 2B is flowchart of mortality analyses by regression.
Fig. 3. QQ plot of the genome-wide association study (GWAS) of idiopathic pulmonary fibrosis (IPF).
Fig. 4 includes regional association plots showing the IPF-associated regions in Ch11p15.5 (Fig. 4A) and Ch17q21.31 (Fig. 4B).
Fig. 5 survival probability over time for people with or without H2 and with or without an SPPL2C variant.
Fig. 6A is a KM plot for TOLLIP*/MUC5B risk alleles; Fig. 6B is KM plot by Risk Index for WPGS using all 3 genes (TOLLIP, SPPL2C & MUC5B) and categorizing into 4 groups.
Fig. 7A-7C is a list of top associated loci with susceptibility to IPF.
Fig. 8 is a table listing the sample sources and sizes used in a three stage study.
Fig. 9 shows the characteristics of IPF patients used in stage 1 discovery GWAS study.
Fig. 10 lists the characteristics of IPF patients by stage and availability.
Fig. 1 1 A-11 C is a list of 44 SNPs and their association p-values with susceptibility to IPF from stage 1 , stage 2, and overall.
Fig. 12 shows characteristics of IPF case series for mortality analysis.
Fig. 13 is a table showing association signals with susceptibility to IPF across stages of six SNPs followed up in Stage 3.
Fig. 14 is a table listing SNP effects on mortality.
Fig. 15 provides summaries of univariate Cox analysis for mortality.
Fig. 16 provides summaries of univariate and multivariate Cox analysis for mortality
Fig. 17 provides summaries of Kaplan-Meier survival analysis.
Fig. 18 lists predictors of survival in IPF patients identified using a univariant Cox model.
Fig. 19A-Fig. 19B lists predictors of survival in IPF patients identified using a multivariate analysis of covariance.
Fig. 20 lists 30 regions identified showing the value of aggregation and using information in addition to protein coding SNPs, with the six p values represent highest-ranking SNPs in each region in bold.
DETAILED DESCRIPTION
As described in detail below, an independent genome wide association study (GWAS) was used to identify novel polymorphisms associated with IPF susceptibility and/or mortality. The association of two novel genetic loci and the replication of a third locus in a 3-stage association study are reported herein. These loci are also associated with mortality in case series with follow-up data.
Specifically, the results obtained identified three genetic loci and replicated the association of four novel SNPs (rs11 1521887, rs5743894, rs5743890, and rs17690703) in two novel loci (ch11 p15.5/TOLL/P and ch17q21.3MSPPL2C), and the MUC5B promoter SNP (rs35705950) with IPF susceptibility in European- Americans through a three-stage case-control study. Another novel SNP (rs7144383) on a third genetic locus not previously known to be associated with IPF,
ch14q21.23/MDG/\2, was discovered to show association with IPF susceptibility, although it did not replicate in Stage 3, possibly owing to the Stage 3 sample size.
The findings reported herein provide, inter alia, for novel compositions and methods for identifying genetic variants in interstitial lung disease subjects and/or determining whether an individual has, or is at risk for developing, interstitial lung disease and/or compositions and methods for predicting prognosis, e.g., survival time or mortality, of an individual with an interstitial lung disease, for example, a fibrotic interstitial lung disease, such as IPF, or familial interstitial pneumonia. Further, the identification of genetic loci and SNPs associated with interstitial lung disease contributes to the understanding of IPF pathogenesis and provides potential targets for novel treatment paradigms.
Definitions
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by a person of ordinary skill in the art. See, e.g., Lackie, DICTIONARY OF CELL AND MOLECULAR BIOLOGY, Elsevier (4th ed. 2007); Sambrook et al., MOLECULAR CLONING, A LABORATORY MANUAL, Cold Springs Harbor Press (Cold Springs Harbor, NY 1989). The term "a" or "an" is intended to mean "one or more." The term "comprise" and variations thereof such as "comprises" and "comprising," when preceding the recitation of a step or an element, are intended to mean that the addition of further steps or elements is optional and not excluded. The following definitions are provided to facilitate understanding of certain terms used frequently herein and are not meant to limit the scope of the present disclosure.
The term "nucleic acid" refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form, and complements thereof. "Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical equivalents used herein means at least two nucleotides covalently linked together. Oligonucleotides are typically from about 5, 6, 7, 8, 9, 0, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of any length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, etc. The term "nucleotide" typically refers to a single unit of a polynucleotide, i.e., a monomer. Nucleotides can be ribonucleotides, deoxyribonucleotides, or modified versions thereof.
As used herein, a "genetic variant" refers to a mutation, single nucleotide polymorphism (SNP), deletion variant, missense variant, insertion variant, inversion, or copy number variant.
The terms "probe" or "primer" refer to one or more nucleic acid fragments whose specific hybridization to a sample can be detected. A probe or primer can be of any length depending on the particular technique it will be used for. For example, PCR primers are generally between 10 and 40 nucleotides in length, while nucleic acid probes for, e.g., a Southern blot, can be more than a hundred nucleotides in length. The probe or primers can be unlabeled or labeled as described below so that its binding to a target sequence can be detected (e.g., with a FRET donor or acceptor label). The probe or primer can be designed based on one or more particular (preselected) portions of a chromosome, e.g., one or more clones, an isolated whole chromosome or chromosome fragment, or a collection of polymerase chain reaction (PCR) amplification products. The length and complexity of the nucleic acid fixed onto the target element is not critical to the invention. One of skill can adjust these factors to provide optimum hybridization and signal production for a given hybridization and detection procedures, and to provide the required resolution among different genes or genomic locations.
Probes and primers can also be immobilized on a solid surface (e.g., nitrocellulose, glass, quartz, fused silica slides), as in an array. Techniques for producing high density arrays can also be used for this purpose (see, e.g., Fodor (1991) Science 767-773; Johnston (1998) Curr. Biol. 8: R171-R174; Schummer (1997) Biotechniques 23: 1087-1092; Kern (1997) Biotechniques 23: 120-124; U.S. Patent No. 5,143,854). One of skill will recognize that the precise sequence of particular probes and primers can be modified from the target sequence to a certain degree to produce probes that are "substantially identical" or "substantially complementary to" a target sequence, but retain the ability to specifically bind to (i.e., hybridize specifically to) the same targets from which they were derived.
A probe or primer is "capable of detecting" a genetic variant if it is complementary to a region that covers or is adjacent to the genetic variant. For example, to detect a SNP, primers can be designed on either side of the SNP, and primer extension used to determine the identity of the nucleotide at the position of the SNP. In some embodiments, FRET-labeled primers are used (at least one
labeled with a FRET donor and at least one labeled with a FRET acceptor) so that FRET signal will be detected only upon hybridization of both primers. In some embodiments, a probe is used in conditions such that it hybridizes only to a genetic variant, or only to a dominant sequence. For example, the probe can be designed to hybridize to a junction point of a genetice inversion, but not to a sequence that does not include the inversion.
Again, in the context of nucleic acids, the term "capable of hybridizing to" refers to a polynucleotide sequence that forms non-covalent, Watson-Crick bonds with a complementary sequence. One of skill will understand that the percent complementarity need not be 100% for hybridization to occur, depending on the length of the polynucleotides, length of the complementary region, and stringency of the conditions. For example, a polynucleotide (e.g., primer or probe) can be capable of hybrindizing (binding) to a polynucleotide having 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% complementarity over the stretch of the complementary region. Stringency can be increased by reducing the length of the complementary region, reducing the G-C content of the complementary region, increasing temperature and/or detergent levels, varying salt levels and pH, etc. as known in the art. In some embodiments, a polynucleotide is capable of hybridizing to a complementary sequence in standard PCR annealing conditions. In the context of detecting genetic variants, the tolerated percent complementarity or number of mismatches will vary depending on the technique used for detection (see below).
In the context of nucleic acids, the term "amplification product" refers to a polynucleotide that results from an amplification reaction, e.g., PCR and variations thereof, rtPCR, strand displacement reaction (SDR), ligase chain reaction (LCR), transcription mediated amplification (TMA), or Qbeta replication. A thermally stable polymerase, e.g., Taq, can be used to avoid repeated addition of polymerase throughout amplification procedures that involve cyclic or extreme temperatures (e.g., PCR and its variants).
The terms "label," "detectable moiety," "detectable agent," and like terms refer to a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. For example, useful labels include fluorescent dyes, luminescent agents, radioisotopes (e.g., 32P, 3H), electron- dense reagents, enzymes, biotin, digoxigenin, or haptens and proteins or other
entities which can be made detectable, e.g., by affinity. Any method known in the art for conjugating a nucleic acid or other biomolecule to a label may be employed, e.g., using methods described in Hermanson, Bioconiuqate Techniques 1996, Academic Press, Inc., San Diego. The term "tag" can be used synonymously with the term "label," but generally refers to an affinity-based moiety, e.g., a "His tag" for purification, or a "strepavidin tag" that interacts with biotin.
A "labeled" molecule (e.g., nucleic acid, protein, or antibody) is one that is bound, either covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds to a label such that the presence of the molecule may be detected by detecting the presence of the label bound to the molecule.
F5rster resonance energy transfer (abbreviated FRET), also known as fluorescence resonance energy transfer, is a mechanism describing energy transfer between two chromophores. A donor chromophore (FRET donor), initially in its electronic excited state, can transfer energy to an acceptor chromophore (FRET acceptor), which is typically less than 10 nm away, through nonradiative dipole- dipole coupling. The energy transferred to the FRET acceptor is detected as an emission of light (energy) when the FRET donor and acceptor are in proximity. A "FRET signal" is thus the signal that is generated by the emission of light from the acceptor. The efficiency of Forster resonance energy transfer between a donor and an acceptor dye separated by a distance of R is given by E = 1/[1 +(R/R0)6] with R0 being the Forster radius of the donor-acceptor pair at which E= ½. R0 is about 50-60 A for some commonly used dye pairs (e.g., Cy3-Cy5). FRET signal varies as the distance to the 6th power. If the donor-acceptor pair is positioned around R0, a small change in distance ranging from 1 A to 50 A can be measured with the greatest signal to noise. With current technology, 1 ms or faster parallel imaging of many single FRET pairs is achievable.
A "FRET pair" refers to a FRET donor and FRET acceptor pair that are capable of FRET detection.
The terms "fluorophore," "dye," "fluorescent molecule," "fluorescent dye,"
"FRET dye" and like terms are used synonymously herein unless otherwise indicated.
"Subject," "patient," "individual" and like terms are used interchangeably and refer to, except where indicated, humans and non-human animals. The term does not necessarily indicate that the subject has been diagnosed with a particular disease, but typically refers to an individual under medical supervision. A patient can be an individual that is seeking diagnosis, treatment, monitoring, adjustment or modification of an existing therapeutic regimen, etc.
As used herein, a "sample" refers to a biological sample obtained from a subject. Samples include material that is processed prior to carrying out testing, e.g., genomic DNA separated or purified from other cellular and non-cellular debris. In the context of the present disclosure, the sample includes genomic DNA from the subject, e.g., cheek swab, blood sample, mucosal sample, buccal swab, skin sample, hair, etc.
A "control" sample or value refers to a sample that serves as a reference, usually a known reference, for comparison to a test sample. For example, a test sample can be taken from a test condition, e.g., a sample from an individual of unknown disease status, and compared to samples from individuals with known conditions, e.g., healthy, or lacking a given genetic variation (negative control), or pulmonary disease or having a given genetic variation (positive control). A control can also represent an average value gathered from a number of tests or results. One of skill in the art will recognize that controls can be designed for assessment of any number of parameters. For example, a control can be devised to compare signal strength in given conditions, e.g., in the presence of a test probe, or primer. One of skill in the art will understand which controls are valuable in a given situation and be able to analyze data based on comparisons to control values. Controls are also valuable for determining the significance of data. For example, if values for a given parameter are widely variant in controls, variation in test samples will not be considered as significant.
Diagnosis, prognosis, and treatment of interstitial lung disease
Provided herein are compositions and methods for determining whether a human subject has or is at risk of developing an interstitial lung disease and/or prognosing interstitial lung disease. In certain embodiments, the methods of the invention may be used in conjunction with any other diagnostic or prognostic criterion or method, including, but not limited to, currently known criterion or methods.
In certain embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting whether the genome of the subject comprises a genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease. In certain embodiments, more than one genetic variant of TOLLIP and/or SPPL2C and/or MDGA2 is detected. In certain embodiments, in addition to detecting genetic variants of TOLLIP and/or SPPL2C and/or DGA2, the method includes detecting whether the genome of the subject includes other genetic variants diagnostic or predictive of risk for interstitial lung disease, e.g., a genetic variant of MUC5B, such as rs35705950. In some embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence or absence of one or more SNPs selected from rs11 1521887, rs5743894, rs5743890, rs17690703, and rs7144383. The presence or absence of each SNP may be detected alone or in combination with each other, i.e., the methods of the invention may include detection of one, two, three, four, or five of rs11 1521887, rs5743894, rs5743890, rs17690703, and rs7144383 in any possible combination. In certain embodiments, the method includes detecting the presence or absence of from one to five of rs1 11521887, rs5743894, rs5743890, rs17690703, and rs7144383 in any combination and the presence or absence of any other SNP associated with an interstitial lung disease or its prognosis, including, without limitation, the MUC5B SNP rs35705950.
In some embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs1 11521887 {e.g., G or other non-dominant allele). In some embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs5743894 (e.g., G or other non-dominant allele). In some embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs5743890 (e.g., G or other non-dominant allele). In some embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs17690703 (e.g., T or other non-dominant allele).
In some embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting the presence of rs7144383 (e.g., G or other non-dominant allele).
In certain embodiments, the method for determining whether a human subject has or is at risk of developing an interstitial lung disease includes detecting one or more genetic variants listed in Fig. 7. The one or more genetic variants may be detected alone or in any possible combination of from two to 52 of the listed genetic variants. If the method includes detecting rs35705950, then the method includes detecting at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7.
In certain embodiments, the method includes prognosing an interstitial lung disease in a human subject. In certain embodiments, the method comprises detecting whether the genome of the subject comprises a genetic variant of TOLLIP and/or SPPL2C prognostic of increased or decreased survival. In certain embodiments, the methods include detecting whether the genome of the subject comprises a genetic variant of MUC5B and whether the genome comprises a genetic variant of a genetic variant of TOLLIP and/or SPPL2C prognostic of increased or decreased survival. In certain embodiments, the method includes detecting whether the genome comprises rs17690703 and/or rs5743890, each of which is predictive of decreased survival. In certain embodiments, the method detects whether the genome comprises rs35705950, which is predictive of increased survival, and rs17690703 and/or rs5743890. In some embodiments, the method comprises detecting rs17690703 (e.g., T or other non-dominant allele), and prognosing reduced survival time for the subject, In some embodiments, the method comprises detecting rs5743890 (e.g., G or other non-dominant allele), and prognosing reduced survival time for the subject.
In certain embodiments, the method for prognosing the interstitial lung disease in a human subject includes detecting one or more genetic variants listed in Fig. 7. The one or more genetic variants may be detected alone or in any possible combination of from two to 52 of the listed genetic variants. If the method includes detecting rs35705950, then the method includes detecting at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7.
The present invention provides methods for detecting the presence or absence of at least one genetic variant in a human subject. In certain embodiments, the method includes detecting the presence or absence of at least one genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2 in a sample from the subject. In certain embodiments, more than one genetic variant of TOLLIP and/or SPPL2C and/or MDGA2 is detected. In certain embodiments, in addition to detecting genetic variants of TOLLIP and/or SPPL2C and/or MDGA2, the method includes detecting a genetic variant of MUC5B, such as rs355950.
In certain embodiments, the method for detecting the presence or absence of at least one genetic variant in a human subject includes detecting the presence or absence of at least one genetic variant of the genetic variants listed in Fig. 7. The one or more genetic variants may be detected alone or in any possible combination of from two to 52 of the genetic variants listed in Fig. 7. If the method includes detecting rs35705950, then the method includes detecting at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7. In certain embodiments, the at least one genetic variant includes one or more of a single nucleotide polymorphism selected from the group consisting of rs111521887, rs5743894, rs5743890, rs17690703, and rs7144383 in any possible combination.
In other embodiments, the method for detecting the presence or absence of at least one genetic variant in a human subject includes detecting the presence or absence of heterozygosity in least one genetic variant of the genetic variants listed in Fig. 7. Alternatively, the method for detecting the presence or absence of at least one genetic variant in a human subject includes detecting the presence or absence of homozygosity in least one genetic variant of the genetic variants listed in Fig. 7. The heterozygosity or homozygosity of the one or more genetic variants may be detected alone or in any possible combination of from two to 52 of the genetic variants listed in Fig. 7, wherein the genetic variant may be the same or different in the individual chromosomes present in the diploid human subject. If the method includes detecting heterozygosity or homozygosity of rs35705950, then the method includes detecting heterozygosity or homozygosity of at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7. In certain embodiments, the heterozygosity or homozygosity of at least one genetic variant includes the heterozygosity or homozygosity of one or more of a single nucleotide
polymorphism selected from the group consisting of rs111521887, rs5743894, rs5743890, rs17690703, and rs7144383 in any possible combination.
Also provided is a method for testing for interstitial lung disease in a human subject that involves detecting the level of TOLLIP gene expression in a sample from the subject, a low level of TOLLIP gene expression relative to a control being indicative of interstitial lung disease. The level of gene expression may be detected by measuring, directly or indirectly, TOLLIP mRNA or by measuring Tollip protein by any suitable method, several of which are known in the art. The control may include, for example, a sample from a human that does not have interstitial lung disease or a value or set of values, for example, a normal range, derived from several humans that do not have interstitial lung disease. A low level of TOLLIP gene expression relative to a control (standard control) indicative of interstitial lung disease is a level that is less than about 50% of the control.
In certain embodiments, the present invention includes a method of treating a human subject having an interstitial lung disease comprising detecting the level of TOLLIP expression in a sample from the subject, and if the subject has a low level of TOLLIP expression relative to a control (standard control), administering to the subject an amount of a Tollip agonist, Tollip or a genetic construct expressing TOLLIP effective to treat the interstitial lung disease. An amount effective to treat the interstitial lung disease is an amount effective to delay onset, reduce frequency and/or severity of one or more symptoms, ameliorate one or more symptoms, and/or improve comfort and/or some function of the subject, e.g., respiratory function, relative to an untreated second subject or pool of subjects, or relative to, or to the same subject prior to treatment, or after cessation of treatment.
Methods of detecting a genetic variant
The methods of the invention are not limited to any particular way of detecting the presence or absence of a genetic variant (e.g. SNP) and can employ any suitable method to detect the presence or absence of a variant(s), of which numerous detection methods are known in the art.
Dynamic allele-specific hybridization (DASH) can be used to detect a genetic variant. DASH genotyping takes advantage of the differences in the melting temperature in DNA that results from the instability of mismatched base pairs. The process can be vastly automated and encompasses a few simple principles.
Typically, the target genomic segment is amplified and separated from non- target sequence, e.g., through use of a biotinylated primer and chromatography. A probe that is specific for the particular allele is added to the amplification product. The probe can be designed to hybridize specifically to a variant sequence or to the dominant allelic sequence. The probe can be either labeled with or added in the presence of a molecule that fluoresces when bound to double-stranded DNA. The signal intensity is then measured as temperature is increased until the Tm can be determined. A non-matching sequence (either genetic variant or dominant allelic sequence, depending on probe design), will result in a lower than expected Tm.
DASH genotyping relies on a quantifiable change in Tm, and is thus capable of measuring many types of mutations, not just SNPs. Other benefits of DASH include its ability to work with label free probes and its simple design and performance conditions.
Molecular beacons can also be used to detect a genetic variant. This method makes use of a specifically engineered single-stranded oligonucleotide probe. The oligonucleotide is designed such that there are complementary regions at each end and a probe sequence located in between. This design allows the probe to take on a hairpin, or stem-loop, structure in its natural, isolated state. Attached to one end of the probe is a fluorophore and to the other end a fluorescence quencher. Because of the stem-loop structure of the probe, the fluorophore is in close proximity to the quencher, thus preventing the molecule from emitting any fluorescence. The molecule is also engineered such that only the probe sequence is complementary to the targeted genomic DNA sequence.
If the probe sequence of the molecular beacon encounters its target genomic DNA sequence during the assay, it will anneal and hybridize. Because of the length of the probe sequence, the hairpin segment of the probe will be denatured in favor of forming a longer, more stable probe-target hybrid. This conformational change permits the fluorophore and quencher to be free of their tight proximity due to the hairpin association, allowing the molecule to fluoresce.
If on the other hand, the probe sequence encounters a target sequence with as little as one non-complementary nucleotide, the molecular beacon will preferentially stay in its natural hairpin state and no fluorescence will be observed, as the fluorophore remains quenched. The unique design of these molecular beacons
allows for a simple diagnostic assay to identify SNPs at a given location. If a molecular beacon is designed to match a wild-type allele and another to match a mutant of the allele, the two can be used to identify the genotype of an individual. If only the first probe's fluorophore wavelength is detected during the assay then the individual is homozygous to the wild type. If only the second probe's wavelength is detected then the individual is homozygous to the mutant allele. Finally, if both wavelengths are detected, then both molecular beacons must be hybridizing to their complements and thus the individual must contain both alleles and be heterozygous.
A microarray can also be used to detect genetic variants. Hundreds of thousands of probes can be arrayed on a small chip, allowing for many genetic variants or SNPs to be interrogated simultaneously. Because SNP alleles only differ in one nucleotide and because it is difficult to achieve optimal hybridization conditions for all probes on the array, the target DNA has the potential to hybridize to mismatched probes. This can be addressed by using several redundant probes to interrogate each SNP. Probes can be designed to have the SNP site in several different locations as well as containing mismatches to the SNP allele. By comparing the differential amount of hybridization of the target DNA to each of these redundant probes, it is possible to determine specific homozygous and heterozygous alleles.
Restriction fragment length polymorphism (RFLP) can be used to detect genetic variants and SNPs. RFLP makes use of the many different restriction endonucleases and their high affinity to unique and specific restriction sites. By performing a digestion on a genomic sample and determining fragment lengths through a gel assay it is possible to ascertain whether or not the enzymes cut the expected restriction sites. A failure to cut the genomic sample results in an identifiably larger than expected fragment implying that there is a mutation at the point of the restriction site which is rendering it protected from nuclease activity.
PCR- and amplification-based methods can be used to detect genetic variants. For example, tetra-primer PCR employs two pairs of primers to amplify two alleles in one PCR reaction. The primers are designed such that the two primer pairs overlap at a SNP location but each matches perfectly to only one of the possible alleles. As a result, if a given allele is present in the PCR reaction, the primer pair specific to that allele will produce product but not the alternative allele with a different allelic sequence. The two primer pairs can be designed such that
their PCR products are of a significantly different length allowing for easily distinguishable bands by gel electrophoresis, or such that they are differently labeled.
Primer extension can also be used to detect genetic variants. Primer extension first involves the hybridization of a probe to the bases immediately upstream of the SNP nucleotide followed by a 'mini-sequencing' reaction, in which DNA polymerase extends the hybridized primer by adding a base that is complementary to the SNP nucleotide. The incorporated base that is detected determines the presence or absence of the SNP allele. Because primer extension is based on the highly accurate DNA polymerase enzyme, the method is generally very reliable. Primer extension is able to genotype most SNPs under very similar reaction conditions making it also highly flexible. The primer extension method is used in a number of assay formats, and can be detected using e.g., fluorescent labels or mass spectrometry.
Primer extension can involve incorporation of either fluorescently labeled ddNTP or fluorescently labeled deoxynucleotides (dNTP). With ddNTPs, probes hybridize to the target DNA immediately upstream of SNP nucleotide, and a single, ddNTP complementary to the SNP allele is added to the 3' end of the probe (the missing 3'-hydroxyl in didioxynucleotide prevents further nucleotides from being added). Each ddNTP is labeled .with a different fluorescent signal allowing for the detection of all four alleles in the same reaction. With dNTPs, allele-specific probes have 3' bases which are complementary to each of the SNP alleles being interrogated. If the target DNA contains an allele complementary to the 3' base of the probe, the target DNA will completely hybridize to the probe, allowing DNA polymerase to extend from the 3' end of the probe. This is detected by the incorporation of the fluorescently labeled dNTPs onto the end of the probe. If the target DNA does not contain an allele complementary to the probe's 3' base, the target DNA will produce a mismatch at the 3' end of the probe and DNA polymerase will not be able to extend from the 3' end of the probe.
The iPLEX® SNP genotyping method takes a slightly different approach, and relies on detection by mass spectrometer. Extension probes are designed in such a way that many different SNP assays can be amplified and analyzed in a PCR cocktail. The extension reaction uses ddNTPs as above, but the detection of the
SNP allele is dependent on the actual mass of the extension product and not on a fluorescent molecule. This method is for low to medium high throughput, and is not intended for whole genome scanning.
Primer extension methods are, however, amenable to high throughput analysis. Primer extension probes can be arrayed on slides allowing for many SNPs to be genotyped at once. Broadly referred to as arrayed primer extension (APEX), this technology has several benefits over methods based on differential hybridization of probes. Comparatively, APEX methods have greater discriminating power than methods using differential hybridization, as it is often impossible to obtain the optimal hybridization conditions for the thousands of probes on DNA microarrays (usually this is addressed by having highly redundant probes).
Oligonucleotide ligation assays can also be used to detect genetic variants. DNA ligase catalyzes the ligation of the 3' end of a DNA fragment to the 5' end of a directly adjacent DNA fragment. This mechanism can be used to interrogate a SNP by hybridizing two probes directly over the SNP polymorphic site, whereby ligation can occur if the probes are identical to the target DNA. For example, two probes can be designed; an allele-specific probe which hybridizes to the target DNA so that its 3' base is situated directly over the SNP nucleotide and a second probe that hybridizes the template upstream (downstream in the complementary strand) of the SNP polymorphic site providing a 5' end for the ligation reaction. If the allele-specific probe matches the target DNA, it will fully hybridize to the target DNA and ligation can occur. Ligation does not generally occur in the presence of a mismatched 3' base. Ligated or unligated products can be detected by gel electrophoresis, MALDI- TOF mass spectrometry or by capillary electrophoresis.
The 5'-nuclease activity of Taq DNA polymerase can be used for detecting genetic variants. The assay is performed concurrently with a PCR reaction and the results can be read in real-time. The assay requires forward and reverse PCR primers that will amplify a region that includes the SNP polymorphic site. Allele discrimination is achieved using FRET, and one or two allele-specific probes that hybridize to the SNP polymorphic site. The probes have a fluorophore linked to their 5' end and a quencher molecule linked to their 3' end. While the probe is intact, the quencher will remain in close proximity to the fluorophore, eliminating the fluorophore's signal . During the PCR amplification step, if the allele-specific probe is
perfectly complementary to the SNP allele, it will bind to the target DNA strand and then get degraded by 5'-nuclease activity of the Taq polymerase as it extends the DNA from the PCR primers. The degradation of the probe results in the separation of the fluorophore from the quencher molecule, generating a detectable signal. If the allele-specific probe is not perfectly complementary, it will have lower melting temperature and not bind as efficiently. This prevents the nuclease from acting on the probe.
Fluorescence resonance energy transfer (FRET) detection can be used for detection in primer extension and ligation reactions where the two labels are brought into close proximity to each other. It can also be used in the 5'-nuclease reaction, the molecular beacon reaction, and the invasive cleavage reactions where the neighboring donor/acceptor pair is separated by cleavage or disruption of the stem- loop structure that holds them together. FRET occurs when two conditions are met. First, the emission spectrum of the fluorescent donor dye must overlap with the excitation wavelength of the acceptor dye. Second, the two dyes must be in close proximity to each other because energy transfer drops off quickly with distance. The proximity requirement is what makes FRET a good detection method for a number of allelic discrimination mechanisms.
A variety of dyes can be used for FRET, and are known in the art. The most common ones are fluorescein, cyanine dyes (Cy3 to Cy7), rhodamine dyes (e.g. rhodamine 6G), the Alexa series of dyes (Alexa 405 to Alexa 730). Some of these dyes have been used in FRET networks (with multiple donors and acceptors). Optics for imaging all of these require detection from UV to near IR (e.g. Alex 405 to Cy7), and the Atto series of dyes (Atto-Tec GmbH). The Alexa series of dyes from Invitrogen cover the whole spectral range. They are very bright and photostable.
Example dye pairs for FRET labeling include Alexa-405/Alex-488, Alexa- 488/Alexa-546, Alexa-532/Alexa-594, Alexa-594/Alexa-680, Alexa-594/Alexa-700, Alexa-700/Alexa-790, Cy3/Cy5, Cy3.5/Cy5.5, and Rhodamine-Green/Rhodamine- Red, etc. Fluorescent metal nanoparticles such as silver and gold nanoclusters can also be used (Richards ei al. (2008) J Am Chem Soc 130:5038-39; Vosch et al. (2007) Proc Natl Acad Sci USA 104:12616-21 ; Petty and Dickson (2003) J Am Chem Soc 125:7780-81 Available filters, dichroics, multichroic mirrors and lasers can affect the choice of dye.
Kits
In certain embodiments, the present invention provides a kit for predicting, diagnosing, or prognosing interstitial lung disease in a human subject, the kit including (e.g. consisting essentially of) at least one probe or primer for detecting the presence or absence of at least one genetic variation. In certain embodiments, the at least one genetic variation includes a genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2. In certain embodiments, the kit includes at least one primer or probe for detecting more than one genetic variant of TOLLIP and/or SPPL2C and/or MDGA2. In certain embodiments, the kit includes at least one probe or primer for detecting additional genetic variants diagnostic or predictive of risk for interstitial lung disease, e.g., a genetic variant of MUC5B, such as rs37055950. In some embodiments, the kit includes a probe or primer for detecting one or more SNPs selected from rs11 1521887, rs5743894, rs5743890, rs17690703, and rs7144383. The kit may include probes or primers for detecting rs1 11521887, rs5743894, rs5743890, rs17690703, and rs7144383 alone or in any combination. In certain embodiments, the kit may include additional primers or probes for detecting the presence of detecting the presence or absence of rs37055950 and rs1 11521887, rs5743894, rs5743890, rs17690703, or rs7144383 in any combination. In certain embodiments, the kit includes at least one probe or primer includes at least one probe or primer for detecting one or more of the genetic variants listed in Fig. 7. The kit may include probes or primers for detecting the one or more genetic variants listed in Fig. 7 alone or in any possible combination of from two to 52 of the listed genetic variants. If the kit includes a probe or primer for detecting rs35705950, the kit also includes a probe or primer for detecting at least one additional genetic variant from the remaining 51 genetic variants listed in Fig. 7.
Claims directed to kits for predicting, diagnosing, or prognosing interstitial lung disease in a human subject "consisting essentially of" certain types of probes or primers is intended to capture kits that include probes or primers that are suitable primarily for detecting genetic variants associated with interstitial lung disease in humans, although the kits may also include additional probes or primers used as controls, for example, probes or primers for detecting housekeeping genes such β- actin, tubulin, or glyceraldehyde-3-phosphate dehydrogenase, for example. In this context, the use of the transitional phrase "consisting essentially of" is intended to
exclude arrays containing thousands of probes, the vast majority of which are unrelated to interstitial lung disease. In certain embodiments, the kits may include buffers, enzymes, labels, and the like, for example, for use in isolating DNA or mRNA, generating cDNA, or for amplifying and/or detecting and/or sequencing specific SNPs.
In some embodiments, the kit includes (or consists essentially of) a nucleic acid primer capable of hybridizing to a genetic variant in the TOLLIP gene (e.g., a TOLLIP nucleic acid), SPPL2C gene (e.g., a SPPL2C nucleic acid), or MDGA2 gene (e.g., MDGA2 nucleic acid). In some embodiments, the genetic variant has been extracted from a human subject with an interstitial lung disease, or suspected of having an interstitial lung disease. In some embodiments, the genetic variant is an amplification product of DNA extracted from a human subject with an interstitial lung disease, or suspected of having an interstitial lung disease. In some embodiments, the interstitial lung disease is a pulmonary fibrotic condition.
In some embodiments, the kit includes a first nucleic acid probe (e.g. , a labeled probe) capable of hybridizing to an amplification product of a genetic variant in the TOLLIP gene (e.g., a TOLLIP nucleic acid), SPPL2C gene (e.g., a SPPL2C nucleic acid), or MDGA2 gene (e.g. , MDGA2 nucleic acid). In some embodiments, the kit includes a second nucleic acid probe capable of hybridizing to an amplification product of a genetic variant in the TOLLIP gene (e.g., a TOLLIP nucleic acid), SPPL2C gene (e.g., a SPPL2C nucleic acid), or MDGA2 gene (e.g., MDGA2 nucleic acid). In some embodiments, the second nucleic acid probe is capable of hybridizing to a different sequence than the first probe. In some embodiments, only one of the nucleic acid probes hybridizes to the variant nucleotide(s) (e.g., in the case of a SNP), while the other nucleic acid probe hybridizes to a nearby sequence. In some embodiments, the second probe is labeled, e.g., with a different label than the first probe. In some embodiments, the first nucleic acid probe is labeled with a first label, and the second nucleic acid probe is labeled with a second label, wherein the first and second label form a FRET pair (are capable of fluorescence resonance energy transfer) when hybridized to the genetic variant TOLLIP gene (e.g., a TOLLIP nucleic acid), SPPL2C gene (e.g., a SPPL2C nucleic acid), or MDGA2 gene (e.g. , MDGA2 nucleic acid), or amplification product thereof.
In some embodiments, the kit includes (or consists essentially of) primers or at least one probe capable of detecting a genetic variant, e.g., as described above, depending on the detection method selected. In some embodiments, the kit includes primers or at least one probe capable of detecting a genetic variant in a region selected from the group consisting of 11p15.5, 14q21.3, and 17q21.31. In some embodiments, the kit includes primers or at least one probe capable of detecting at least one genetic variant in 11p15.5 (e.g., rs111521887, rs5743894, rs5743890, and rs35705950). In some embodiments, the kit includes primers or probes capable of detecting more than one (e.g., 2, 3, 4, 5, 5-10, 10-20, or more) genetic variant in 11p15.5 and 14q21.3 (e.g., rs7144383). In some embodiments, the kit includes primers or probes capable of detecting more than one (e.g., 2, 3, 4, 5, 5-10, 10-20, or more) genetic variant in 11 p15.5 and 17q21.31 (e.g., rs17690703, a genetic inversion, or copy number variation). In some embodiments, the kit includes primers or probes capable of detecting more than one (e.g., 2, 3, 4, 5, or more) genetic variant in 14q21.3 and 17q21.31. In some embodiments, the kit includes primers or probes capable of detecting more than one (e.g., 2, 3, 4, 5, 5-10, 10-20, or more) genetic variant in 11p15.5, 14q21.3, and 17q21.31.
In some embodiments, the primers and/or probes are labeled, e.g., with fluorescent labels or FRET labels. In some embodiments, the primers and/or probes are unlabeled. In some embodiments, the kit includes primers and/or probes that detect both a variant allelic sequence and the dominant allelic sequence at a selected genetic variant site, e.g., with different labels, or designed to generate amplification or primer extension products with different masses.
In some embodiments, the kit further includes at least one control sample, e.g., sample(s) with dominant allele(s) at the selected genetic variation site(s), or sample(s) with variant allele(s) at the selected genetic variation site(s). In some embodiments, the kit includes a polymerase.
In vitro complexes
Provided herein are nucleic acid complexes, e.g., formed in in vitro assays to indicate the presence of a genetic variant sequence. One of skill will understand that a nucleic acid complex can also be formed to detect the presence of a dominant allelic sequence, depending on the design of the probe or primer, e.g., in assays to distinguish homozygous and heterozygous subjects.
In some embodiments, the complex comprises a first nucleic acid hybridized to a genetic variant nucleic acid, wherein the genetic variant nucleic acid is a genetic variant in a region selected from 11 p15.5, 14q21.3, and 17q21.31. In some embodiments, the genetic variant nucleic acid is an amplification product. In some embodiments, the genetic variant nucleic acid is on genomic DNA, e.g., from a subject that has or is suspected of having an interstitial lung disease. In some embodiments, the first nucleic acid is an amplification product or a primer extension product. In some embodiments, the first nucleic acid is labeled. In some embodiments, the nucleic acid complex further comprises a second nucleic acid hybridized to the genetic variant nucleic acid. In some embodiments, the second nucleic acid is labeled e.g., with a FRET or other fluorescent label. In some embodiments, the first and second nucleic acids form a FRET pair when hybridized to a genetic variant sequence.
In some embodiments, the genetic variant is in the TOLLIP gene (e.g., rs1 11521887, rs5743894, rs5743890). In some embodiments, the genetic variant is in the MDGA2 gene (e.g., rs7144383). In some embodiments, the genetic variant is in the SPPL2C gene (e.g., rs17690703, a genetic inversion, or copy number variation).
Further provided is an in vitro complex comprising a first nucleic acid probe (e.g., a labeled probe) hybridized to a genetic variant nucleic acid, wherein said genetic variant nucleic acid comprises a genetic variant TOLLIP, SPPL2C or MDGA2 gene sequence, wherein said genetic variant nucleic acid is extracted from a human subject with an interstitial lung disease or suspected of having an interstitial lung disease, or is an amplification product thereof. In some embodiments, the complex further comprises a second nucleic acid probe (e.g., labeled with a different label) hybridized to said genetic variant nucleic acid. In some embodiments, first nucleic acid probe comprises a first label and said second nucleic acid probe comprises a second label, wherein said first and second label are capable of fluorescence resonance energy transfer.
In some embodiments, the complex further comprises an enzyme, such as a
DNA polymerase (e.g., standard DNA polymerase or thermally stable polymerase such as Taq) or ligase.
Genetic variants associated with interstitial lung disease
MUC5B and TOLLIP genes reside on the same genetic locus. Based on the analysis performed, the association of TOLLIP genetic variants was found to be independent from association with the previously reported MUC5B promoter SNP, rs35705950, on IPF susceptibility. Notably, the minor allele of TOLLIP SNP, rs5743890_G, was discovered to be a "protective" allele, as it lowered susceptibility to IPF compared with controls. However, mortality analysis demonstrated that individuals who developed IPF despite having the protective rs5743890_G allele had increased mortality in two independent case series and in a meta-analysis. The MUC5B/TOLLIP region on chromosome 11 p15.5 exemplifies the association patterns, disease susceptibility and outcomes.
The Toll interacting protein (Tollip), encoded by the TOLLIP gene, is known to be a critical regulator of Toll-like receptor (TLR)-mediated innate immune responses and transforming growth factor-β (TGF-βΙ) signaling pathway. Tollip activates Myd88-dependent NF-kB to modulate TLR signaling and membrane trafficking; interacts with Smad7 to modulate intracellular trafficking and negatively regulated TGF-β signaling pathway by degrading ubiquitinated TGF- β type 1 receptor; interacts with caveolin-1 interacting protein in monocytes, regulating signaling in antigen-presenting cells to induce antigen specific proliferation of T-cell proliferation, B cells, or both. TOLLIP polymorphisms are involved in regulation of TLR2 and TLR4 and are associated with susceptibility to tuberculosis, atopic dermatitis, sepsis, and TOLLIP is differentially hypomethylated in IPF lungs. Lastly, failure to upregulate TOLLIP expression in inflammatory bowel disease, may lead to chronic inflammation.
Chromosome 17q21 region has been associated with Parkinson's, multiple sclerosis, Alzheimer's, androgenic alopecia, and interestingly, with the response to inhaled corticosteroids in asthma and COPD. In the present study, it was discovered that the minor allele rs17690703_T in the 17q21.31 region was associated with decreased susceptibility for IPF development and also conferred increased mortality in Inter une, UChicago, and in the meta-analysis. Among the unique aspects of this region include a known inversion, referred to as H2, in a large region of conserved LD on the chromosome, which is positively selected in Europeans. There also appear to be a high number of copy number variants (CNVs) within this region and it
has been associated with a microdeletion syndrome. A critical span of 440 kb that partially or entirely involves five genes: CRHR1, IMP5 (SPPL2C), MAPT, STH and KIAA1267 reside on 17q21.31 region. A large number of variants in the region with significance in Stage 1 were discovered, with a focus on the top SNPs.
MDGA2, a novel region, resides on 14q21.23 and showed association with
IPF susceptibility. MDGA2 is a paralog for ICAM, which has been recently demonstrated as a potential biomarker of IPF disease activity. The instant findings indicate the importance of this gene in IPF.
IPF is a heterogeneous disease and, by definition, is a diagnosis of exclusion. As such, misdiagnoses are possible, which might lead to a reduction in power. However all subjects met currently accepted criteria for diagnosis as outlined by ATS/ERS/JRS/ELAT with many having been vetted with core pathology and radiology as in Inter une, ACE-IPF, as well as participation in variety of studies.
This discovery GWAS study revealed novel genetic loci associated with IPF susceptibility. Furthermore, susceptibility alleles within these loci were discovered to be associated with mortality. Identification of common genetic variants in association with IPF provides insight into the manifestations of this complex disease process and lead to earlier detection, more predictable prognosis, and personalized therapeutic strategies.
EXAMPLES
A three-stage association study was conducted including a discovery GWAS for susceptibility to IPF in Stage 1 , and replicated the findings in two independent case-control association studies (Stage 2 and Stage 3, respectively). Association with mortality was evaluated in three case series. A flowchart illustrating the strategic approach used is shown in Fig. 2.
IPF cases and controls of each stage
Three stages of IPF cases were collected and characterized by the conventional criteria.12"14 Stage 1 samples consisting of African-Americans (AA) and European-Americans (EA) were collected for the discovery phase of the genome- wide association study (GWAS), while Stages 2 and 3 consisting of only EA samples were collected for two independent replication studies (replication 1 and 2, respectively). All eligible subjects were at least 35 years of age and reported having symptoms of idiopathic interstitial pneumonia for at least 3 months. A high-resolution
computed tomographic scan was required to show definite or probable idiopathic interstitial pneumonia in accordance with predefined criteria,14 and a surgical lung biopsy confirming UIP, was obtained in 37.3% of subjects in the discovery GWAS stage. Subjects with clinically significant exposure to known fibrogenic agents or another cause of interstitial lung disease were excluded.
Stage 1 discovery GWAS IPF samples (n=633) were identified and clinically characterized at the University of Chicago (UChicago), University of Pittsburgh (UPittsburgh), via the Lung Tissue Research Consortium (LTRC), and from the Correlating Outcomes with biomedical Markers to Estimate Time-progression in IPF (COMET) study. Stage 2 samples (n=544) comprised additional independent IPF patients from UChicago, InterMune,3 Lung Transplant Outcomes Group (LTOG) cohort4 and LTRC. Stage 3 IPF cases (n=324) consisted of additional independent IPF patients from LTOG and Anticoagulant Effectiveness in Idiopathic Pulmonary Fibrosis Study (ACE-IPF).5 Fig. 8, Fig. 9, and Fig. 10 feature each study population.
All eligible subjects were > 35 years old and reported symptoms of idiopathic interstitial pneumonia for at least 3 months. A high-resolution computed tomographic scan was required for diagnosis of definite or probable idiopathic interstitial pneumonia in accordance with predefined criteria.6 A surgical lung biopsy was obtained in 37.3% of affected subjects in the discovery GWAS stage. Subjects with clinically significant exposure to known fibrogenic agents and those with other known cause of interstitial lung disease were excluded.
For Stage 1 , data of unaffected European American (EA) subjects, from dbGaP (n=1 ,442) were compiled with healthy subjects recruited from the University of Pittsburgh (n=103), to increase the available pool of subjects (n=1,545). A subset of controls matched one-on-one to cases by means of genome-wide genetic ancestry estimates were selected for downstream analysis.
EA controls for Stages 2 and 3 (n=687 and n=702, respectively) were collected from 2005 to 2012 as part of the Translational Research in the Department of Medicine Study (TRIDOM) at the University of Chicago. Institutional review boards at each institution approved this study and informed consent was obtained from all subjects. Summarized strategic methodology of the study and detailed clinical and demographic characteristics of all study stages are shown in Fig. 2 and Fig. 10, respectively.
Genotyping, imputation, and statistical analysis
Discovery Stage 1 genotyping was conducted using the Genome-Wide Human SNP 6.0 array (Affymetrix, Santa Clara, CA). Stages 2 and 3 genotyping was conducted using the iPLEX Gold™ Platform (Sequenom, San Diego, CA). Genotype imputation was performed with IMPUTE2 using European ancestry panel data from the 1000 Genomes Project as a reference. Association testing was performed using SNPTEST software (v2.3).7 Fifty-two SNPs selected in 19 loci showing an association with IPF (p<10"4) in Stage 1 were carried forward to Stage 2. As the selected SNPs with the lowest p-value in Stage 1 were all a result of imputation, their association was validated by genotyping using the iPLEX Gold™ Platform. Six SNPs in 3 loci achieving an overall p<5x10"8 (i.e. Stage 1 and 2 combined) were carried forward to Stage 3.
DNA quantity was checked using PicoGreen fluorometry. Samples were dispensed at 50 ng/μΙ in 96-well plates and hybridized to arrays following manufacturer's protocols. Samples with fewer than 86% of the quality control (FQC) SNPs produced genotypes were rerun. Genotypes were recalled plate-by-plate in the study, including those downloaded from dbGaP using "crlmm" package, a new implementation of the Corrected Robust Linear Model with Maximum Likelihood Classification (CRLMM) algorithm, available through the Oligo package at Bioconductor.18, 19
Samples were excluded from the analysis if they failed any of several quality metrics: low call rate (below 97% or 93% for production plate with > 35 samples or with <35 samples, respectively), incompatibility between reported gender and genetically determined gender, or incompatibility between reported race and genetically determined race. Samples were also checked for unexpected familial relationships using pairwise IBD estimation in PLINK.20 The total number of European-American IPF case and control samples passing all initial QC tests was 575 and 1 ,427 (1 ,340 of the available 1,442 cases from dbGaP and 87 of the 103 cases from University of Pittsburgh), respectively.
To reduce the false-positive rate, inflated by spuriously small p-values, while having little impact on the p-values associated with true positive loci for heterogeneous human populations, controls were matched to cases on a one-on- one basis for race and ethnicity based on genetic ancestry.21 SMARTPCA
software, was used to select control individuals from a larger set of available controls with the first four principal components (PCAs) obtained from a subset of variants showing limited linkage disequilibrium (n=267,000). To do so, the distance between every case individual and control individual was defined as the Euclidean distance between the individuals in a space based on the first four principal components, where each axis was also multiplied by its corresponding eigenvalue. After pairwise matching of 575 cases and 1,427 controls and accounted for the first four PCAs, 542 cases and 542 genetically matched controls were retained for downstream analysis.
Two tiers of filtering of control genotyping quality was performed using a call rate (<95%) and Hardy-Weinberg Equilibrium (HWE) p-value<10"3. An additional 1 ,367 variants were further removed for inconsistent allele differences with IMPUTE2 1000 Genomes Project panel data. Prior to imputation, SNPs with minor allele frequency (MAF)< 5% were removed (a total of 349,801 were filtered based on QC and MAF) leaving a final number of 555,432 variants for further analysis and imputation.
Genome-wide SNP imputation was performed for the cleaned dataset to identify additional SNPs possibly showing associations. SHAPEIT23 software was used to estimate phased haplotypes from the directly observed genotype data. Haplotypes derived from a European ancestry panel, consisting on samples from CEU, FIN, GBR, IBS and TSI from 1000 Genomes Project (February 2012 release), was used as a reference. Imputation was conducted using IMPUTE2. The inflation factor (λ) between cases and controls across all SNPs was 1.06. SNPTEST software (v2.3)24 was used to calculate p-values based on a one degree-of-freedom score test for a logistic regression which assumes that the allele effect on the genotype for each SNP is additive. The score test implemented in SNPTEST allows for genotypic uncertainty via missing data likelihood, therefore it is applicable to both imputed genotypic data (i.e. in Stage 1) and to directly genotyped data (i.e. all stages). P- values were calculated for each stage separately, for Stages 1 and 2 combined, and finally for a joint analysis with all stages combined as one sample. Model parameters were estimated with a random subset of 200 individuals before imputation on the entire dataset.
Regions were deemed for follow-up in Stage 2 if they had a SNP with an association p<10"4 in Stage 1. A minimum of 2 SNPs was selected from each region for Stage 2 genotyping. Where possible, the linkage disequilibrium (LD) of those two SNPs was low (?<0.2), where one of them was the variant with direct genotyping data showing the lowest p-value, and the other was the variant with imputed data showing the lowest p-value. Based on these criteria, a total of 40 SNPs for 19 loci were selected (2 SNPs per loci except for chrl VTOLUP, chrl 7/SPPL2C, and chr7/MAD1L1 regions with 3 SNPs; for c rtlS H region with only 1 SNP).
In order to provide a better coverage of genetic variants for the previously reported region on chromosome 11 p15.5, containing TOLLIP and MUC5B, an additional set of tagging SNPs (tSNPs) were selected using the multiple-marker selection algorithm, haplotype i2, included in TagIT 3.03 software.25 A set of 23 chrl MTOLLIP tSNPs under previously described criteria26 from 380 European individuals (CEU, FIN, GBR, IBS and TSI) in 1000 Genomes Project Consortium27 were selected. The common polymorphism of MUC5B (rs35705950) associated with familial and sporadic IPF cases was used as a positive control for genotyping quality and association. A total of 64 SNPs were compiled and submitted to Assay Design Suite (https://www.mysequenom.com/ToolsMassArray online design) for primers and probes design. Twelve of these SNPs failed during assay design and were considered failed and discarded from the analysis. A list of the remaining 52 SNPs from the 19 regions are shown in Fig. 7 along with their association p-values and MAFs.
A subset of 6 SNPs (rs111521887, rs17690703, rs35705950, rs5743890, rs5743894, rs7144383) from 3 loci showing a statistically significant association p- value<5x10"8 in the joint analysis of Stages 1 and 2 and with the same direction of effects in the two stages was compiled for Stage 3 replication (Fig. 11).
As the SNPs with the best p-value in the Stage 1 discovery GWAS were all a result of imputation, 541 of the 633 cases previously genotyped by the SNP array were compiled and genotyping was performed using iPLEX Gold™ platform to validate the findings. Approximately 10% of the samples were genotyped by TaqMan™ allelic discrimination assays (Applied Biosystems) to monitor genotyping quality. Genotyping was blind to case-control status. Samples with disconcordant genotypes were discarded.
Linkage disequilibrium assessment
Linkage disequilibrium (LD) between SNPs in the MUC5B/T0LLIP region was measured using pairwise r2 measures.8 The mode of inheritance for these SNPs (dominant, recessive) was determined by comparing the odds ratios of the heterozygous and at-risk homozygous genotypes. A regression-based conditional analysis of the interaction between MUC5B and TOLLIP SNPs on IPF susceptibility was implemented in the R statistical package.9
TOLLIP Gene expression in IPF lung tissues
Gene expression profiling data of IPF lungs was obtained from the Lung Genomics Research Consortium (LGRC) website. A total of 67 IPF individuals have paired genotype of SNPs associated with susceptibility and gene expression profiling data. The TOLLIP gene expression levels in these 67 samples were stratified into two groups according to presence or absence of the minor allele. Two-group comparison was performed using unequal variance i-test.
Mortality analysis for individual loci
Three case series in Stages 1 and 2 averaging follow-up data between 22 to 70 months (Fig. 12) were subjected to Cox regression analyses for mortality using the SPSS package (SPSS Inc., Chicago, IL) on the three IPF susceptibility loci that showed an overall p<10"8 in Stages 1 and 2. Time "at risk" was defined as the interval between the date of enrollment in a given study and date of the last follow- up, lung transplant, or death. Lung transplant patients (2%, 7%, and 25% in InterMune, UChicago and UPittsburgh, respectively) were censored at time of transplant from the analysis, as potential confounders of survival. Univariate and multivariate analyses, considering relevant demographic and clinical parameters in the models, were conducted as appropriate. A single aggregate result for each locus was obtained by means of a meta-analysis applying both fixed and random effect models10 as appropriate to account for the different available follow-up data among the case series studied.
Average follow-up data of 22 to 70 months was available for a subset of samples in 3 case series included in Stages 1 and 2 (Fig. 12). These case series were utilized mortality analyses was performed on the previously identified MUC5B promoter SNP and 5 novel SNPs within susceptibility loci that showed an overall association p<10"8 in Stages 1 and 2 assuming that the genotypic effects were
additive. Logistic regressions were used initially to explore SNP effects comparing alive vs. dead patients. A more appropriate analysis of survival was then assessed on the 5 novel SNPs only, utilizing time "at risk".
All transplanted cases were censored from these analyses in order to avoid the confounding factor associated with IPF mortality. Univariate and multivariate analyses, using models considering relevant demographic and clinical parameters (such as age, gender, tobacco history, forced vital capacity (FVC) percent predicted, diffusing capacity of carbon monoxide (DLCO) percent predicted, and recruitment center) were conducted. The heterogeneity of the Kaplan-Meier mortality curves as a function of genotypes for each SNP was assessed by the log-rank test. Hazard ratio (HR) estimates were obtained using Cox proportional hazard analyses. Schoenfeld residuals were used to assess the assumption of proportional hazards.
A single aggregate result for each locus was obtained with METASOFT by means of a meta-analysis. For that, both fixed and random effect models were applied, the latter corresponding to an optimized model to detect associations under heterogeneity, which was applied if heterogeneity between study samples was evident, as indicated by the significance of the Cochran's Q statistic.
Sample characteristics
Demographic and clinical characteristics of IPF patients and controls in each stage are shown in Fig. 9 and Fig. 10. As in other studies, cases in the discovery stage had a wide range of disease severity and age. The Stage 2 patients were a blend of cases with milder (InterMune) and more severe disease undergoing lung transplantation (LTOG), yielding a very similar group to Stage 1 based on the overall physiologic severity as assessed by forced vital capacity (FVC) and diffusing capacity for carbon monoxide (DLCO) (Fig. 10). The Stage 3 patients were more severe, derived from the LTOG and ACE-lPF study. However all IPF cases met diagnostic criteria16 and were all of similar age and gender. Characteristics of cases with follow-up data for survival analysis are shown in Fig. 12.
Genome-wide association study, replication, and regional association
After completion of sample quality control and genotype filtering, 542 of the
633 cases and 542 genetically matched controls selected from the available 1 ,545- pooled controls were retained for Stage 1. A total of 555,432 high quality genotyped variants were used for imputation which resulted in 10,601 ,812 best imputed
common variants with minor allele frequency (MAF) > 5%. The GWAS was then conducted using the genotyped and imputed SNPs. The inflation was modest with a test statistics of λ=1.06, indicating an insignificant confounding of the results by population stratification (Fig. 3).
A total of 19 genomic loci with an association (p-value<10"4) were identified from Stage 1 discovery GWAS. Fifty-two SNPs were compiled from the combination of genotyped, imputed, and tSNPs. Fig. 7 summarizes annotations for these loci, allele frequency in reference populations (CEU, EUR), IPF cases, controls, as well as their association p-values with susceptibility to IPF.
Directly genotyped SNPs in Stage 2 nominally replicated many of the associations with IPF susceptibility detected in Stage 1 GWAS. Five imputed SNPs and the previously identified UC5B promoter SNP reached genome-wide significance levels (p-value<4.2 x 10"8) in a joint analysis of Stage 1 and 2. These six SNPs were re-genotyped in Stage 1 samples and the association confirmed. Fig. 1 highlighted loci of chrl 1 p15.5 containing SNPs of TOLLIP (rs111521887, rs5743894, rs5743890) and MUC5B (rs35705950); chr17q21.31 of SPPL2C (rs17690703) and Chr14q21.3 of MDGA2 (rs7144383).
In Stage 3, the association of four of the SNPs in two novel loci (ch11 p15.5/7O _L/P and ch17q21.ZMSPPL2C) was replicated, as well as the association of MUC5B promoter SNP, previously reported in an independent study,11 with IPF susceptibility. Each of them had overall combined p<10"9, showing effects in the same direction across all single stages (i.e. allele rs35705950_T constitutes as risk for IPF, while alleles rs5743890_G and rs17690703_T protect from IPF) (Fig. 13). Regional associations of the genotyped and imputed SNPs at ch11 p15.5 and ch17q21.31 loci are shown in Fig. 4. (A) ch11p15.5/MUC5B/TOLLIP locus and (B) ch17q21. ZMSPPL2C locus as defined by the positions of SNPs showing a linkage disequilibrium with the lead SNP rs5743894 (A; p=2.2 x 10"6) and SNP rs17690703 (B; p=4.9 x 10"6), respectively. Disease associations as indicated by -Iog10 p-values are plotted against chromosomal positions. Diamonds and circles represent individual SNP of the GWA screen using genotyped and imputed data, respectively. Colored diamonds indicate SNP data obtained by the analysis of 542 IPF cases and 542 controls. Additional tSNPs selected for better coverage are included. Associations were assessed assuming recessive and additive modes of inheritance
for the MUC5B/TOLLIP locus and the SPPL2C locus, respectively. Levels of linkage disequilibria (r2) with the best-associated SNP (red diamonds) are color-coded. Blue lines indicate recombination fractions as estimated from the European panel sample. Horizontal arrows mark structural human genes as annotated by Human Genome Build 37.3/gh19 of the UCSC (Genome Bioinformatics Group, University of California, Santa Cruz). Symbols, position and direction of each gene within the loci are shown at the bottom of the plot.
In ch11 p15.5 locus, the r2 values of MUC5B promoter SNP, rs35705950, and TOLLIP S Ps (rs111521887, rs5743894, and rs5743890) were 0.07, 0.16, and 0.01, respectively. These low levels of LD indicate that the signals of association for TOLLIP SNPs are independent from MUC5B (Fig. 4A). Moreover, the mode of effect for the MUC5B SNP (dominant) was different than that for the TOLLIP SNPs (additive or recessive), providing additional evidence that these are independent signals. Lastly, in a conditional regression-based analysis, genotypes were combined according to the mode of inheritance and it was found that, while the /C56/rs35705950 SNP showed the strongest signal (p=2x10"16), the 7OLL/P/rs11152887/rs5743894/rs5743890 SNPs remained associated with IPF (p=0.05).
Relationship between presence of susceptibility alleles by genotype and survival in IPF case series
Enrollment criteria in the InterMune study skewed patients towards better pulmonary function as assessed by FVC (71.56±12.68 percent predicted), and less heterogeneity of disease severity as assessed by a lesser standard deviation on lung function than in the UChicago study (65.17±18.29) or the UPittsburgh study (65.27±19.72). Also, InterMune had a shorter average follow-up period (22 months) in survivors, than UChicago (40 months) or UPittsburgh (70 months) (Fig. 12). Since the follow-up time varied widely in each IPF case series, it was decided to evaluate the novel susceptibility alleles in association with mortality, both separately and jointly through a meta-analysis.
Three SNPs were associated with mortality in an initial logistic regressions analysis of the overall case series (Fig. 14). Univariate Cox regression analysis in InterMune, UChicago, UPittsburgh as well as in the meta-analysis further demonstrated that the novel risk alleles for susceptibility in 11p15.5/rOZ.L/P and 17q21.3 MSPPL2C loci were associated with protection from IPF mortality (Fig. 1 and
Fig. 15). Briefly, allele rs5743890_G was associated with increased mortality in UChicago (p=0.008) and in UPittsburgh (p=0.025). Similarly, allele rs17690703_T was associated with increased mortality in InterMune (p=0.044) and in UChicago (p=0.030). Meta-analysis of the 3 case series sustained associations with mortality (p=0.034 for rs17690703_T, and p=0.0009 for rs5743890_G) (Fig. 15). Notably, the meta-analysis of rs17690703_T with increased mortality suggested significant study heterogeneity among the three case series (Cochran's Q-value=9.54, p=0.0085). Multivariate analyses adjusting for recorded covariates (i.e. age, gender, tobacco history, FVC, DLCO, at each recruitment center) that maintained p-value<0.1 in regression models did not appreciably change these findings (Fig. 16). Results of additional analyses pertaining to survival are presented in Fig. 17, Fig. 18, and Fig. 19.
The SPPL2C variant rs17690703 failed to meet significance only after adjustment of disease severity (p=0.06). This is highly suggestive that the region may have a relationship to survival. Intronic variants are unlikely to be causal. In fact, this variant might actually be a tag for an altogether different gene within the H2 inversion. Because H2 is rare among individuals of African (6%) and Asian (1%) ancestry, and the IPF cohort was overwhelmingly comprised of individuals of European ancestry (EA), where H2 occurs in approximately 20%, further evaluation of the role of H2 focused on an EA group. H2-specific SNPs tag the inversion, and are strongly correlated, but incompletely linked, to the SPPL2C variant (rs17690703) (r2 = 0.76). Three SNPs that tag H2 (rs916793, rs2902662, rs17651213) were included on the Affymetrix 6.0 GeneChip® (Affymetrix, Santa Clara, CA) used in the GWAS. Several proxy SNPs in complete linkage disequilibrium (LD) (r2 = 1) with SNPs that tag H2 were also identified. Included in this analysis were 120 EA individuals from the University of Chicago cohort for which mortality and genotype data were available. Of this group, 28.3% (n=34) carried an H2 haplotype, a 40% increase over the general population estimate. Of these 34 patients, 30 (88%) were heterozygous and 4(12%) homozygous for the inversion. Assignment of H2 status was based on the presence of all 3 SNPs that tag H2 (rs916793, rs2902662, rs17651213). This method allowed H2 assignment to all but 3 patients in this cohort. The addition of a proxy SNP (rs199448) allowed H2 status to be determined for the 3
remaining patients. These data suggest that presence of an H2 haplotype increases susceptibility to IPF.
To perform the survival analysis, the cohort of 120 EA individuals was then stratified based on H2 (absent vs. present) and SPPL2C (wild-type (WT) vs variant (Var)) status. Inclusion of SPPL2C in the stratification is necessary given the strong correlation between the two variants and potential confounding by SPPL2C. Survival analysis for each group showed a statistically significant difference (p=0.04) in mortality risk between the 4 groups (Fig. 5). The vast majority of patients belonged to either the H2(-)/ SPPL2CANT or H2(+)/SPPL2C-Var group, making it difficult to draw a conclusion about the two smaller groups. When comparing one group to another the statistical significance was lost. But these data suggest that H2 and SPPL2C contribute to mortality risk independently, SPPL2C (wild-type (WT) vs variant (Var)) status. Inclusion of SPPL2C in the stratification is necessary given the strong correlation between the two variants and potential confounding by SPPL2C.
The presence of a common variant associated with IPF, its location within an inversion, the independence of chromosome 17 from chromosome 11 and the possibility of variants related to survival in IPF indicate that additional sequencing may facilitate identification of causal or regulatory variants within the region.
A barrier inherent in the large amount of data generated by next generation sequencing of genetic regions involves methods to evaluate uncommon or rare variants. However the importance of regions and likelihood of additional uncommon and rare variants can be discovered by using aggregating or collapsing methods within regions. Indeed, regions with common variants have a greater number of uncommon or rare variants as well. One approach using the fundamentals of a logistic regression involves an L1 -regularized regression to accommodate large number of variants. The Lasso method is a shrinkage and selection method for linear regression. It minimizes the usual sum of squared errors, with a bound on the sum of the absolute values of the coefficients. It has connections to soft-thresholding of wavelet coefficients, forward stagewise regression, and boosting methods. It is recognized that the power to detect true rare variants is limited by the number of cases (n=542) conducted on the array. A preliminary analysis based on aggregating the common and uncommon variants by region, ranking them by p values at 10"3 or smaller to yield expected rates of variants beyond that of the populace in general
was therefore conducted. The 542 cases versus controls were examined using a mutli-variant genetic association test of functional features of the genome by LASSO. This represents 37,000 functional features, which include but are not limited to protein coding regions as well as known lincRNA and miRNA, etc. The genome was analyzed in 5 MB increments or "clusters" where, 4.6 represents chromosome 4 and the 6 would then be cluster 35-30MB along that chromosome. Using 5x10"3 as a cutoff for significance, there were over 100 regions of interest. In Table X are the top 6 regions. Not surprisingly, MUC5B and TOLLIP in the same cluster were ranked second. Chromosome 17 actually demonstrated multiple clusters of interest. The preliminary data demonstrate several additional loci not previously identified, while also emphasizing the importance of 17q21.31 and 11p15.5 region, as well as other regions. This analysis demonstrates the ability to handle complex datasets of uncommon and rare variants to generate novel discovery.
To address the issue that variants within a region could exert an effect in opposite directions, a unique dataset with survival cohorts was used that that allows performance of linear regression analysis of each individual variant within the region to assess the direction of effect and assignment of an additive or subtractive model for multiple variants within a region. In fact, TOLLIP SNPs demonstrated this phenomenon, in that rs111521887 (G for T), or rs5743894 (G for T) of TOLLIP SNPs were associated with increased susceptibility to IPF while rs5743890 (G for T) was associated with decreased susceptibility, or a protective effect in developing IPF. However, all three SNPs seem to exert an effect reducing gene expression levels of TOLLIP in lung tissue, again arguing for a better understanding of causal or regulatory SNPs.
To further the integration of multiple genetic markers with clinical parameters, the two published SNPs in 11 p15.5 were examined to determine if there was an interaction. The intersection of these two independent SNPs in TOLLIP and MUC5B demonstrated only a weak interaction with an r2 of 0.009 by linear regression. The relationship with survival appears to therefore be additive in preliminary data. (Fig. 6A). Initial results indicated that the association with mortality for SPPL2C moved to only trend levels (p=0.06) after adjusting for severity of illness, with a modest hazard ratio of 1.3. However, taking into account information regarding the H1/H2 status and its influence, it is more than plausible that other SNPs in the regions will carry greater
hazard ratio of significance. Therefore the SPPL2C was incorporated into a risk index using a multidimensional approach to collapse categories down to 4 groups (Fig. 6B). An analysis using a weighted sum of risk index alleles across the SNPs where the Weighted Personalized Gene Risk Score (WPGRS) is obtained by multiplying the logHR by the number of risk alleles by genotype across 3 SNPs gives 17 categories. The unadjusted Cox regression model gave HR=6.51 (2.91-14.55), p=5.02x10"06 and the adjusted Cox regression gave HR=6.60 (2.71-16.09), p=3.23x10"05 (adjusted for age, sex, FVC, DLCO, study center) demonstrating the power of this approach. The identification of causal SNPs in the TOLLIP or SPPL2C region is expected to increase HR.
Novel genetic loci associated with IPF susceptibility.
In the joint analysis, two loci (ch11p15.5 and ch17q21.31 ) showed clear evidence of replication with effects in the same direction as in Stage 1 discovery GWAS and genome-wide significance levels of p<10"8. Association of the genotyped and imputed SNPs at ch11p15.5 and ch17q21.31 loci is shown in Figures S2A and S2B, respectively. SNP rs35705950 on locus chr11p15.5/ /C5S has been firmly implicated in association with IPF.17 Notably, three novel SNPs were revealed on the same locus, located in the intronic regions of TOLLIP gene, which were associated with IPF (rs111521887_G, Odds ratio (OR)=1.48, 95%CI=1.32-1.66, p=2.2x10"12; rs5743894_G, OR=1.49, 95%CI=1.33-1.68, p=1.35x10"12; rs5743890_G, OR=0.61 , 95%CI=0.52-0.71, p=3.43x10"11) (Fig. 13). Subsequent logistic regression analyses conditioned on the marker SNPs in ch11 p15.5 revealed that these TOLLIP SNPs were not in LD with the MUC5B SNP, rs35705950. The i2 values were 0.07, 0.16, 0.01 between rs35705950 and rs111521887, rs5743894, and rs5743890, respectively. This data indicated that the signals of association for these three SNPs were not correlated to rs35705950. Additionally, the mode of inheritance for the MUC5B SNP (dominant) is different than that for the TOLLIP SNPs (additive or recessive), adding to the evidence for independent signals. Lastly, genotypes were combined according to the mode of inheritance to identify the underlying genetic mode and perform a joint conditional analysis of rs35705950 and rs111521887: the MUC5B SNP shows the strongest signal (p<2x10"16), but the TOLLIP SNP remains associated (p=0.05).
The second novel locus, which is located on chromosome 17q21.31 , was indicated by imputation and supported by physical genotyping of SNP rs17690703_T (OR=0.70, 95%CI=0.62-0.79, p=5.70x10"9)(Fig. 13).
For the third novel locus on chromosome 14q21.3, replication of SNP rs7144383 was achieved in an independent case-control association study after imputation of the 1000 Genomes Project data demonstrating an OR=1.57, 95%CI=1.18-2.08, p=3.50x10"8 in the joint analysis. In a joint analysis of Stage 3 data along with data from the two previous stages this association maintained a suggestive association (OR=1.44, 95%CI=1.23-1.69, p=3.7x10"6) (Fig. 13).
The following embodiments are included:
1. A method of determining whether a human subject has or is at risk of developing an interstitial lung disease, the method comprising detecting whether the genome of the subject comprises a genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2 and determining whether the subject has or is at risk of developing an interstitial lung disease, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease.
2. The method of embodiment 1, wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of TOLLIP.
3. The method of embodiment 1 or embodiment 2, wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of SPPL2C.
4. The method of any one of embodiments 1-3, wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of MDGA2.
5. The method of any one of embodiments 1-4, further comprising detecting whether the genome of the subject comprises a genetic variant of MUC5B.
6. The method of any one of embodiments 1-5, wherein the method comprises detecting whether the genome of the subject comprises one or more genetic variants having a single nucleotide polymorphism selected from the group consisting of rs111521887, rs5743894, rs5743890, rs17690703, and rs7144383.
7. The method of embodiment 6, wherein the method comprises detecting whether the genome of the subject comprises the genetic variant having the single nucleotide polymorphism rs111521887.
8. The method of embodiment 6 or 7, wherein the method comprises detecting whether the genome of the subject comprises the genetic variant having the single nucleotide polymorphism rs5743894.
9. The method of any one of embodiments 6-8, wherein the method comprises detecting whether the genome of the subject comprises the genetic variant having the single nucleotide polymorphism rs5743890.
10. The method of any one of embodiments 6-9, wherein the method comprises detecting whether the genome of the subject comprises the genetic variant having the single nucleotide polymorphism rs17690703.
11. The method of any one of embodiments 6-10, wherein the method comprises detecting whether the genome of the subject comprises the genetic variant having the single nucleotide polymorphism rs7144383.
12. The method of any one of embodiments 6-11 , further comprising detecting whether the genome of the subject comprises a genetic variant having a single polynucleotide polymorphism rs35705950.
13. The method of any one of embodiments 1-12, wherein the interstitial lung disease is a fibrotic interstitial lung disease.
14. The method of embodiment 13, wherein the interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
15. A method of prognosing an interstitial lung disease in a human subject, the method comprising detecting whether the genome of the subject comprises a genetic variant of TOLLIP or SPPL2C and determining a prognosis for the subject, the presence of the genetic variant gene being prognostic of increased or decreased survival.
16. The method of embodiment 15, wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of TOLLIP.
17. The method of embodiment 15 or 16, wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of SPPL2C.
18. The method of any one of embodiments 15-17, further comprising detecting whether the genome of the subject comprises a genetic variant of MUC5B.
19. The method of any one of embodiments 15-18, wherein the genetic variant has at least one single nucleotide polymorphism selected from the group
consisting of rs17690703 and rs5743890, and wherein the single nucleotide polymorphism is predictive of decreased survival.
20. The method of any one of embodiments 15-19, wherein the genome of the subject comprises the single nucleotide polymorphism rs35705950, and wherein the single nucleotide polymorphism is predictive of increased survival.
21. The method of any one of embodiments 15-20, wherein the interstitial lung disease is a fibrotic interstitial lung disease.
22. The method of embodiment 21, wherein the interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
23. A method of detecting the presence or absence of at least one genetic variant in a human subject, the method comprising: detecting the presence or absence of at least one genetic variant of at least one of TOLLIP, SPPL2C, and MDGA2 in a sample from the subject.
24. The method of embodiment 23, wherein the at least one genetic variant includes a genetic variant of TOLLIP.
25. The method of embodiment 23 or embodiment 24, wherein the at least one genetic variant includes a genetic variant of SPPL2C.
26. The method of any one of embodiments 23-25, wherein the at least one genetic variant includes a genetic variant of MDGA2.
27. The method of any one of embodiments 23-26, further comprising testing the sample for a genetic variant of MUC5B.
28. The method of any of embodiments 23-27, wherein the at least one genetic variant includes at least one of the genetic variants listed in Fig. 7.
29. The method of embodiment 28, wherein the at least one genetic variant includes one or more of a single nucleotide polymorphism selected from the group consisting of rs1 1521887, rs5743894, rs5743890, rs17690703, and rs7144383.
30. The method of embodiment 29, wherein the at least one genetic variant includes rs111521887.
31. The method of embodiment 29 or 30, wherein the wherein the at least one genetic variant includes rs5743894.
32. The method of any one of embodiments embodiment 29-31 , wherein the at least one genetic variant includes rs5743890.
33. The method of any one of embodiments 29-32, wherein the at least one genetic variant includes rs17690703.
34. The method of any one of embodiments 29-33, wherein the at least one genetic variant includes rs7144383.
35. The method of any one of embodiments 29-34, further comprising testing the sample for the genetic variant rs35705950.
36. The method of any one of embodiments 22-35, wherein the subject has or is suspected of having or is at risk for developing an interstitial lung disease.
37. The method of embodiment 36, wherein the interstitial lung disease is a fibrotic interstitial lung disease or familial interstitial pneumonia.
38. The method of embodiment 37, wherein the interstitial lung disease is idiopathic pulmonary fibrosis.
39. A method of detecting the presence or absence of at least two genetic variants in a human subject having or suspected of being at risk for developing an interstitial lung disease, the method comprising: detecting the presence or absence of at least two of the genetic variants listed in Fig. 7 in a sample from the subject.
40. The method of embodiment 39, wherein the at least two genetic variants includes from two to 52 of the genetic variants listed in Fig. 7.
41. The method of embodiment 40, wherein the at least two genetic variants includes from two to 44 of the genetic variants listed in Fig. 11.
42. A method of testing for interstitial lung disease in a human subject, the method comprising: detecting a level of TOLLIP gene expression in a sample from the subject, a low level of TOLLIP gene expression relative to a control being indicative of interstitial lung disease.
43. The method of embodiment 42, wherein the level of gene expression is detected by measuring directly or indirectly TOLLIP mRNA.
44. The method of embodiment 42, wherein the level of gene expression is detected by measuring Tollip protein.
45. A method of treating a human subject having an interstitial lung disease, the method comprising: detecting a level of TOLLIP expression according to any one of embodiments 42-44; and if the subject has a low level of TOLLIP expression relative to a control, administering to the subject an amount of a Tollip agonist, Tollip
or a genetic construct expressing TOLLIP effective to treat the interstitial lung disease.
46. A kit for predicting, diagnosing, or prognosing interstitial lung disease in a human subject, the kit consisting essentially of: at least one probe or primer for detecting the presence or absence of at least one genetic variation in at least one of TOLLIP, SPPL2C, and MDGA2.
47. The kit of embodiment 46, wherein the at least one probe or primer includes probes or primers for detecting at least one genetic variation in TOLLIP.
48. The kit of embodiment 46 or 47, wherein the at least one probe or primer includes probes or primers for detecting at least one genetic variation in SPPL2C.
49. The kit of any one of embodiments 46-48, wherein the at least one probe or primer includes probes or primers for detecting at least one genetic variation in MDGA2.
50. The kit of any one of embodiments 46-49, further comprising at least one probe or primer for detecting at least one genetic variation in MUC5B.
51. The kit of any one of embodiments 46-50, wherein the genetic variation includes at least one of rs111521887, rs5743894, rs5743890, rs17690703, rs7144383, and rs35705950.
52. The kit of any one of embodiments 46-51 , wherein the at least one probe or primer includes at least one probe or primer for detecting one or more of the genetic variations listed in Fig. 7.
53. A kit for predicting, diagnosing, or prognosing interstitial lung disease in a human subject, the kit comprising: at least one probe or primer for detecting the presence or absence of at least two genetic variations selected from the genetic variations listed in Fig. 7.
54. The kit of embodiment 53, wherein the kit comprises probes and/or primers for detecting the presence or absence of from two to 52 of the genetic variations listed in Fig. 7.
55. The kit of embodiment 54, wherein the kit comprises probes and/or primers for detecting the presence or absence of from two to 44 of the genetic variations listed in Fig. 11.
56. A method of determining whether a human subject has or is at risk of developing an interstitial lung disease, the method comprising detecting whether the
genome of the subject comprises at least two genetic variants selected from the group of variants listed in Fig. 7 and determining whether the subject has or is at risk of developing an interstitial lung disease, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease.
57. The method of embodiment 56, wherein the at least two genetic variants includes from two to 52 of the genetic variants listed in Fig. 7.
58. The method of embodiment 57, wherein the at least one genetic variant includes from two to 44 of the genetic variants listed in Fig. 11.
59. The method of any one of embodiments 56-58, wherein the interstitial lung disease is a fibrotic interstitial lung disease.
60. The method of embodiment 59, wherein the interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
61. A method of prognosing an interstitial lung disease in a human subject, the method comprising detecting whether the genome of the subject comprises at least two of the genetic variants listed in Fig. 7 and determining a prognosis for the subject, the presence of the genetic variant gene being prognostic of increased or decreased survival.
62. The method of embodiment 61 , wherein the interstitial lung disease is a fibrotic interstitial lung disease.
63. The method of embodiment 62, wherein the interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
64. A method of prognosing an interstitial lung disease in a human subject, the method comprising detecting whether the genome of the subject comprises an inversion in the 17q21.31 chromosomal region and determining a prognosis for the subject, the presence of the inversion being prognostic of increased or decreased survival.
65. A kit comprising a nucleic acid primer capable of hybridizing to a genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
66. The kit of claim 65, wherein said genetic variant has been extracted from a human subject with an interstitial lung disease or is an amplification product of a nucleic acid extracted from a human subject with an interstitial lung disease.
67. The kit of claim 65 or 66, wherein said interstitial lung disease is a pulmonary fibrotic condition.
68. The kit of one of claims 65-67, further comprising a first labeled nucleic acid probe capable of hybridizing to an amplification product of said genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
69. The kit of claim 68, further comprising a second labeled nucleic acid probe capable of hybridizing to an amplification product of said genetic variant
TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
70. The kit of claim 69, wherein said first labeled nucleic acid probe comprises a first label and said additional labeled nucleic acid probe comprises a second label, wherein said first and second label are capable of fluorescence resonance energy transfer when hybridized to said genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
71. An in vitro complex comprising a first nucleic acid probe hybridized to a genetic variant nucleic acid, said genetic variant nucleic acid comprising a genetic variant TOLLIP, SPPL2C or MDGA2 gene sequence, wherein said genetic variant nucleic acid is extracted from a human subject with an interstitial lung disease or is an amplification product of a nucleic acid extracted from a human subject with an interstitial lung disease.
72. The in vitro complex of claim 72, wherein said complex further comprises an second labeled nucleic acid probe hybridized to said genetic variant nucleic acid.
73. The in vitro complex of claim 72, wherein said first labeled nucleic acid probe comprises a first label and said second labeled nucleic acid probe comprises a second label, wherein said first and second label are capable of fluorescence resonance energy transfer.
74 An in vitro complex comprising a thermally stable polymerase bound to a genetic variant nucleic acid, said genetic variant nucleic acid comprising a genetic variant TOLLIP, SPPL2C or MDGA2 gene sequence, wherein said genetic variant nucleic acid is extracted from a human subject with an interstitial lung disease or is an amplification product of a nucleic acid extracted from a human subject with an interstitial lung disease.
75. The in vitro complex of claim 74, wherein the complex further comprises a nucleic acid primer hybridized to said genetic variant nucleic acid.
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof
will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All patents, patent applications, internet sources, and other published reference materials cited in this specification are incorporated herein by reference in their entireties. Any discrepancy between any reference material cited herein or any prior art in general and an explicit teaching of this specification is intended to be resolved in favor of the teaching in this specification. This includes any discrepancy between an art- understood definition of a word or phrase and a definition explicitly provided in this specification of the same word or phrase.
Each of the following publications is incorporated by reference in its entirety:
1. Mushiroda T, Wattanapokayakit S, Takahashi A, et al. A genome-wide association study identifies an association of a common variant in TERT with susceptibility to idiopathic pulmonary fibrosis. Journal of medical genetics 2008;45:654-6.
3. Raghu G, Brown KK, Bradford WZ, et al. A placebo-controlled trial of interferon gamma-1 b in patients with idiopathic pulmonary fibrosis. The New England journal of medicine 2004;350:125-33.
4. Lederer DJ, Kawut SM, Wickersham N, et al. Obesity and primary graft dysfunction after lung transplantation: the Lung Transplant Outcomes Group Obesity Study. American journal of respiratory and critical care medicine 2011; 184: 1055-61.
5. Noth I, Anstrom KJ, Calvert SB, et al. A placebo-controlled randomized trial of warfarin in idiopathic pulmonary fibrosis. American journal of respiratory and critical care medicine 2012;186:88-95.
6. Raghu G, Collard HR, Egan JJ, et al. An official ATS/E RS/J RS/AL AT statement: idiopathic pulmonary fibrosis: evidence-based guidelines for diagnosis and management. American journal of respiratory and critical care medicine 2011 ;183:788-824.
7. Marchini J, Howie B. Genotype imputation for genome-wide association studies. Nature reviews 2010; 11 :499-511.
8. Gabriel SB, Schaffner SF, Nguyen H, et al. The structure of haplotype blocks in the human genome. Science (New York, NY 2002;296:2225-9.
9. R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. . 2012.
10. Han B, Eskin E. Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. American journal of human genetics 2011 ;88:586-98.
11. Seibold MA, Wise AL, Speer MC, et al. A common MUC5B promoter polymorphism and pulmonary fibrosis. The New England journal of medicine
2011 ;364:1503-12.
12. American Thoracic Society. Idiopathic pulmonary fibrosis: diagnosis and treatment. International consensus statement. American Thoracic Society (ATS), and the European Respiratory Society (ERS). American journal of respiratory and critical care medicine 2000; 161(2 Pt 1):646-64.
13. American Thoracic Society/European Respiratory Society International Multidisciplinary Consensus Classification of the Idiopathic Interstitial Pneumonias. This joint statement of the American Thoracic Society (ATS), and the European Respiratory Society (ERS) was adopted by the ATS board of directors, June 2001 and by the ERS Executive Committee, June 2001. American journal of respiratory and critical care medicine 2002;165(2):277-304.
14. Raghu G, Collard HR, Egan JJ, et al. An official ATS/E RS/J RS/AL AT statement: idiopathic pulmonary fibrosis: evidence-based guidelines for diagnosis and management. American journal of respiratory and critical care medicine 2011 ;183(6):788-824.
15. Raghu G, Brown KK, Bradford WZ, et al. A placebo-controlled trial of interferon gamma-1 b in patients with idiopathic pulmonary fibrosis. The New England journal of medicine 2004;350(2): 125-33.
16. Lederer DJ, Kawut SM, Wickersham N, et al. Obesity and primary graft dysfunction after lung transplantation: the Lung Transplant Outcomes Group Obesity
Study. American journal of respiratory and critical care medicine 2011 ;184(9); 1055- 61.
17. Noth I, Anstrom KJ, Calvert SB, et al. A placebo-controlled randomized trial of warfarin in idiopathic pulmonary fibrosis. American journal of respiratory and critical care medicine 2012;186(1 ):88-95.
18. Carvalho B, Bengtsson H, Speed TP, Irizarry RA. Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics (Oxford, England) 2007;8(2):485-99.
19. Carvalho BS, Louis TA, Irizarry RA. Quantifying uncertainty in genotype calls. Bioinformatics (Oxford, England) 2010;26(2):242-9.
20. Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. American journal of human genetics 2007;81(3):559-75.
21. Luca D, Ringquist S, Klei L, et al. On the use of general control samples for genome-wide association studies: genetic matching highlights causal variants. American journal of human genetics 2008;82(2):453-63.
22. Patterson N, Price AL, Reich D. Population structure and eigenanalysis. PLoS genetics 2006;2(12):e190.
23. Delaneau O, Marchini J, Zagury JF. A linear complexity phasing method for thousands of genomes. Nature methods 2012;9(2):179-81.
24. Marchini J, Howie B. Genotype imputation for genome-wide association studies. Nature reviews 2010;11(7):499-511.
25. Weale ME, Depondt C, Macdonald SJ, et al. Selection and evaluation of tagging SNPs in the neuronal-sodium-channel gene SCN1A: implications for linkage- disequilibrium gene mapping. American journal of human genetics 2003;73(3):551- 65.
26. Flores C, Ma SF, Maresso K, Ober C, Garcia JG. A variant of the myosin light chain kinase gene is associated with severe asthma in African Americans. Genetic epidemiology 2007;31(4):296-305.
27. A map of human genome variation from population-scale sequencing. Nature 2010;467(7319):1061-73.
28. Seibold MA, Wise AL, Speer MC, et al. A common MUC5B promoter polymorphism and pulmonary fibrosis. The New England journal of medicine
2011 ;364(16):1503-12.
Claims
1. A method of determining whether a human subject has or is at risk of developing an interstitial lung disease, the method comprising detecting whether the genome of the subject comprises a genetic variant of TOLLIP, SPPL2C or MDGA2 and determining whether the subject has or is at risk of developing an interstitial lung disease, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease.
2. The method of claim 1 , wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of TOLLIP.
3. The method of claim 1 or claim 2, wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of SPPL2C.
4 The method of claim 3, further comprising determining if the subject carries an
H2 inversion in 17q21.31.
5. The method of claim 4, wherein the determining comprises determining if the subject comprises one or more single nucleotide polymorphisms selected from the group consisting of rs916793, rs2902662, rs17651213, and rs199448.
6. The method of any one of claims 1-5, wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of MDGA2.
7. The method of any one of claims 1-6, further comprising detecting whether the genome of the subject comprises a genetic variant of MUC5B.
8. The method of any one of claims 1-7, wherein the method comprises detecting whether the genome of the subject comprises one or more genetic variants comprising a single nucleotide polymorphism selected from the group consisting of rs111521887, rs5743894, rs5743890, rs17690703, and rs7144383.
9. The method of claim 8, wherein the method comprises detecting whether the genome of the subject comprises the genetic variant comprising the single nucleotide polymorphism rs11 521887.
10. The method of claim 8 or 9, wherein the method comprises detecting whether the genome of the subject comprises the genetic variant comprising the single nucleotide polymorphism rs5743894.
11. The method of any one of claims claim 8-10, wherein the method comprises detecting whether the genome of the subject comprises the genetic variant comprising the single nucleotide polymorphism rs5743890.
12. The method of any one of claims 8-11 wherein the method comprises detecting whether the genome of the subject comprises the genetic variant comprising the single nucleotide polymorphism rs17690703.
13. The method of any one of claims 8-12, wherein the method comprises detecting whether the genome of the subject comprises the genetic variant comprising the single nucleotide polymorphism rs7144383.
14. The method of any one of claims 8-13, further comprising detecting whether the genome of the subject comprises a genetic variant comprising a single polynucleotide polymorphism rs35705950.
15. The method of any one of claims 1-14, wherein the interstitial lung disease is a fibrotic interstitial lung disease.
16. The method of claim 15, wherein the interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
17. A method of prognosing an interstitial lung disease in a human subject, the method comprising detecting whether the genome of the subject comprises a genetic variant of TOLLIP or SPPL2C and determining a prognosis for the subject, the presence of the genetic variant gene being prognostic of increased or decreased survival.
18. The method of claim 17, wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of TOLLIP.
19. The method of claim 17 or 18, wherein the method comprises detecting whether the genome of the subject comprises a genetic variant of SPPL2C.
20. The method of claim 19, further comprising determining if the subject carries an H2 inversion in 17q21.31.
21. The method of claim 20, wherein the determining comprises determining if the subject comprises one or more single nucleotide polymorphisms selected from the group consisting of rs916793, rs2902662, rs17651213, and rs199448.
22. The method of any one of claims 17-21 , further comprising detecting whether the genome of the subject comprises a genetic variant of MUC5B.
23. The method of any one of claims 17-22, wherein the genetic variant comprises a single nucleotide polymorphism selected from the group consisting of rs17690703 and rs5743890, and wherein the single nucleotide polymorphism is predictive of decreased survival.
24. The method of any one of claims 17-23, wherein the genome of the subject comprises the single nucleotide polymorphism rs35705950, and wherein the single nucleotide polymorphism is predictive of increased survival.
25. The method of any one of claims 17-24, wherein the interstitial lung disease is a fibrotic interstitial lung disease.
26. The method of claim 25, wherein the interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
27. A method of detecting the presence or absence of a genetic variant in a human subject, the method comprising:
detecting the presence or absence of a genetic variant of TOLLIP, SPPL2C, or MDGA2 in a sample from the subject.
28. The method of claim 27, wherein the genetic variant is a genetic variant of TOLLIP.
29. The method of claim 27, wherein the genetic variant is a genetic variant of SPPL2C.
30. The method of claim 29, further comprising determining if the subject carries an H2 inversion in 17q21.31.
31. The method of claim 30, wherein the determining comprises determining if the subject comprises one or more single nucleotide polymorphisms selected from the group consisting of rs916793, rs2902662, rs17651213, and rs199448.
32. The method of 27, wherein the genetic variant is a genetic variant of MDGA2.
33. The method of any one of claims 27-32, further comprising detecting the presence or absence of a genetic variant of MUC5B in said sample.
34. The method of any of claims 27-33, wherein the genetic variant comprises a single nucleotide polymorphism listed in Fig. 7.
35. The method of claim 34, wherein the genetic variant comprises a single nucleotide polymorphism selected from the group consisting of rs111521887, rs5743894, rs5743890, rs17690703, and rs7144383.
36. The method of claim 35, wherein the genetic variant comprises rs111521887.
37. The method of claim 35 or 36, wherein the genetic variant comprises rs5743894.
38. The method of any one of claims 29-31 , wherein the genetic variant comprises rs5743890.
33. The method of any one of claims 29-32, wherein the genetic variant comprises rs17690703.
34. The method of any one of claims 29-33, wherein the genetic variant comprises rs7144383.
35. The method of any one of claims 29-34, further comprising detecting the presence or absence of a genetic variant of MUC5B comprising rs35705950.
36. The method of any one of claims 22-35, wherein the subject has, is suspected of having, or is at risk for developing an interstitial lung disease.
37. The method of claim 36, wherein the interstitial lung disease is a fibrotic interstitial lung disease.
38. The method of claim 37, wherein the interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
39. A method of detecting the presence or absence of at least two genetic variants in a human subject having, suspected of having, or at risk for developing an interstitial lung disease, the method comprising:
detecting the presence or absence of at least two of the genetic variants listed in Fig. 7 in a sample from the subject.
40. The method of claim 39, wherein the at least two genetic variants includes from two to 52 of the genetic variants listed in Fig. 7.
41. The method of claim 40, wherein the at least two genetic variants includes from two to 44 of the genetic variants listed in Fig. 1.
42. A method of testing for interstitial lung disease in a human subject, the method comprising:
detecting a level of TOLLIP gene expression in a sample from the subject, a low level of TOLLIP gene expression relative to a control being indicative of interstitial lung disease.
43. The method of claim 42, wherein the level of gene expression is detected by measuring directly or indirectly TOLLIP mRNA.
44. The method of claim 42, wherein the level of gene expression is detected by measuring Tollip protein.
45. A method of treating a human subject having an interstitial lung disease, the method comprising:
detecting a low level of TOLLIP expression relative to a control; and administering to the subject an amount of a Tollip agonist, Tollip or a genetic construct expressing TOLLIP effective to treat the interstitial lung disease.
46. A kit for predicting, diagnosing, or prognosing interstitial lung disease in a human subject, the kit comprising:
a probe or primer capable of detecting the presence or absence of a genetic variant of TOLLIP, SPPL2C, or DGA2.
47. The kit of claim 46, wherein the probe or primer is capable of detecting a genetic variant of TOLLIP.
48. The kit of claim 46 or 47, wherein the at least one probe or primer is capable of detecting a genetic variant of SPPL2C.
49. The kit of claim 48, further comprising at least one probe or primer that is capable of detecting an H2 inversion in 17q21.31.
50. The kit of claim 49, wherein the at least one probe or primer detects one or more single nucleotide polymorphisms selected from the group consisting of rs916793, rs2902662, rs17651213, and rs199448.
5 . The kit of any one of claims 46-50, wherein the at least one probe or primer is capable of detecting a genetic variant of MDGA2.
52. The kit of any one of claims 46-51, further comprising an additional probe or primer capable of detecting at least one genetic variant of MUC5B.
53. The kit of any one of claims 46-52, wherein the genetic variant comprises rs111521887, rs5743894, rs5743890, rs17690703, rs7144383, or rs35705950.
54. The kit of any one of claims 46-53, wherein the genetic variant comprises a single nucleotide polymorphism set forth in Fig. 7.
55. A kit for predicting, diagnosing, or prognosing interstitial lung disease in a human subject, the kit comprising:
at least one probe or primer for detecting the presence or absence of at least two single nucleotide polymorphisms set forth in Fig. 7.
56. The kit of claim 55, wherein the kit comprises probes and/or primers for detecting the presence or absence of from two to 52 of the single nucleotide polymorphisms set forth in Fig. 7.
57. The kit of claim 56, wherein the kit comprises probes and/or primers for detecting the presence or absence of from two to 44 of the single nucleotide polymorphisms set forth in Fig. 11.
58. A method of determining whether a human subject has or is at risk of developing an interstitial lung disease, the method comprising detecting whether the genome of the subject comprises at least two single nucleotide polymorphisms set forth in Fig. 7 and determining whether the subject has or is at risk of developing an interstitial lung disease, the presence of the genetic variant indicating that the subject has or is at risk of developing the interstitial lung disease.
59. The method of claim 58, wherein the at least two single nucleotide polymorphisms includes from two to 52 of the single nucleotide polymorphisms set forth in Fig. 7.
60. The method of claim 59, wherein the at least two single nucleotide polymorphisms includes from two to 44 of the single nucleotide polymorphisms set forth in Fig. 11.
61. The method of any one of claims 58-60, wherein the interstitial lung disease is a fibrotic interstitial lung disease.
62. The method of claim 60, wherein the interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
63. A method of prognosing an interstitial lung disease in a human subject, the method comprising detecting whether the genome of the subject comprises at least two single nucleotide polymorphisms set forth in Fig. 7 and determining a prognosis for the subject, the presence of the genetic variant gene being prognostic of increased or decreased survival.
64. The method of claim 63, wherein the interstitial lung disease is a fibrotic interstitial lung disease.
65. The method of claim 64, wherein the interstitial lung disease is idiopathic pulmonary fibrosis or familial interstitial pneumonia.
66. A method of prognosing an interstitial lung disease in a human subject, the method comprising detecting whether the genome of the subject comprises an
inversion in the 17q21.31 chromosomal region and determining a prognosis for the subject, the presence of the inversion being prognostic of increased or decreased survival.
67. A kit comprising a nucleic acid primer capable of hybridizing to a genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
68. The kit of claim 67, wherein said genetic variant has been extracted from a human subject with an interstitial lung disease or is an amplification product of a nucleic acid extracted from a human subject with an interstitial lung disease.
69. The kit of claim 67 or 68, wherein said interstitial lung disease is a pulmonary fibrotic condition.
70. The kit of one of claims 67-69, further comprising a first labeled nucleic acid probe capable of hybridizing to an amplification product of said genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
71. The kit of claim 70, further comprising a second labeled nucleic acid probe capable of hybridizing to an amplification product of said genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
72. The kit of claim 71 , wherein said first labeled nucleic acid probe comprises a first label and said additional labeled nucleic acid probe comprises a second label, wherein said first and second label are capable of fluorescence resonance energy transfer when hybridized to said genetic variant TOLLIP nucleic acid, SPPL2C nucleic acid, or MDGA2 nucleic acid.
73. An in vitro complex comprising a first nucleic acid probe hybridized to a genetic variant nucleic acid, said genetic variant nucleic acid comprising a genetic variant TOLLIP, SPPL2C or MDGA2 gene sequence, wherein said genetic variant nucleic acid is extracted from a human subject with an interstitial lung disease or is an amplification product of a nucleic acid extracted from a human subject with an interstitial lung disease.
74. The in vitro complex of claim 73, wherein said complex further comprises an second labeled nucleic acid probe hybridized to said genetic variant nucleic acid.
75. The in vitro complex of claim 74, wherein said first labeled nucleic acid probe comprises a first label and said second labeled nucleic acid probe comprises a second label, wherein said first and second label are capable of fluorescence resonance energy transfer.
76 An in vitro complex comprising a thermally stable polymerase bound to a genetic variant nucleic acid, said genetic variant nucleic acid comprising a genetic variant TOLLIP, SPPL2C or MDGA2 gene sequence, wherein said genetic variant nucleic acid is extracted from a human subject with an interstitial lung disease or is an amplification product of a nucleic acid extracted from a human subject with an interstitial lung disease.
77. The in vitro complex of claim 76, wherein the complex further comprises a nucleic acid primer hybridized to said genetic variant nucleic acid.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361759820P | 2013-02-01 | 2013-02-01 | |
US61/759,820 | 2013-02-01 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014121180A1 true WO2014121180A1 (en) | 2014-08-07 |
Family
ID=51263020
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/014395 WO2014121180A1 (en) | 2013-02-01 | 2014-02-03 | Genetic variants in interstitial lung disease subjects |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2014121180A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016172150A1 (en) * | 2015-04-22 | 2016-10-27 | The University Of Chicago | Methods for treating idiopathic pulmonary fibrosis |
WO2017121769A1 (en) * | 2016-01-12 | 2017-07-20 | bioMérieux | In vitro method for predicting a risk of developing pneumonia in a subject |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110217315A1 (en) * | 2010-01-26 | 2011-09-08 | National Jewish Health | Methods and compositions for risk prediction, diagnosis, prognosis, and treatment of pulmonary disorders |
US20110311512A1 (en) * | 2008-11-14 | 2011-12-22 | Hakon Hakonarson | Genetic Variants Underlying Human Cognition and Methods of Use Thereof as Diagnostic and Therapeutic Targets |
-
2014
- 2014-02-03 WO PCT/US2014/014395 patent/WO2014121180A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110311512A1 (en) * | 2008-11-14 | 2011-12-22 | Hakon Hakonarson | Genetic Variants Underlying Human Cognition and Methods of Use Thereof as Diagnostic and Therapeutic Targets |
US20110217315A1 (en) * | 2010-01-26 | 2011-09-08 | National Jewish Health | Methods and compositions for risk prediction, diagnosis, prognosis, and treatment of pulmonary disorders |
Non-Patent Citations (2)
Title |
---|
MARTIN ET AL.: "Regulated intramembrane proteolysis of Bri2 (Itm2b) by ADAM10 and SPPL2a/SPPL2b", THE JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 283, no. 3, 18 January 2008 (2008-01-18), pages 1644 - 1652 * |
ZHU ET AL.: "Tollip, an intracellular trafficking protein, is a novel modulator of the transforming growth factor-beta signaling pathway", JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 287, no. 47, 16 November 2012 (2012-11-16), pages 39653 - 39663 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016172150A1 (en) * | 2015-04-22 | 2016-10-27 | The University Of Chicago | Methods for treating idiopathic pulmonary fibrosis |
US10543185B2 (en) | 2015-04-22 | 2020-01-28 | The University Of Chicago | Method for treating idiopathic pulmonary fibrosis |
WO2017121769A1 (en) * | 2016-01-12 | 2017-07-20 | bioMérieux | In vitro method for predicting a risk of developing pneumonia in a subject |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7081829B2 (en) | Analysis of tumor DNA in cell-free samples | |
US12043869B2 (en) | Compositions and methods for detecting predisposition to cardiovascular disease | |
JP6530717B2 (en) | Methods for predicting the risk of interstitial pneumonia | |
JP6203217B2 (en) | How to determine glaucoma progression risk | |
KR20100020960A (en) | Genetic markers associated with endometriosis and use thereof | |
AU2011249763B2 (en) | A new combination of eight risk alleles associated with autism | |
WO2014121180A1 (en) | Genetic variants in interstitial lung disease subjects | |
WO2011076783A2 (en) | A method for evaluating a risk for a transmissible neuropsychiatric disorder | |
US20050255498A1 (en) | APOC1 genetic markers associated with age of onset of Alzheimer's Disease | |
WO2015168252A1 (en) | Mitochondrial dna copy number as a predictor of frailty, cardiovascular disease, diabetes, and all-cause mortality | |
JP2023507798A (en) | Methods and compositions for monitoring and diagnosing health and disease conditions | |
KR101167945B1 (en) | Polynucleotides derived from ATG16L1 gene comprising single nucleotide polymorphisms, microarrays and diagnostic kits comprising the same, and analytic methods for autism spectrum disorders using the same | |
JP7165617B2 (en) | How to determine the risk of hypertension | |
KR101167934B1 (en) | Polynucleotides derived from TICAM1 gene comprising single nucleotide polymorphisms, microarrays and diagnostic kits comprising the same, and analytic methods for autism spectrum disorder using the same | |
KR101167942B1 (en) | Polynucleotides derived from ALG12 gene comprising single nucleotide polymorphisms, microarrays and diagnostic kits comprising the same, and analytic methods for autism spectrum disorders using the same | |
KR101167940B1 (en) | Polynucleotides derived from FMN2 gene comprising single nucleotide polymorphisms, microarrays and diagnostic kits comprising the same, and analytic methods for autism spectrum disorders using the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14746788 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14746788 Country of ref document: EP Kind code of ref document: A1 |