WO2024192294A1 - Procédés et systèmes pour générer des banques de séquençage - Google Patents
Procédés et systèmes pour générer des banques de séquençage Download PDFInfo
- Publication number
- WO2024192294A1 WO2024192294A1 PCT/US2024/020012 US2024020012W WO2024192294A1 WO 2024192294 A1 WO2024192294 A1 WO 2024192294A1 US 2024020012 W US2024020012 W US 2024020012W WO 2024192294 A1 WO2024192294 A1 WO 2024192294A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nucleic acid
- acid molecules
- filler
- fold
- cases
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 185
- 238000012163 sequencing technique Methods 0.000 title abstract description 130
- 150000007523 nucleic acids Chemical class 0.000 claims description 616
- 102000039446 nucleic acids Human genes 0.000 claims description 597
- 108020004707 nucleic acids Proteins 0.000 claims description 597
- 108020004414 DNA Proteins 0.000 claims description 224
- 102000053602 DNA Human genes 0.000 claims description 206
- 239000000945 filler Substances 0.000 claims description 123
- 230000011987 methylation Effects 0.000 claims description 101
- 238000007069 methylation reaction Methods 0.000 claims description 101
- 239000012634 fragment Substances 0.000 claims description 72
- 108090000623 proteins and genes Proteins 0.000 claims description 69
- 108091034117 Oligonucleotide Proteins 0.000 claims description 62
- 102000004169 proteins and genes Human genes 0.000 claims description 57
- 239000011230 binding agent Substances 0.000 claims description 56
- 239000011324 bead Substances 0.000 claims description 51
- 239000007787 solid Substances 0.000 claims description 50
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 claims description 49
- 239000000758 substrate Substances 0.000 claims description 47
- 238000006243 chemical reaction Methods 0.000 claims description 32
- 230000005291 magnetic effect Effects 0.000 claims description 30
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 26
- 230000003321 amplification Effects 0.000 claims description 25
- 239000003153 chemical reaction reagent Substances 0.000 claims description 15
- 108010090804 Streptavidin Proteins 0.000 claims description 13
- 238000009396 hybridization Methods 0.000 claims description 9
- 238000012408 PCR amplification Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 6
- LSNNMFCWUKXFEE-UHFFFAOYSA-M Bisulfite Chemical compound OS([O-])=O LSNNMFCWUKXFEE-UHFFFAOYSA-M 0.000 claims description 5
- 230000002255 enzymatic effect Effects 0.000 claims description 5
- 239000003795 chemical substances by application Substances 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims description 3
- 238000010168 coupling process Methods 0.000 claims description 3
- 238000005859 coupling reaction Methods 0.000 claims description 3
- 230000005641 tunneling Effects 0.000 claims description 3
- 206010028980 Neoplasm Diseases 0.000 abstract description 120
- 238000001514 detection method Methods 0.000 abstract description 10
- 239000000523 sample Substances 0.000 description 187
- 201000011510 cancer Diseases 0.000 description 77
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 52
- 210000001519 tissue Anatomy 0.000 description 47
- 239000012472 biological sample Substances 0.000 description 45
- 201000010099 disease Diseases 0.000 description 44
- 230000000153 supplemental effect Effects 0.000 description 43
- 238000001114 immunoprecipitation Methods 0.000 description 41
- 125000003729 nucleotide group Chemical group 0.000 description 40
- 230000035772 mutation Effects 0.000 description 35
- 230000027455 binding Effects 0.000 description 33
- 239000000203 mixture Substances 0.000 description 33
- 238000007672 fourth generation sequencing Methods 0.000 description 32
- 238000002360 preparation method Methods 0.000 description 31
- 238000004458 analytical method Methods 0.000 description 26
- 210000004027 cell Anatomy 0.000 description 26
- 230000035945 sensitivity Effects 0.000 description 26
- 238000012545 processing Methods 0.000 description 24
- 238000004422 calculation algorithm Methods 0.000 description 21
- 239000002773 nucleotide Substances 0.000 description 21
- 230000015654 memory Effects 0.000 description 20
- 230000008569 process Effects 0.000 description 19
- 238000003860 storage Methods 0.000 description 18
- 238000003752 polymerase chain reaction Methods 0.000 description 17
- 229920002477 rna polymer Polymers 0.000 description 17
- 238000000137 annealing Methods 0.000 description 16
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 12
- 230000004048 modification Effects 0.000 description 11
- 238000012986 modification Methods 0.000 description 11
- 108091029523 CpG island Proteins 0.000 description 10
- 210000004369 blood Anatomy 0.000 description 10
- 239000008280 blood Substances 0.000 description 10
- 238000004891 communication Methods 0.000 description 9
- 238000003556 assay Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 208000035475 disorder Diseases 0.000 description 8
- 238000009826 distribution Methods 0.000 description 8
- 238000007481 next generation sequencing Methods 0.000 description 8
- 238000012706 support-vector machine Methods 0.000 description 8
- 108091093088 Amplicon Proteins 0.000 description 7
- 208000007660 Residual Neoplasm Diseases 0.000 description 7
- 108020004682 Single-Stranded DNA Proteins 0.000 description 7
- 238000013461 design Methods 0.000 description 7
- 239000012528 membrane Substances 0.000 description 7
- 238000012544 monitoring process Methods 0.000 description 7
- 230000002441 reversible effect Effects 0.000 description 7
- 229960002685 biotin Drugs 0.000 description 6
- 235000020958 biotin Nutrition 0.000 description 6
- 239000011616 biotin Substances 0.000 description 6
- 238000007477 logistic regression Methods 0.000 description 6
- 208000020816 lung neoplasm Diseases 0.000 description 6
- 210000002381 plasma Anatomy 0.000 description 6
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical compound CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 5
- 108091028043 Nucleic acid sequence Proteins 0.000 description 5
- 230000000692 anti-sense effect Effects 0.000 description 5
- 238000013459 approach Methods 0.000 description 5
- 238000012350 deep sequencing Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 239000012530 fluid Substances 0.000 description 5
- 238000005259 measurement Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 102000040430 polynucleotide Human genes 0.000 description 5
- 108091033319 polynucleotide Proteins 0.000 description 5
- 239000002157 polynucleotide Substances 0.000 description 5
- 238000003753 real-time PCR Methods 0.000 description 5
- 238000005070 sampling Methods 0.000 description 5
- 206010006187 Breast cancer Diseases 0.000 description 4
- 208000026310 Breast neoplasm Diseases 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- 108091029865 Exogenous DNA Proteins 0.000 description 4
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 4
- 239000000090 biomarker Substances 0.000 description 4
- 238000001369 bisulfite sequencing Methods 0.000 description 4
- 210000001124 body fluid Anatomy 0.000 description 4
- 238000002512 chemotherapy Methods 0.000 description 4
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 4
- 238000007847 digital PCR Methods 0.000 description 4
- 230000008826 genomic mutation Effects 0.000 description 4
- 201000005202 lung cancer Diseases 0.000 description 4
- 210000003819 peripheral blood mononuclear cell Anatomy 0.000 description 4
- 230000026731 phosphorylation Effects 0.000 description 4
- 238000006366 phosphorylation reaction Methods 0.000 description 4
- 230000000865 phosphorylative effect Effects 0.000 description 4
- 238000007637 random forest analysis Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 239000004055 small Interfering RNA Substances 0.000 description 4
- 238000001356 surgical procedure Methods 0.000 description 4
- RYVNIFSIEDRLSJ-UHFFFAOYSA-N 5-(hydroxymethyl)cytosine Chemical class NC=1NC(=O)N=CC=1CO RYVNIFSIEDRLSJ-UHFFFAOYSA-N 0.000 description 3
- 108700028369 Alleles Proteins 0.000 description 3
- 108091026890 Coding region Proteins 0.000 description 3
- 206010009944 Colon cancer Diseases 0.000 description 3
- 206010018338 Glioma Diseases 0.000 description 3
- 101000581507 Homo sapiens Methyl-CpG-binding domain protein 1 Proteins 0.000 description 3
- 101000615488 Homo sapiens Methyl-CpG-binding domain protein 2 Proteins 0.000 description 3
- 102100027383 Methyl-CpG-binding domain protein 1 Human genes 0.000 description 3
- 102100021299 Methyl-CpG-binding domain protein 2 Human genes 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000010828 elution Methods 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000004393 prognosis Methods 0.000 description 3
- 238000001959 radiotherapy Methods 0.000 description 3
- 210000003296 saliva Anatomy 0.000 description 3
- 239000007790 solid phase Substances 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 2
- 206010003571 Astrocytoma Diseases 0.000 description 2
- 206010060971 Astrocytoma malignant Diseases 0.000 description 2
- 208000003174 Brain Neoplasms Diseases 0.000 description 2
- 206010007953 Central nervous system lymphoma Diseases 0.000 description 2
- 230000007067 DNA methylation Effects 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 206010014967 Ependymoma Diseases 0.000 description 2
- 206010071602 Genetic polymorphism Diseases 0.000 description 2
- 101000615495 Homo sapiens Methyl-CpG-binding domain protein 3 Proteins 0.000 description 2
- 101000615492 Homo sapiens Methyl-CpG-binding domain protein 4 Proteins 0.000 description 2
- 206010025557 Malignant fibrous histiocytoma of bone Diseases 0.000 description 2
- 208000000172 Medulloblastoma Diseases 0.000 description 2
- 102000006890 Methyl-CpG-Binding Protein 2 Human genes 0.000 description 2
- 108010072388 Methyl-CpG-Binding Protein 2 Proteins 0.000 description 2
- 102100021291 Methyl-CpG-binding domain protein 3 Human genes 0.000 description 2
- 102100021290 Methyl-CpG-binding domain protein 4 Human genes 0.000 description 2
- 208000003445 Mouth Neoplasms Diseases 0.000 description 2
- 108091092724 Noncoding DNA Proteins 0.000 description 2
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 2
- 206010060862 Prostate cancer Diseases 0.000 description 2
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 2
- 238000003559 RNA-seq method Methods 0.000 description 2
- 206010039491 Sarcoma Diseases 0.000 description 2
- 108091081021 Sense strand Proteins 0.000 description 2
- 108091027967 Small hairpin RNA Proteins 0.000 description 2
- 108020004459 Small interfering RNA Proteins 0.000 description 2
- 208000005718 Stomach Neoplasms Diseases 0.000 description 2
- 208000033559 Waldenström macroglobulinemia Diseases 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 239000000427 antigen Substances 0.000 description 2
- 108091007433 antigens Proteins 0.000 description 2
- 102000036639 antigens Human genes 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 201000007335 cerebellar astrocytoma Diseases 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 208000029742 colonic neoplasm Diseases 0.000 description 2
- 239000000356 contaminant Substances 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 238000013401 experimental design Methods 0.000 description 2
- 210000003722 extracellular fluid Anatomy 0.000 description 2
- 230000037433 frameshift Effects 0.000 description 2
- 108020001507 fusion proteins Proteins 0.000 description 2
- 102000037865 fusion proteins Human genes 0.000 description 2
- 206010017758 gastric cancer Diseases 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 238000009169 immunotherapy Methods 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 210000000265 leukocyte Anatomy 0.000 description 2
- 208000012987 lip and oral cavity carcinoma Diseases 0.000 description 2
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 2
- 108091070501 miRNA Proteins 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 230000003278 mimic effect Effects 0.000 description 2
- 208000018795 nasal cavity and paranasal sinus carcinoma Diseases 0.000 description 2
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 2
- 201000002528 pancreatic cancer Diseases 0.000 description 2
- 208000008443 pancreatic carcinoma Diseases 0.000 description 2
- 201000002530 pancreatic endocrine carcinoma Diseases 0.000 description 2
- 230000005298 paramagnetic effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 239000013641 positive control Substances 0.000 description 2
- 201000011461 pre-eclampsia Diseases 0.000 description 2
- 208000016800 primary central nervous system lymphoma Diseases 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 238000003908 quality control method Methods 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 230000003252 repetitive effect Effects 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 206010041823 squamous cell carcinoma Diseases 0.000 description 2
- 201000011549 stomach cancer Diseases 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 208000008732 thymoma Diseases 0.000 description 2
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 2
- 208000018417 undifferentiated high grade pleomorphic sarcoma of bone Diseases 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- 238000012070 whole genome sequencing analysis Methods 0.000 description 2
- HWPZZUQOWRWFDB-UHFFFAOYSA-N 1-methylcytosine Chemical compound CN1C=CC(N)=NC1=O HWPZZUQOWRWFDB-UHFFFAOYSA-N 0.000 description 1
- COHVJBUINVIGOI-UHFFFAOYSA-N 4-amino-4-methyl-1,3-dihydropyrimidin-2-one Chemical compound CC1(N)NC(=O)NC=C1 COHVJBUINVIGOI-UHFFFAOYSA-N 0.000 description 1
- CKOMXBHMKXXTNW-UHFFFAOYSA-N 6-methyladenine Chemical compound CNC1=NC=NC2=C1N=CN2 CKOMXBHMKXXTNW-UHFFFAOYSA-N 0.000 description 1
- 208000030507 AIDS Diseases 0.000 description 1
- 208000002008 AIDS-Related Lymphoma Diseases 0.000 description 1
- 208000024893 Acute lymphoblastic leukemia Diseases 0.000 description 1
- 208000014697 Acute lymphocytic leukaemia Diseases 0.000 description 1
- 206010061424 Anal cancer Diseases 0.000 description 1
- 208000007860 Anus Neoplasms Diseases 0.000 description 1
- 206010073360 Appendix cancer Diseases 0.000 description 1
- 206010003445 Ascites Diseases 0.000 description 1
- 208000010839 B-cell chronic lymphocytic leukemia Diseases 0.000 description 1
- 208000032791 BCR-ABL1 positive chronic myelogenous leukemia Diseases 0.000 description 1
- 206010004146 Basal cell carcinoma Diseases 0.000 description 1
- 206010004593 Bile duct cancer Diseases 0.000 description 1
- 206010005003 Bladder cancer Diseases 0.000 description 1
- 206010005949 Bone cancer Diseases 0.000 description 1
- 208000018084 Bone neoplasm Diseases 0.000 description 1
- 208000011691 Burkitt lymphomas Diseases 0.000 description 1
- 108010022366 Carcinoembryonic Antigen Proteins 0.000 description 1
- 102100025475 Carcinoembryonic antigen-related cell adhesion molecule 5 Human genes 0.000 description 1
- 206010007279 Carcinoid tumour of the gastrointestinal tract Diseases 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 206010057248 Cell death Diseases 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- 208000010833 Chronic myeloid leukaemia Diseases 0.000 description 1
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 208000008743 Desmoplastic Small Round Cell Tumor Diseases 0.000 description 1
- 206010064581 Desmoplastic small round cell tumour Diseases 0.000 description 1
- 206010061818 Disease progression Diseases 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 241000305071 Enterobacterales Species 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 208000006168 Ewing Sarcoma Diseases 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 208000022072 Gallbladder Neoplasms Diseases 0.000 description 1
- 206010051066 Gastrointestinal stromal tumour Diseases 0.000 description 1
- 208000032612 Glial tumor Diseases 0.000 description 1
- 208000017604 Hodgkin disease Diseases 0.000 description 1
- 208000021519 Hodgkin lymphoma Diseases 0.000 description 1
- 208000010747 Hodgkins lymphoma Diseases 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 description 1
- 101000932478 Homo sapiens Receptor-type tyrosine-protein kinase FLT3 Proteins 0.000 description 1
- 101100214367 Homo sapiens ZNF215 gene Proteins 0.000 description 1
- 108090000144 Human Proteins Proteins 0.000 description 1
- 102000003839 Human Proteins Human genes 0.000 description 1
- 206010021042 Hypopharyngeal cancer Diseases 0.000 description 1
- 206010056305 Hypopharyngeal neoplasm Diseases 0.000 description 1
- 101150020771 IDH gene Proteins 0.000 description 1
- 206010061252 Intraocular melanoma Diseases 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 208000007766 Kaposi sarcoma Diseases 0.000 description 1
- 208000008839 Kidney Neoplasms Diseases 0.000 description 1
- 206010023825 Laryngeal cancer Diseases 0.000 description 1
- 206010061523 Lip and/or oral cavity cancer Diseases 0.000 description 1
- 239000000232 Lipid Bilayer Substances 0.000 description 1
- 208000031422 Lymphocytic Chronic B-Cell Leukemia Diseases 0.000 description 1
- 206010025312 Lymphoma AIDS related Diseases 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- 108700043128 MBD2 Proteins 0.000 description 1
- 101150083522 MECP2 gene Proteins 0.000 description 1
- 208000030070 Malignant epithelial tumor of ovary Diseases 0.000 description 1
- 206010073059 Malignant neoplasm of unknown primary site Diseases 0.000 description 1
- 208000032271 Malignant tumor of penis Diseases 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 206010027406 Mesothelioma Diseases 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 102100039124 Methyl-CpG-binding protein 2 Human genes 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 101150042248 Mgmt gene Proteins 0.000 description 1
- 108091092878 Microsatellite Proteins 0.000 description 1
- 229920006068 Minlon® Polymers 0.000 description 1
- 201000003793 Myelodysplastic syndrome Diseases 0.000 description 1
- 208000033761 Myelogenous Chronic BCR-ABL Positive Leukemia Diseases 0.000 description 1
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 1
- 201000007224 Myeloproliferative neoplasm Diseases 0.000 description 1
- PJKKQFAEFWCNAQ-UHFFFAOYSA-N N(4)-methylcytosine Chemical class CNC=1C=CNC(=O)N=1 PJKKQFAEFWCNAQ-UHFFFAOYSA-N 0.000 description 1
- 208000002454 Nasopharyngeal Carcinoma Diseases 0.000 description 1
- 206010061306 Nasopharyngeal cancer Diseases 0.000 description 1
- 208000034176 Neoplasms, Germ Cell and Embryonal Diseases 0.000 description 1
- 206010029260 Neuroblastoma Diseases 0.000 description 1
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 1
- 102000007999 Nuclear Proteins Human genes 0.000 description 1
- 108010089610 Nuclear Proteins Proteins 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 102000043276 Oncogene Human genes 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 206010031096 Oropharyngeal cancer Diseases 0.000 description 1
- 206010057444 Oropharyngeal neoplasm Diseases 0.000 description 1
- 208000007571 Ovarian Epithelial Carcinoma Diseases 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 206010061328 Ovarian epithelial cancer Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 208000000821 Parathyroid Neoplasms Diseases 0.000 description 1
- 208000002471 Penile Neoplasms Diseases 0.000 description 1
- 206010034299 Penile cancer Diseases 0.000 description 1
- 208000009565 Pharyngeal Neoplasms Diseases 0.000 description 1
- 206010034811 Pharyngeal cancer Diseases 0.000 description 1
- 206010035052 Pineal germinoma Diseases 0.000 description 1
- 208000007913 Pituitary Neoplasms Diseases 0.000 description 1
- 201000005746 Pituitary adenoma Diseases 0.000 description 1
- 206010061538 Pituitary tumour benign Diseases 0.000 description 1
- 201000008199 Pleuropulmonary blastoma Diseases 0.000 description 1
- 208000006664 Precursor Cell Lymphoblastic Leukemia-Lymphoma Diseases 0.000 description 1
- 206010036790 Productive cough Diseases 0.000 description 1
- 108091028733 RNTP Proteins 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 description 1
- 102100020718 Receptor-type tyrosine-protein kinase FLT3 Human genes 0.000 description 1
- 208000015634 Rectal Neoplasms Diseases 0.000 description 1
- 206010038389 Renal cancer Diseases 0.000 description 1
- 208000006265 Renal cell carcinoma Diseases 0.000 description 1
- 201000000582 Retinoblastoma Diseases 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- -1 SPRI beads) Chemical class 0.000 description 1
- 208000004337 Salivary Gland Neoplasms Diseases 0.000 description 1
- 206010061934 Salivary gland cancer Diseases 0.000 description 1
- 241000702208 Shigella phage SfX Species 0.000 description 1
- 208000000453 Skin Neoplasms Diseases 0.000 description 1
- 206010041067 Small cell lung cancer Diseases 0.000 description 1
- 208000021712 Soft tissue sarcoma Diseases 0.000 description 1
- 208000000102 Squamous Cell Carcinoma of Head and Neck Diseases 0.000 description 1
- LSNNMFCWUKXFEE-UHFFFAOYSA-N Sulfurous acid Chemical group OS(O)=O LSNNMFCWUKXFEE-UHFFFAOYSA-N 0.000 description 1
- 208000002847 Surgical Wound Diseases 0.000 description 1
- 208000031673 T-Cell Cutaneous Lymphoma Diseases 0.000 description 1
- 206010042971 T-cell lymphoma Diseases 0.000 description 1
- 208000027585 T-cell non-Hodgkin lymphoma Diseases 0.000 description 1
- 206010043515 Throat cancer Diseases 0.000 description 1
- 201000009365 Thymic carcinoma Diseases 0.000 description 1
- 208000024770 Thyroid neoplasm Diseases 0.000 description 1
- 101710183280 Topoisomerase Proteins 0.000 description 1
- 230000010632 Transcription Factor Activity Effects 0.000 description 1
- 108020004566 Transfer RNA Proteins 0.000 description 1
- 206010046431 Urethral cancer Diseases 0.000 description 1
- 206010046458 Urethral neoplasms Diseases 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 1
- 201000005969 Uveal melanoma Diseases 0.000 description 1
- 206010047741 Vulval cancer Diseases 0.000 description 1
- 208000004354 Vulvar Neoplasms Diseases 0.000 description 1
- 208000016025 Waldenstroem macroglobulinemia Diseases 0.000 description 1
- 208000008383 Wilms tumor Diseases 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 108091060592 XDNA Proteins 0.000 description 1
- 102100039974 Zinc finger protein 215 Human genes 0.000 description 1
- 208000009956 adenocarcinoma Diseases 0.000 description 1
- 208000020990 adrenal cortex carcinoma Diseases 0.000 description 1
- 208000007128 adrenocortical carcinoma Diseases 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 201000011165 anus cancer Diseases 0.000 description 1
- 230000001640 apoptogenic effect Effects 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 208000021780 appendiceal neoplasm Diseases 0.000 description 1
- 238000007845 assembly PCR Methods 0.000 description 1
- 238000007846 asymmetric PCR Methods 0.000 description 1
- 230000037147 athletic performance Effects 0.000 description 1
- 208000026900 bile duct neoplasm Diseases 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000001772 blood platelet Anatomy 0.000 description 1
- 238000009534 blood test Methods 0.000 description 1
- 210000001185 bone marrow Anatomy 0.000 description 1
- 201000008873 bone osteosarcoma Diseases 0.000 description 1
- 201000002143 bronchus adenoma Diseases 0.000 description 1
- 230000001680 brushing effect Effects 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 230000030833 cell death Effects 0.000 description 1
- 208000030239 cerebral astrocytoma Diseases 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 210000000038 chest Anatomy 0.000 description 1
- 208000011654 childhood malignant neoplasm Diseases 0.000 description 1
- 208000006990 cholangiocarcinoma Diseases 0.000 description 1
- 208000032852 chronic lymphocytic leukemia Diseases 0.000 description 1
- 230000003931 cognitive performance Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 201000007241 cutaneous T cell lymphoma Diseases 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000017858 demethylation Effects 0.000 description 1
- 238000010520 demethylation reaction Methods 0.000 description 1
- 230000000779 depleting effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 235000001916 dieting Nutrition 0.000 description 1
- 230000037228 dieting effect Effects 0.000 description 1
- 230000005750 disease progression Effects 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 210000002889 endothelial cell Anatomy 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000006862 enzymatic digestion Effects 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 230000029142 excretion Effects 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000005669 field effect Effects 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 201000010175 gallbladder cancer Diseases 0.000 description 1
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 1
- 210000003731 gingival crevicular fluid Anatomy 0.000 description 1
- 201000009277 hairy cell leukemia Diseases 0.000 description 1
- 201000010536 head and neck cancer Diseases 0.000 description 1
- 208000014829 head and neck neoplasm Diseases 0.000 description 1
- 201000000459 head and neck squamous cell carcinoma Diseases 0.000 description 1
- 201000010235 heart cancer Diseases 0.000 description 1
- 208000024348 heart neoplasm Diseases 0.000 description 1
- 238000003505 heat denaturation Methods 0.000 description 1
- 208000029824 high grade glioma Diseases 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 238000007849 hot-start PCR Methods 0.000 description 1
- 238000007031 hydroxymethylation reaction Methods 0.000 description 1
- 201000006866 hypopharynx cancer Diseases 0.000 description 1
- 230000002267 hypothalamic effect Effects 0.000 description 1
- 230000002621 immunoprecipitating effect Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 210000004153 islets of langerhan Anatomy 0.000 description 1
- 238000011901 isothermal amplification Methods 0.000 description 1
- 201000010982 kidney cancer Diseases 0.000 description 1
- 206010023841 laryngeal neoplasm Diseases 0.000 description 1
- 208000032839 leukemia Diseases 0.000 description 1
- 238000007834 ligase chain reaction Methods 0.000 description 1
- 206010024627 liposarcoma Diseases 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 208000037841 lung tumor Diseases 0.000 description 1
- 230000001926 lymphatic effect Effects 0.000 description 1
- 238000007403 mPCR Methods 0.000 description 1
- 201000000564 macroglobulinemia Diseases 0.000 description 1
- 208000030883 malignant astrocytoma Diseases 0.000 description 1
- 201000011614 malignant glioma Diseases 0.000 description 1
- 208000026045 malignant tumor of parathyroid gland Diseases 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000005399 mechanical ventilation Methods 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 210000000716 merkel cell Anatomy 0.000 description 1
- 229910044991 metal oxide Inorganic materials 0.000 description 1
- 150000004706 metal oxides Chemical class 0.000 description 1
- 208000037970 metastatic squamous neck cancer Diseases 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 206010051747 multiple endocrine neoplasia Diseases 0.000 description 1
- 230000000869 mutational effect Effects 0.000 description 1
- 201000005962 mycosis fungoides Diseases 0.000 description 1
- 208000025113 myeloid leukemia Diseases 0.000 description 1
- 201000011216 nasopharynx carcinoma Diseases 0.000 description 1
- 230000001338 necrotic effect Effects 0.000 description 1
- 238000013188 needle biopsy Methods 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 230000009826 neoplastic cell growth Effects 0.000 description 1
- 238000007857 nested PCR Methods 0.000 description 1
- 230000000683 nonmetastatic effect Effects 0.000 description 1
- 230000009871 nonspecific binding Effects 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 201000002575 ocular melanoma Diseases 0.000 description 1
- 201000006958 oropharynx cancer Diseases 0.000 description 1
- 201000008968 osteosarcoma Diseases 0.000 description 1
- 208000021284 ovarian germ cell tumor Diseases 0.000 description 1
- 238000009595 pap smear Methods 0.000 description 1
- 208000028591 pheochromocytoma Diseases 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 201000007315 pineal gland astrocytoma Diseases 0.000 description 1
- 201000004838 pineal region germinoma Diseases 0.000 description 1
- 208000021310 pituitary gland adenoma Diseases 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 210000004180 plasmocyte Anatomy 0.000 description 1
- 238000005498 polishing Methods 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 230000001376 precipitating effect Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 208000025638 primary cutaneous T-cell non-Hodgkin lymphoma Diseases 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 238000012207 quantitative assay Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 206010038038 rectal cancer Diseases 0.000 description 1
- 201000001275 rectum cancer Diseases 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 208000030859 renal pelvis/ureter urothelial carcinoma Diseases 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 201000009410 rhabdomyosarcoma Diseases 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 108020004418 ribosomal RNA Proteins 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 238000007790 scraping Methods 0.000 description 1
- 210000000582 semen Anatomy 0.000 description 1
- 201000000849 skin cancer Diseases 0.000 description 1
- 201000008261 skin carcinoma Diseases 0.000 description 1
- 239000010454 slate Substances 0.000 description 1
- 208000000649 small cell carcinoma Diseases 0.000 description 1
- 208000000587 small cell lung carcinoma Diseases 0.000 description 1
- 201000002314 small intestine cancer Diseases 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 238000011895 specific detection Methods 0.000 description 1
- 210000003802 sputum Anatomy 0.000 description 1
- 208000024794 sputum Diseases 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 201000008205 supratentorial primitive neuroectodermal tumor Diseases 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 210000004243 sweat Anatomy 0.000 description 1
- 210000001179 synovial fluid Anatomy 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 238000005382 thermal cycling Methods 0.000 description 1
- 201000002510 thyroid cancer Diseases 0.000 description 1
- 238000004448 titration Methods 0.000 description 1
- 208000029387 trophoblastic neoplasm Diseases 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 201000005112 urinary bladder cancer Diseases 0.000 description 1
- 208000037965 uterine sarcoma Diseases 0.000 description 1
- 206010046885 vaginal cancer Diseases 0.000 description 1
- 208000013139 vaginal neoplasm Diseases 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 210000000239 visual pathway Anatomy 0.000 description 1
- 230000004400 visual pathway Effects 0.000 description 1
- 201000005102 vulva cancer Diseases 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Definitions
- Circulating tumor DNA has increasingly demonstrated potential as a non- invasive, tumor-specific biomarker for routine clinical use.
- ctDNA is derived from tumor cells predominantly undergoing cell-death and released into circulation of various bodily fluids including blood.
- the majority of blood-derived cell-free DNA originates from healthy (e.g., non-cancerous) tissues.
- the fraction of ctDNA observed may range from ⁇ 0.1% to 90% of total cell-free DNA at diagnosis depending on several factors including primary site of the tumor and disease burden.
- ctDNA has been providing non-invasive access to the tumor’ s molecular landscape and disease burden. Methods for detecting ctDNA with increased sensitivity are needed, especially in subjects with lower abundance of ctDNA.
- the present disclosure provides a method for detection of hypermethylated cell-free nucleic acid molecules in a sample.
- it integrates the use of the Oxford Nanopore sequencing technology (ONT), allowing for the sequencing of the enriched methylated single stranded DNA generated after immunoprecipitation without any PCR amplification.
- This disclosure allows for the measurement of methylation status at a single base pair resolution, as well as direct count of methylated cell-free nucleic acid molecules.
- This disclosure allows for the mapping of the whole methylome without needing to sequence the whole genome or targeted sequencing, in a PCR-free and bisulfite or enzymatic conversion-free manner.
- the present disclosure presents methods for providing a plurality of nucleic acid molecules derived from a nucleic acid sample; subjecting said plurality of nucleic acid molecules to enrichment to yield (i) a first plurality of nucleic acid molecules having a methylation level at or above a threshold methylation level and (ii) a second plurality of nucleic acid molecules having a methylation level below said threshold methylation level, wherein said enrichment does not use bisulfite conversion or enzymatic conversion of said plurality of nucleic acid molecules; and directing a nucleic acid molecule derived from said first plurality of nucleic acid molecules or second plurality of nucleic acid molecules to a nanopore of a nanopore sensing platform, which nanopore of said nanopore sensing platform is used to identify a methylation status of said nucleic acid molecule.
- the present disclosure presents a method, comprising: (a) providing a plurality of nucleic acid molecules derived from a nucleic acid sample; (b) subjecting said plurality of nucleic acid molecules to enrichment to yield (i) a first plurality of nucleic acid molecules having a methylation level at or above a threshold methylation level and (ii) a second plurality of nucleic acid molecules having a methylation level below said threshold methylation level, wherein said enrichment does not use bisulfite conversion or enzymatic conversion of said plurality of nucleic acid molecules; and (c) directing a nucleic acid molecule derived from said first plurality of nucleic acid molecules or second plurality of nucleic acid molecules to a nanopore of a nanopore sensing platform, wherein the nanopore of said nanopore sensing platform is used to identify a methylation status of said nucleic acid molecule.
- said plurality of nucleic acid molecules is a deoxyribonucleic acid (DNA) molecule.
- said nucleic acid sample is a cell-free DNA (cfDNA) sample.
- said enrichment comprises using a plurality of filler nucleic acid molecules to enrich for said first plurality of nucleic acid molecules.
- (b) further comprises incubating said plurality of nucleic acid molecules and said plurality of filler nucleic acid molecules under conditions sufficient to enrich for a methylated region of said plurality of nucleic acid molecules to yield said first plurality of nucleic acid molecules.
- said plurality of filler nucleic acid molecules is a plurality of exogenous filler DNAs, wherein said exogenous filler DNAs are not derived from said nucleic acid sample.
- said plurality of filler nucleic acid molecules is one or more samples derived from a pool of more than one nucleic acid samples.
- said first plurality of nucleic acid molecules is hypermethylated.
- said second plurality of nucleic acid molecules is hypomethylated.
- said plurality of filler nucleic acid molecules is added in at least about a 5-fold excess relative to total nucleic acid in said nucleic acid sample. In some cases, said plurality of filler nucleic acid molecules increases a fold enrichment ratio.
- said fold enrichment ratio is at least 500. In some cases, said fold enrichment ratio is at least 1,000. In some cases, said plurality of filler nucleic acid molecules does not align to a human genome. In some cases, said plurality of filler nucleic acid molecules is XDNA. In some cases, said plurality of filler nucleic acid molecules comprises a fragment length of about 50 base pairs (bp) to about 800 bp. In some cases, said plurality of filler nucleic acid molecules is biotinylated. In some cases, said plurality of filler nucleic acid molecules comprises nonbiotinylated filler nucleic acid molecules.
- said incubating comprises contacting said nucleic acid sample and said plurality of filler nucleic acid molecules with a binder selective for methylated regions of nucleic acid molecules.
- said binder is selected from the group consisting of an anti-5- methylcytosine antibody or a derivative thereof, an anti-5-carboxylcytosine antibody or a derivative thereof, an anti-5-formylcytosine antibody or a derivative thereof, an anti-5-hydroxymethylcytosine antibody or a derivative thereof, an anti-3- methylcytosine antibody or a derivative thereof, and any combinations thereof.
- said binder is said anti-5-methylcytosine antibody or a derivative thereof.
- said first plurality of nucleic acid molecules and said second plurality of nucleic acid molecules are ligated with double stranded adaptor oligonucleotides. In some cases, said first plurality of nucleic acid molecules and said second plurality of nucleic acid molecules are annealed with single-stranded oligonucleotides. In some cases, said plurality of filler nucleic acid molecules is not removed. In some cases, said plurality of filler nucleic acid molecules is not biotinylated filler nucleic acid molecules. In some cases, prior to (c), the method further comprises removing a substantial amount of said plurality of filler nucleic acid molecules.
- said plurality of filler nucleic acid molecules comprise biotinylated filler nucleic acid molecules.
- the method further comprises contacting said biotinylated filler nucleic acid molecules with a plurality of streptavidin beads.
- removing comprises using hybridization capture.
- the method prior to (c), the method further comprises using hybridization capture to purify said first plurality of nucleic acid molecules and said second plurality of nucleic acid molecules.
- (c) comprises directing said nucleic acid molecule through said nanopore.
- (c) further comprises using said nanopore sensing platform to measure a current or change thereof as said nucleic acid molecule interacts with said nanopore.
- said current is a faradic current. In some cases, said current is tunneling current. In some cases, (c) further comprises using said nanopore sensing platform to measure a charge, conductivity, resistance, or impedance, or change thereof, as said nucleic acid molecule interacts with said nanopore. In some cases, in (c) said nucleic acid molecule is derived from said first plurality of nucleic acid molecules. In some cases, in (c) said nucleic acid molecule is derived from said second plurality of nucleic acid molecules. In some cases, subsequent to (c), the method further comprises removing an unmethylated nucleic acid molecule.
- said incubating comprises contacting said nucleic acid sample and said plurality of filler nucleic acid molecules with a methylated nucleic acid capture reagent.
- said methylated nucleic capture agents comprises a binder and a solid substrate.
- said methylated nucleic acid capture reagent is generated by coupling said binder to said solid substate by incubating said binder with said solid substrate.
- said solid substrate is a bead.
- said bead is a protein A bead.
- said solid substrate is a magnetic solid substrate.
- said binder is selected from the group consisting of an anti-5-methylcytosine antibody or a derivative thereof, an anti-5-carboxylcytosine antibody or a derivative thereof, an anti-5-formylcytosine antibody or a derivative thereof, an anti-5-hydroxymethylcytosine antibody or a derivative thereof, an anti-3 - methylcytosine antibody or a derivative thereof, and any combinations thereof.
- said first plurality of nucleic acid molecules and said second plurality of nucleic acid molecules are pooled together.
- said plurality of nucleic acid molecules is derived from a pool of more than one nucleic acid samples.
- said nucleic acid molecule derived from said first plurality of nucleic acid molecules or second plurality of nucleic acid molecules undergo an amplification before (c). In some cases, said nucleic acid molecule derived from said first plurality of nucleic acid molecules or second plurality of nucleic acid molecules does not undergo an amplification before (c). In some cases, said amplification is a PCR amplification.
- FIG. 3A shows a diagram illustrating an exemplary process for integrating a motor protein into DNA.
- FIG. 3B diagrams an exemplary experimental design for integrating a motor protein.
- FIG. 4 shows a schematic of a computer system, in accordance with embodiments of the present disclosure.
- FIG. 5 shows three designs illustrating exemplary processes for integrating a nanopore sequencing with cell-free Methylated DNA Immunoprecipitation (cfMeDIP).
- FIG. 6 shows a diagram illustrating an example process for integrating a nanopore sequencing with cfMeDIP.
- FIG. 7 shows a diagram illustrating an example process for integrating a nanopore sequencing with cfMeDIP, which comprises a separate step of pre-binding magnetic beads and 5-mC binder to create a magnetic bead-5-mCbinder complex.
- FIG. 8 shows the read distribution of raw reads obtained from a process of integrating a nanopore sequencing with cfMeDIP.
- FIG. 9 shows the read count of each index obtained from a process of integrating a nanopore sequencing with cfMeDIP.
- FIG. 10 shows the read distribution of fragment size obtained a process of integrating a nanopore sequencing with cfMeDIP.
- FIG. 11 shows a diagram illustrating an example process for integrating a nanopore sequencing with cfMeDIP compared to integrating with Illumina sequencing.
- the present disclosure provides methods and systems for the processing and analysis of nucleic acids present in biological samples, which can be useful in determining a risk or likelihood of a subject having cancer or a tumor with high sensitivity and/or high specificity.
- Methods and systems provided herein can comprise the creation of hypermethylated cell-free nucleic acid molecules which can be processed to differentiate between, for example, cancerous and non-cancerous tissue in circulating free DNA (cfDNA).
- the use and analysis of hypermethylated nucleic acids can allow for highly sensitive and highly specific detection and/or characterization of circulating tumor DNA (ctDNA) in a fluid sample (e.g., a blood sample) obtained from a subject.
- a fluid sample e.g., a blood sample
- the use and analysis of hypermethylated nucleic acids can allow for increased sensitivity, specificity, and/or efficiency in the determination of a subject’s risk of having or having a risk of developing a tumor or cancer.
- the analysis of hypermethylated nucleic acids can comprise nanopore sequencing.
- subject generally refers to any member of the animal kingdom. Thus, the methods and described herein are applicable to both human and veterinary disease and animal models. Preferred subjects are “patients,” i.e., living humans that are being investigated to determine whether treatment or medical care is needed for a disease or condition; or that are receiving medical care for a disease or condition (e.g., cancer).
- patients i.e., living humans that are being investigated to determine whether treatment or medical care is needed for a disease or condition; or that are receiving medical care for a disease or condition (e.g., cancer).
- genomic information generally refers to genomic information from a subject, which may be, for example, at least a portion or an entirety of a subject’s hereditary information.
- a genome can be encoded either in DNA or in RNA.
- a genome can comprise coding regions (e.g., that code for proteins) as well as non-coding regions.
- a genome can include the sequence of all chromosomes together in an organism.
- the human genome ordinarily has a total of 46 chromosomes. The sequence of all of these together may constitute a human genome.
- nucleic acid used herein generally refers to a polynucleotide comprising two or more nucleotides, i.e., a polymeric form of nucleotides of any length, either deoxyribonucleotides (dNTPs) or ribonucleotides (rNTPs), or analogs thereof.
- dNTPs deoxyribonucleotides
- rNTPs ribonucleotides
- Non-limiting examples of nucleic acids include deoxyribonucleic (DNA), ribonucleic acid (RNA), coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes, cDNA, recombinant nucleic acids, branched nucleic acids, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers.
- DNA deoxyribonucleic
- RNA ribonucleic acid
- coding or non-coding regions of a gene or gene fragment loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfer
- a nucleic acid may comprise one or more modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be made before or after assembly of the nucleic acid.
- the sequence of nucleotides of a nucleic acid may be interrupted by non-nucleotide components.
- a nucleic acid may be further modified after polymerization, such as by conjugation or binding with a reporter agent.
- a “variant” nucleic acid is a polynucleotide having a nucleotide sequence identical to that of its original nucleic acid except having at least one nucleotide modified, for example, deleted, inserted, or replaced, respectively. The variant may have a nucleotide sequence at least about 80%, 90%, 95%, or 99%, identity to the nucleotide sequence of the original nucleic acid.
- Cell-free methylated DNA is DNA that can be one or more nucleic acid molecules circulating freely in the blood stream. In some cases, cell-free methylated DNA can be methylated at various regions of the DNA. Samples, for example, plasma samples may be taken to analyze cell-free methylated DNA. Studies reveal that much of the circulating nucleic acids in blood arise from necrotic or apoptotic cells and greatly elevated levels of nucleic acids from apoptosis is observed in diseases such as cancer.
- circulating DNA bears hallmark signs of the disease including mutations in oncogenes, microsatellite alterations, and, for certain cancers, viral genomic sequences, DNA or RNA in plasma has become increasingly studied as a potential biomarker for disease.
- a quantitative assay for low levels of circulating tumor DNA in total circulating DNA may serve as a better marker for detecting the relapse of colorectal cancer compared with carcinoembryonic antigen, the standard biomarker used clinically.
- Cell-free DNA e.g., circulating cfDNA
- library preparation generally includes one or more of list end-repair, A-tailing, adaptor ligation, or any other preparation performed on the cell free DNA to permit subsequent sequencing of DNA.
- supplemental processed DNA may be noncoding DNA or it may consist of amplicons.
- nanopore generally refers to a pore, channel or passage formed or otherwise provided in a membrane.
- a membrane can be an organic membrane, such as a lipid bilayer, or a synthetic membrane, such as a membrane formed of a polymeric material.
- the membrane can be a polymeric material.
- the nanopore can be disposed adjacent or in proximity to a sensing circuit or an electrode coupled to a sensing circuit, such as, for example, a complementary metal-oxide semiconductor (CMOS) or field effect transistor (FET) circuit.
- CMOS complementary metal-oxide semiconductor
- FET field effect transistor
- a nanopore has a characteristic width or diameter on the order of 0.1 nanometers (nm) to about 1000 nm.
- Some nanopores are proteins.
- the fragment length metric is fragment length.
- the subject cell-free methylated DNA is limited to fragments having a length of ⁇ 170 bp, ⁇ 165 bp, ⁇ 160 bp, ⁇ 155 bp, ⁇ 150 bp, ⁇ 145 bp, ⁇ 140 bp, ⁇ 135 bp, ⁇ 130 bp, ⁇ 125 bp, ⁇ 120 bp, ⁇ 115 bp, ⁇ 110 bp, ⁇ 105 bp, or ⁇ 100 bp.
- the subject cell-free methylated DNA is limited to fragments having a length of between about 100 - about 150 bp, 110 - 140 bp, or 120 - 130 bp.
- the fragment length metric is the fragment length distribution of the subject cell-free methylated DNA.
- the subject cell-free methylated DNA is limited to fragments within the bottom 50 th , 45 th , 40 th , 35 th , 30 th , 25 th , 20 th , 15 th , or 10 th percentile based on length.
- Cell-free DNA may be processed and enriched for methylated cfDNA using any one of the methods disclosed herein. As shown in FIG. 1, cfDNA may undergo end-pair, A-tailing, adaptor ligation or other preparation thereof to produce a library of DNAs for downstream sequencing.
- the library may be combined with supplemental processed DNA (e.g., filler DNA) and/or spiked-in DNAs to produce a sample mixture, before heat denaturation and immunoprecipitation to enrich for methylated cfDNA.
- Immunoprecipitation can comprise combining the sample mixture to any one of the binders disclosed herein and a solid substrate (e.g., a plurality of magnetic beads).
- immunoprecipitation can comprise combining the sample mixture with a methylated nucleic acid capture reagent, wherein the methylated nucleic acid capture reagent comprises an incubated mixture of a binder and a solid substrate.
- the methylated nucleic acid capture reagent comprises an incubated mixture of a binder and a solid substrate.
- the methylated cfDNA may undergo amplification before being subjected to sequencing (e.g., Illumina sequencing reaction).
- the methylated may not undergo amplification before being subjected to sequencing (e.g., nanopore sequencing reaction).
- cfDNA Cell-free DNA
- cancer development can be associated with focal gain of 5’ methylcytosines (5mC), for instance, at cytosine-phosphate-guanine (CpG) islands and CpG island shores. Cancer development can also be associated with global (e.g., genome-wide) cytosine demethylation (e.g., global loss of 5mC).
- 5mC methylcytosines
- CpG cytosine-phosphate-guanine
- CpG island shores cancer development can also be associated with global (e.g., genome-wide) cytosine demethylation (e.g., global loss of 5mC).
- ctDNA can be distinguished from cfDNA molecules derived from healthy tissue (e.g., non-tumor and/or noncancer tissue) by the methylation level (e.g., the percentage of nucleotide residues that are methylated) of the nucleic acid molecules.
- healthy tissue e.g., non-tumor and/or noncancer tissue
- methylation level e.g., the percentage of nucleotide residues that are methylated
- nucleic acid molecules of or derived from tumor tissue and/or cancer tissue can be hypomethylated (e.g., can comprise a lower level of methylation, for instance, wherein there are fewer methylated nucleotide residues and/or a lower percentage of methylated nucleotide residues) compared to nucleic acid molecules of or derived from healthy tissue (e.g., nucleic acid molecules of or derived from healthy tissue that consist of or comprise nucleotide sequences corresponding to the same region(s) of the genome of the subject).
- healthy tissue e.g., nucleic acid molecules of or derived from healthy tissue that consist of or comprise nucleotide sequences corresponding to the same region(s) of the genome of the subject.
- tumor-derived nucleic acid molecules e.g., ctDNA molecules
- nucleic acid molecules of or derived from tumor tissue and/or cancer tissue can be hypermethylated (e.g., can comprise a higher level of methylation, for instance, wherein there are a greater number of methylated nucleotide residues and/or a higher percentage of methylated nucleotide residues) compared to nucleic acid molecules of or derived from healthy tissue (e.g., nucleic acid molecules of or derived from healthy tissue that consist of or comprise nucleotide sequences corresponding to the same region(s) of the genome of the subject).
- healthy tissue e.g., nucleic acid molecules of or derived from healthy tissue that consist of or comprise nucleotide sequences corresponding to the same region(s) of the genome of the subject.
- a tumor-derived fraction of a plurality of cell-free DNA molecules can be distinguished from cfDNA molecules derived from healthy tissue by one or more biophysical properties (e.g., the length of the cfDNA molecules or the presence of stereotypical 5’ and 3’ end sequence motifs) and/or one or more fragmentomics patterns.
- ctDNA molecules can have shorter nucleic acid lengths than cfDNA molecules derived from healthy tissues.
- ctDNA molecules may comprise stereotypical 5’ and 3’ end motifs.
- ctDNA present in a plurality of nucleic acid molecules (e.g., cfDNA) in or derived from a biological sample, for instance, because they are present in the sample in lower quantities relative to cfDNA derived from healthy tissue (e.g., which may require using a greater amount of potentially scarce biological sample and/or which may require significantly higher sequencing depth, if it is possible at all).
- a plurality of nucleic acid molecules e.g., a plurality of cell-free nucleic acid molecules, or amplicons thereof, comprising a biological sample
- depletion/removing may be performed by using a binder specific for methylated DNA molecules to pull them down.
- the pull-down is typically collected and the flow-through containing the unmethylated/hypomethylated DNA molecules is discarded.
- the current disclosure provides for the first time methods and systems to collect such flow-through containing unmethylated/hypomethylated DNA molecules and to generate sequencing library using methylated/hypomethylated DNA molecules or derivatives thereof.
- a depleted sequencing library of methods, systems, compositions, and kits disclosed herein may consist of or can be comprised of such a remainder population of nucleic acid molecules.
- it may be sufficient to deplete a plurality of nucleic acids (e.g., cfDNA molecules or amplicons thereof derived from a biological sample) of nucleic acid molecules methylated in one or more specific regions of the genomic sequence of the nucleic acid molecules (e.g., CpG islands, CpG island shores, or repetitive sequences of the genome, such as long interspersed nuclear elements (LINEs), short interspersed nuclear elements (SINEs), or LTRs (long terminal repeats)) to achieve increased sensitivity and/or increased specificity in assays for determining the presence or absence or the sequence identity of ctDNA molecules in the plurality.
- LINEs long interspersed nuclear elements
- SINEs short interspersed nuclear elements
- LTRs long terminal repeats
- a plurality of nucleic acids may be subjected to genome-wide depletion of nucleic acid molecules methylated in one or more specific regions of the genomic sequence of the nucleic acid molecules (e.g., CpG islands, CpG island shores, or repetitive sequences of the genome, such as long interspersed nuclear elements (LINEs), short interspersed nuclear elements (SINEs), or LTRs (long terminal repeats)) to achieve increased sensitivity and/or increased specificity in assays for determining the presence or absence or the sequence identity of ctDNA molecules in the plurality.
- LINEs long interspersed nuclear elements
- SINEs short interspersed nuclear elements
- LTRs long terminal repeats
- a remainder population (e.g., a plurality of nucleic acid fragments useful in the creation of a depleted library) can be deprived of CpG genomic islands.
- a remainder population (e.g., a plurality of nucleic acid fragments useful in the creation of a depleted library) can comprise one or more of: long interspersed nuclear elements (LINEs), short interspersed nuclear elements (SINEs), or long terminal repeat (LTR) elements.
- LINEs long interspersed nuclear elements
- SINEs short interspersed nuclear elements
- LTR long terminal repeat
- Enrichment of all or a portion of the population of methylated DNA molecules (e.g., molecules having increased nucleotide methylation levels throughout or in a subset of the regions of the genome represented by the plurality of nucleic acid molecules of a biological sample) from a plurality of nucleic acid molecules (e.g., a plurality of cell-free nucleic acid molecules, or amplicons thereof, comprising a biological sample) may yield a population of the plurality of nucleic acids of the biological sample that may be useful for determining a presence and/or sequence identity of ctDNA molecules in the biological sample.
- enrichment may be performed by using a binder specific for methylated DNA molecules to pull them down.
- the pull-down is collected and the flow-through containing the unmethylated/hypomethylated DNA molecules can be discarded or alternatively be collected (e.g., used to generate a depletion library as described in this disclosure).
- the enriched fraction may be then subjected to sequencing.
- Depletion or enrichment of all or a portion of the methylated nucleic acid molecules of a plurality of nucleic acid molecules of a biological sample may comprise contacting the methylated nucleic acid molecules with a binder (e.g., an affinity molecule, such as an antibody or a protein, specific to methylated nucleotide residues).
- a binder e.g., an affinity molecule, such as an antibody or a protein, specific to methylated nucleotide residues.
- creation of a sequencing library can comprise contacting a plurality of nucleic acid molecules (e.g., cfDNA molecules) or amplicons thereof with a binder selective for a methylated region of nucleic acid molecules (e.g., a methylcytosine binder (MBD), such as an MBD-Fc fusion protein).
- MBD methylcytosine binder
- a binder may be specific to one or more methylated nucleotide species (e.g., 5-methylcytosine (5mC)).
- methylated nucleotide species e.g., 5-methylcytosine (5mC)
- cfMeDIP-seq Cell-free Methylated DNA Immunoprecipitation sequencing (cfMeDIP-seq), a genome-wide molecular profiling technique, can enrich for methylated cfDNA fragments through use of a binder, such as an anti-5-methylcytosine (anti-5mC) antibody or methyl-CpG- binding domain (MBD) protein (e.g., MBD-Fc fusion proteins).
- anti-5mC anti-5-methylcytosine
- MBD methyl-CpG- binding domain
- cfMeDIP- seq can comprise a portion of methods and systems for depleting a cfDNA sample of methylated DNA fragments, leaving behind hypomethylated or unmethylated cfDNA fragments, such as ctDNA.
- hypomethylated or unmethylated cell-free DNA within a clinical sample may be useful in determining the presence of a tumor or cancer in a subject.
- depletion of a plurality of nucleic acid molecules may comprise removing one or more nucleic acid molecules having a methylation level above a threshold methylation level (e.g., wherein the one or more removed nucleic acid molecules are hypermethylated, for instance, relative to one or more nucleic acid molecules not removed during depletion).
- enrichment of a plurality of nucleic acid molecules may comprise removing one or more nucleic acid molecules having a methylation level below a threshold methylation level (e.g., wherein the one or more removed nucleic acid molecules are hypomethylated, for instance, relative to one or more nucleic acid molecules not enriched, or non-methylated).
- a methylation level of a particular nucleic acid fragments may be considered to reach the threshold methylation level when a binder with a sufficient specificity for methylated cytosines is able to bind to the particular nucleic acid fragments either with or without using filler nucleic acids as described here.
- a methylation level of particular nucleic acid fragments e.g., DNA fragments
- depletion of a plurality of nucleic acid molecules results in (e.g., provides) a remainder population of the plurality of nucleic acid molecules, wherein the remainder of the plurality of nucleic acid molecules comprises (or, in some cases, consists of) nucleic acid molecules having a methylation level below the threshold methylation level (e.g., wherein the remainder population is hypomethylated/unmethylated relative to one or more nucleic acid molecules removed from the plurality of nucleic acid molecules during depletion).
- a methylation level may be calculated as a percentage of hypermethylated nucleic acid fragments compared to all the nucleic acid fragments contained in a sample.
- a threshold methylation level can be from 0.1% to 1%, 1% to 5%, 5% to 10%, 10% to 15%, 15% to 20%, 20% to 25%, 25% to 30%, 30% to 35%, 35% to 40%, 40% to 45%, 45% to 50%, 50% to 55%, 55% to 60%, 65% to 70%, 70% to 75%, 75% to 80%, 80% to 85%, 85% to 90%, 95% to 100%, at least 1%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least
- a first plurality of nucleic acid molecules (e.g., comprising nucleic acid molecules, such as cfDNA, from a biological sample of a subject) may be combined (e.g., mixed) with a second plurality of nucleic acid molecules (e.g., wherein the second plurality of nucleic acid molecules is not from the subject from whom the biological sample was taken), for instance, as shown in FIG. 1.
- the second plurality of nucleic acid molecules comprises supplemental processed DNA (e.g., comprising 1 DNA).
- each of the second plurality of nucleic acid molecules does not align to a human genome.
- all or a portion of the plurality of purified nucleic acid molecules may be amplified (e.g., via polymerase chain reaction), for instance, prior to or as part of a process of determining or identifying a sequence of all or a portion of the depleted nucleic acid molecule population.
- a population of amplified nucleic acid molecules or a derivative thereof e.g., comprising amplicons of all or a portion of the plurality of purified nucleic acid molecules
- sequencing e.g., for the determination and/or identification of a sequence of the nucleic acid molecules.
- a portion of the plurality of purified nucleic acid molecules may not be amplified but instead subjected to additional molecular modifications and the sequencing of the resultant material may be achieved using a sequencer (e.g., a nanopore sequencer), as described herein.
- a sequence of a plurality of nucleic acid molecules of a biological sample (or a derivative thereof) may be identified or determined using an array or polymerase chain reaction.
- the presence of a tumor-derived nucleic acid molecule may be determined by calculating a sum of reads per kilobase per million (RPKM) for a region of the genome (e.g., all or a portion of the genome, such as just CpG islands or just CpG island shores).
- RPKM reads per kilobase per million
- the presence of a tumor-derived nucleic acid molecule may be indicated when a depleted sequencing library (e.g., comprising a remainder population of nucleic acids) is observed to have alow sum ofRPKMs, e.g., lower than 70,000, lower than 60,000, lower than 50,000, lower than 40,000, or lower than 30,000 across one or more regions of interest (e.g., CpG islands or CpG island shores).
- a depleted sequencing library e.g., comprising a remainder population of nucleic acids
- supplemental processed DNA e.g., filler DNA, filler nucleic acids
- a first plurality of nucleic acids e.g., a plurality of nucleic acids from a biological sample, which may comprise cfDNA from healthy tissue and/or cfDNA from tumor tissue, such as ctDNA
- a biological sample which may comprise cfDNA from healthy tissue and/or cfDNA from tumor tissue, such as ctDNA
- addition of supplemental processed DNA (e.g., a second plurality of nucleic acid molecules) to a first plurality of nucleic acid molecules can increase the specificity and/or sensitivity of a method, system, or kit described herein, for instance, with respect to the detection and/or identification of a nucleic acid sequence of the first plurality of nucleic acid molecules.
- addition of supplemental processed DNA (e.g., a second plurality of nucleic acid molecules) to a first plurality of nucleic acid molecules may increase the rate of depletion of a methylated region of a nucleic acid sequence, e.g., during the practice of some embodiments of methods and systems described herein.
- supplemental processed DNA e.g., a second plurality of nucleic acid molecules
- a first plurality of nucleic acid molecules e.g., comprising cfDNA of a biological sample
- supplemental processed DNA e.g., the second plurality of nucleic acid molecules
- a desired total mass for use in a method or system described herein can be from 20 ng to 30 ng, from 30 ng to 40 ng, from 40 ng to 50 ng, from 50 ng to 60 ng, from 60 ng to 70 ng, from 70 ng to 80 ng, from 80 ng to 90 ng, from 90 ng to 100 ng, from 100 ng to 110 ng, from 110 ng to 120 ng, from 120 ng to 130 ng, from 130 ng to 140 ng, from 140 ng to 150 ng, from 150 ng to 160 ng, from 160 ng to 170 ng, from 170 ng to 180 ng, from 180 ng to 190 ng, from 190 ng to 200 ng, greater than 200 ng, or less than 20 ng.
- an amount of supplemental processed DNA from 1 ng to 5 ng, from 5 ng to 10 ng, from 10 ng to 20 ng, from 20 ng to 30 ng, from 30 ng to 40 ng, from 40 ng to 50 ng, from 50 ng to 60 ng, from 60 ng to 70 ng, from 70 ng to 80 ng, from 80 ng to 90 ng, from 90 ng to 100 ng, from 100 ng to 110 ng, from 110 ng to 120 ng, from 120 ng to 130 ng, from 130 ng to 140 ng, from 140 ng to 150 ng, from 150 ng to 160 ng, from 160 ng to 170 ng, from 170 ng to 180 ng, from 180 ng to 190 ng, from 190 ng to 200 ng, greater than 200 ng, less than 20 ng, less than 10 ng, or less than 5 ng can be added to a first plurality of nucleic acid molecules (e.g., to bring the total mixture of
- the present disclosure comprises methods and systems for filling in the sample with an amount of supplemental processed DNA (e.g., filler DNA) to generate a mixture sample, wherein the mixture sample comprises at least about 50ng, 55ng, 60ng, 65ng, 70ng, 75ng, 80ng, 85ng, 90ng, 95ng, lOOng, 120ng, 140ng, 160ng, 180ng, 200ng, or any amount in between the numbers of the total amount of the nucleic acid mixture.
- supplemental processed DNA e.g., filler DNA
- the supplemental processed DNA comprises at least about 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% methylated supplemental processed DNA with remainder being unmethylated supplemental processed DNA, and in some cases between 5% and 50%, between 10%-40%, or between 15%-30% methylated supplemental processed DNA.
- the mixture sample comprise an amount of supplemental processed DNA from 20 ng to 100 ng, in some cases 30 ng to 100 ng, in some cases 50 ng to 100 ng.
- the cell-free DNA from the sample and the first amount of supplemental processed DNA together comprises at least 50 ng of total DNA, in some cases at least 100 ng of total DNA.
- supplemental processed DNA may be produced by fragmentation (e.g., via sonication).
- the supplemental processed DNA may be 50 bp to 800 bp long, in some cases 100 bp to 600 bp long, and in some cases 200 bp to 600 bp long.
- the supplemental processed DNA is double stranded.
- the supplemental processed DNA may be double stranded DNA.
- the supplemental processed DNA may be junk DNA.
- the supplemental processed DNA may also be endogenous or exogenous filler DNA.
- the supplemental processed DNA may be a plurality of exogenous filler DNAs. In some cases, the exogenous filler DNAs are not derived from a nucleic acid sample.
- the supplemental processed DNA may be non-human DNA, and in some cases, X DNA.
- DNA generally refers to Enterobacteria phage X DNA.
- the supplemental processed DNA has substantially no alignment to human DNA.
- the supplemental processed DNA may be non-exogenous DNA (e.g., non-exogenous filler DNA).
- the supplemental processed DNA is one or more samples derived from a pool of more than one nucleic acid samples.
- the supplemental filler DNA may be modified to integrate biotin into the strands.
- the presence of biotin may allow for the removal of the filler from the reaction postimmunoprecipitation.
- streptavidin e.g., a streptavidin bead
- streptavidin bead may be added to sample comprising biotinylated nucleic acids, and the streptavidin may bind to the biotinylated nucleic acids.
- the streptavidin bead may then be used to remove the biotinylated nucleic acids.
- the removal of biotin can limit or inhibit the filler’ s interference with downstream sequencing.
- the filler may be removed after the pull down using other approaches (e.g., hybrid capture of the filler, hybrid capture of the human cfDNA molecules by capturing adaptors, etc.).
- the supplemental filler DNA is nonbiotinylated.
- Non-biotinylated filler remain in sample throughout a library preparation and may not be subjected to a specific filler removal process.
- Non-biotinylated filler may be sequenced along with a sample DNA.
- Non-biotinylated filler may fail to be sequenced (or may generate a low number of sequence reads) based at least on library preparation processes that differentially affect filler nucleic acids and sample nucleic acids. For example, a non-biotinylated may fail to ligate to an adaptor or fail to be processed with a motor protein.
- the supplemental filler DNA increases a fold enrichment ratio of enriching one or more methylated regions by at least about 1 fold, at least about 2 fold, at least about 3 fold, at least about 4 fold, at least about 5 fold, at least about 6 fold, at least about 7 fold, at least about 8 fold, at least about 9 fold, at least about 10 fold, at least about 15 fold, at least about 20 fold, at least about 25 fold, at least about 30 fold, at least about 35 fold, at least about 40 fold, at least about 45 fold, at least about 50 fold, at least about 55 fold, at least about 60 fold, at least about 65 fold, at least about 70 fold, at least about 75 fold, at least about 80 fold, at least about 85 fold, at least about 90 fold, at least about 95 fold, at least about 100 fold, at least about 150 fold, at least about 200 fold, at least about 300 fold, at least about 400 fold, at least about 500 fold, at least about 600 fold, at least about 700 fold, at least about 800 fold, at least about 900 fold,
- the supplemental filler DNA increases a fold enrichment ratio by at most about 1 fold, at most about 2 fold, at most about 3 fold, at most about 4 fold, at most about 5 fold, at most about 6 fold, at most about 7 fold, at most about 8 fold, at most about 9 fold, at most about 10 fold, at most about 15 fold, at most about 20 fold, at most about 25 fold, at most about 30 fold, at most about 35 fold, at most about 40 fold, at most about 45 fold, at most about 50 fold, at most about 55 fold, at most about 60 fold, at most about 65 fold, at most about 70 fold, at most about 75 fold, at most about 80 fold, at most about 85 fold, at most about 90 fold, at most about 95 fold, at most about 100 fold, at most about 150 fold, at most about 200 fold, at most about 300 fold, at most about 400 fold, at most about 500 fold, at most about 600 fold, at most about 700 fold, at most about 800 fold, at most about 900 fold, or at most 1000 fold.
- the supplemental filler DNA increases a fold enrichment ratio by about 1 fold, about 2 fold, about 3 fold, about 4 fold, about 5 fold, about 6 fold, about 7 fold, about 8 fold, about 9 fold, about 10 fold, about 15 fold, about 20 fold, about 25 fold, about 30 fold, about 35 fold, about 40 fold, about 45 fold, about 50 fold, about 55 fold, about 60 fold, about 65 fold, about 70 fold, about 75 fold, about 80 fold, about 85 fold, about 90 fold, about 95 fold, about 100 fold, about 150 fold, about 200 fold, about 300 fold, about 400 fold, about 500 fold, about 600 fold, about 700 fold, about 800 fold, about 900 fold, or 1000 fold.
- the supplemental filler DNA (e.g., exogenous filler DNA) is not added to a first plurality of nucleic acids (e.g., a plurality of nucleic acids from a biological sample, which may comprise cfDNA from healthy tissue and/or cfDNA from tumor tissue, such as ctDNA).
- a biological sample which may comprise cfDNA from healthy tissue and/or cfDNA from tumor tissue, such as ctDNA
- the supplemental filler DNA (e.g., exogenous filler DNA) is not added when a plurality of nucleic acids is pooled together from one or more biological samples, as the remaining samples may serve as the supplemental filler DNA.
- a sample can be any biological sample isolated from a subject.
- a sample may comprise, without limitation, bodily fluid, whole blood, platelets, serum, plasma, stool, white blood cells or leukocytes, endothelial cells, tissue biopsies, synovial fluid, lymphatic fluid, ascites fluid, interstitial or extracellular fluid, the fluid in spaces between cells, including gingival crevicular fluid, bone marrow, cerebrospinal fluid, saliva, mucous, sputum, semen, sweat, urine, fluid from nasal brushings, fluid from a pap smear, or any other bodily fluids.
- a bodily fluid may include saliva, blood, or serum.
- a sample may also be a tumor sample, which may be obtained from a subject by various approaches, including, but not limited to, venipuncture, excretion, ejaculation, massage, biopsy, needle aspirate, lavage, scraping, surgical incision, or intervention or other approaches.
- a sample may be a cell-free sample (e.g., substantially free of cells). DNA samples may be denatured, for example, using sufficient heat.
- the sample may be taken from a subject with a disease or disorder.
- the sample may be taken from a subject suspected of having a disease or a disorder.
- the sample may be obtained before and/or after treatment of a subject with a disease or disorder. Samples may be obtained from a subject during a treatment or a treatment regime.
- the disease or disorder may be a cancer.
- cancer types include suitable for detection with the methods according to the disclosure include acute lymphoblastic leukemia, acute myeloid leukemia, adrenocortical carcinoma, AIDS-related cancers, AIDS-related lymphoma, anal cancer, appendix cancer, astrocytomas, basal cell carcinoma, bile duct cancer, bladder cancer, bone cancers, brain tumors, such as cerebellar astrocytoma, cerebral astrocytoma/malignant glioma, ependymoma, medulloblastoma, supratentorial primitive neuroectodermal tumors, visual pathway and hypothalamic glioma, breast cancer, bronchial adenomas, Burkitt lymphoma, carcinoma of unknown primary origin, central nervous system lymphoma, cerebellar astrocytoma, cervical cancer, childhood cancers, chronic lympho
- the cancer is head and neck squamous cell carcinoma.
- the sample may be taken from a healthy individual. In some cases, samples may be taken longitudinally from the same individual. In some cases, samples acquired longitudinally may be analyzed with the goal of monitoring individual health and early detection of health issues. In some embodiments, the sample may be collected at a home setting or at a point-of- care setting and subsequently transported by a mail delivery, courier delivery, or other transport method prior to analysis. For example, a home user may collect a blood spot sample through a finger prick, which blood spot sample may be dried and subsequently transported by mail delivery prior to analysis. In some cases, samples acquired longitudinally may be used to monitor response to stimuli expected to impact healthy, athletic performance, or cognitive performance. Non-limiting examples include response to medication, dieting, or an exercise regimen.
- the present disclosure provides a system, method, or kit that includes or uses one or more biological samples.
- the one or more samples used herein may comprise any substance containing or presumed to contain nucleic acids.
- a sample may include a biological sample obtained from a subject.
- a biological sample is a liquid sample.
- the sample comprises less than about 100 ng, 90 ng, 80 ng, 75 ng, 70ng, 60 ng, 50 ng, 40 ng, 30 ng, 20 ng, 10 ng, 5 ng, 1 ng or any amount in between the numbers of cell-free nucleic acid molecules.
- the sample comprises less than about 1 pg, less than about 5 pg, less than about 10 pg, less than about 20 pg, less than about 30 pg, less than about 40 pg, less than about 50 pg, less than about 100 pg, less than about 200 pg, less than about 500 pg, less than about 1 ng, less than about 5 ng, less than about 10 ng, less than about 20 ng, less than about 30 ng, less than about 40 ng, less than about 50 ng, less than about 100 ng, less than about 200 ng, less than about 500 ng, less than about 1000 ng, or any amount in between the numbers of cell-free nucleic acid molecules.
- creation or provision of a plurality of nucleic acid molecules from a biological sample can comprise performing one or more of end-repair, A-tailing, and adaptor ligation on the plurality of nucleic acid molecules (e.g., after purification from the biological sample).
- a plurality of nucleic acid molecules may be pooled together from one or more biological samples.
- a plurality of nucleic acid molecules may be pooled together from one or more biological sample after performing one or more of end-repair, A- tailing, and adaptor ligation on the plurality of nucleic acid molecules (e.g., after purification from the biological sample).
- the sample may be processed to generate datasets indicative of a disease or disorder of the subject. For example, a presence, absence, or quantitative assessment of cell-free nucleic acid molecules (e.g., ctDNA molecules) of the sample at a panel of cancer-associated genomic loci or microbiome-associated loci may be indicative of a cancer of the subject.
- Processing the sample obtained from the subject may comprise (i) subjecting the sample to conditions that are sufficient to isolate, enrich, or extract a plurality of cell-free nucleic acid molecules, and (ii) assaying the plurality of cell-free nucleic acid molecules to generate the dataset (e.g., nucleic acid sequences).
- a plurality of cell-free nucleic acid molecules is extracted from the sample and subjected to sequencing to generate a plurality of sequencing reads.
- a plurality of nucleic acid molecules may be hypermethylated and enriched by using a binder, e.g., as described herein, to form a hypermethylated sequencing library which can be used as a novel background as opposed to a whole-genome background for use in analysis of cfDNA.
- DNA may be hypermethylated before use of a binder to create a sequencing library with a novel background.
- the novel background sequencing library may comprise a set of background genomic regions that are enriched by the binder.
- the DNA may be double-stranded. Alternatively, or in addition to, the DNA may be single-stranded.
- the nucleic acids may be subjected to target specific enrichment.
- nucleic acids may be pulldown via a pulldown probe to enrich the sequencing library for a given sequence.
- nucleic acids may be amplified with primers to enrich the sequencing library for a given sequence.
- the pulldown probes or primers may be specific to a specific gene, -non-coding region, or other sequence.
- the nucleic acids may be enriched for one or more genes.
- the one or more genes may be genes that are associated with cancer.
- the one or more genes may be genes that are previously determined as relevant for a cancer type.
- the nucleic acids may be enriched to genes or sequences that have a previously identified methylation state, or previously identified number of methylated nucleotides.
- the target specific enrichment may be performed during the preparation of the sequencing library.
- nucleic acids may be enriched or depleted from methylated nucleic acid (as described in this disclosure) and then the nucleic acids may be subjected to target specific enrichment.
- the present disclosure provides methods and technologies for determining the sequence of nucleotide bases in one or more polynucleotides.
- the polynucleotides may be, for example, nucleic acid molecules such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), including variants or derivatives thereof (e.g., single stranded DNA). Sequencing may be performed by next-generation sequencing. Alternatively, or in addition to, sequencing may be performed using a nanopore.
- any sequencing methods that provide fragment length such as paired-end sequencing may be utilized.
- sequencing may be performed using nucleic acid amplification, polymerase chain reaction (PCR) (e.g., digital PCR, quantitative PCR, or real time PCR), or isothermal amplification.
- PCR polymerase chain reaction
- Such systems may provide a plurality of raw genetic data corresponding to the genetic information of a subject (e.g., human), as generated by the systems from a sample provided by the subject.
- PCR polymerase chain reaction
- Such systems may provide a plurality of raw genetic data corresponding to the genetic information of a subject (e.g., human), as generated by the systems from a sample provided by the subject.
- sequencing reads also “reads” herein).
- a read may include a string of nucleic acid bases corresponding to a sequence of a nucleic acid molecule that has been sequenced.
- systems and methods provided herein may be used with proteomic information.
- the sequencing reads are obtained via a next-generation sequencing method or a next-next-generation sequencing method.
- the sequencing methods comprise cfMeDIP sequencing, e.g., comprising processes or systems as described by Shen et al., (“Sensitive tumor detection and classification using plasma cell-free DNA methylomes,” (2016) Nature), which is incorporated herein in its entirety.
- sequencing can be performed using methyl-CpG-binding domain sequencing (MBD-seq).
- MBD-seq can comprise capture (e.g., via a binder, such as an antibody specific to a species of methylated nucleotide) of double-stranded, methylated DNA fragments for sequencing of methylation-enriched DNA fragment libraries.
- the sequencing methods comprises Cancer Personalized Profiling by deep Sequencing (CAPP-Seq), which is a next-generation sequencing based method used to quantify circulating DNA in cancer (ctDNA). This method may be generalized for any cancer type that is documented to have recurrent mutations and may detect one molecule of mutant DNA in 10,000 molecules of healthy DNA.
- the sequencing comprises bisulfite sequencing. In some embodiments, the sequencing does not comprise bisulfite sequencing.
- the sequencing reads may be obtained via nanopore sequencing or a nanopore-based sequencing method (e.g., the use of a nanopore sensing platform).
- a nucleic acid molecule e.g., a single stranded nucleic acid molecule
- the nucleic molecule may comprise a sense strand coupled to an anti-sense strand through a nucleic acid segment ligated on an end portion of each of the sense strand and anti-sense strand.
- the electrode may be adapted to detect an electric current or changes in an electric current upon the nucleic acid molecule passing through or in proximity to the nanopore.
- Electric current may be faradic current or tunneling current.
- electric current measurements or changes in electric current measurements may be obtained while passing the nucleic acid molecule through or in proximity to the nanopore.
- the nanopore and/or nanopore sensing platform may measure a charge, conductivity, resistance, impedance, or change thereof as a nucleic acid molecule interacts with a nanopore.
- a sequence of the single stranded nucleic acid molecule may be determined from the electric current measurements.
- the methylation level of the single stranded nucleic acid molecule may be determined from the electric current measurements.
- sequencing of enriched methylated single-stranded DNA generated after immunoprecipitation may occur without amplification (e.g., PCR amplification).
- Nanoporebased sequencing without amplification may allow for single base-pair resolution in each enriched fragment as well as a direct count of methylated DNA fragments in a sample without the need of barcodes (e.g., unique molecular identifiers) and/or without the need to sequence the same fragment multiple times due to amplification.
- nanopore-based cfMeDIP-sequencing may allow the determination of methylation status (e.g., hydroxymethylation) that may not be detectable after amplification.
- sequencing of enriched methylated single-stranded DNA generated after immunoprecipitation may occur after amplification.
- Amplification prior to sequencing of enriched methylated single-stranded DNA may generate more sequence reads.
- Amplification may allow for more stable nucleic acids to be generated (e.g., double stranded nucleic acid compared to partially single stranded or single stranded nucleic acids), which may in turn allow for of the nucleic acid to be stored for longer periods of time without degradation.
- a sample or portion thereof may be subjected to library preparation before sequencing.
- the samples are ligated to nucleic acid adaptors and digested using enzymes.
- sequencing comprises modification of a nucleic acid molecule or fragment thereof, for example, by ligating a barcode, a unique molecular identifier (UMI), or another tag to the nucleic acid molecule or fragment thereof.
- a barcode may allow for a nucleic acid molecule to be identified as originating from a particular sample.
- a barcode is a unique barcode (e.g., a UMI).
- a barcode is non-unique, and barcode sequences may be used in connection with endogenous sequence information such as the start and stop sequences of a target nucleic acid (e.g., the target nucleic acid is flanked by the barcode and the barcode sequences, in connection with the sequences at the beginning and end of the target nucleic acid, creates a uniquely tagged molecule).
- a barcode, UMI, or tag may be a known sequence used to associate a polynucleotide or fragment thereof with an input or target nucleic acid molecule or fragment thereof.
- a barcode, UMI, or tag may comprise natural nucleotides or non-natural (e.g., modified) nucleotides (e.g., as described herein).
- a barcode sequence may be contained within an adaptor sequence such that the barcode sequence may be contained within a sequencing read.
- a barcode sequence may comprise at least 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or more nucleotides in length. In some cases, a barcode sequence may be of sufficient length and may be sufficiently different from another barcode sequence to allow the identification of a sample based on a barcode sequence with which it is associated.
- a barcode sequence, or a combination of barcode sequences may be used to tag and subsequently identify an “original” nucleic acid molecule or fragment thereof (e.g., a nucleic acid molecule or fragment thereof present in a sample from a subject).
- a barcode sequence, or a combination of barcode sequences is used in conjunction with endogenous sequence information to identify an original nucleic acid molecule or fragment thereof.
- a barcode sequence, or a combination of barcode sequences may be used with endogenous sequences adjacent to a barcode, UMI, or tag (e.g., the beginning and end of the endogenous sequences).
- the prepared libraries may be combined with filler nucleic acids (e.g., filler X DNAs) to minimize the effect of low abundance ctDNA in the prepared libraries and generate mixed samples.
- filler nucleic acids e.g., filler X DNAs
- the amount of ctDNA can be low and may not be easily and accurately measured and quantified.
- the mixed samples may be brought to at least about 50 ng, 80 ng, 100 ng, 120 ng, 150 ng, or 200 ng and are subjected to further enrichment.
- Nucleic acid amplification may involve one or more reagents such as one or more primers, probes, polymerases, buffers, enzymes, and deoxyribonucleotides. Nucleic acid amplification may be isothermal or may comprise thermal cycling. And/or with the length of the endogenous sequence.
- Processing a nucleic acid molecule or fragment thereof may comprise the addition of a bead that may bind to nucleic acids.
- the binding of nucleic acids may allow for nucleic acids to a be specifically bound and allow for enzyme or contaminants to be removed from the nucleic acid sample.
- the beads e.g., Solid-phase reversible immobilization (SPRI) beads
- SPRI Solid-phase reversible immobilization
- the nucleic acids may be eluted from the beads and then may be collected. This may be performed by removing or collecting the beads from a nucleic acid sample.
- the beads may be magnetic beads, or paramagnetic beads, and may be subjected to a magnetic field to remove or collect the beads.
- the beads may be subjected to multiple iterations of a removal or collection process to reduce or minimize carryover of beads into later processing reactions.
- a binder may be used to deplete or enrich for a population of nucleic acid molecules (e.g., a plurality of nucleic acid molecules derived from a biological sample).
- a binder can be used to deplete or enrich for a plurality of nucleic acid molecules of one or more nucleic acid molecules having a methylation level at or above a threshold methylation level (e.g., by binding to one or more methylated nucleotides of the one or more nucleic acid molecules).
- a binder may be used to enrich a population of nucleic acid molecules (e.g., a plurality of nucleic acids derived from a biological sample).
- the binder may be a molecule that binds specifically to methylated nucleic acids or methylated nucleotides.
- a binder can be specific to one or more methylated nucleotide species (e.g., 5-methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC), 4-methylcytosine (4mC), or 6-methyladenine (6mA)).
- a binder can be selected from the group consisting of an anti-5- methylcytosine antibody or a derivative thereof, an anti-5-carboxylcytosine antibody or a derivative thereof, an anti-5-formylcytosine antibody or a derivative thereof, an anti-5- hydroxymethylcytosine antibody or a derivative thereof, an anti -3 -methylcytosine antibody or a derivative thereof, and any combinations thereof.
- the binder can be an anti-5- methylcytosine antibody or a derivative thereof.
- the binder is a protein comprising a Methyl -CpG-binding domain. One such protein is MBD2 protein.
- Methods of MeCP2, MBD1, MBD2, MBD4 and BAZ2 mediates binding to DNA, and in cases of MeCP2, MBD1 and MBD2, preferentially to methylated CpG.
- Human proteins MECP2, MBD1, MBD2, MBD3, and MBD4 comprise a family of nuclear proteins related by the presence in each of a methyl-CpG-binding domain (MBD). Each of these proteins, with the exception of MBD3, is capable of binding specifically to methylated DNA.
- the binder may comprise a biotin, and may allow the binder to couple to, or bind to, a streptavidin.
- the binder is an antibody and capturing cell-free methylated DNA comprises immunoprecipitating the cell-free methylated DNA using the antibody.
- immunoprecipitation generally refers a technique of precipitating an antigen (such as polypeptides and nucleotides) out of solution using an antibody that specifically binds to that particular antigen. This process may be used to isolate and concentrate a particular protein or DNA from a sample and may require that the antibody be coupled to a solid substrate at some point in the procedure.
- the solid substrate includes for example beads, such as magnetic beads. Other types of beads and solid substrates may be used.
- the solid substrate may be able to bind or couple to a molecule that binds specifically to methylated nucleic acids.
- the solid substrate may comprise streptavidin and may be able to bind to a biotinylated antibody.
- the solid substrate may comprise a protein A and may be able to bind to a Fc domain of an antibody.
- binders e.g., a methylation binding molecule
- the binders may be coupled to a solid substrate (e.g., a magnetic bead) and this complex (e.g., a methylated nucleic acid capture reagent) may be added to sample.
- the complex may initially be generated via a prior incubation without the sample present.
- an anti 5-mC antibody may be incubated with a protein A bead and allow for the antibody to bind to, or couple to, the protein A bead via a Fc domain of the antibody.
- the complex may be added to the sample to bind to nucleic acids.
- the binder and the solid substrate may also be added at a same time (or substantially the same time) to a sample or may be added sequentially.
- a magnetic protein A bead and antibody may be added at the same time to a sample and allow binding of the antibody to nucleic acids and allow binding of the antibody to magnetic protein A bead.
- an antibody may be initially added to a sample to allow binding of the nucleic acids to the antibody, followed by addition of a bead to bind to the antibody which bound to the nucleic acids.
- the generation of the complex via prior incubation, simultaneous addition, or sequential addition, each may comprise advantages. For example, prior incubation may allow for a stable complex of the binder and solid substrate to be formed without potential steric interference of nucleic acids.
- a 5-mC antibody (e.g., wherein the 5-mC antibody specifically binds to 5-methylcytosine) may be used as a binder.
- the immunoprecipitation procedure in some embodiments at least 0.05 pg of the antibody is added to the sample, while in some embodiments at least 0.16 pg of the antibody is added to the sample.
- 0.05 pg to 0.80 pg, 0.16 pg to 0.80 pg, 0.40 pg to 0.80 pg, 0.16 pg to 0.40 pg, 0.10 pg to 0.80 pg, 0.20 pg to 0.60 pg, 0.30 pg to 0.50 pg, or 0.40 pg to 0.50 pg of the antibody can be used.
- the method described herein further comprises the operation of adding a second amount of control DNA to the sample.
- the present disclosure provides methods and systems of processing a cell-free nucleic acid sample from a subject for detecting a methylation event.
- the present workflow comprises (a) providing a plurality of nucleic acid molecules (e.g., cell free DNA (cfDNA)) derived from a nucleic acid sample; (b) phosphorylating the plurality of nucleic acid molecules at the 5’ end to generate a plurality of phosphorylated nucleic acid molecules; (c) adding a plurality of filler nucleic acid molecules (e.g., biotinylated filler nucleic acid molecules) to the phosphorylated nucleic acid molecules to generate a mixture sample; (d) subjecting the mixture sample to immunoprecipitation (e.g., cell-free Methylated DNA Immunoprecipitation (cfMeDIP)) to yield an enriched sample comprising a plurality of methylated nucleic acid molecules; (e) performing post-immunoprecipitation clean-up; and (f) preparing the enriched sample for nanopore sequencing.
- a plurality of nucleic acid molecules e.g., cell
- the present workflow further comprises i) removing the plurality of filler nucleic acid molecules (e.g., biotinylated filler nucleic acid molecules); ii) annealing or ligating oligonucleotides comprising indexing sequences, custom adaptor sequences, or reverse compliment oligonucleotides to the plurality of methylated nucleic acid molecules; iii) ligating or annealing oligonucleotides to a motor protein; and iv) subjecting to a nanopore sensing platform for nanopore sequencing.
- filler nucleic acid molecules e.g., biotinylated filler nucleic acid molecules
- annealing or ligating oligonucleotides comprising indexing sequences, custom adaptor sequences, or reverse compliment oligonucleotides to the plurality of methylated nucleic acid molecules
- the oligonucleotides are single-stranded oligonucleotides, wherein the single-stranded oligonucleotides anneal to the plurality of methylated nucleic acid molecules to form double-stranded nucleic acid molecules at the ends of the plurality of methylated nucleic acid molecules, and wherein the double-stranded nucleic acid molecules ligate to the motor protein.
- the oligonucleotides are doublestranded adaptor oligonucleotides, wherein the double stranded adaptor oligonucleotides ligate to the motor protein.
- one or more samples can be pooled after ii) annealing or ligating oligonucleotides.
- the present workflow comprises (a) providing a plurality of nucleic acid molecules (e.g., cfDNA) derived from a nucleic acid sample; (b) undergoing a library preparation (e.g., end-repair, A-tail, adaptor ligation) with one or more custom adaptors to generate a library; (c) adding a plurality of filler nucleic acid molecules (e.g., biotinylated filler nucleic acid molecules) to the library; (d) subjecting the library to immunoprecipitation (e.g., cfMeDIP) to yield an enriched sample comprising a plurality of methylated nucleic acid molecules; (e) performing postimmunoprecipitation clean-up; and (f) preparing the enriched sample for nanopore sequencing.
- a library preparation e.g., end-repair, A-tail, adaptor ligation
- a library preparation e.g., end-repair, A-tail
- the present workflow further comprises i) removing the plurality of filler nucleic acid molecules (e.g., biotinylated filler nucleic acid molecules); ii) annealing or ligating oligonucleotides comprising indexing sequences, custom adaptor sequences, or reverse compliment oligonucleotides to the plurality of methylated nucleic acid molecules; iii) ligating or annealing the oligonucleotides to a motor protein; and iv) subjecting to a nanopore sensing platform for nanopore sequencing.
- filler nucleic acid molecules e.g., biotinylated filler nucleic acid molecules
- annealing or ligating oligonucleotides comprising indexing sequences, custom adaptor sequences, or reverse compliment oligonucleotides to the plurality of methylated nucleic acid molecules
- the oligonucleotides are single-stranded oligonucleotides, wherein the single-stranded oligonucleotides anneal to the plurality of methylated nucleic acid molecules to form doublestranded nucleic acid molecules at the ends of the plurality of methylated nucleic acid molecules, and wherein the double-stranded nucleic acid molecules ligate to the motor protein.
- the oligonucleotides are double-stranded adaptor oligonucleotides, wherein the double stranded adaptor oligonucleotides ligate to the motor protein.
- one or more samples can be pooled after ii) annealing or ligating oligonucleotides.
- the present workflow comprises (a) providing a plurality of nucleic acid molecules (e.g., cfDNA) derived from a nucleic acid sample; (b) undergoing a library preparation (e.g., end-repair, A-tail, adaptor ligation) with one or more custom adaptors to generate a library; (c) phosphorylating the plurality of nucleic acid molecules from the library preparation at the 5’ end to generate a plurality of phosphorylated nucleic acid molecules; (d) adding a plurality of filler nucleic acid molecules (e.g., non-biotinylated filler nucleic acid molecules) to generate a sample mixture; (e) subjecting the sample mixture to immunoprecipitation (e.g., cfMeDIP) to yield an enriched sample comprising a plurality of methylated nucleic acid molecules; and (f) preparing the enriched sample for
- the present workflow (f) preparing the enriched sample for nanopore sequence further comprises i) heat denaturing the enriched sample; ii) annealing or ligating oligonucleotides comprising indexing sequences, custom adaptor sequences, or reverse compliment oligonucleotides to the plurality of methylated nucleic acid molecules; iii) ligating or annealing the oligonucleotides to a motor protein; and iv) subjecting to a nanopore sensing platform for nanopore sequencing.
- the oligonucleotides are single-stranded oligonucleotides, wherein the single-stranded oligonucleotides anneal to the plurality of methylated nucleic acid molecules to form doublestranded nucleic acid molecules at the ends of the plurality of methylated nucleic acid molecules, and wherein the double-stranded nucleic acid molecules ligate to the motor protein.
- the oligonucleotides are double-stranded adaptor oligonucleotides, wherein the double stranded adaptor oligonucleotides ligate to the motor protein.
- one or more samples can be pooled after ii) annealing or ligating oligonucleotides.
- the present workflow comprises (a) providing a plurality of nucleic acid molecules (e.g., cfDNA, cfDNA with spike-in DNA) derived from a nucleic acid sample; (b) undergoing a library preparation (e.g., end-repair, A- tail, adaptor ligation) with one or more custom adaptors to generate a library; (c) phosphorylating the plurality of nucleic acid molecules from the library preparation at the 5’ end to generate a plurality of phosphorylated nucleic acid molecules (d) performing a postlibrary preparation clean-up with a solid substrate (e.g., magnetic solid substrate); (e) adding a plurality of filler nucleic acid molecules (e.g., non-biotinylated filler nucleic acid molecules) to generate a sample mixture; (f) heat denaturing and snap chilling the sample mixture; (g) subjecting the sample mixture to immunoprecipitation to yield
- a library preparation e.g., end-re
- the present workflow, (d) performing a post-library preparation clean-up comprises an additional capture of the solid substrate (e.g., magnetic solid substrate) by placing the prepped library against a device (e.g., magnetic rack) to capture any remaining solid substrates.
- the solid substrate e.g., magnetic solid substrate
- the present workflow (g) subjecting the sample mixture to immunoprecipitation further comprises i) incubating a binder as disclosed herein with a solid substrate (e.g., magnetic solid substrate) to generate a methylated nucleic acid capture reagent; ii) adding the methylated nucleic acid capture reagent to the sample mixture; and ii) isolating the methylated nucleic acid capture reagent to yield the enriched sample comprising the plurality of methylated nucleic acid molecules.
- a solid substrate e.g., magnetic solid substrate
- preparing the enriched sample for nanopore sequence further comprises i) heat denaturing the enriched sample; ii) annealing or ligating oligonucleotides comprising indexing sequences, custom adaptor sequences, or reverse compliment oligonucleotides to the plurality of methylated nucleic acid molecules; iii) ligating or annealing the oligonucleotides to a motor protein; and iv) subjecting to a nanopore sensing platform for nanopore sequencing.
- the oligonucleotides are single-stranded oligonucleotides, wherein the single-stranded oligonucleotides anneal to the plurality of methylated nucleic acid molecules to form doublestranded nucleic acid molecules at the ends of the plurality of methylated nucleic acid molecules, and wherein the double-stranded nucleic acid molecules ligate to the motor protein.
- the oligonucleotides are double-stranded adaptor oligonucleotides, wherein the double stranded adaptor oligonucleotides ligate to the motor protein.
- one or more samples can be pooled after ii) annealing or ligating oligonucleotides.
- a method comprising: (a) providing a plurality of nucleic acid molecules (e.g., cfDNA) derived from a nucleic acid sample; (b) subjecting the plurality of nucleic acid molecules (e.g., cfDNA) to enrichment to yield (i) a first plurality of nucleic acid molecules having a methylation level at or above a threshold methylation level and (ii) a second plurality of nucleic acid molecules having a methylation level below the threshold methylation level, wherein the enrichment does not use bisulfite conversion or enzymatic conversion of the plurality of nucleic acid molecules; and (c) directing a nucleic acid molecule derived from the first plurality of nucleic acid molecules or second plurality of nucleic acid molecules to a nanopore of a nanopore sensing platform, wherein the nanopore of the nanopore sensing platform is used to identify a methylation status of the nucleic
- the plurality of nucleic acid molecules are phosphorylated at the 5’ end. In some cases, prior to enrichment, the plurality of nucleic acid molecules are not phosphorylated at the 5’ end. In some cases, prior to enrichment, the plurality of nucleic acid molecules undergo library preparation with custom adaptors. In some cases, the plurality of nucleic acid molecules are phosphorylated at the 5’ end after undergoing library preparation with custom adaptors, and prior to enrichment. In some cases, the enrichment comprises immunoprecipitation (e.g., cell-free Methylated DNA Immunoprecipitation (cfMeDIP)).
- immunoprecipitation e.g., cell-free Methylated DNA Immunoprecipitation (cfMeDIP)
- immunoprecipitation comprises postimmunoprecipitation clean up.
- immunoprecipitation e.g., cfMeDIP
- the enrichment comprises using a plurality of filler nucleic acid molecules to enrich for the first plurality of nucleic acid molecules.
- (b) further comprises incubating the plurality of nucleic acid molecules and the plurality of filler nucleic acid molecules under conditions sufficient to enrich for a methylated region of the plurality of nucleic acid molecules to yield the first plurality of nucleic acid molecules.
- the plurality of filler nucleic acid molecule is biotinylated. In some cases, the plurality of filler nucleic acid molecule is non-biotinylated. In some cases, prior to (c), the method further comprises removing a substantial amount of the plurality of filler nucleic acid molecules. In some cases, the plurality of filler nucleic acid molecules (e.g., biotinylated filler nucleic acid molecules) is contacted with a plurality of streptavidin beads for removal. In some cases, the plurality of filler nucleic acid molecules (e.g., biotinylated filler nucleic acid molecules) is removed using hybridization capture.
- the method does not comprise removal of the plurality of filler nucleic acid molecules.
- the first plurality of nucleic acid molecules is hypermethylated.
- the second plurality of nucleic acid molecules is hypomethylated.
- the enrichment comprises contacting said nucleic acid sample and said plurality of filler nucleic acid molecules with a methylated nucleic acid capture reagent.
- the methylated nucleic acid capture reagent is generated by incubating a binder as disclosed herein with a solid substrate.
- the solid substrate is a bead.
- the solid substrate is a magnetic solid substrate.
- the solid substrate is a paramagnetic solid substrate.
- the first plurality of nucleic acid molecules and/or the second plurality of nucleic acid molecules are ligated with oligonucleotide complimentary to the custom adaptors, prior to (c).
- the first plurality of nucleic acid molecules and/or the second plurality of nucleic acid molecules are denatured and chilled.
- the first plurality of nucleic acid molecules and the second plurality of nucleic acid molecules can be pooled together, prior to (c).
- the first plurality of nucleic acid molecules and second plurality of nucleic acid molecules are further ligated with double stranded adaptor oligonucleotides, prior to (c).
- the double stranded adaptor oligonucleotides comprise indexing sequences or custom adaptor sequences. In some cases, the double stranded adaptor oligonucleotides anneals to a motor protein disclosed herein. In some cases, the first plurality of nucleic acid molecules and second plurality of nucleic acid molecules are further annealed to single-stranded oligonucleotides to form double-stranded nucleic acid molecules at the ends of the first plurality of methylated nucleic acid molecules and the second plurality of methylated nucleic acid molecules, and wherein the double-stranded nucleic acid molecules ligate to the motor protein.
- the motor protein associates with a nanopore of the nanopore sensing platform.
- the method further comprises using hybridization capture to purify the first plurality of nucleic acid molecules and the second plurality of nucleic acid molecules.
- (c) comprises directed the nucleic acid molecule through the nanopore.
- (c) further comprises using the nanopore sensing platform to measure a current or change thereof as the nucleic acid molecules interacts with the nanopore.
- a method and system of processing a nucleic acid sample from a subject comprising: (a) generating a nucleic acid sample mixture comprising a plurality of methylated nucleic acids from the subject and an amount of supplemental processed DNA (e.g., filler DNA), and (b) incubating the nucleic acid sample mixture with (i) a binder (e.g., a methylation binding molecule) and (ii) a solid substrate, and (c) capturing the methylated nucleic acid to enrich the nucleic acid sample mixture for the plurality of methylated nucleic acids (e.g., methylated single-stranded DNA).
- a binder e.g., a methylation binding molecule
- an amount of supplemental processed DNA (e.g., filler DNA) is not required in (a).
- an amount of supplemental processed DNA (e.g., filler DNA) comprises at least one methylated DNA molecule.
- methylation binding molecule is an antibody (e.g., an anti-5- methylcytosine (anti-5mC) antibody, methyl-CpG-binding domain (MBD) protein).
- the methylation binding molecule comprises biotin.
- the methylation binding molecule binds to a methylated cytosine.
- the solid substrate is a bead.
- the solid substrate is a magnetic solid substrate.
- the solid substrate comprises protein A.
- the solid substrate comprises streptavidin.
- the method further comprises subsequent to (a) and prior to (b), denaturing nucleic acids in the nucleic acid sample mixture.
- the method further comprises, prior to (a), obtaining the nucleic acid samples from the sample and performing one or more library preparation reactions on the nucleic acids.
- exogenous DNA e.g., spike-in DNA
- the prepped library is incubated with a plurality of DNA capture beads (e.g., Solid-phase reversible immobilization (SPRI) beads), then removed.
- the methylated nucleic acid undergoes a sequence reaction.
- the sequencing reaction is a nanopore sequencing reaction. In some cases, the sequencing reaction does not comprise bisulfite sequencing.
- a method and system of processing a nucleic acid sample from a subject comprising: (a) generating a nucleic acid sample mixture comprising a plurality of methylated nucleic acids from the subject and an amount of supplemental processed DNA (e.g., filler DNA), and (b) incubating (i) binder (e.g., a methylation binding molecule) with (ii) a solid substrate to form a methylated nucleic acid capture reagent; and (c) capturing the methylated nucleic acid by adding the methylated nucleic acid capture reagent to the nucleic acid sample mixture to enrich the nucleic acid mixture for the plurality of methylated nucleic acids.
- binder e.g., a methylation binding molecule
- an amount of supplemental processed DNA (e.g., filler DNA) is not required in (a). In some cases, an amount of supplemental processed DNA (e.g., filler DNA) comprises at least one methylated DNA molecule.
- the methylation binding molecule is an antibody (e.g., an anti-5-methylcytosine (anti-5mC) antibody, methyl-CpG- binding domain (MBD) protein). In some cases, the methylation binding molecule comprises biotin. In some cases, the methylation binding molecule binds to a methylated cytosine.
- the solid substrate is a bead. In some cases, the solid substrate is a magnetic solid substrate. In some cases, the solid substrate comprises protein A.
- the solid substrate comprises streptavidin.
- the method further comprises prior to (b), denaturing nucleic acids in the nucleic acid sample mixture.
- the method further comprises, prior to (a), obtaining the nucleic acid samples from the sample and performing one or more library preparation reactions on the nucleic acids.
- exogenous DNA e.g., spike-in DNA
- the prepped library is incubated with a plurality of magnetic beads that interact with nucleic acids (e.g., SPRI beads), and eluted from magnetic beads that interact with nucleic acid.
- the sample is subjected to magnetic capture to remove the magnetic beads that interact with nucleic acids. In some embodiments, the sample is subjected to an additional magnetic capture to remove residual magnetic beads that interact with nucleic acids.
- the methylated nucleic acid undergoes a sequence reaction. In some cases, the sequencing reaction is a nanopore sequencing reaction. In some cases, the sequencing reaction does not comprise bisulfite sequencing.
- quality control analysis can be performed on the captured methylated nucleic acid.
- quality control analysis can comprise measuring methylation binding specificity.
- the methylation binding specificity is measured by calculating the event of methylated fragments or reads detected.
- the methods and systems of processing nucleic acid sample from a subject described herein result in enrichment of methylated single-stranded DNA with at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, at least about 99.5%, at least about 99.6%, at least about 99.7%, at least about 99.8%, or at least about 99.9% methylation specificity.
- the methods and systems of processing nucleic acid sample from a subject described herein result enrichment of methylated nucleic acids by at least about 1 fold, at least about 2 fold, at least about 3 fold, at least about 4 fold, at least about 5 fold, at least about 6 fold, at least about 7 fold, at least about 8 fold, at least about 9 fold, at least about 10 fold, at least about 15 fold, at least about 20 fold, at least about 25 fold, at least about 30 fold, at least about 35 fold, at least about 40 fold, at least about 45 fold, at least about 50 fold, at least about 55 fold, at least about 60 fold, at least about 65 fold, at least about 70 fold, at least about 75 fold, at least about 80 fold, at least about 85 fold, at least about 90 fold, at least about 95 fold, at least about 100 fold, at least about 150 fold, at least about 200 fold, at least about 300 fold, at least about 400 fold, at least about 500 fold, at least about 600 fold, at least about 700 fold, at least about 800 fold, at least about 900
- the methods and systems of processing nucleic acid sample from a subject described herein result enrichment of methylated nucleic acids by at most about 1 fold, at most about 2 fold, at most about 3 fold, at most about 4 fold, at most about 5 fold, at most about 6 fold, at most about 7 fold, at most about 8 fold, at most about 9 fold, at most about 10 fold, at most about 15 fold, at most about 20 fold, at most about 25 fold, at most about 30 fold, at most about 35 fold, at most about 40 fold, at most about 45 fold, at most about 50 fold, at most about 55 fold, at most about 60 fold, at most about 65 fold, at most about 70 fold, at most about 75 fold, at most about 80 fold, at most about 85 fold, at most about 90 fold, at most about 95 fold, at most about 100 fold, at most about 150 fold, at most about 200 fold, at most about 300 fold, at most about 400 fold, at most about 500 fold, at most about 600 fold, at most about 700 fold, at most about 800 fold, at most about 900
- the methods and systems of processing nucleic acid sample from a subject described herein result enrichment of methylated nucleic acids by about 1 fold, about 2 fold, about 3 fold, about 4 fold, about 5 fold, about 6 fold, about 7 fold, about 8 fold, about 9 fold, about 10 fold, about 15 fold, about 20 fold, about 25 fold, about 30 fold, about 35 fold, about 40 fold, about 45 fold, about 50 fold, about 55 fold, about 60 fold, about 65 fold, about 70 fold, about 75 fold, about 80 fold, about 85 fold, about 90 fold, about 95 fold, about 100 fold, about 150 fold, about 200 fold, about 300 fold, about 400 fold, about 500 fold, about 600 fold, about 700 fold, about 800 fold, about 900 fold, or 1000 fold.
- a methylation profile can comprise analysis (e.g., comprising sequencing) of a plurality of nucleic acids (e.g., a plurality of nucleic acid molecules of a depleted sequencing library, as described herein).
- a methylation profile can comprise detection of methylated nucleotides and/or quantification of methylated nucleotide counts.
- a methylation profile can comprise determination of a methylated signal, e.g., in a population of nucleic acids of a depleted sequencing library, as described herein.
- a methylation profile is compared to a genome-wide background profile.
- a methylation profile is compared to a novel background profile created using hypermethylated cfDNA.
- the samples disclosed herein can be subjected to library preparation and next generation deep sequencing, for example to a depth of 1 million (M) to 60 M single reads, 10 M to 60 M single reads, 10 M to 100 M single reads, 40 M to 60 M single reads, 40 M to 100 M single reads, 60 M to 100 M single reads, 60 M to 200 M single reads, 1 M to 10 M single reads, 1 M to 40 M single reads, 1 M single reads to 100 M single reads, 1 M single reads to 200 M single reads, at least 1 M single reads, at least 10 M single reads, at least 40 M single reads, at least 60 M single reads, at least 100 M single reads, or at least 200 M single reads.
- M 1 million
- sequencing can be performed at low sequencing depth (e.g., 10 M single reads, 20 M single reads, 30 M single reads, 40 M single reads, from 1 M single reads to 10 M single reads, from 10 M single reads to 20 M single reads, from 20 M single reads to 30 M single reads, from 30 M single reads to 40 M single reads, at most 10 M single reads, at most 20 M single reads, at most 30 M single reads, or at most 40 M single reads).
- 10 M single reads e.g., 10 M single reads, 20 M single reads, 30 M single reads, 40 M single reads, from 1 M single reads to 10 M single reads, from 10 M single reads to 20 M single reads, from 20 M single reads to 30 M single reads, from 30 M single reads to 40 M single reads, at most 10 M single reads, at most 20 M single reads, at most 30 M single reads, or at most 40 M single reads).
- a sample disclosed herein can be subjected to 1 sequencing at a depth of 0.1X to 100X, 0.1X to 60X, 0.1X to 40X, 0.1X to 30X, 0.1X to 20X, 0.1X to 10X, O. IX to 5.
- OX at least 0.1X, at least 0.5X, at least 1.0X, at least 2. OX, at least 3. OX, at least 4. OX, at least 5. OX, at least 10. OX, at least 20. OX, at least 30. OX, at least 40. OX, at least 50. OX, at least 60. OX, at least 100X, at least 200X, at most 0.1X, at most 0.5X, at most 1.0X, at most 2. OX, at most 3. OX, at most 4. OX, at most 5. OX, at most 10. OX, at most 20. OX, at most 30. OX, at most 40. OX, at most 50. OX, at most 60. OX, at most 100X, or at most 200X.
- a plurality of sequencing reads is generated and analyzed. In some embodiments, deep sequencing may be configured to maximize identifying genomic mutations associated with the disease/condition.
- the relative measure of ctDNA abundance is calculated from the mean mutant allele fractions (MAFs).
- the mean MAF of mutations identified a subject and comprised in his/her mutation profile ranges from at least about 0.01% to at least about 10%.
- the MAF of a ctDNA fraction of a sample can be about at least 0.01%, 0.02%, 0.03%, 0.04%, 0.05%, 0.06%, 0.07%, 0.08%, 0.09%, 0.1%, 0.15%, 0.2%, 0.5%, 1%, 1.5%, 2%, 2.5%, 3%, 3.5%, 4%, 4.5%, 5%, 5.5%, 6%, 6.5%, 7%, 7.5%, 8%, 8.5%, 9%, 9.5%, 10%, or any percentage in between.
- a generated mutation profile of a subject can be generated from sequencing results.
- the mutation profile comprises genetic polymorphisms, such as missense variant, a nonsense variant, a deletion variant, an insertion variant, a duplication variant, an inversion variant, a frameshift variant, or a repeat expansion variant.
- the mutation profile may comprise mutation variant derived from a fraction of cell-free nucleic acid molecules of a specific size range. The present disclosure provides methods, systems, and kits for producing a mutation profile of a subject that has a disease/condition or is suspected of having such disease/condition, wherein the methylation profile may be used to determine whether the subject has the disease/condition or is at risk of having the disease/condition.
- Producing a genomic mutation profile can comprise subjecting a plurality of nucleic acid molecules to library preparation and next generation deep sequencing (e.g., MeDIP-seq).
- a plurality of sequencing reads can be generated and analyzed, and, in some cases, deep sequencing may be configured to maximize identifying genomic mutations associated with the disease/condition.
- a panel of canonical cancer driver genes may be included in a selector for sequencing results analysis.
- including genes without documented driver effects in a particular cancer type in the analysis of sequencing data may increase the sensitivity of ctDNA detection.
- the relative measure of ctDNA abundance is calculated from the mean mutant allele fractions (MAFs).
- the mean MAF of mutations identified a subject and comprised in his/her mutation profile ranges from at least about 0.01% to at least about 10%.
- the generated mutation profile of a subject does not include mutation variants derived from cell-free nucleic acid molecules derived from a biological sample.
- the mutation profile comprises genetic polymorphisms, such as missense variant, a nonsense variant, a deletion variant, an insertion variant, a duplication variant, an inversion variant, a frameshift variant, or a repeat expansion variant.
- the mutation profile may comprise mutation variant derived from a fraction of cell-free nucleic acid molecules of a specific size range.
- the length of ctDNA fragments is shorter than cell-free nucleic acid molecules derived from a healthy subject. In some embodiments, the length of ctDNA comprising at least one mutation is shorter than the length of cell free nucleic acid molecule containing a corresponding reference allele.
- the sequencing does not utilize bisulfite sequence because it causes degradation of ctDNA fragments and prevents the preservation of the length distribution of ctDNAs.
- the fragment length of a plurality of nucleic acids of the present disclosure can be from 1 to about 800 basepairs (bp), from about 50 bp to about 800 bp, from about 100 bp to about 200 bp, from about 120 bp to about 150 bp, from about 60 to about 500 bp, from about 80 to about 300 bp, from 90 to about 250 bp, from 80 to 170 bp, or from about 100 to about 150 bp.
- the fragment length of a plurality of nucleic acids of the present disclosure can be at least 800 basepairs (bp), at least 700 basepairs, at least 600 basepairs, at least 500 basepairs, at least 400 basepairs, at least 300 basepairs, at least 200 basepairs, at least 150 basepairs, at least 100 basepairs, or at least 50 basepairs.
- the fragment length of a plurality of nucleic acids of the present disclosure can be at most 800 basepairs (bp), at most 700 basepairs, at most 600 basepairs, at most 500 basepairs, at most 400 basepairs, at most 300 basepairs, at most 200 basepairs, at most 150 basepairs, at most 100 basepairs, or at most 50 basepairs.
- the present disclosure provides an enrichment of the cell free nucleic acid samples based on selecting cell free molecules of a certain size.
- the multimodal analysis comprises utilizing the mutation profile described herein and the fragment length profile by selectively including a plurality of nucleic acid molecules in the mutation profile based on their fragment length. In some embodiments, the multimodal analysis comprises utilizing the methylation profile described herein and the fragment length profile by selectively including a plurality of nucleic acid molecules in the methylation profile based on their fragment length. In some embodiments, the multimodal analysis comprises utilizing the mutation profile, methylation profile, and the fragment length profile together by selectively including a plurality of nucleic acid molecules in the mutation profile based on their fragment length and by selectively including a plurality of nucleic acid molecules in the methylation profile based on their fragment length respectively.
- the present disclosure provides methods and systems for determining whether a subject has or is at risk of having a disease, wherein the methods and systems comprise subjecting a plurality of nucleic acid molecules derived from a cell-free nucleic acid sample obtained from said subject to sequencing to generate at least one profile of (i) a methylation profile, (ii) a mutation profile, and (iii) a fragment length profile; and processing said at least one profile to determine whether said subject has or is at risk of said disease at a sensitivity of at least 80% or at a specificity of at least about 90%, wherein said cell-free nucleic acid sample comprises less than 30 ng/ml of said plurality of nucleic acid molecules.
- the sensitivity is at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the specificity is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the methods and systems can comprise subjecting a plurality of nucleic acid molecules derived from a cell-free nucleic acid sample obtained from said subject to sequencing to generate at least two profiles of (i) a methylation profile, (ii) a mutation profile, and (iii) a fragment length profile.
- the methods provide a sensitivity of at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the sensitivity when using two profiles is increased by at least about 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, or percentage in between any of the numbers compared to the sensitivity when using one profile.
- the sensitivity when using three profiles is increased by at least about 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, or percentage in between any of the numbers compared to the sensitivity when using two profiles.
- the methods can provide a specificity of at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the specificity when using two profiles is increased by at least about 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, or percentage in between any of the numbers compared to the specificity when using one profile.
- the specificity when using three profiles is increased by at least about 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, or percentage in between any of the numbers compared to the specificity when using two profiles.
- the present disclosure provides methods and systems for processing a cell-free nucleic acid sample of a subject to determine whether said subject has or is at risk of having a disease
- the methods and systems comprise providing said cell-free nucleic acid sample comprising a plurality of nucleic acid molecules; subjecting said plurality of nucleic acid molecules or derivatives thereof to sequencing to generate a plurality of sequencing reads; computer processing said plurality of sequencing reads to identify, for said plurality of nucleic acid molecules, (i) a methylation profile, (ii) a mutation profile, and (iii) a fragment length profile; and using at least said methylation profile, said mutation profile and said fragment length profile to determine whether said subject has or is at risk of having said disease.
- the methods provide a sensitivity of at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the methods provide a specificity of at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the present disclosure provides methods and systems for determining a tissue origin of a tumor, comprising identifying a nucleotide sequence specific for a particular cancer (e.g., breast cancer, colon cancer, prostate cancer, HSNCC, or lung cancer) from which a fraction of cell-free nucleic acid molecules.
- a particular cancer e.g., breast cancer, colon cancer, prostate cancer, HSNCC, or lung cancer
- the fraction of the cell-free nucleic acid molecules is derived from ctDNA.
- the methods provide a sensitivity of at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the methods provide a specificity of at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the present disclosure describes methods and systems for providing a prognosis to a subject after receiving a treatment for a disease/condition.
- the treatment comprises a surgical removal of a tumor, a chemotherapy designed for a specific type of cancer, a radio therapy, or an immune therapy (e.g., TCR, CAR, etc.).
- the methods or systems comprise subjecting a plurality of nucleic acid molecules derived from a cell-free nucleic acid sample obtained from said subject to sequencing to generate at least one profile of (i) a methylation profile, (ii) a mutation profile, and (iii) a fragment length profile; and monitoring or detecting minimal residual disease (MRD) based at least based on the at least one profile.
- MRD minimal residual disease
- a subject is accurately diagnosed and receives a treatment to treat the cancer, such as surgical removal, chemotherapy, radio therapy, etc., it can be important to monitor the effectiveness of the treatment and predict the patient’s survival rate. Further, it can be important to detect minimal residual disease of cancer cells.
- the method further comprises the operation of adding a second amount of control DNA to the sample for confirming the immunoprecipitation reaction.
- control may comprise both positive and negative control, or at least a positive control.
- the method further comprises the operation of adding a second amount of control DNA to the sample for confirming the capture of cell-free methylated DNA.
- identifying the presence of DNA from cancer cells further includes identifying the cancer cell tissue of origin.
- tumor tissue sampling may be challenging or carry significant risks, in which case diagnosing and/or subtyping the cancer without the need for tumor tissue sampling may be desired.
- lung tumor tissue sampling may require invasive procedures such as mediastinoscopy, thoracotomy, or percutaneous needle biopsy; these procedures may result in a need for hospitalization, chest tube, mechanical ventilation, antibiotics, or other medical interventions.
- Some individuals may not undergo the invasive procedures needed for tumor tissue sampling either because of medical comorbidities or due to preference.
- the actual procedure for tumor tissue procurement may depend on the suspected cancer subtype.
- cancer subtype may evolve over time within the same individual; serial assessment with invasive tumor tissue sampling procedures is often impractical and not well tolerated by patients.
- non-invasive cancer subtyping via blood test may have many advantageous applications in the practice of clinical oncology.
- identifying the cancer cell tissue of origin further includes identifying a cancer subtype.
- the cancer subtype differentiates the cancer based on stage (e.g., early stage lung cancer treated with surgery vs late stage lung cancer treated with chemotherapy), histology (e.g., small cell carcinoma vs adenocarcinoma vs squamous cell carcinoma in lung cancer), gene expression pattern or transcription factor activity (e.g., ER status in breast cancer), copy number aberrations (e.g., HER2 status in breast cancer), specific rearrangements (e.g., FLT3 in AML), specific gene point mutational status (e.g., IDH gene point mutations), and DNA methylation patterns (e.g., MGMT gene promoter methylation in brain cancer).
- stage e.g., early stage lung cancer treated with surgery vs late stage lung cancer treated with chemotherapy
- histology e.g., small cell carcinoma vs adenocarcinoma vs squamous cell carcinoma
- comparisons can be carried out genome-wide.
- the comparisons can be restricted from genome-wide to specific regulatory regions, such as, but not limited to, long interspersed nuclear elements (LINEs), short interspersed nuclear elements (SINEs), long terminal repeats (LTRs), FANTOM5 enhancers, CpG Islands, CpG shores, CpG Shelves, or any combination of the foregoing.
- LINEs long interspersed nuclear elements
- SINEs short interspersed nuclear elements
- LTRs long terminal repeats
- FANTOM5 enhancers CpG Islands, CpG shores, CpG Shelves, or any combination of the foregoing.
- the methods herein are for use in the detection of the cancer.
- the methods herein are for use in monitoring therapy of the cancer.
- the methods and systems disclosed herein may comprise algorithms or uses thereof.
- the one or more algorithms may be used to classify one or more samples from one or more subjects.
- the one or more algorithms may be applied to data from one or more samples.
- the data may comprise biomarker expression data.
- the methods or systems comprise subjecting a plurality of nucleic acid molecules derived from a cell-free nucleic acid sample obtained from said subject to sequencing to generate at least one profile of (i) a methylation profile, (ii) a mutation profile, and (iii) a fragment length profile; and monitoring or detecting minimal residual disease (MRD) based on at least one profile.
- the methods disclosed herein may comprise assigning a classification to one or more samples from one or more subjects.
- Assigning the classification to the sample may comprise applying an algorithm to the methylation profile, mutation profile, and fragment length profile.
- at least one profile is inputted to a data analysis system comprising a trained algorithm for classifying the sample as obtained from a subject which has a disease or minor injuries.
- a data analysis system may be a trained algorithm.
- the algorithm may comprise a linear classifier.
- the linear classifier comprises one or more of linear discriminant analysis, Fisher’s linear discriminant, Naive Bayes classifier, Logistic regression, Perceptron, Support vector machine, or a combination thereof.
- the linear classifier may be a support vector machine (SVM) algorithm.
- the algorithm may comprise a two-way classifier.
- the two-way classifier may comprise one or more decision tree, random forest, Bayesian network, support vector machine, neural network, or logistic regression algorithms.
- the algorithm may comprise one or more linear discriminant analysis (LDA), Basic perceptron, Elastic Net, logistic regression, (Kernel) Support Vector Machines (SVM), Diagonal Linear Discriminant Analysis (DLDA), Golub Classifier, Parzen-based, (kernel) Fisher Discriminant Classifier, k-nearest neighbor, Iterative RELIEF, Classification Tree, Maximum Likelihood Classifier, Random Forest, Nearest Centroid, Prediction Analysis of Microarrays (PAM), k-medians clustering, Fuzzy C-Means Clustering, Gaussian mixture models, graded response (GR), Gradient Boosting Method (GBM), Elastic-net logistic regression, logistic regression, or a combination thereof.
- LDA linear discriminant analysis
- SVM Support Vector Machines
- DLDA Diagonal Linear Discriminant Analysis
- Golub Classifier Parzen-based
- (kernel) Fisher Discriminant Classifier k-nearest neighbor
- Iterative RELIEF Classification Tree
- the algorithm may comprise a Diagonal Linear Discriminant Analysis (DLDA) algorithm.
- the algorithm may comprise a Nearest Centroid algorithm.
- the algorithm may comprise a Random Forest algorithm.
- GBM gradient boosting method for discrimination of preeclampsia and non-preeclampsia
- LDA linear discriminant analysis
- SVM support vector machine
- the present disclosure provides methods and systems for determining whether a subject has or is at risk of having a disease, wherein the methods and systems comprises subjecting a plurality of nucleic acid molecules derived from a cell-free nucleic acid sample obtained from said subject to sequencing to generate at least one profile of (i) a methylation profile, (ii) a mutation profile, and (iii) a fragment length profile; and processing said at least one profile to determine whether said subject has or is at risk of said disease at a sensitivity of at least 80% or at a specificity of at least about 90%, wherein said cell-free nucleic acid sample comprises less than 30 ng/ml of said plurality of nucleic acid molecules.
- the sensitivity is at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the specificity is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the methods and systems can comprise subjecting a plurality of nucleic acid molecules derived from a cell-free nucleic acid sample obtained from said subject to sequencing to generate at least two profiles of (i) a methylation profile, (ii) a mutation profile, and (iii) a fragment length profile.
- the methods provide a sensitivity of at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the sensitivity when using two profiles is increased by at least about 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, or percentage in between any of the numbers compared to the sensitivity when using one profile.
- the sensitivity when using three profiles is increased by at least about 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, or percentage in between any of the numbers compared to the sensitivity when using two profiles.
- the methods can provide a specificity of at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the specificity when using two profiles is increased by at least about 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, or percentage in between any of the numbers compared to the specificity when using one profile.
- the present disclosure provides methods and systems for processing a cell-free nucleic acid sample of a subject to determine whether said subject has or is at risk of having a disease
- the methods and systems comprise providing said cell-free nucleic acid sample comprising a plurality of nucleic acid molecules; subjecting said plurality of nucleic acid molecules or derivatives thereof to sequencing to generate a plurality of sequencing reads; computer processing said plurality of sequencing reads to identify, for said plurality of nucleic acid molecules, (i) a methylation profile, (ii) a mutation profile, and (iii) a fragment length profile; and using at least said methylation profile, said mutation profile and said fragment length profile to determine whether said subject has or is at risk of having said disease.
- the methods provide a sensitivity of at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the methods can provide a specificity of at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or any percentage in between the numbers.
- the present disclosure describes methods and systems for providing a prognosis to a subject after receiving a treatment for a disease/condition.
- the treatment comprises a surgical removal of a tumor, a chemotherapy designed for a specific type of cancer, a radio therapy, or an immune therapy (e.g., TCR, CAR, etc.).
- the methods or systems comprise subjecting a plurality of nucleic acid molecules derived from a cell-free nucleic acid sample obtained from said subject to sequencing to generate at least one profile of (i) a methylation profile, (ii) a mutation profile, and (iii) a fragment length profile; and monitoring or detecting minimal residual disease (MRD) based on the at least one profile.
- MRD minimal residual disease
- FIG. 4 shows a computer system 401 that is programmed or otherwise configured to generate a sequencing library containing nucleic acid molecules that are depleted of hypermethylated regions of the nucleic acid molecules (e.g., ctDNA).
- the computer system 401 can regulate various aspects of the present disclosure.
- the computer system 401 can be an electronic device of a user or a computer system that is remotely located with respect to the electronic device.
- the electronic device can be a mobile electronic device.
- the computer system 401 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 405, which can be a single core or multi core processor, or a plurality of processors for parallel processing.
- CPU central processing unit
- the computer system 401 also includes memory or memory location 410 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 415 (e.g., hard disk), communication interface 420 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 425, such as cache, other memory, data storage and/or electronic display adapters.
- the memory 410, storage unit 415, interface 420 and peripheral devices 425 are in communication with the CPU 405 through a communication bus (solid lines), such as a motherboard.
- the storage unit 415 can be a data storage unit (or data repository) for storing data.
- the computer system 401 can be operatively coupled to a computer network (“network”) 430 with the aid of the communication interface 420.
- the network 430 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet.
- the network 430 in some cases is a telecommunication and/or data network.
- the network 430 can include one or more computer servers, which can enable distributed computing, such as cloud computing.
- the network 430 in some cases with the aid of the computer system 401, can implement a peer-to- peer network, which may enable devices coupled to the computer system 401 to behave as a client or a server.
- the CPU 405 can execute a sequence of machine-readable instructions, which can be embodied in a program or software.
- the instructions may be stored in a memory location, such as the memory 410.
- the instructions can be directed to the CPU 405, which can subsequently program or otherwise configure the CPU 405 to implement methods of the present disclosure. Examples of operations performed by the CPU 405 can include fetch, decode, execute, and writeback.
- the CPU 405 can be part of a circuit, such as an integrated circuit. One or more other components of the system 401 can be included in the circuit. In some cases, the circuit is an application specific integrated circuit (ASIC).
- ASIC application specific integrated circuit
- the storage unit 415 can store files, such as drivers, libraries, and saved programs.
- the storage unit 415 can store user data, e.g., user preferences and user programs.
- the computer system 401 in some cases can include one or more additional data storage units that are external to the computer system 401, such as located on a remote server that is in communication with the computer system 401 through an intranet or the Internet.
- the computer system 401 can communicate with one or more remote computer systems through the network 430.
- the computer system 401 can communicate with a remote computer system of a user.
- remote computer systems include personal computers (e.g., portable PC), slate or tablet PC’s (e.g., Apple® iPad, Samsung® Galaxy Tab), telephones, Smart phones (e.g., Apple® iPhone, Android-enabled device, Blackberry®), or personal digital assistants.
- the user can access the computer system 401 via the network 430.
- Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 401, such as, for example, on the memory 410 or electronic storage unit 415.
- the machine executable or machine readable code can be provided in the form of software. During use, the code can be executed by the processor 405. In some cases, the code can be retrieved from the storage unit 415 and stored on the memory 410 for ready access by the processor 405. In some situations, the electronic storage unit 415 can be precluded, and machine-executable instructions are stored on memory 410.
- the code can be pre-compiled and configured for use with a machine having a processer adapted to execute the code or can be compiled during runtime.
- the code can be supplied in a programming language that can be selected to enable the code to execute in a pre-compiled or as-compiled fashion.
- aspects of the systems and methods provided herein can be embodied in programming.
- Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium.
- Machine-executable code can be stored on an electronic storage unit, such as memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk.
- “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server.
- another type of media that may bear the software elements includes optical, electrical, and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links.
- a machine readable medium such as computer-executable code
- a tangible storage medium such as computer-executable code
- Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings.
- Volatile storage media include dynamic memory, such as main memory of such a computer platform.
- Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system.
- Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications.
- RF radio frequency
- IR infrared
- Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data.
- Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.
- the computer system 401 can include or be in communication with an electronic display 1135 that comprises a user interface (UI) 440.
- UI user interface
- Examples of UI’s include, without limitation, a graphical user interface (GUI) and web-based user interface.
- GUI graphical user interface
- Methods and systems of the present disclosure can be implemented by way of one or more algorithms. An algorithm can be implemented by way of software upon execution by the central processing unit 405.
- kits for identifying or monitoring a disease or disorder (e.g., cancer) of a subject may comprise probes for identifying a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of a panel of cancer-associated genomic loci in a sample of the subject.
- a quantitative measure e.g., indicative of a presence, absence, or relative amount
- sequences at each of a panel of cancer-associated genomic loci in the sample may be indicative of the disease or disorder (e.g., cancer) of the subject.
- the probes may be selective for the sequences at the panel of cancer-associated genomic loci in the sample.
- a kit may comprise instructions for using the probes to process the sample to generate datasets indicative of a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of the panel of cancer-associated genomic loci in a sample of the subject.
- a quantitative measure e.g., indicative of a presence, absence, or relative amount
- the probes in the kit may be selective for the sequences at the panel of cancer- associated genomic loci in the sample.
- the probes in the kit may be configured to selectively enrich nucleic acid (e.g., RNA or DNA) molecules corresponding to the panel of cancer- associated genomic loci.
- the probes in the kit may be nucleic acid primers.
- the probes in the kit may have sequence complementarity with one or more nucleic acid sequences from the panel of cancer-associated genomic loci or genomic regions.
- the panel of cancer-associated genomic loci or microbiome-associated genomic loci or genomic regions may comprise at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, or more distinct panel of cancer-associated genomic loci or genomic regions.
- the instructions in the kit may comprise instructions to assay the sample using the probes that are selective for the sequences at the panel of cancer-associated genomic loci in the cell-free biological sample.
- These probes may be nucleic acid molecules (e.g., RNA or DNA) having sequence complementarity with nucleic acid sequences (e.g., RNA or DNA) from one or more of the pluralities of panel of cancer-associated genomic loci.
- These nucleic acid molecules may be primers or enrichment sequences.
- the instructions to assay the cell-free biological sample may comprise introductions to perform array hybridization, polymerase chain reaction (PCR), or nucleic acid sequencing (e.g., DNA sequencing or RNA sequencing) to process the sample to generate datasets indicative of a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of the panel of cancer- associated genomic loci in the sample.
- a quantitative measure e.g., indicative of a presence, absence, or relative amount
- a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of a panel of cancer-associated genomic loci in the sample may be indicative of a disease or disorder (e.g., cancer).
- the instructions in the kit may comprise instructions to measure and interpret assay readouts, which may be quantified at one or more of the panel of cancer-associated genomic loci to generate the datasets indicative of a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of the panel of cancer-associated genomic loci in the sample.
- a quantitative measure e.g., indicative of a presence, absence, or relative amount
- quantification of array hybridization or polymerase chain reaction (PCR) corresponding to the panel of cancer-associated genomic loci may generate the datasets indicative of a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of the panel of cancer-associated genomic loci in the sample.
- Assay readouts may comprise quantitative PCR (qPCR) values, digital PCR (dPCR) values, digital droplet PCR (ddPCR) values, fluorescence values, etc., or normalized values thereof.
- This example shows the use of nanopore sequencing without amplification to determine methylation status at a single base pair resolution of an enriched nucleotide fragment.
- low input (cf)DNA is phosphorylated at the 5’ end to allow for custom primer ligation downstream.
- phosphorylated (cf)DNA undergoes immunoprecipitation (IP) with the addition of biotinylated filler.
- phosphorylated (cf)DNA undergoes immunoprecipitation with non-biotinylated filler and an alternative filler depletion method is used after the immunoprecipitation.
- IP immunoprecipitation
- the filler is removed using streptavidin beads or alternative pull down approaches.
- a custom double stranded adaptor oligonucleotide which includes a sample indexing sequence, is annealed to the phosphorylated 5’ end and leaves a 3’ overhang in the antisense strand.
- An Oxford Nanopore Technologies (ONT) motor protein is annealed to the adaptor.
- the enriched methylated DNA that is ligated to a sample index is multiplexed into a single pool prior to annealing of the ONT motor protein.
- the library is loaded and sequenced on a nanopore sequencer. The nanopore sequencer returns sequencing reads that describe methylation status of the cfDNA.
- This example shows the use of nanopore sequencing without amplification to determine methylation status at a single base pair resolution of an enriched nucleotide fragment.
- FIG. 5 design 2
- a custom oligonucleotide consisting of unique sample barcode and sequencing adaptor oligonucleotide with a 5’ phosphate is ligated to (cf)DNA.
- phosphorylated (cf)DNA undergoes immunoprecipitation (IP) with the addition of biotinylated filler.
- phosphorylated (cf)DNA undergoes immunoprecipitation with non-biotinylated filler and an alternative filler depletion method is used after the immunoprecipitation.
- IP immunoprecipitation
- the filler is removed using streptavidin beads or alternative pull down approaches.
- An oligonucleotide complimentary to custom adaptor sequence previously ligated to the (cf)DNA is annealed to the phosphorylated 5’ end and leaves a 3’ overhang in the antisense strand to create a doublestranded binding complex to which a motor protein can bind.
- An ONT motor protein is annealed to the adaptor.
- the enriched methylated DNA that is ligated to a sample index is multiplexed into a single pool prior to annealing of the ONT motor protein.
- the library is loaded and sequenced on a nanopore sequencer. The nanopore sequencer returns sequencing reads that describe methylation status of the cfDNA.
- This example shows the use of a cfMeDIP workflow coupled to nanopore sequencing to determine methylation status of an enriched nucleotide fragment.
- Nucleosomal cell free DNA may be used to mimic cell-free DNA and its characteristics (e.g., as short nucleic acid fragments). This ncfDNA can be generated by obtaining nucleosomal DNA from viable cells through enzymatic digestion. As shown in FIG.
- a custom oligonucleotide comprising of unique sample barcode and sequencing adaptor oligonucleotide was ligated to each ncfDNA sample for library preparation.
- the library preparation with the ncfDNA samples were subjected to phosphorylation at the 5’ end.
- the elution post library preparation clean-up with beads e.g., solid phase reversible immobilization (SPRI) beads
- SPRI solid phase reversible immobilization
- non-biotinylated filler DNAs normal DNAs
- cfMeDIP Cell-free Methylated DNA Immunoprecipitation
- the samples were mixed with an oligonucleotide complimentary to custom adaptor sequence previously ligated to the cfDNA and subjected to heat denaturing to allow annealing to the phosphorylated 5’ end, leaving a 3’ overhang in the antisense strand to create a double-stranded binding complex to which a motor protein can bind.
- the samples were pooled together and clean up prior to having an ONT motor protein annealed to the adaptor. Then, the libraries were loaded and sequenced on a nanopore sequencer on the MinlON flow cell.
- the sequencing results were analyzed with Dorado software. The results were also mapped to hg38 human genome, the spike-in DNAs and filler DNAs. As shown in FIG. 8, 1,236,679 reads were mapped, of which 1,227,024 were mapped to the hg38 human genome. The read counts were also obtained after demultiplexing based on the i5 index, which represents barcodes that discriminate different samples. As shown in FIG. 9, the sample with 100 ng of ncfDNA (barcode 11) had the highest number of reads obtained. Further, analysis of fragment size of the reads showed that the number of reads that were equal or less than 750 base pairs can be obtained. As shown in FIG. 10, 1,198,605 reads were generated that were equal or less than 750 base pairs, while 28,149 reads were generated that were more than 750 base pairs.
- Example 4 Nanopore on cfDNA (whole genome) versus enriched cfDNA (cfMeDIP-seq) [0141] This example shows the comparison between performing whole genome sequencing of cfDNA sample on a nanopore flow cell versus performing nanopore workflow described in Example 3.
- ncfDNA contrived from cancer cell line spiked in at different concentrations into a background cfDNA sample obtained from pooled normal donor plasma
- two different libraries are generated: 1) whole genome libraries and 2) enriched libraries.
- the whole genome libraries are prepared using standard whole genome sequencing preparation and sequenced on a nanopore flow cell.
- the enriched libraries are prepared by subjecting the DNA to cfMeDIP coupled with nanopore workflow sequencing preparation as described in Example 3, prior to sequencing on a nanopore flow cell.
- methylation calls are compared between the two methods to observe the difference in enrichment of methylated fragments in the enriched library compared to the whole genome library. More methylation calls are expected using the enriched libraries in comparison to the whole genome sample.
- This example shows the capabilities of using nanopore sequencing in comparison to Illumina sequencing (e.g., Miseq) for the cfMeDIP-seq libraries.
- DNA inputs including spike-ins DNAs
- cfMeDIP-seq 10 ng of DNA inputs, including spike-ins DNAs.
- the DNA inputs are obtained ncfDNA contrived from cancer cell line, spiked in at different concentrations into a background cfDNA sample obtained from pooled normal donor plasma.
- the samples are split into two different workflows. One set of replicates is subjected to Illumina sequencing platform, while the other set of replicates is subjected to nanopore sequencing platform, as described in Example 3. The number of unique molecules from the Illumina sequencing platform and the nanopore sequencing platform are compared.
- the spike-in counts are leveraged to show the methylation specificity difference between using the two platforms.
- the concordance of whole methylome to each platform is also analyzed. Further, the GC content distribution and the insert size distribution are compared between the outputs of the two platforms. Algorithm is run with the data generated from each platform to generate score (e.g., methylation calls) as comparison between the two platforms.
- score e.g., methylation calls
- the single nucleotide methylation resolution obtained via the nanopore sequencing platform is expected to allow more leniency in the methylation specificity in the reaction while removing bias associated with PCR amplification as compared to Illumina sequencing platform.
- This example shows the ability of the nanopore workflow, as described in Examples 1-3, to discern the methylation status of each fragment sequenced in pooled samples.
- DNA inputs are obtained from contrived cell line titrations into a pooled background cfDNAs with additional spike-ins DNAs.
- the samples of cfDNAs are next indexed by adding an indexing sequence or custom adaptors (e.g., barcode, UMI, tag) that ligate to the cfDNAs.
- indexing sequence or custom adaptors e.g., barcode, UMI, tag
- Individually indexed samples can be aggregated into a single pool and subjected to immunoprecipitation, removing the need for adding exogenous filler nucleic acids (e.g., exogenous filler DNAs) since the remaining samples in the pool can act as filler nucleic acids by increasing the amount of nucleic acids in the overall sample.
- 10 samples each with 10 ng cfDNA and 20 samples, each with 5 ng cfDNA, were pooled together, resulting in two pools that have 100 ng of DNA total in each.
- the number of samples that are used in the pooled sample may vary depending on the concentration of cfDNAs in each sample and final total cfDNA amount in the pooled sample.
- the pooled samples are subjected to an immunoprecipitation reaction to enrich for methylated regions, followed by nanopore sequencing.
- the spike-in counts are leveraged to show methylation specificity difference between pooled nanopore-based cfMeDIP-sequencing versus a comparable platform or method (e.g., using cfDNAs that are not pooled).
- the concordance of whole methylome to each platform is also analyzed. Further, the GC content distribution and the insert size distribution are compared between the outputs of using pooled cfDNAs versus a comparable platform or method (e.g., using cfDNAs that are not pooled). The algorithm is run with the sequencing data obtained also generate score (e.g., methylation calls) for comparison.
- score e.g., methylation calls
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
La présente invention concerne des procédés et des systèmes pour la détection ciblée de molécules d'ADN tumoral circulant (ADNtc). Dans certains cas, une banque de séquençage moléculaire appauvrie en ADN méthylé peut être générée et utilisée pour détecter l'ADNtc dans un échantillon d'ADN acellulaire de manière fiable, à une profondeur de séquençage plus faible et à un coût inférieur à celui des procédés existants.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202363490322P | 2023-03-15 | 2023-03-15 | |
US63/490,322 | 2023-03-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024192294A1 true WO2024192294A1 (fr) | 2024-09-19 |
Family
ID=92756041
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2024/020012 WO2024192294A1 (fr) | 2023-03-15 | 2024-03-14 | Procédés et systèmes pour générer des banques de séquençage |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024192294A1 (fr) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021253138A1 (fr) * | 2020-06-19 | 2021-12-23 | University Health Network | Analyse multimodale de molécules d'acide nucléique tumorales circulantes |
-
2024
- 2024-03-14 WO PCT/US2024/020012 patent/WO2024192294A1/fr unknown
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021253138A1 (fr) * | 2020-06-19 | 2021-12-23 | University Health Network | Analyse multimodale de molécules d'acide nucléique tumorales circulantes |
Non-Patent Citations (1)
Title |
---|
SIMPSON, J. T. ET AL.: "Detecting DNA cytosine methylation using nanopore sequencing", NATURE METHODS, vol. 14, no. 4, 2017, pages 407 - 410, XP055660941, DOI: 10.1038/nmeth.4184 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220411877A1 (en) | Locked nucleic acids for capturing fusion genes | |
JP2023139162A (ja) | メチローム解析を用いる癌の検出及び分類 | |
JP2019520791A (ja) | 5−ヒドロキシメチル化無細胞系dnaをシーケンシングすることによる非侵襲性診断 | |
KR20210023804A (ko) | 조직 특이적 메틸화 마커 | |
AU2021291586B2 (en) | Multimodal analysis of circulating tumor nucleic acid molecules | |
EP3688195A1 (fr) | Biomarqueurs pour la détection d'un cancer colorectal | |
JP2024056984A (ja) | エピジェネティック区画アッセイを較正するための方法、組成物およびシステム | |
US20230203590A1 (en) | Methods and means for diagnosing lung cancer | |
CN112210601B (zh) | 基于粪便样本的结直肠癌筛查试剂盒 | |
US20220084632A1 (en) | Clinical classfiers and genomic classifiers and uses thereof | |
WO2023226939A1 (fr) | Biomarqueur de méthylation pour détecter les métastases des ganglions lymphatiques dans le cancer colorectal et son utilisation | |
WO2022262831A1 (fr) | Substance et procédé pour l'évaluation tumorale | |
WO2024192294A1 (fr) | Procédés et systèmes pour générer des banques de séquençage | |
CN112210602B (zh) | 基于粪便样本的结直肠癌筛查方法 | |
WO2024216205A1 (fr) | Procédés et systèmes de traitement d'acide nucléique acellulaire | |
CA3151627A1 (fr) | Utilisation d'une detection simultanee de marqueurs pour evaluer le gliome diffus et la reactivite a un traitement | |
WO2023230289A1 (fr) | Procédés et systèmes pour le traitement d'acides nucléiques acellulaires | |
EP4444916A1 (fr) | Procédés et systèmes pour générer des banques de séquençage | |
CN118679267A (zh) | 生成测序文库的方法和系统 | |
Dong et al. | The Progress of the Specific and Rapid Genetic Detection Methods for Ovarian Cancer Diagnosis and Treatment | |
WO2023048713A1 (fr) | Compositions et procédés pour le séquençage ciblé par ngs de cfarn et cfant | |
Michel et al. | Non-invasive multi-cancer detection using DNA hypomethylation of LINE-1 retrotransposons | |
EP4409039A1 (fr) | Biopsies liquides personnalisées dans le cadre du cancer en utilisant des amorces provenant d'une banque d'amorces | |
WO2022120076A1 (fr) | Classificateurs cliniques et classificateurs génomiques et leurs utilisations | |
CN115772566A (zh) | 用于辅助检测肺癌体细胞erbb2基因突变的甲基化生物标记物及其应用 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24771769 Country of ref document: EP Kind code of ref document: A1 |