WO2020181202A1 - Édition de base a:t en t:a par déamination et oxydation d'adénine - Google Patents
Édition de base a:t en t:a par déamination et oxydation d'adénine Download PDFInfo
- Publication number
- WO2020181202A1 WO2020181202A1 PCT/US2020/021429 US2020021429W WO2020181202A1 WO 2020181202 A1 WO2020181202 A1 WO 2020181202A1 US 2020021429 W US2020021429 W US 2020021429W WO 2020181202 A1 WO2020181202 A1 WO 2020181202A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cas9
- seq
- oxidase
- fusion protein
- napdnabp
- Prior art date
Links
- 229960000643 adenine Drugs 0.000 title claims abstract description 42
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 title claims abstract description 38
- 229930024421 Adenine Natural products 0.000 title claims abstract description 38
- 238000007254 oxidation reaction Methods 0.000 title description 18
- 238000006481 deamination reaction Methods 0.000 title description 14
- 230000009615 deamination Effects 0.000 title description 13
- 230000003647 oxidation Effects 0.000 title description 10
- 101710169336 5'-deoxyadenosine deaminase Proteins 0.000 claims abstract description 430
- 230000035772 mutation Effects 0.000 claims abstract description 424
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 145
- 102000004316 Oxidoreductases Human genes 0.000 claims abstract description 143
- 108090000854 Oxidoreductases Proteins 0.000 claims abstract description 143
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 133
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 133
- 102000055025 Adenosine deaminases Human genes 0.000 claims abstract description 132
- 238000000034 method Methods 0.000 claims abstract description 110
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 claims abstract description 48
- 239000013598 vector Substances 0.000 claims abstract description 40
- 229940113082 thymine Drugs 0.000 claims abstract description 29
- 102000052510 DNA-Binding Proteins Human genes 0.000 claims abstract description 20
- 101710096438 DNA-binding protein Proteins 0.000 claims abstract description 18
- 239000008194 pharmaceutical composition Substances 0.000 claims abstract description 12
- 108091033409 CRISPR Proteins 0.000 claims description 420
- 102000037865 fusion proteins Human genes 0.000 claims description 179
- 108020001507 fusion proteins Proteins 0.000 claims description 179
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 152
- 108020004414 DNA Proteins 0.000 claims description 95
- 102000053602 DNA Human genes 0.000 claims description 94
- 108020005004 Guide RNA Proteins 0.000 claims description 80
- 102220366762 c.439G>T Human genes 0.000 claims description 71
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 claims description 66
- 102220089709 rs869320709 Human genes 0.000 claims description 55
- 101710163270 Nuclease Proteins 0.000 claims description 54
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 claims description 50
- 108010008532 Deoxyribonuclease I Proteins 0.000 claims description 48
- 102000007260 Deoxyribonuclease I Human genes 0.000 claims description 48
- 241000282414 Homo sapiens Species 0.000 claims description 46
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 42
- 102220484559 C-type lectin domain family 4 member A_H36L_mutation Human genes 0.000 claims description 33
- 102220182843 rs182603751 Human genes 0.000 claims description 33
- 102200018639 rs122458142 Human genes 0.000 claims description 32
- 102220082375 rs863224226 Human genes 0.000 claims description 32
- 102200012576 rs111033648 Human genes 0.000 claims description 31
- 229930010555 Inosine Natural products 0.000 claims description 29
- 229960003786 inosine Drugs 0.000 claims description 29
- 239000003112 inhibitor Substances 0.000 claims description 26
- 230000014509 gene expression Effects 0.000 claims description 25
- 201000010099 disease Diseases 0.000 claims description 24
- 239000002126 C01EB10 - Adenosine Substances 0.000 claims description 22
- 229960005305 adenosine Drugs 0.000 claims description 22
- 125000000539 amino acid group Chemical group 0.000 claims description 22
- 230000000295 complement effect Effects 0.000 claims description 22
- 102000040430 polynucleotide Human genes 0.000 claims description 22
- 108091033319 polynucleotide Proteins 0.000 claims description 22
- 239000002157 polynucleotide Substances 0.000 claims description 22
- 102220584721 Coordinator of PRMT5 and differentiation stimulator_P48A_mutation Human genes 0.000 claims description 21
- 241000588724 Escherichia coli Species 0.000 claims description 19
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical group O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 claims description 19
- 238000006467 substitution reaction Methods 0.000 claims description 19
- 102000008682 Argonaute Proteins Human genes 0.000 claims description 18
- 108010088141 Argonaute Proteins Proteins 0.000 claims description 18
- 208000035475 disorder Diseases 0.000 claims description 18
- NEACNHFWOGSYJL-UUOKFMHZSA-N 9-[(2R,3R,4S,5R)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]purine-6,8-dione Chemical compound O=C1N([C@H]2[C@H](O)[C@H](O)[C@@H](CO)O2)C2=NC=NC(C2=N1)=O NEACNHFWOGSYJL-UUOKFMHZSA-N 0.000 claims description 16
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 16
- 230000033590 base-excision repair Effects 0.000 claims description 14
- 102220273513 rs373435521 Human genes 0.000 claims description 13
- 102220138225 rs759718991 Human genes 0.000 claims description 8
- 230000008685 targeting Effects 0.000 claims description 7
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 6
- 102000005773 Xanthine dehydrogenase Human genes 0.000 claims description 6
- 108010091383 Xanthine dehydrogenase Proteins 0.000 claims description 6
- 108010093894 Xanthine oxidase Proteins 0.000 claims description 6
- 101150069031 CSN2 gene Proteins 0.000 claims description 5
- 101100378854 Mus musculus Alkbh1 gene Proteins 0.000 claims description 5
- 101150055601 cops2 gene Proteins 0.000 claims description 5
- UBKVUFQGVWHZIR-UHFFFAOYSA-N 8-oxoguanine Chemical group O=C1NC(N)=NC2=NC(=O)N=C21 UBKVUFQGVWHZIR-UHFFFAOYSA-N 0.000 claims description 4
- 102220471150 CUGBP Elav-like family member 6_R152P_mutation Human genes 0.000 claims description 4
- 238000010367 cloning Methods 0.000 claims description 4
- 238000000338 in vitro Methods 0.000 claims description 4
- 238000001727 in vivo Methods 0.000 claims description 4
- 206010053138 Congenital aplastic anaemia Diseases 0.000 claims description 3
- 208000001625 Ectodermal dysplasia-skin fragility syndrome Diseases 0.000 claims description 3
- 201000004939 Fanconi anemia Diseases 0.000 claims description 3
- 206010029748 Noonan syndrome Diseases 0.000 claims description 3
- 206010011005 corneal dystrophy Diseases 0.000 claims description 3
- 208000022592 epidermolysis bullosa simplex due to plakophilin deficiency Diseases 0.000 claims description 3
- 201000003775 lattice corneal dystrophy Diseases 0.000 claims description 3
- 239000000546 pharmaceutical excipient Substances 0.000 claims description 3
- 208000007056 sickle cell anemia Diseases 0.000 claims description 3
- 238000010442 DNA editing Methods 0.000 claims 2
- 101100214576 Mus musculus Mpg gene Proteins 0.000 claims 1
- 241000282577 Pan troglodytes Species 0.000 claims 1
- 239000003814 drug Substances 0.000 claims 1
- 102220335283 rs574731221 Human genes 0.000 claims 1
- 238000000926 separation method Methods 0.000 claims 1
- 238000009434 installation Methods 0.000 abstract description 4
- 230000007812 deficiency Effects 0.000 abstract 1
- 102100036664 Adenosine deaminase Human genes 0.000 description 353
- 108090000623 proteins and genes Proteins 0.000 description 229
- 235000001014 amino acid Nutrition 0.000 description 222
- 229940024606 amino acid Drugs 0.000 description 214
- 150000001413 amino acids Chemical class 0.000 description 200
- 102000004169 proteins and genes Human genes 0.000 description 155
- 235000018102 proteins Nutrition 0.000 description 152
- 210000004027 cell Anatomy 0.000 description 124
- 230000000694 effects Effects 0.000 description 63
- 108700040115 Adenosine deaminases Proteins 0.000 description 55
- 239000002773 nucleotide Substances 0.000 description 53
- 125000003729 nucleotide group Chemical group 0.000 description 51
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 43
- 101000910035 Streptococcus pyogenes serotype M1 CRISPR-associated endonuclease Cas9/Csn1 Proteins 0.000 description 42
- 108090000765 processed proteins & peptides Proteins 0.000 description 41
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 36
- 230000004048 modification Effects 0.000 description 36
- 238000012986 modification Methods 0.000 description 36
- 241000193996 Streptococcus pyogenes Species 0.000 description 29
- 239000013603 viral vector Substances 0.000 description 29
- 239000012634 fragment Substances 0.000 description 28
- 102000004190 Enzymes Human genes 0.000 description 24
- 108090000790 Enzymes Proteins 0.000 description 24
- 229940088598 enzyme Drugs 0.000 description 24
- 239000001678 brown HT Substances 0.000 description 23
- 108091079001 CRISPR RNA Proteins 0.000 description 22
- 102000004196 processed proteins & peptides Human genes 0.000 description 22
- 230000003612 virological effect Effects 0.000 description 22
- 230000006870 function Effects 0.000 description 21
- 229920001184 polypeptide Polymers 0.000 description 21
- 238000010362 genome editing Methods 0.000 description 19
- 230000000670 limiting effect Effects 0.000 description 19
- 230000027455 binding Effects 0.000 description 18
- 238000006243 chemical reaction Methods 0.000 description 18
- 239000013612 plasmid Substances 0.000 description 18
- -1 rRNA Proteins 0.000 description 17
- 108091028113 Trans-activating crRNA Proteins 0.000 description 16
- 238000003776 cleavage reaction Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 16
- 230000007017 scission Effects 0.000 description 16
- 210000004899 c-terminal region Anatomy 0.000 description 15
- 239000000047 product Substances 0.000 description 15
- 238000012546 transfer Methods 0.000 description 15
- 239000002245 particle Substances 0.000 description 14
- 230000001580 bacterial effect Effects 0.000 description 13
- 239000013256 coordination polymer Substances 0.000 description 13
- 230000010076 replication Effects 0.000 description 13
- 101710119400 Geranylfarnesyl diphosphate synthase Proteins 0.000 description 12
- 238000012217 deletion Methods 0.000 description 12
- 230000037430 deletion Effects 0.000 description 12
- 108020004705 Codon Proteins 0.000 description 11
- 230000007018 DNA scission Effects 0.000 description 11
- 230000008859 change Effects 0.000 description 11
- 238000012937 correction Methods 0.000 description 11
- 239000012636 effector Substances 0.000 description 11
- 241000894006 Bacteria Species 0.000 description 10
- 102220576552 HLA class I histocompatibility antigen, A alpha chain_W23R_mutation Human genes 0.000 description 10
- 229930182817 methionine Natural products 0.000 description 10
- 230000033607 mismatch repair Effects 0.000 description 10
- 229920000642 polymer Polymers 0.000 description 10
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 10
- 102000004533 Endonucleases Human genes 0.000 description 9
- 108010042407 Endonucleases Proteins 0.000 description 9
- 102100033220 Xanthine oxidase Human genes 0.000 description 9
- 229960003767 alanine Drugs 0.000 description 9
- 230000015572 biosynthetic process Effects 0.000 description 9
- 239000003795 chemical substances by application Substances 0.000 description 9
- 238000005755 formation reaction Methods 0.000 description 9
- 208000015181 infectious disease Diseases 0.000 description 9
- 238000003780 insertion Methods 0.000 description 9
- 230000037431 insertion Effects 0.000 description 9
- 239000000178 monomer Substances 0.000 description 9
- 230000008439 repair process Effects 0.000 description 9
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 8
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 8
- 102220517488 Phosphate-regulating neutral endopeptidase PHEX_R26Q_mutation Human genes 0.000 description 8
- 235000004279 alanine Nutrition 0.000 description 8
- 210000004900 c-terminal fragment Anatomy 0.000 description 8
- 230000003197 catalytic effect Effects 0.000 description 8
- 239000013078 crystal Substances 0.000 description 8
- 239000000539 dimer Substances 0.000 description 8
- 230000004927 fusion Effects 0.000 description 8
- 230000001404 mediated effect Effects 0.000 description 8
- 239000000126 substance Substances 0.000 description 8
- 208000024891 symptom Diseases 0.000 description 8
- 230000033616 DNA repair Effects 0.000 description 7
- 102100026406 G/T mismatch-specific thymine DNA glycosylase Human genes 0.000 description 7
- 108010035344 Thymine DNA Glycosylase Proteins 0.000 description 7
- 230000001419 dependent effect Effects 0.000 description 7
- 230000002068 genetic effect Effects 0.000 description 7
- 239000000710 homodimer Substances 0.000 description 7
- 210000005260 human cell Anatomy 0.000 description 7
- 238000009396 hybridization Methods 0.000 description 7
- 235000018977 lysine Nutrition 0.000 description 7
- 239000000203 mixture Substances 0.000 description 7
- 108091026890 Coding region Proteins 0.000 description 6
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 6
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 6
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 6
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 6
- 239000004472 Lysine Substances 0.000 description 6
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 6
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 6
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 6
- 102000006382 Ribonucleases Human genes 0.000 description 6
- 108010083644 Ribonucleases Proteins 0.000 description 6
- 239000008186 active pharmaceutical agent Substances 0.000 description 6
- 235000009697 arginine Nutrition 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 239000005090 green fluorescent protein Substances 0.000 description 6
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 6
- 235000014304 histidine Nutrition 0.000 description 6
- 230000001965 increasing effect Effects 0.000 description 6
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 6
- 231100000219 mutagenic Toxicity 0.000 description 6
- 230000003505 mutagenic effect Effects 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 238000000746 purification Methods 0.000 description 6
- 241000894007 species Species 0.000 description 6
- 210000001519 tissue Anatomy 0.000 description 6
- 241001515965 unidentified phage Species 0.000 description 6
- 101710191958 Amino-acid acetyltransferase Proteins 0.000 description 5
- 239000004475 Arginine Substances 0.000 description 5
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 5
- 108091032955 Bacterial small RNA Proteins 0.000 description 5
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 description 5
- 230000004568 DNA-binding Effects 0.000 description 5
- 102000005720 Glutathione transferase Human genes 0.000 description 5
- 108010070675 Glutathione transferase Proteins 0.000 description 5
- 102000029812 HNH nuclease Human genes 0.000 description 5
- 108060003760 HNH nuclease Proteins 0.000 description 5
- 208000026350 Inborn Genetic disease Diseases 0.000 description 5
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 5
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 5
- 241000169176 Natronobacterium gregoryi Species 0.000 description 5
- 241000194020 Streptococcus thermophilus Species 0.000 description 5
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 5
- 239000004473 Threonine Substances 0.000 description 5
- 108020004566 Transfer RNA Proteins 0.000 description 5
- 241000700605 Viruses Species 0.000 description 5
- 206010002026 amyotrophic lateral sclerosis Diseases 0.000 description 5
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 5
- 235000009582 asparagine Nutrition 0.000 description 5
- 229960001230 asparagine Drugs 0.000 description 5
- 230000008901 benefit Effects 0.000 description 5
- 230000021615 conjugation Effects 0.000 description 5
- 239000013604 expression vector Substances 0.000 description 5
- 208000016361 genetic disease Diseases 0.000 description 5
- 230000002458 infectious effect Effects 0.000 description 5
- 150000002632 lipids Chemical class 0.000 description 5
- 230000004777 loss-of-function mutation Effects 0.000 description 5
- 230000030648 nucleus localization Effects 0.000 description 5
- FFNMBRCFFADNAO-UHFFFAOYSA-N pirenzepine hydrochloride Chemical compound [H+].[H+].[Cl-].[Cl-].C1CN(C)CCN1CC(=O)N1C2=NC=CC=C2NC(=O)C2=CC=CC=C21 FFNMBRCFFADNAO-UHFFFAOYSA-N 0.000 description 5
- 108020001580 protein domains Proteins 0.000 description 5
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 4
- 101710172824 CRISPR-associated endonuclease Cas9 Proteins 0.000 description 4
- 241000010804 Caulobacter vibrioides Species 0.000 description 4
- 108010035563 Chloramphenicol O-acetyltransferase Proteins 0.000 description 4
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 4
- 102100026846 Cytidine deaminase Human genes 0.000 description 4
- 108010031325 Cytidine deaminase Proteins 0.000 description 4
- 108010028143 Dioxygenases Proteins 0.000 description 4
- 102000016680 Dioxygenases Human genes 0.000 description 4
- 241001524679 Escherichia virus M13 Species 0.000 description 4
- 108091092584 GDNA Proteins 0.000 description 4
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 4
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 4
- 101710154606 Hemagglutinin Proteins 0.000 description 4
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 4
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 4
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 4
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 4
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 4
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 4
- 101710093908 Outer capsid protein VP4 Proteins 0.000 description 4
- 101710135467 Outer capsid protein sigma-1 Proteins 0.000 description 4
- 101710176177 Protein A56 Proteins 0.000 description 4
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 4
- 241000863432 Shewanella putrefaciens Species 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 4
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 4
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 4
- QTBSBXVTEAMEQO-UHFFFAOYSA-N acetic acid Substances CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 4
- 238000013459 approach Methods 0.000 description 4
- 238000003556 assay Methods 0.000 description 4
- UCMIRNVEIXFBKS-UHFFFAOYSA-N beta-alanine Chemical compound NCCC(O)=O UCMIRNVEIXFBKS-UHFFFAOYSA-N 0.000 description 4
- 108091005948 blue fluorescent proteins Proteins 0.000 description 4
- 238000012512 characterization method Methods 0.000 description 4
- 238000005520 cutting process Methods 0.000 description 4
- 108010082025 cyan fluorescent protein Proteins 0.000 description 4
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 4
- 239000000185 hemagglutinin Substances 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 229960000310 isoleucine Drugs 0.000 description 4
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 4
- 230000003902 lesion Effects 0.000 description 4
- 230000004807 localization Effects 0.000 description 4
- 230000035800 maturation Effects 0.000 description 4
- 108020004999 messenger RNA Proteins 0.000 description 4
- 201000011540 mitochondrial DNA depletion syndrome 4a Diseases 0.000 description 4
- 238000010369 molecular cloning Methods 0.000 description 4
- 239000002777 nucleoside Substances 0.000 description 4
- 210000004940 nucleus Anatomy 0.000 description 4
- 230000001105 regulatory effect Effects 0.000 description 4
- 102220062649 rs786204195 Human genes 0.000 description 4
- 102220097735 rs876659105 Human genes 0.000 description 4
- 238000002741 site-directed mutagenesis Methods 0.000 description 4
- 125000006850 spacer group Chemical group 0.000 description 4
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- 238000010361 transduction Methods 0.000 description 4
- 230000026683 transduction Effects 0.000 description 4
- 238000001890 transfection Methods 0.000 description 4
- 239000004474 valine Substances 0.000 description 4
- 108091005957 yellow fluorescent proteins Proteins 0.000 description 4
- ZDTFMPXQUSBYRL-UUOKFMHZSA-N 2-Aminoadenosine Chemical compound C12=NC(N)=NC(N)=C2N=CN1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O ZDTFMPXQUSBYRL-UUOKFMHZSA-N 0.000 description 3
- 102100034112 Alkyldihydroxyacetonephosphate synthase, peroxisomal Human genes 0.000 description 3
- 241000282472 Canis lupus familiaris Species 0.000 description 3
- 241000282326 Felis catus Species 0.000 description 3
- 241000606768 Haemophilus influenzae Species 0.000 description 3
- 101000799143 Homo sapiens Alkyldihydroxyacetonephosphate synthase, peroxisomal Proteins 0.000 description 3
- 108010015268 Integration Host Factors Proteins 0.000 description 3
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 3
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 3
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 3
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 3
- 108060001084 Luciferase Proteins 0.000 description 3
- 239000005089 Luciferase Substances 0.000 description 3
- 108020004485 Nonsense Codon Proteins 0.000 description 3
- 108091007494 Nucleic acid- binding domains Proteins 0.000 description 3
- 108700026244 Open Reading Frames Proteins 0.000 description 3
- 239000002202 Polyethylene glycol Substances 0.000 description 3
- 102000018120 Recombinases Human genes 0.000 description 3
- 108010091086 Recombinases Proteins 0.000 description 3
- 108700008625 Reporter Genes Proteins 0.000 description 3
- 102000003661 Ribonuclease III Human genes 0.000 description 3
- 108010057163 Ribonuclease III Proteins 0.000 description 3
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 3
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 3
- 108020004682 Single-Stranded DNA Proteins 0.000 description 3
- 102220582735 Solute carrier family 2, facilitated glucose transporter member 1_R51H_mutation Human genes 0.000 description 3
- 241000191967 Staphylococcus aureus Species 0.000 description 3
- 241001600132 Streptomyces cyanogenus Species 0.000 description 3
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 3
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 3
- 230000004913 activation Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 3
- 230000004075 alteration Effects 0.000 description 3
- 150000001408 amides Chemical group 0.000 description 3
- 230000008970 bacterial immunity Effects 0.000 description 3
- 210000003855 cell nucleus Anatomy 0.000 description 3
- 108091092356 cellular DNA Proteins 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 210000000349 chromosome Anatomy 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000005782 double-strand break Effects 0.000 description 3
- 210000003527 eukaryotic cell Anatomy 0.000 description 3
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 3
- 230000003301 hydrolyzing effect Effects 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 238000002955 isolation Methods 0.000 description 3
- 210000004962 mammalian cell Anatomy 0.000 description 3
- 230000000813 microbial effect Effects 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000001590 oxidative effect Effects 0.000 description 3
- 230000001717 pathogenic effect Effects 0.000 description 3
- 230000037361 pathway Effects 0.000 description 3
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 3
- 229920001223 polyethylene glycol Polymers 0.000 description 3
- 102000054765 polymorphisms of proteins Human genes 0.000 description 3
- 230000004952 protein activity Effects 0.000 description 3
- 102200005752 rs370823171 Human genes 0.000 description 3
- 238000002864 sequence alignment Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 150000003573 thiols Chemical class 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 3
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 2
- ZAYHVCMSTBRABG-JXOAFFINSA-N 5-methylcytidine Chemical compound O=C1N=C(N)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 ZAYHVCMSTBRABG-JXOAFFINSA-N 0.000 description 2
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical compound CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 2
- SLXKOJJOQWFEFD-UHFFFAOYSA-N 6-aminohexanoic acid Chemical compound NCCCCCC(O)=O SLXKOJJOQWFEFD-UHFFFAOYSA-N 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 102000012758 APOBEC-1 Deaminase Human genes 0.000 description 2
- 108010079649 APOBEC-1 Deaminase Proteins 0.000 description 2
- 102100024090 Aldo-keto reductase family 1 member C3 Human genes 0.000 description 2
- 102100030461 Alpha-ketoglutarate-dependent dioxygenase FTO Human genes 0.000 description 2
- 101100123845 Aphanizomenon flos-aquae (strain 2012/KM1/D3) hepT gene Proteins 0.000 description 2
- 241000203069 Archaea Species 0.000 description 2
- 244000063299 Bacillus subtilis Species 0.000 description 2
- 235000014469 Bacillus subtilis Nutrition 0.000 description 2
- 102100026189 Beta-galactosidase Human genes 0.000 description 2
- 240000007532 Butia capitata Species 0.000 description 2
- 125000001433 C-terminal amino-acid group Chemical group 0.000 description 2
- 108010040467 CRISPR-Associated Proteins Proteins 0.000 description 2
- KRKNYBCHXYNGOX-UHFFFAOYSA-K Citrate Chemical compound [O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O KRKNYBCHXYNGOX-UHFFFAOYSA-K 0.000 description 2
- MIKUYHXYGGJMLM-GIMIYPNGSA-N Crotonoside Natural products C1=NC2=C(N)NC(=O)N=C2N1[C@H]1O[C@@H](CO)[C@H](O)[C@@H]1O MIKUYHXYGGJMLM-GIMIYPNGSA-N 0.000 description 2
- 102220503606 Cyclin-dependent kinase inhibitor 2A_P48L_mutation Human genes 0.000 description 2
- RGSFGYAAUTVSQA-UHFFFAOYSA-N Cyclopentane Chemical compound C1CCCC1 RGSFGYAAUTVSQA-UHFFFAOYSA-N 0.000 description 2
- NYHBQMYGNKIUIF-UHFFFAOYSA-N D-guanosine Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(CO)C(O)C1O NYHBQMYGNKIUIF-UHFFFAOYSA-N 0.000 description 2
- 102000011724 DNA Repair Enzymes Human genes 0.000 description 2
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 2
- 241000252212 Danio rerio Species 0.000 description 2
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 2
- 241000588697 Enterobacter cloacae Species 0.000 description 2
- 101710191360 Eosinophil cationic protein Proteins 0.000 description 2
- 102100039556 Galectin-4 Human genes 0.000 description 2
- 241001494297 Geobacter sulfurreducens Species 0.000 description 2
- 108010060309 Glucuronidase Proteins 0.000 description 2
- 102000053187 Glucuronidase Human genes 0.000 description 2
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 2
- 239000004471 Glycine Substances 0.000 description 2
- 101001023784 Heteractis crispa GFP-like non-fluorescent chromoprotein Proteins 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- 101001062620 Homo sapiens Alpha-ketoglutarate-dependent dioxygenase FTO Proteins 0.000 description 2
- 101000653360 Homo sapiens Methylcytosine dioxygenase TET1 Proteins 0.000 description 2
- 101000653374 Homo sapiens Methylcytosine dioxygenase TET2 Proteins 0.000 description 2
- 101000653369 Homo sapiens Methylcytosine dioxygenase TET3 Proteins 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 2
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 2
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 2
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 2
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 2
- 102000003960 Ligases Human genes 0.000 description 2
- 108090000364 Ligases Proteins 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 241001357706 Marinitoga piezophila Species 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 102100030803 Methylcytosine dioxygenase TET2 Human genes 0.000 description 2
- 102100030812 Methylcytosine dioxygenase TET3 Human genes 0.000 description 2
- 241000699666 Mus <mouse, genus> Species 0.000 description 2
- 241000947184 Myxococcus hansupus Species 0.000 description 2
- 125000001429 N-terminal alpha-amino-acid group Chemical group 0.000 description 2
- 108091061960 Naked DNA Proteins 0.000 description 2
- 108010066154 Nuclear Export Signals Proteins 0.000 description 2
- 241000283973 Oryctolagus cuniculus Species 0.000 description 2
- 102100030655 Platelet-activating factor acetylhydrolase IB subunit beta Human genes 0.000 description 2
- 239000004698 Polyethylene Substances 0.000 description 2
- 108010065942 Prostaglandin-F synthase Proteins 0.000 description 2
- 102000055027 Protein Methyltransferases Human genes 0.000 description 2
- 108700040121 Protein Methyltransferases Proteins 0.000 description 2
- 244000305267 Quercus macrolepis Species 0.000 description 2
- 235000016976 Quercus macrolepis Nutrition 0.000 description 2
- 102000044126 RNA-Binding Proteins Human genes 0.000 description 2
- 102100036007 Ribonuclease 3 Human genes 0.000 description 2
- 101710192197 Ribonuclease 3 Proteins 0.000 description 2
- 241000283984 Rodentia Species 0.000 description 2
- 241000293871 Salmonella enterica subsp. enterica serovar Typhi Species 0.000 description 2
- 241000293869 Salmonella enterica subsp. enterica serovar Typhimurium Species 0.000 description 2
- 241000700584 Simplexvirus Species 0.000 description 2
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 2
- 241000972623 Streptomyces albulus Species 0.000 description 2
- 241000855330 Streptomyces himastatinicus Species 0.000 description 2
- 241000187398 Streptomyces lividans Species 0.000 description 2
- 102100036407 Thioredoxin Human genes 0.000 description 2
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 108010073929 Vascular Endothelial Growth Factor A Proteins 0.000 description 2
- 102100039037 Vascular endothelial growth factor A Human genes 0.000 description 2
- 108010067390 Viral Proteins Proteins 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 2
- 125000000637 arginyl group Chemical class N[C@@H](CCCNC(N)=N)C(=O)* 0.000 description 2
- 229940009098 aspartate Drugs 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- 125000004429 atom Chemical group 0.000 description 2
- 125000000751 azo group Chemical group [*]N=N[*] 0.000 description 2
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 2
- 108010005774 beta-Galactosidase Proteins 0.000 description 2
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 2
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 2
- 230000008512 biological response Effects 0.000 description 2
- 108700023293 biotin carboxyl carrier Proteins 0.000 description 2
- 125000003636 chemical group Chemical group 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 2
- 239000012039 electrophile Substances 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 239000003733 fiber-reinforced composite Substances 0.000 description 2
- 108010021843 fluorescent protein 583 Proteins 0.000 description 2
- BTCSSZJGUNDROE-UHFFFAOYSA-N gamma-aminobutyric acid Chemical compound NCCCC(O)=O BTCSSZJGUNDROE-UHFFFAOYSA-N 0.000 description 2
- 238000002873 global sequence alignment Methods 0.000 description 2
- 108091022928 glucosylglycerol-phosphate synthase Proteins 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- 229940029575 guanosine Drugs 0.000 description 2
- 239000000833 heterodimer Substances 0.000 description 2
- 150000007857 hydrazones Chemical class 0.000 description 2
- 230000002209 hydrophobic effect Effects 0.000 description 2
- FDGQSTZJBFJUBT-UHFFFAOYSA-N hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 2
- 230000036039 immunity Effects 0.000 description 2
- 230000008676 import Effects 0.000 description 2
- 230000002779 inactivation Effects 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 238000005304 joining Methods 0.000 description 2
- 208000025766 lethal multiple pterygium syndrome Diseases 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 210000004898 n-terminal fragment Anatomy 0.000 description 2
- 210000004897 n-terminal region Anatomy 0.000 description 2
- 230000006780 non-homologous end joining Effects 0.000 description 2
- 230000025308 nuclear transport Effects 0.000 description 2
- 150000003833 nucleoside derivatives Chemical class 0.000 description 2
- 125000003835 nucleoside group Chemical group 0.000 description 2
- 230000020520 nucleotide-excision repair Effects 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 2
- 229920001282 polysaccharide Chemical group 0.000 description 2
- 239000005017 polysaccharide Chemical group 0.000 description 2
- 150000004804 polysaccharides Chemical group 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 239000013636 protein dimer Substances 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 230000001177 retroviral effect Effects 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 102220340881 rs1554949196 Human genes 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 230000003007 single stranded DNA break Effects 0.000 description 2
- 150000003384 small molecules Chemical class 0.000 description 2
- 239000000344 soap Substances 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 235000019333 sodium laurylsulphate Nutrition 0.000 description 2
- 230000009870 specific binding Effects 0.000 description 2
- UNFWWIHTNXNPBV-WXKVUWSESA-N spectinomycin Chemical class O([C@@H]1[C@@H](NC)[C@@H](O)[C@H]([C@@H]([C@H]1O1)O)NC)[C@]2(O)[C@H]1O[C@H](C)CC2=O UNFWWIHTNXNPBV-WXKVUWSESA-N 0.000 description 2
- 235000000346 sugar Nutrition 0.000 description 2
- 150000008163 sugars Chemical class 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- 108060008226 thioredoxin Proteins 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 230000005945 translocation Effects 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 2
- 229940045145 uridine Drugs 0.000 description 2
- 230000017613 viral reproduction Effects 0.000 description 2
- RIFDKYBNWNPCQK-IOSLPCCCSA-N (2r,3s,4r,5r)-2-(hydroxymethyl)-5-(6-imino-3-methylpurin-9-yl)oxolane-3,4-diol Chemical compound C1=2N(C)C=NC(=N)C=2N=CN1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O RIFDKYBNWNPCQK-IOSLPCCCSA-N 0.000 description 1
- GWXOEHPRWMAKPT-IIZANFQQSA-N (2s)-1-[(2s)-2-[[(2s)-2-[[(2s)-1-[2-[[(2s)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]acetyl]pyrrolidine-2-carbonyl]amino]-4-methylpentanoyl]amino]-3-phenylpropanoyl]pyrrolidine-2-carboxylic acid Chemical compound C([C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=C(O)C=C1 GWXOEHPRWMAKPT-IIZANFQQSA-N 0.000 description 1
- 108091064702 1 family Proteins 0.000 description 1
- RKSLVDIXBGWPIS-UAKXSSHOSA-N 1-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-iodopyrimidine-2,4-dione Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(I)=C1 RKSLVDIXBGWPIS-UAKXSSHOSA-N 0.000 description 1
- QLOCVMVCRJOTTM-TURQNECASA-N 1-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-prop-1-ynylpyrimidine-2,4-dione Chemical compound O=C1NC(=O)C(C#CC)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 QLOCVMVCRJOTTM-TURQNECASA-N 0.000 description 1
- PISWNSOQFZRVJK-XLPZGREQSA-N 1-[(2r,4s,5r)-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-methyl-2-sulfanylidenepyrimidin-4-one Chemical compound S=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 PISWNSOQFZRVJK-XLPZGREQSA-N 0.000 description 1
- FRJNIHLOMXIQKH-UHFFFAOYSA-N 1-amino-15-oxo-4,7,10-trioxa-14-azaoctadecan-18-oic acid Chemical compound NCCCOCCOCCOCCCNC(=O)CCC(O)=O FRJNIHLOMXIQKH-UHFFFAOYSA-N 0.000 description 1
- YKBGVTZYEHREMT-KVQBGUIXSA-N 2'-deoxyguanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](CO)O1 YKBGVTZYEHREMT-KVQBGUIXSA-N 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- CKTSBUTUHBMZGZ-SHYZEUOFSA-N 2'‐deoxycytidine Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-SHYZEUOFSA-N 0.000 description 1
- JRYMOPZHXMVHTA-DAGMQNCNSA-N 2-amino-7-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-1h-pyrrolo[2,3-d]pyrimidin-4-one Chemical compound C1=CC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O JRYMOPZHXMVHTA-DAGMQNCNSA-N 0.000 description 1
- RHFUOMFWUGWKKO-XVFCMESISA-N 2-thiocytidine Chemical compound S=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 RHFUOMFWUGWKKO-XVFCMESISA-N 0.000 description 1
- XXSIICQLPUAUDF-TURQNECASA-N 4-amino-1-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-prop-1-ynylpyrimidin-2-one Chemical compound O=C1N=C(N)C(C#CC)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 XXSIICQLPUAUDF-TURQNECASA-N 0.000 description 1
- ZAYHVCMSTBRABG-UHFFFAOYSA-N 5-Methylcytidine Natural products O=C1N=C(N)C(C)=CN1C1C(O)C(O)C(CO)O1 ZAYHVCMSTBRABG-UHFFFAOYSA-N 0.000 description 1
- AGFIRQJZCNVMCW-UAKXSSHOSA-N 5-bromouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(Br)=C1 AGFIRQJZCNVMCW-UAKXSSHOSA-N 0.000 description 1
- BLQMCTXZEMGOJM-UHFFFAOYSA-N 5-carboxycytosine Chemical compound NC=1NC(=O)N=CC=1C(O)=O BLQMCTXZEMGOJM-UHFFFAOYSA-N 0.000 description 1
- FHIDNBAQOFJWCA-UAKXSSHOSA-N 5-fluorouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(F)=C1 FHIDNBAQOFJWCA-UAKXSSHOSA-N 0.000 description 1
- FHSISDGOVSHJRW-UHFFFAOYSA-N 5-formylcytosine Chemical compound NC1=NC(=O)NC=C1C=O FHSISDGOVSHJRW-UHFFFAOYSA-N 0.000 description 1
- KDOPAZIWBAHVJB-UHFFFAOYSA-N 5h-pyrrolo[3,2-d]pyrimidine Chemical compound C1=NC=C2NC=CC2=N1 KDOPAZIWBAHVJB-UHFFFAOYSA-N 0.000 description 1
- UEHOMUNTZPIBIL-UUOKFMHZSA-N 6-amino-9-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-7h-purin-8-one Chemical compound O=C1NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O UEHOMUNTZPIBIL-UUOKFMHZSA-N 0.000 description 1
- HCAJQHYUCKICQH-VPENINKCSA-N 8-Oxo-7,8-dihydro-2'-deoxyguanosine Chemical compound C1=2NC(N)=NC(=O)C=2NC(=O)N1[C@H]1C[C@H](O)[C@@H](CO)O1 HCAJQHYUCKICQH-VPENINKCSA-N 0.000 description 1
- HDZZVAMISRMYHH-UHFFFAOYSA-N 9beta-Ribofuranosyl-7-deazaadenin Natural products C1=CC=2C(N)=NC=NC=2N1C1OC(CO)C(O)C1O HDZZVAMISRMYHH-UHFFFAOYSA-N 0.000 description 1
- 108010013043 Acetylesterase Proteins 0.000 description 1
- 241000604451 Acidaminococcus Species 0.000 description 1
- 101710159080 Aconitate hydratase A Proteins 0.000 description 1
- 101710159078 Aconitate hydratase B Proteins 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- 102100033647 Activity-regulated cytoskeleton-associated protein Human genes 0.000 description 1
- 108010052875 Adenine deaminase Proteins 0.000 description 1
- 229920001817 Agar Polymers 0.000 description 1
- 241000193412 Alicyclobacillus acidoterrestris Species 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 108010016119 Alpha-Ketoglutarate-Dependent Dioxygenase FTO Proteins 0.000 description 1
- 102100033776 Amelotin Human genes 0.000 description 1
- 102100022749 Aminopeptidase N Human genes 0.000 description 1
- 101710099461 Aminopeptidase N Proteins 0.000 description 1
- 206010056292 Androgen-Insensitivity Syndrome Diseases 0.000 description 1
- 241000219194 Arabidopsis Species 0.000 description 1
- 101100404726 Arabidopsis thaliana NHX7 gene Proteins 0.000 description 1
- 101001125931 Arabidopsis thaliana Plastidial pyruvate kinase 2 Proteins 0.000 description 1
- 108020000946 Bacterial DNA Proteins 0.000 description 1
- 241000616876 Belliella baltica Species 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 238000010453 CRISPR/Cas method Methods 0.000 description 1
- 101150018129 CSF2 gene Proteins 0.000 description 1
- 101100452003 Caenorhabditis elegans ape-1 gene Proteins 0.000 description 1
- 241000589875 Campylobacter jejuni Species 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- 108091060290 Chromatid Proteins 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 241000918600 Corynebacterium ulcerans Species 0.000 description 1
- XDTMQSROBMDMFD-UHFFFAOYSA-N Cyclohexane Chemical compound C1CCCCC1 XDTMQSROBMDMFD-UHFFFAOYSA-N 0.000 description 1
- 108010074922 Cytochrome P-450 CYP1A2 Proteins 0.000 description 1
- 102000008144 Cytochrome P-450 CYP1A2 Human genes 0.000 description 1
- 108010081668 Cytochrome P-450 CYP3A Proteins 0.000 description 1
- 102000002004 Cytochrome P-450 Enzyme System Human genes 0.000 description 1
- 108010015742 Cytochrome P-450 Enzyme System Proteins 0.000 description 1
- 102100036194 Cytochrome P450 2A6 Human genes 0.000 description 1
- 102100039205 Cytochrome P450 3A4 Human genes 0.000 description 1
- 102000018832 Cytochromes Human genes 0.000 description 1
- 108010052832 Cytochromes Proteins 0.000 description 1
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical class OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- 108010076525 DNA Repair Enzymes Proteins 0.000 description 1
- 230000005778 DNA damage Effects 0.000 description 1
- 231100000277 DNA damage Toxicity 0.000 description 1
- 230000008265 DNA repair mechanism Effects 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- CKTSBUTUHBMZGZ-UHFFFAOYSA-N Deoxycytidine Natural products O=C1N=C(N)C=CN1C1OC(CO)C(O)C1 CKTSBUTUHBMZGZ-UHFFFAOYSA-N 0.000 description 1
- 241000702421 Dependoparvovirus Species 0.000 description 1
- 102100025800 E3 SUMO-protein ligase ZBED1 Human genes 0.000 description 1
- 108700034637 EC 3.2.-.- Proteins 0.000 description 1
- 102100038132 Endogenous retrovirus group K member 6 Pro protein Human genes 0.000 description 1
- 101710091045 Envelope protein Proteins 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 101100207407 Escherichia coli (strain K12) traQ gene Proteins 0.000 description 1
- 241000702189 Escherichia virus Mu Species 0.000 description 1
- 101150065330 Fancc gene Proteins 0.000 description 1
- 108010027673 Fanconi Anemia Complementation Group C protein Proteins 0.000 description 1
- 102100027286 Fanconi anemia group C protein Human genes 0.000 description 1
- 241000724791 Filamentous phage Species 0.000 description 1
- 241000589601 Francisella Species 0.000 description 1
- 241000589599 Francisella tularensis subsp. novicida Species 0.000 description 1
- 108090000926 GMP synthase (glutamine-hydrolyzing) Proteins 0.000 description 1
- 102100033452 GMP synthase [glutamine-hydrolyzing] Human genes 0.000 description 1
- 102100030708 GTPase KRas Human genes 0.000 description 1
- 108010001515 Galectin 4 Proteins 0.000 description 1
- 229940123611 Genome editing Drugs 0.000 description 1
- 241001468175 Geobacillus thermodenitrificans Species 0.000 description 1
- 102220547509 Glucocorticoid receptor_N72D_mutation Human genes 0.000 description 1
- 102220637361 Glutathione S-transferase A3_I49V_mutation Human genes 0.000 description 1
- 102220566626 Glutathione hydrolase 1 proenzyme_R107K_mutation Human genes 0.000 description 1
- 229940113491 Glycosylase inhibitor Drugs 0.000 description 1
- 101150013707 HBB gene Proteins 0.000 description 1
- 108050008753 HNH endonucleases Proteins 0.000 description 1
- 102000000310 HNH endonucleases Human genes 0.000 description 1
- 241000025244 Haemophilus influenzae F3031 Species 0.000 description 1
- 102100021519 Hemoglobin subunit beta Human genes 0.000 description 1
- 102000006479 Heterogeneous-Nuclear Ribonucleoproteins Human genes 0.000 description 1
- 108010019372 Heterogeneous-Nuclear Ribonucleoproteins Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000779860 Homo sapiens Amelotin Proteins 0.000 description 1
- 101000855342 Homo sapiens Cytochrome P450 1A2 Proteins 0.000 description 1
- 101000875170 Homo sapiens Cytochrome P450 2A6 Proteins 0.000 description 1
- 101000786317 Homo sapiens E3 SUMO-protein ligase ZBED1 Proteins 0.000 description 1
- 101000584612 Homo sapiens GTPase KRas Proteins 0.000 description 1
- 101000608765 Homo sapiens Galectin-4 Proteins 0.000 description 1
- 101001050472 Homo sapiens Integral membrane protein 2A Proteins 0.000 description 1
- 101000946124 Homo sapiens Lipocalin-1 Proteins 0.000 description 1
- 101000615488 Homo sapiens Methyl-CpG-binding domain protein 2 Proteins 0.000 description 1
- 101001128634 Homo sapiens NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 2, mitochondrial Proteins 0.000 description 1
- 101001125939 Homo sapiens Plakophilin-1 Proteins 0.000 description 1
- 101000605014 Homo sapiens Putative L-type amino acid transporter 1-like protein MLAS Proteins 0.000 description 1
- 101000894525 Homo sapiens Transforming growth factor-beta-induced protein ig-h3 Proteins 0.000 description 1
- UGQMRVRMYYASKQ-UHFFFAOYSA-N Hypoxanthine nucleoside Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 UGQMRVRMYYASKQ-UHFFFAOYSA-N 0.000 description 1
- 101150105104 Kras gene Proteins 0.000 description 1
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 1
- 241001112693 Lachnospiraceae Species 0.000 description 1
- 101710128836 Large T antigen Proteins 0.000 description 1
- 241000270322 Lepidosauria Species 0.000 description 1
- 241000029603 Leptotrichia shahii Species 0.000 description 1
- 102100022275 Leucine-rich glioma-inactivated protein 1 Human genes 0.000 description 1
- 102220567251 Leucine-rich glioma-inactivated protein 1_R26C_mutation Human genes 0.000 description 1
- 101000779583 Limulus polyphemus Anti-lipopolysaccharide factor Proteins 0.000 description 1
- 102100034724 Lipocalin-1 Human genes 0.000 description 1
- 241000186805 Listeria innocua Species 0.000 description 1
- 208000001826 Marfan syndrome Diseases 0.000 description 1
- 102100021299 Methyl-CpG-binding domain protein 2 Human genes 0.000 description 1
- 102100030819 Methylcytosine dioxygenase TET1 Human genes 0.000 description 1
- 101100061204 Mus musculus Cyp2a4 gene Proteins 0.000 description 1
- 101100494762 Mus musculus Nedd9 gene Proteins 0.000 description 1
- 125000000729 N-terminal amino-acid group Chemical group 0.000 description 1
- 102100032194 NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 2, mitochondrial Human genes 0.000 description 1
- 241000588650 Neisseria meningitidis Species 0.000 description 1
- 241000244206 Nematoda Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 241000221961 Neurospora crassa Species 0.000 description 1
- 101100385413 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) csm-3 gene Proteins 0.000 description 1
- 241000207746 Nicotiana benthamiana Species 0.000 description 1
- 102220547068 Nucleobindin-1_R52H_mutation Human genes 0.000 description 1
- 102000002488 Nucleoplasmin Human genes 0.000 description 1
- 102000011931 Nucleoproteins Human genes 0.000 description 1
- 108010061100 Nucleoproteins Proteins 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 240000000968 Parkia biglobosa Species 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 102100035278 Pendrin Human genes 0.000 description 1
- 102100029331 Plakophilin-1 Human genes 0.000 description 1
- 239000004952 Polyamide Substances 0.000 description 1
- 102000017033 Porins Human genes 0.000 description 1
- 108010013381 Porins Proteins 0.000 description 1
- 241000605861 Prevotella Species 0.000 description 1
- 241000677647 Proba Species 0.000 description 1
- 101710188315 Protein X Proteins 0.000 description 1
- 241000577544 Psychroflexus torquis Species 0.000 description 1
- 102100038206 Putative L-type amino acid transporter 1-like protein MLAS Human genes 0.000 description 1
- 101710123256 Pyrrolysine-tRNA ligase Proteins 0.000 description 1
- 230000026279 RNA modification Effects 0.000 description 1
- 108700020471 RNA-Binding Proteins Proteins 0.000 description 1
- 101710105008 RNA-binding protein Proteins 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- 108700022176 SOS1 Proteins 0.000 description 1
- 101100197320 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPL35A gene Proteins 0.000 description 1
- 101710084593 Sensory histidine kinase/phosphatase NtrB Proteins 0.000 description 1
- 102100022433 Single-stranded DNA cytosine deaminase Human genes 0.000 description 1
- 101710143275 Single-stranded DNA cytosine deaminase Proteins 0.000 description 1
- 108020004688 Small Nuclear RNA Proteins 0.000 description 1
- 102000039471 Small Nuclear RNA Human genes 0.000 description 1
- 108020004459 Small interfering RNA Proteins 0.000 description 1
- 102100032929 Son of sevenless homolog 1 Human genes 0.000 description 1
- 101150100839 Sos1 gene Proteins 0.000 description 1
- 241000203029 Spiroplasma taiwanense Species 0.000 description 1
- 241000194056 Streptococcus iniae Species 0.000 description 1
- 241000194045 Streptococcus macacae Species 0.000 description 1
- 241000285632 Streptococcus macacae NCTC 11558 Species 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 241001313536 Thermothelomyces thermophila Species 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical class OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- 244000111306 Torreya nucifera Species 0.000 description 1
- 235000006732 Torreya nucifera Nutrition 0.000 description 1
- 241000283907 Tragelaphus oryx Species 0.000 description 1
- 102100021398 Transforming growth factor-beta-induced protein ig-h3 Human genes 0.000 description 1
- 101800005109 Triakontatetraneuropeptide Proteins 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 241000269370 Xenopus <genus> Species 0.000 description 1
- 102220481001 Zinc transporter 10_E25A_mutation Human genes 0.000 description 1
- 239000000370 acceptor Substances 0.000 description 1
- 229960000583 acetic acid Drugs 0.000 description 1
- 235000011054 acetic acid Nutrition 0.000 description 1
- 108020002494 acetyltransferase Proteins 0.000 description 1
- 102000005421 acetyltransferase Human genes 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000013543 active substance Substances 0.000 description 1
- 125000002015 acyclic group Chemical group 0.000 description 1
- 150000001266 acyl halides Chemical class 0.000 description 1
- 210000005006 adaptive immune system Anatomy 0.000 description 1
- 150000003838 adenosines Chemical class 0.000 description 1
- 239000008272 agar Substances 0.000 description 1
- 125000001931 aliphatic group Chemical group 0.000 description 1
- 150000001350 alkyl halides Chemical class 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 206010002449 angioimmunoblastic T-cell lymphoma Diseases 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 208000009262 apparent mineralocorticoid excess Diseases 0.000 description 1
- PYMYPHUHKUWMLA-WDCZJNDASA-N arabinose Chemical class OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 1
- 210000004507 artificial chromosome Anatomy 0.000 description 1
- 125000003118 aryl group Chemical group 0.000 description 1
- 150000001502 aryl halides Chemical class 0.000 description 1
- 239000013602 bacteriophage vector Substances 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 1
- 229940000635 beta-alanine Drugs 0.000 description 1
- 230000008275 binding mechanism Effects 0.000 description 1
- 238000012742 biochemical analysis Methods 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N biotin Natural products N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- WWVKQTNONPWVEL-UHFFFAOYSA-N caffeic acid phenethyl ester Natural products C1=C(O)C(O)=CC=C1C=CC(=O)OCC1=CC=CC=C1 WWVKQTNONPWVEL-UHFFFAOYSA-N 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 210000000234 capsid Anatomy 0.000 description 1
- 125000002837 carbocyclic group Chemical group 0.000 description 1
- 125000000837 carbohydrate group Chemical group 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 239000011203 carbon fibre reinforced carbon Substances 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 125000002091 cationic group Chemical group 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 150000005829 chemical entities Chemical class 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 210000004756 chromatid Anatomy 0.000 description 1
- 208000016653 cleft lip/palate Diseases 0.000 description 1
- 230000004186 co-expression Effects 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000002153 concerted effect Effects 0.000 description 1
- 210000002808 connective tissue Anatomy 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- 108010037508 cytochrome P-450 CYP3A6 (rabbit) Proteins 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 230000007711 cytoplasmic localization Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- VGONTNSXDCQUGY-UHFFFAOYSA-N desoxyinosine Natural products C1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 VGONTNSXDCQUGY-UHFFFAOYSA-N 0.000 description 1
- 238000006471 dimerization reaction Methods 0.000 description 1
- 206010013023 diphtheria Diseases 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 230000002616 endonucleolytic effect Effects 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 125000004030 farnesyl group Chemical group [H]C([*])([H])C([H])=C(C([H])([H])[H])C([H])([H])C([H])([H])C([H])=C(C([H])([H])[H])C([H])([H])C([H])([H])C([H])=C(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 125000005313 fatty acid group Chemical group 0.000 description 1
- 230000035558 fertility Effects 0.000 description 1
- 102000013370 fibrillin Human genes 0.000 description 1
- 108060002895 fibrillin Proteins 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 238000007306 functionalization reaction Methods 0.000 description 1
- 229960003692 gamma aminobutyric acid Drugs 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 229930195712 glutamate Natural products 0.000 description 1
- 229960002449 glycine Drugs 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 102000009543 guanyl-nucleotide exchange factor activity proteins Human genes 0.000 description 1
- 108040001860 guanyl-nucleotide exchange factor activity proteins Proteins 0.000 description 1
- 229940047650 haemophilus influenzae Drugs 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- DMEGYFMYUHOHGS-UHFFFAOYSA-N heptamethylene Natural products C1CCCCCC1 DMEGYFMYUHOHGS-UHFFFAOYSA-N 0.000 description 1
- 125000001072 heteroaryl group Chemical group 0.000 description 1
- 150000002402 hexoses Chemical class 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 102000057459 human CYP1A2 Human genes 0.000 description 1
- 102000053648 human FTO Human genes 0.000 description 1
- 102000053372 human TET1 Human genes 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 230000007124 immune defense Effects 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 206010022000 influenza Diseases 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 150000002540 isothiocyanates Chemical class 0.000 description 1
- 238000001638 lipofection Methods 0.000 description 1
- 208000004731 long QT syndrome Diseases 0.000 description 1
- 125000003588 lysine group Chemical class [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 101150023497 mcrA gene Proteins 0.000 description 1
- 230000000394 mitotic effect Effects 0.000 description 1
- 238000003032 molecular docking Methods 0.000 description 1
- 125000000896 monocarboxylic acid group Chemical group 0.000 description 1
- 108091005763 multidomain proteins Proteins 0.000 description 1
- 230000000869 mutational effect Effects 0.000 description 1
- 239000002105 nanoparticle Substances 0.000 description 1
- 230000037434 nonsense mutation Effects 0.000 description 1
- 230000030147 nuclear export Effects 0.000 description 1
- 210000004492 nuclear pore Anatomy 0.000 description 1
- 230000001293 nucleolytic effect Effects 0.000 description 1
- 239000012038 nucleophile Substances 0.000 description 1
- 108060005597 nucleoplasmin Proteins 0.000 description 1
- 101150012154 nupG gene Proteins 0.000 description 1
- 230000009437 off-target effect Effects 0.000 description 1
- 150000002894 organic compounds Chemical class 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 125000001997 phenyl group Chemical group [H]C1=C([H])C([H])=C(*)C([H])=C1[H] 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 150000004713 phosphodiesters Chemical group 0.000 description 1
- 239000006187 pill Substances 0.000 description 1
- 101150010248 pkp1 gene Proteins 0.000 description 1
- 238000005498 polishing Methods 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 229920002647 polyamide Polymers 0.000 description 1
- 239000004417 polycarbonate Substances 0.000 description 1
- 229920000728 polyester Polymers 0.000 description 1
- 229920000573 polyethylene Polymers 0.000 description 1
- 229920002704 polyhistidine Polymers 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 125000001500 prolyl group Chemical group [H]N1C([H])(C(=O)[*])C([H])([H])C([H])([H])C1([H])[H] 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
- 230000012743 protein tagging Effects 0.000 description 1
- 150000003212 purines Chemical class 0.000 description 1
- NGVDGCNFYWLIFO-UHFFFAOYSA-N pyridoxal 5'-phosphate Chemical compound CC1=NC=C(COP(O)(O)=O)C(C=O)=C1O NGVDGCNFYWLIFO-UHFFFAOYSA-N 0.000 description 1
- 150000003230 pyrimidines Chemical class 0.000 description 1
- 230000007420 reactivation Effects 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 238000003259 recombinant expression Methods 0.000 description 1
- 108700015182 recombinant rCAS Proteins 0.000 description 1
- 230000013120 recombinational repair Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000001945 resonance Rayleigh scattering spectroscopy Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 101150085492 rpsF gene Proteins 0.000 description 1
- 102220323254 rs150140303 Human genes 0.000 description 1
- 102220244853 rs1555322610 Human genes 0.000 description 1
- 102220005504 rs281860657 Human genes 0.000 description 1
- 102200147815 rs72559734 Human genes 0.000 description 1
- 102220011099 rs730881019 Human genes 0.000 description 1
- 102220311805 rs757903799 Human genes 0.000 description 1
- RHFUOMFWUGWKKO-UHFFFAOYSA-N s2C Natural products S=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 RHFUOMFWUGWKKO-UHFFFAOYSA-N 0.000 description 1
- 230000005783 single-strand break Effects 0.000 description 1
- 238000005063 solubilization Methods 0.000 description 1
- 230000007928 solubilization Effects 0.000 description 1
- 229960000268 spectinomycin Drugs 0.000 description 1
- 208000003265 stomatitis Diseases 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 238000004885 tandem mass spectrometry Methods 0.000 description 1
- 208000011317 telomere syndrome Diseases 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 229940094937 thioredoxin Drugs 0.000 description 1
- 229940104230 thymidine Drugs 0.000 description 1
- 101150008052 traA gene Proteins 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 108091006106 transcriptional activators Proteins 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 108091006107 transcriptional repressors Proteins 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- 230000032258 transport Effects 0.000 description 1
- NMEHNETUFHBYEG-IHKSMFQHSA-N tttn Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N1[C@@H](CCC1)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](NC(=O)[C@H]1N(CCC1)C(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)[C@@H](C)O)[C@@H](C)O)C1=CC=CC=C1 NMEHNETUFHBYEG-IHKSMFQHSA-N 0.000 description 1
- HDZZVAMISRMYHH-KCGFPETGSA-N tubercidin Chemical compound C1=CC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O HDZZVAMISRMYHH-KCGFPETGSA-N 0.000 description 1
- 108010020156 tyrosyl-glycyl-prolyl-leucyl-phenylalanyl-proline Proteins 0.000 description 1
- 241000712461 unidentified influenza virus Species 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- NQPDZGIKBAWPEJ-UHFFFAOYSA-N valeric acid Chemical compound CCCCC(O)=O NQPDZGIKBAWPEJ-UHFFFAOYSA-N 0.000 description 1
- 208000005925 vesicular stomatitis Diseases 0.000 description 1
- 238000002629 virtual reality therapy Methods 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/78—Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P43/00—Drugs for specific purposes, not provided for in groups A61P1/00-A61P41/00
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/0004—Oxidoreductases (1.)
- C12N9/0071—Oxidoreductases (1.) acting on paired donors with incorporation of molecular oxygen (1.14)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y305/00—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
- C12Y305/04—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
- C12Y305/04004—Adenosine deaminase (3.5.4.4)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/80—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
Definitions
- Targeted editing of nucleic acid sequences is a highly promising approach for the study of gene function and also has the potential to provide new therapies for genetic diseases, including those caused by point mutations.
- Point mutations represent the majority of known human genetic variants associated with disease. Developing robust methods to introduce and correct point mutations is therefore important in understanding and treating diseases with a genetic component.
- Base editing involves the conversion of a specific nucleic acid base into another at a targeted genomic locus. For certain approaches, this can be achieved without requiring double- stranded DNA breaks (DSB). Since many genetic diseases arise from point mutations, this technology has important implications in the study of human health and disease. Engineered base editors are capable of editing many targets with high efficiency, often achieving editing of 30-70% of cells following a single treatment, without selective enrichment of the cell population for editing events.
- DSB double- stranded DNA breaks
- Base editors are typically fusions of a Cas (“CRISPR-associated”) domaindomain and a nucleobase modification domaindomain (e.g., a natural or evolved deaminase, such as a cytidine deaminase that include APOBEC1 (“apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1”), CDA (“cytidine deaminase”), and AID (“activation-induced cytidine deaminase”)) domains.
- base editors may also include proteins or domains that alter cellular DNA repair processes to increase the efficiency and/or stability of the resulting single-nucleotide change.
- C-to-T base editors use a cytidine deaminase to convert cytidine to uridine in the single-stranded DNA loop created by the Cas9 (“CRISPR- associated protein 9”) domain.
- the opposite strand is nicked by Cas9 to stimulate DNA repair mechanisms that use the edited strand as a template, while a fused uracil glycosylase inhibitor slows excision of the edited base.
- DNA repair leads to a C:G to T:A base pair conversion.
- This class of base editor is described in U.S. Patent Publication No. 2017/0121693, published May 4, 2017, which issued on January 1, 2019 as U.S. Patent No. 10,167,457, which is incorporated by reference in its entirety herein.
- a major limitation of base editing is the inability to generate transversion (purine ⁇ - pyrimidine) changes, which are needed to correct -38% of known human pathogenic SNPs. See Komor, A.C. et al, Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage, Nature 533, 420-424 (2016) and Landrum, M.J. et al, ClinVar: public archive of relationships among sequence variation and human phenotype, Nucleic Acids Res. 42, D980-985 (2014), each of which is incorporated by reference. Of this -38% of known pathogenic SNPs, about 15% arise from C:G to A:T mutations. Many C:G to A:T point mutations introduce premature stop codons (UAA, UAG, UGA), resulting in nonsense mutations in protein coding regions.
- transversions can only be repaired by nuclease-mediated formation of a double- stranded break (DSB) followed by homology directed repair (HDR), which is typically inefficient, especially in non-mitotic cells, and leads to undesired by-products, such as indels (insertions and deletions) and translocations.
- DLB double- stranded break
- HDR homology directed repair
- transversion base editors requires the development of a new editing strategy, such as the manipulation of endogenous DNA repair pathways or a different nucleobase chemical transformation.
- the present disclosure describes novel transversion base editors using an innovative adenine oxidation strategy. The present disclosure greatly expands the capabilities of base editing.
- the present disclosure provides transversion base editors which add to the repertoire of base editors that have already been developed disclosure providesln particular, the present disclosure provides for adenine-to-thymine or“ATBE” (or thymine-to-adenine or “TABE”) transversion base editors which satisfy the need in the art for the installation of targeted single-base transversion nucleobase changes in a target nucleotide sequence, e.g., a genome.
- ATBE thymine-to-adenine
- the present disclosure provides for nucleic acid molecules encoding and/or expressing the adenine-to-thymine and thymine-to-adenine transversion base editors described herein, as well as expression vectors or constructs for expressing these transversion base editors, host cells comprising said nucleic acid molecules and expression vectors, and compositions for delivering and/or administering nucleic acid-based embodiments described herein.
- the disclosure provides for compositions comprising these transversion base editors.
- the present disclosure provides for methods of making the transversion base editors, as well as methods of using the transversion base editors or nucleic acid molecules encoding such transversion base editors in applications including editing a nucleic acid molecule, e.g., a genome.
- the present inventors have developed novel transversion base editors, and in particular a novel base editor that installs an A-to-T transversion in a targeted manner, through deamination and oxidation reactions. This new strategy allows for the efficient and specific transversion of A-to-T or T-to-A using the inventive base editors described herein.
- a targeted adenine (A) in a nucleic acid of interest is first enzymatically deaminated to an inosine (the adenosine is deaminated to a hypoxanthine).
- Enzyme-catalyzed oxidation of the newly formed inosine is induced, resulting in formation of 8-oxo-inosine (8-oxo-I).
- Steric rotation of the 8-oxo-I around the glycosidic bond is induced, presenting the Hoogsteen edge for base pairing.
- the 8-oxo-I is paired with thymine by a polymerase.
- the cell recognizes the mismatch between 8-oxo-I and the thymine on the unmutated strand and converts the thymine to an adenine.
- the 8- oxo-I is converted to a thymine.
- a desired A-to-T transversion is thus achieved.
- Adenine oxidation is achieved by the targeted use of a fusion protein comprising a Cas9 (e.g., dCas9 or nCas9) domain, an adenosine deaminase domain, an inosine oxidase domain, and optionally linkers interconnecting these domains (see FIG. 1A).
- the nucleic acid programmable DNA binding protein may be a Cas9 domain.
- the napDNAbp may also be a CasX, a CasY, a C2cl, a C2c2, a C2c3, a GeoCas9, a CjCas9, a Casl2a (formerly known as Cpfl), a Casl2b, a Casl2g, a Casl2h, a Casl2i, a Casl3b, a Casl3c, a Casl3d, a Casl4, a Csn2, an xCas9, an SpCas9-NG, an LbCasl2a, an AsCasl2a, a Cas9-KKH, a circularly permuted Cas9, an Argonaute (Ago), a SmacCas9, or a Spy-macCas
- the domains of the base editor fusion protein may be interconnected with a linker.
- This linker may be any suitable amino acid linker, synthetic linker, polymer, or a covalent bond.
- exemplary linkers include any of the following amino acid sequences:
- SGGSSGGSSGS ETPGT S ES ATPES SGGSSGGS (SEQ ID NO: 11), also known as an XTEN linker; SGGSGGSGGS (SEQ ID NO: 12); GGG; GGGS (SEQ ID NO: 28); SGGGS (SEQ ID NO: 2); S GS ETPGT S ES ATPES (SEQ ID NO: 79); or SGGS (SEQ ID NO: 14).
- disclosure provides The disclosure provides an base editor fusion protein comprising: (i) a nucleic acid programmable DNA binding protein (napDNAbp), (i) an adenosine deaminase, and (iii) an oxidase.
- napDNAbp nucleic acid programmable DNA binding protein
- adenosine deaminase an adenosine deaminase
- oxidase an oxidase
- the adenosine deaminase (“AD”) domain is adapted to deaminate adenosine to inosine.
- the adenosine deaminase domain is a monomer.
- the adenosine deaminase domain is a dimer comprising first and second adenosine deaminases.
- the adenosine deaminase comprises Escherichia coli tRNA adenosine deaminase (TadA), or a variant thereof.
- the oxidase is a wild- type oxidase, or a variant thereof, that oxidizes an inosine in DNA.
- the oxidase is an AlkB, or a variant thereof.
- the oxidase is a bacterial AlkB, or a variant thereof.
- the oxidase is a human AlkBH, or a variant thereof.
- the oxidase is a human AlkBHl, AlkBH2, AlkBH3, AlkBH4, AlkBH5, AlkBH7, AlkBH8, or a variant thereof.
- the oxidase comprises any one of the amino acid sequences of SEQ ID NOs: 5-7, 10, 15-20, 29, 34-39, 41, 134- 139, 140-141, and 131. In various embodiments, the oxidase comprises an amino acid sequence that is at least 80%, 85%, 90%, 95%, 98%, or 99% identical to the amino acid sequence of any one of SEQ ID NOs: 5-7, 10, 15-20, 29, 34-49, 41, 134-139, 140- 141, and 131. In particular embodiments, the oxidase comprises an amino acid sequence that is at least 80%, 85%, 90%, 95%, 98%, or 99% identical to the amino acid sequence SEQ ID NO: 136.
- the base editor fusion protein further comprises an inhibitor of base excision repair (“iBER”) that covalently or non-covalently binds to a mutated nucleobase to prevent its excision during subsequent mismatch repair.
- iBER base excision repair
- Use of an iBER in the base editor fusion protein may increase base editing efficiency for the
- the iBER is an 8-oxo- guanine glycosylase (OGG or OGGI) inhibitor (“OGG inhibitor”), a thymine-DNA glycosylase (TDG) inhibitor, a uracil-DNA glycosylase (UDG) inhibitor, a Methyl-CpG Binding Domain 4 (MBD4) inhibitor.
- OGG 8-oxo- guanine glycosylase
- TDG thymine-DNA glycosylase
- UDG uracil-DNA glycosylase
- MBD4 Methyl-CpG Binding Domain 4
- the iBER comprises a catalytically inactive OGG that binds 8-oxo-inosine to prevent its excision during subsequent mismatch repair.
- the base editor fusion proteins described herein may comprise any of the following structures: NH2-[oxidase]-[AD]-[AD]-[napDNAbp]-COOH; NH 2 - [oxidase] - [napDNAbp] - [AD] - [AD] -COOH; NH 2 - [AD] - [AD] - [oxidase] - [napDNAbp] - COOH; NH 2 - [AD] - [AD] - [napDNAbp] - [oxidase] -COOH; NH 2 - [napDNAbp] - [oxidase] - [AD] - [AD]-COOH; or NH 2 -[napDNAbp]-[AD]-[AD]-[oxidase]-COOH, wherein each instance of “]-[” comprises an optional linker.
- the base editor fusion proteins described herein can comprise any of the following structures : NH 2 - [iBER] - [oxidase] - [AD] - [AD] - [napDNAbp] -COOH; NH 2 - [oxidase] - [iBER] - [AD] -[AD] -[napDNAbp] -COOH; NH 2 -[oxidase]-[AD]-[AD]-[iBER]-[napDNAbp]-COOH; NH 2 - [oxidase] - [AD] - [AD] - [napDNAbp] - [iBER] -COOH; NH 2 - [iBER] - [oxidase] - [napDNAbp] -[AD] -COOH; NH 2 - [iBER] - [oxidase] - [napDNAbp] -[AD] -COOH
- the disclosure provides nucleic acid molecules or constructs encoding any of the base editor fusion proteins, or domains thereof.
- the nucleic acid sequences may be codon-optimized for expression in the cells of any organism of interest. In certain embodiments, the nucleic acid sequence is codon-optimized for expression in human cells.
- the disclosure provides polynucleotides and/or vectors encoding any of the base editor fusion proteins described herein, or domains thereof.
- These nucleic acid sequences are typically engineered or modified experimentally.
- these nucleic acid sequences may be codon-optimized for expression in an organism of interest, e.g. mammalian cells.
- the nucleic acid sequences are codon-optimized for expression in human cells.
- cells containing such polynucleotides or constructs are provided.
- complexes comprising any of the fusion proteins described herein and a guide RNA bound to the napDNAbp domain of the fusion protein are provided.
- the disclosure provides a pharmaceutical composition comprising any of the fusion proteins described herein and a pharmaceutically acceptable excipient.
- the pharmaceutical composition further comprises a gRNA.
- the disclosure provides a kit comprising a nucleic acid construct that includes (i) a nucleic acid sequence encoding any of the fusion proteins described herein; (ii) a heterologous promoter that drives expression of the sequence of (i); and optionally an expression construct encoding a guide RNA backbone and the target sequence.
- methods for targeted nucleic acid editing are provided.
- the methods described herein typically comprise i) contacting a nucleic acid sequence with a complex comprising any of the fusion proteins described herein and a guide nucleic acid, wherein the double-stranded DNA comprises a target A:T (or T:A) nucleobase pair, and ii) editing the thymine (or adenine) of the A:T (or T:A) nucleobase pair.
- the methods may further comprise iii) cutting or nicking the non-edited strand of the double- stranded DNA.
- methods of treatment using the inventive base editors are The methods described herein may comprise treating a subject having or at risk of developing a disease, disorder, or condition, comprising administering to the subject a fusion protein as described herein, a polynucleotide as described herein, a vector as described herein, or a pharmaceutical composition as described herein.
- FIG. 1A is a schematic illustration showing an exemplary fusion protein of the disclosure.
- a fusion protein comprising an nCas9 domain linked to a tRNA adenosine deaminase homodimer linked to an inosine oxidase enzyme is targeted to the correct adenine nucleobase through the hybridization of an sgRNA to a complementary sequence of DNA.
- the inosine oxidase oxidizes the inosine to 8-oxo-inosine, and subsequently, the cell’s native
- FIG. IB depicts the chemical conversion of adenosine to 8-oxo-inosine (8- oxo-I).
- the cell interprets the 8-oxo-I as an A, and converts the mismatched adenine of the non-edited strand to an thymine.
- the 8-oxo-I lesion is converted to an adenosine, completing the desired A:T to T:A mutation.
- FIG. 2 depicts an exemplary assay for selection of evolved variants of human AlkBH3 dioxygenase that are highly effective at oxidizing inosine.
- Libraries of mutagenized TadA-TadA-AlkBH3-dCas9 fusion proteins, targeting guide RNAs, and a selection plasmid containing an inactivated spectinomycin resistance gene with mutations at the active site (D182V or K205T) that require A-to-T editing to correct are transformed into E. coli cells, which are plated onto agar media containing spectinomycin and sucrose.
- the term“accessory plasmid,” as used herein, refers to a plasmid comprising a gene required for the generation of infectious viral particles under the control of a conditional promoter.
- transcription from the conditional promoter of the accessory plasmid is typically activated, directly or indirectly, by a function of the gene to be evolved.
- the accessory plasmid serves the function of conveying a competitive advantage to those viral vectors in a given population of viral vectors that carry a version of the gene to be evolved able to activate the conditional promoter or able to activate the conditional promoter more strongly than other versions of the gene to be evolved.
- only viral vectors carrying an“activating” version of the gene to be evolved will be able to induce expression of the gene required to generate infectious viral particles in the host cell, and, thus, allow for packaging and propagation of the viral genome in the flow of host cells.
- Vectors carrying non-activating versions of the gene to be evolved will not induce expression of the gene required to generate infectious viral vectors, and, thus, will not be packaged into viral particles that can infect fresh host cells.
- Exemplary accessory plasmids have been described, for example in U.S. Application No. 15/567,312, published as U.S. Pub. No. 2018/0087046, filed on April 15, 2016, the entire contents of which is incorporated by reference.
- Base editing is a genome editing technology that involves the conversion of a specific nucleic acid base into another at a targeted genomic locus. In certain embodiments, this can be achieved without requiring double- stranded DNA breaks (DSB).
- DSB double- stranded DNA breaks
- CRISPR-based systems begin with the introduction of a DSB at a locus of interest. Subsequently, cellular DNA repair enzymes mend the break, commonly resulting in random insertions or deletions (indels) of bases at the site of the DSB.
- base-to-base changes there are 12 possible base-to-base changes that may occur via individual or sequential use of transition (i.e., a purine-to-purine change or pyrimidine-to- pyrimidine change) or transversion (i.e., a purine-to-pyrimidine or pyrimidine-to-purine) editors. These include:
- C-to-T base editor (or“CTBE”). This type of editor converts a C:G Watson-Crick nucleobase pair to a T:A Watson-Crick nucleobase pair. Because the corresponding Watson-Crick paired bases are also interchanged as a result of the conversion, this category of base editor may also be referred to as a G-to-A base editor (or“GABE”).
- A-to-G base editor (or“AGBE”). This type of editor converts a A:T Watson-Crick nucleobase pair to a G:C Watson-Crick nucleobase pair. Because the corresponding Watson-Crick paired bases are also
- this category of base editor may also be referred to as a T-to-C base editor (or“TCBE”).
- CGBE o C-to-G base editor
- This type of editor converts a C:G Watson-Crick nucleobase pair to a G:C Watson-Crick nucleobase pair. Because the corresponding Watson-Crick paired bases are also interchanged as a result of the conversion, this category of base editor may also be referred to as a G-to-C base editor (or“GCBE”).
- A-to-C base editor (or“ACBE”). This type of editor converts a A:T Watson-Crick nucleobase pair to a C:G Watson-Crick nucleobase pair. Because the corresponding Watson-Crick paired bases are also interchanged as a result of the conversion, this category of base editor may also be referred to as a T-to-G base editor (or“TGBE”).
- TABE G-to-T base editor
- This type of editor converts a G:C Watson-Crick nucleobase pair to a T:A Watson-Crick nucleobase pair. Because the corresponding Watson-Crick paired bases are also interchanged as a result of the conversion, this category of base editor may also be referred to as a C-to-A base editor (or“CABE”).
- A-to-T base editor (or“ATBE”). This type of editor converts an A:T Watson-Crick nucleobase pair to a T:A Watson-Crick nucleobase pair. Because the corresponding Watson-Crick paired bases are also interchanged as a result of the conversion, this category of base editor may also be referred to as a T-to-A base editor (or“TABE”).
- the term“base editors (BEs)”, as used herein, refers to the Cas-fusion proteins described herein.
- the fusion protein comprises a nuclease-inactive Cas9 (dCas9) fused to a oxidase which binds nucleic acid in a guide RNA-programmed manner via the formation of an R-loop, but does not cleave the nucleic acid.
- dCas9 nuclease-inactive Cas9
- the dCas9 domain of the fusion protein may include a D10A and a H840A mutation (which renders Cas9 capable of cleaving only one strand of a nucleic acid duplex), as described in PCT/US2016/058344 (filed on October 22, 2016 and published as WO 2017/070632 on April 27, 2017), which is incorporated herein by reference in its entirety.
- the DNA cleavage domain of S. pyogenes Cas9 includes two subdomains, the HNH nuclease subdomain and the RuvCl subdomain.
- the HNH subdomain cleaves the strand complementary to the gRNA (the“targeted strand,” or the strand at which editing or deamination-oxidation occurs), whereas the RuvCl subdomain cleaves the non-complementary strand containing the PAM sequence (the“non-targeted strand”, or the strand at which editing or deamination-oxidation does not occur).
- the RuvCl mutant D10A generates a nick on the targeted strand
- the HNH mutant H840A generates a nick on the non-targeted strand (see Jinek et al, Science. 337:816-821(2012); Qi et al, Cell. 28; 152(5): 1173-83 (2013)).
- the fusion protein comprises a Cas9 nickase fused to a deaminase domain that is fused to an oxidase domain, which converts an adenine nucleobase to 8-oxo-inosine.
- base editors encompasses the base editors described herein as well as any base editor known or described in the art at the time of this filing or developed in the future. Reference is made to Rees & Liu, Base editing: precision chemistry on the genome and transcriptome of living cells, Nat Rev Genet. 2018;19(12):770-788; as well as.U.S. Patent Publication No. 2018/0073012, published March 15, 2018, which issued as U.S. Patent No.
- Cas9 or“Cas9 nuclease” or“Cas9 domain” refers to a CRISPR associated protein 9, or variant thereof, and embraces any naturally occurring Cas9 from any organism, any naturally-occurring Cas9, any Cas9 homolog, ortholog, or paralog from any organism, and any variant of a Cas9, naturally-occurring or engineered. More broadly, a Cas9 protein, domain, or domain is a type of“nucleic acid programmable DNA binding protein (napDNAbp)”.
- Cas9 is not meant to be limiting and may be referred to as a “Cas9 or variant thereof.” Exemplary Cas9 proteins are described herein and also described in the art. The present disclosure is unlimited with regard to the particular Cas9 that is employed in the base editors of the disclosure.
- proteins comprising Cas9 or fragments thereof are referred to as“Cas9 variants.”
- a Cas9 variant shares homology to Cas9, or a fragment thereof.
- Cas9 variants include functional fragments of Cas9.
- a Cas9 variant is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to wild type Cas9.
- the Cas9 variant may have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 21, 24, 25, 26, 27, 28, 29, 30,
- the Cas9 variant comprises a fragment of Cas9 (e.g., a gRNA binding domain or a DNA-cleavage domain), such that the fragment is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to the corresponding fragment of wild type Cas9.
- a fragment of Cas9 e.g., a gRNA binding domain or a DNA-cleavage domain
- the fragment is at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% identical, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% of the amino acid length of a corresponding wild type Cas9.
- dCas9 refers to a nuclease-inactive Cas9 or nuclease-dead Cas9, or a functional fragment or variant thereof, and embraces any naturally occurring dCas9 from any organism, any naturally-occurring dCas9 equivalent or functional fragment thereof, any dCas9 homolog, ortholog, or paralog from any organism, and any mutant or variant of a dCas9, naturally-occurring or engineered.
- dCas9 is not meant to be particularly limiting and may be referred to as a“dCas9 or equivalent.”
- Exemplary dCas9 proteins and method for making dCas9 proteins are further described herein and/or are described in the art and are incorporated herein by reference.
- nCas9 or“Cas9 nickase” refers to a Cas9 or a functional fragment or variant thereof, which cleaves or nicks only one of the strands of a target cut site thereby introducing a nick in a double strand DNA molecule rather than creating a double strand break. This can be achieved by introducing appropriate mutations in a wild-type Cas9 which inactives one of the two endonuclease activities of the Cas9.
- Any suitable mutation which inactivates one Cas9 endonuclease activity but leaves the other intact is contemplated, such as one of D10A or H840A mutations in the wild-type Cas9 amino acid sequence (e.g., SEQ ID NO: 9) may be used to form the nCas9.
- the term“continuous evolution,” as used herein, refers to an evolution procedure, (e.g., PACE) in which a population of nucleic acids is subjected to multiple rounds of (a) replication, (b) mutation, and (c) selection to produce a desired evolved product, for example, a nucleic acid encoding a protein with a desired activity, wherein the multiple rounds can be performed without investigator interaction and wherein the processes under (a)-(c) can be carried out simultaneously.
- the evolution procedure is carried out in vitro , for example, using cells in culture as host cells.
- a continuous evolution process relies on a system in which a gene of interest is provided in a nucleic acid vector that undergoes a life-cycle including replication in a host cell and transfer to another host cell, wherein a critical component of the life-cycle is deactivated and
- the nucleic acid vector comprising the gene of interest is a phage, a viral vector, or naked DNA (e.g., a mobilization plasmid).
- naked DNA e.g., a mobilization plasmid
- transfer of the gene of interest from cell to cell is via infection, transfect ion, transduction, conjugation, or uptake of naked DNA, and efficiency of cell-to-cell transfer (e.g., transfer rate) is dependent on the activity of a product encoded by the gene of interest.
- the nucleic acid vector is a phage harboring the gene of interest and the efficiency of phage transfer (via infection) is dependent on an activity of the gene of interest in that a protein required for the generation of phage particles (e.g., pill for M13 phage) is expressed in the host cells only in the presence of the desired activity of the gene of interest.
- the nucleic acid vector is a retroviral vector, for example, a lentiviral or vesicular stomatitis vims vector harboring the gene of interest, and the efficiency of viral transfer from cell to cell is dependent on the activity of the gene of interest in that a protein required for the generation of viral particles (e.g., an envelope protein, such as VSV-g) is expressed in the host cells only in the presence of the desired activity of the gene of interest.
- a protein required for the generation of viral particles e.g., an envelope protein, such as VSV-g
- the nucleic acid vector is a DNA vector, for example, in the form of a mobilizable plasmid DNA, comprising the gene of interest, that is transferred between bacterial host cells via conjugation and the efficiency of conjugation- mediated transfer from cell to cell is dependent on an activity of the gene of interest in that a protein required for conjugation-mediated transfer (e.g., traA or traQ) is expressed in the host cells only in the presence of the desired activity of the gene of interest.
- Host cells contain F plasmid lacking one or both of those genes.
- some embodiments provide a continuous evolution system, in which a population of viral vectors comprising a gene of interest to be evolved replicates in a flow of host cells, e.g., a flow through a lagoon, wherein the viral vectors are deficient in a gene encoding a protein that is essential for the generation of infectious viral particles, and wherein that gene is comprised in the host cell under the control of a conditional promoter that can be activated by a gene product encoded by the gene of interest, or a mutated version thereof.
- the activity of the conditional promoter depends on a desired function of a gene product encoded by the gene of interest.
- Viral vectors in which the gene of interest has not acquired a mutation conferring the desired function, will not activate the conditional promoter, or only achieve minimal activation, while any mutation in the gene of interest that confers the desired mutation will result in activation of the conditional promoter. Since the conditional promoter controls an essential protein for the viral life cycle, activation of this promoter directly corresponds to an advantage in viral spread and replication for those vectors that have acquired an advantageous mutation.
- CRISPR is a family of DNA sequences (i.e., CRISPR clusters) in bacteria and archaea that represent snippets of prior infections by a virus that have invaded the prokaryote.
- the snippets of DNA are used by the prokaryotic cell to detect and destroy DNA from subsequent attacks by similar viruses and effectively constitute, along with an array of CRISPR-associated proteins (including Cas9 and homologs thereof) and CRISPR-associated RNA, a prokaryotic immune defense system.
- CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).
- tracrRNA trans-encoded small RNA
- me endogenous ribonuclease 3
- Cas9 protein a trans-encoded small RNA
- the tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA.
- Cas9/crRNA/tracrRNA endonucleolytic ally cleaves linear or circular nucleic acid target complementary to the RNA. Specifically, the target strand not complementary to crRNA is first cut endonucleolytically, then trimmed 3 '-5' exonucleolytically.
- DNA-binding and cleavage typically requires protein and both RNAs.
- single guide RNAs (“sgRNA”, or simply“gRNA”) may be engineered so as to incorporate embodiments of both the crRNA and tracrRNA into a single RNA species— the guide RNA. See, e.g., Jinek M., el al., Science 337:816-821(2012), the entire contents of which is herein incorporated by reference.
- Cas9 recognizes a short motif in the CRISPR repeat sequences (the PAM or protospacer adjacent motif) to help distinguish self versus non-self.
- CRISPR biology, as well as Cas9 nuclease sequences and structures are well known to those of skill in the art (see, e.g.,“Complete genome sequence of an Ml strain of Streptococcus pyogenes.” Ferretti J.J., el al, Proc. Natl. Acad. Sci. U.S.A.
- Cas9 nucleases and sequences include Cas9 sequences from the organisms and loci disclosed in Chylinski, Rhun, and Charpentier,“The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems” (2013) RNA Biology 10:5, 726-737; the entire contents of which are incorporated herein by reference.
- deaminase or“deaminase domain” refers to a protein or enzyme that catalyzes a deamination reaction.
- the deaminase is an adenosine deaminase, which catalyzes the hydrolytic deamination of adenine or adenosine.
- the deaminase or deaminase domain is an adenosine deaminase, catalyzing the hydrolytic deamination of adenosine or deoxy adenosine to inosine or deoxyinosine, respectively.
- the adenosine deaminase catalyzes the hydrolytic deamination of adenine or adenosine in deoxyribonucleic acid (DNA).
- the adenosine deaminases e.g . engineered adenosine deaminases, evolved adenosine deaminases
- the adenosine deaminases may be from any organism, such as a bacterium.
- the deaminase or deaminase domain is a variant of a naturally-occurring deaminase from an organism. In some embodiments, the deaminase or deaminase domain does not occur in nature.
- the deaminase or deaminase domain is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75% at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to a naturally-occurring deaminase.
- the adenosine deaminase is from a bacterium, such as, E.coli, S. aureus, S. typhi, S. putrefaciens, H. influenzae, or C. crescentus.
- the adenosine deaminase is a TadA deaminase.
- the TadA deaminase is an E. coli TadA deaminase (ecTadA).
- the TadA deaminase is a truncated E. coli TadA deaminase.
- the truncated ecTadA may be missing one or more N-terminal amino acids relative to a full-length ecTadA.
- the truncated ecTadA may be missing 1, 2, 3, 4, 5 ,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 6, 17, 18, 19, or 20 N-terminal amino acid residues relative to the full length ecTadA. In some embodiments, the truncated ecTadA may be missing 1, 2, 3, 4, 5 ,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 6, 17, 18, 19, or 20 C- terminal amino acid residues relative to the full length ecTadA. In some embodiments, the ecTadA deaminase does not comprise an N-terminal methionine. Reference is made to U.S. Patent Publication No. 2018/0073012, published March 15, 2018 and International
- the deaminase domain provided herein is a dimer of two adenosine deaminases. In some embodiments, the deaminase provided herein is a homodimer of two TadA deaminases. In some embodiments, the deaminase provided herein is a heterodimer of a wild-type TadA deaminase and an evolved variant of a TadA deaminase. In some embodiments, the deaminase provided herein is a dimer of two adenosine deaminases that is linked covalently or non-covalently to a napDNAbp and an oxidase. In some embodiments, the deaminase domain provided herein is a monomer, e.g., a single TadA deaminase or a monomer of an evolved variant of a TadA deaminase.
- the TadA deaminase is an N-terminal truncated TadA.
- the adenosine deaminase comprises the amino acid sequence:
- the TadA deaminase is a full-length E. coli TadA deaminase.
- the adenosine deaminase comprises the amino acid sequence:
- adenosine deaminases useful in the present application would be apparent to the skilled artisan and are within the scope of this disclosure.
- the adenosine deaminase may be a homolog of an AD AT.
- AD AT homologs include, without limitation:
- Bacillus subtilis TadA Bacillus subtilis TadA:
- Salmonella typhimurium (S. typhimurium ) TadA
- Haemophilus influenzae F3031 H. influenzae
- TadA Haemophilus influenzae F3031 (H. influenzae ) TadA
- an effective amount refers to an amount of a biologically active agent that is sufficient to elicit a desired biological response.
- an effective amount of a base editor may refer to the amount of the base editor that is sufficient to edit a target site of a nucleotide sequence, e.g., a genome.
- an effective amount of a base editor provided herein e.g., of a fusion protein comprising a nuclease-inactive Cas9 domain and a nucleobase modification domain (e.g., an oxidase domain) may refer to the amount of the fusion protein that is sufficient to induce editing of a target site specifically bound and edited by the fusion protein.
- an effective amount of a base editor provided herein may refer to the amount of the fusion protein sufficient to induce editing having the following characteristics: > 50% product purity, ⁇ 5% indels, and an editing window of 2-8 nucleotides.
- an agent e.g., a fusion protein, a nuclease, an oxidase, an adenosine deaminase, a hybrid protein, a protein dimer, a complex of a protein (or protein dimer) and a polynucleotide, or a polynucleotide
- an agent e.g., a fusion protein, a nuclease, an oxidase, an adenosine deaminase, a hybrid protein, a protein dimer, a complex of a protein (or protein dimer) and a polynucleotide, or a polynucleotide
- the desired biological response e.g., on the specific allele, genome, or target site to be edited, on the target cell or tissue (i.e., the cell or tissue to be edited), and on the agent being used.
- the term“evolved base editor” or“evolved base editor variant” refers to a base editor formed as a result of mutagenizing a reference or starting-point base editor.
- the term refers to embodiments in which the nucleobase modification domain is evolved or a separate domain is evolved.
- Mutagenizing a reference or starting-point base editor may comprise mutagenizing a fusion protein comprising an adenosine deaminase-oxidase domain— by a continuous evolution method (e.g., PACE), wherein the evolved base editor has one or more amino acid variations introduced into its amino acid sequence relative to the amino acid sequence of the adenosine deaminase-oxidase.
- PACE continuous evolution method
- Amino acid sequence variations may include one or more mutated residues within the amino acid sequence of a reference base editor, e.g., as a result of a change in the nucleotide sequence encoding the base editor that results in a change in the codon at any particular position in the coding sequence, the deletion of one or more amino acids (e.g., a truncated protein), the insertion of one or more amino acids, or any combination of the foregoing.
- the evolved base editor may include variants in one or more components or domains of the base editor (e.g., variants introduced into a Cas9 domain, an adenosine deaminase domain, an inosine oxidase domain, an iBER domain, or variants introduced into combinations of these domains).
- variants introduced into a Cas9 domain e.g., variants introduced into a Cas9 domain, an adenosine deaminase domain, an inosine oxidase domain, an iBER domain, or variants introduced into combinations of these domains.
- fusion protein refers to a hybrid polypeptide which comprises protein domains from at least two different proteins.
- One protein may be located at the amino-terminal (N-terminal) portion of the fusion protein or at the carboxy-terminal (C-terminal) protein thus forming an“amino-terminal fusion protein” or a“carboxy-terminal fusion protein,” respectively.
- a protein may comprise different domains, for example, a nucleic acid binding domain (e.g., the gRNA binding domain of Cas9 that directs the binding of the protein to a target site) and a nucleic acid cleavage domain or a catalytic domain of a nucleic-acid editing protein.
- any of the proteins provided herein may be produced by any method known in the art.
- the proteins provided herein may be produced via recombinant protein expression and purification, which is especially suited for fusion proteins comprising a peptide linker.
- Methods for recombinant protein expression and purification are well known, and include those described by Green and Sambrook, Molecular Cloning: A Laboratory Manual (4 th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2012)), the entire contents of which are incorporated herein by reference.
- a suitable host cell refers to a cell that can host, replicate, and transfer a phage vector useful for a continuous evolution process as provided herein.
- a suitable host cell is a cell that can be infected by the viral vector, can replicate it, and can package it into viral particles that can infect fresh host cells.
- a cell can host a viral vector if it supports expression of genes of viral vector, replication of the viral genome, and/or the generation of viral particles.
- One criterion to determine whether a cell is a suitable host cell for a given viral vector is to determine whether the cell can support the viral life cycle of a wild-type viral genome that the viral vector is derived from.
- a suitable host cell would be any cell that can support the wild-type M13 phage life cycle.
- Suitable host cells for viral vectors useful in continuous evolution processes are well known to those of skill in the art, and the disclosure is not limited in this respect.
- the viral vector is a phage and the host cell is a bacterial cell.
- the host cell is an E. coll cell. Suitable E.
- a fresh host cell can, however, have been infected by a viral vector unrelated to the vector to be evolved or by a vector of the same or a similar type but not carrying the gene of interest.
- the host cell is a prokaryotic cell, for example, a bacterial cell.
- the host cell is an E. coll cell.
- the host cell is a eukaryotic cell, for example, a yeast cell, an insect cell, or a mammalian cell.
- the type of host cell will, of course, depend on the viral vector employed, and suitable host cell/viral vector combinations will be readily apparent to those of skill in the art.
- the host cells are E. coli cells expressing the Fertility factor, also commonly referred to as the F factor, sex factor, or F-plasmid.
- the F-factor is a bacterial DNA sequence that allows a bacterium to produce a sex pilus necessary for conjugation and is essential for the infection of E. coli cells with certain phage, for example, with M13 phage.
- the host cells for M13-PACE are of the genotype F'proA + B + A(lacIZY) zzf::Tnl0(TetR)/ endAl recAl galE15 galK16 nupG rpsF AlacIZYA araD139 A(ara,leu)7697 mcrA A(mrr-hsdRMS-mcrBC) proBA::pirl l6 l .
- linker refers to a chemical group or a molecule linking two molecules or domains, e.g., nCas9 and a deaminase and/or an oxidase.
- a linker joins a dCas9 and modification domain (e.g., an oxidase).
- the linker is positioned between, or flanked by, two groups, molecules, or other domains and connected to each one via a covalent bond, thus connecting the two.
- the linker is an amino acid or a plurality of amino acids (e.g., a peptide or protein).
- the linker is an organic molecule, group, polymer, or chemical domain.
- Chemical domains include, but are not limited to, disulfide, hydrazone, thiol and azo domains.
- the linker is 5-100 amino acids in length, for example, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 30-35, 35-40, 40-45, 45-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-150, or 150-200 amino acids in length.
- Fonger or shorter linkers are also contemplated.
- mutation refers to a substitution of a residue within a sequence, e.g., a nucleic acid or amino acid sequence, with another residue; a deletion or insertion of one or more residues within a sequence; or a substitution of a residue within a sequence of a genome in a subject to be corrected. Mutations are typically described herein by identifying the original residue followed by the position of the residue within the sequence and by the identity of the newly substituted residue.
- Mutations can include a variety of categories, such as single base polymorphisms, microduplication regions, indel, and inversions, and is not meant to be limiting in any way. Mutations can include“loss-of-function” mutations which is the normal result of a mutation that reduces or abolishes a protein activity.
- loss-of-function mutations are recessive, because in a heterozygote the second chromosome copy carries an unmutated version of the gene coding for a fully functional protein whose presence compensates for the effect of the mutation.
- a loss-of-function mutation is dominant, one example being haploinsufficiency, where the organism is unable to tolerate the approximately 50% reduction in protein activity suffered by the heterozygote.
- This is the explanation for a few genetic diseases in humans, including Marfan syndrome which results from a mutation in the gene for the connective tissue protein called fibrillin.
- Mutations also embrace“gain-of-function” mutations, which is one which confers an abnormal activity on a protein or cell that is otherwise not present in a normal condition.
- gain-of-function mutations are in regulatory sequences rather than in coding regions, and can therefore have a number of consequences. For example, a mutation might lead to one or more genes being expressed in the wrong tissues, these tissues gaining functions that they normally lack. Alternatively the mutation could lead to overexpression of one or more genes involved in control of the cell cycle, thus leading to uncontrolled cell division and hence to cancer. Because of their nature, gain-of-function mutations are usually dominant.
- nucleic acid molecules or polypeptides e.g., Cas9 or oxidases
- nucleic acid molecule or polypeptides e.g., Cas9 or oxidases
- the nucleic acid molecule or the polypeptide is at least substantially free from at least one other component with which they are naturally associated in nature and/or as found in nature (e.g., an amino acid sequence not found in nature).
- nucleic acid refers to RNA as well as single and/or double-stranded DNA. Nucleic acids may be naturally occurring, for example, in the context of a genome, a transcript, an mRNA, tRNA, rRNA, siRNA, snRNA, a plasmid, cosmid, chromosome, chromatid, or other naturally occurring nucleic acid molecule.
- a nucleic acid molecule may be a non-naturally occurring molecule, e.g., a recombinant DNA or RNA, an artificial chromosome, an engineered genome, or fragment thereof, or a synthetic DNA, RNA, DNA/RNA hybrid, or including non-naturally occurring nucleotides or nucleosides.
- the terms“nucleic acid,”“DNA,”“RNA,” and/or similar terms include nucleic acid analogs, e.g., analogs having other than a phosphodiester backbone. Nucleic acids may be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, etc.
- nucleic acids may comprise nucleoside analogs such as analogs having chemically modified bases or sugars, and backbone modifications.
- a nucleic acid sequence is presented in the 5' to 3' direction unless otherwise indicated.
- a nucleic acid is or comprises natural nucleosides (e.g.
- nucleoside analogs e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, 2-aminoadenosine, C5- bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8- oxoadenosine, 8-oxoguanosine, 0(6)-methylguanine, and 2-thiocy
- nucleic acid programmable D/RNA binding protein [0072] The term“nucleic acid programmable D/RNA binding protein
- nucleic acid molecules i.e., which may broadly be referred to as a“napR/DNAbp- programming nucleic acid molecule” and includes, for example, guide RNA in the case of Cas systems
- a specific target nucleotide sequence e.g., a gene locus of a genome
- napR/DNAbp embraces CRISPR-Cas9 proteins, as well as Cas9 equivalents, homologs, orthologs, or paralogs, whether naturally occurring or non-naturally occurring (e.g., engineered or modified), and may include a Cas9 equivalent from any type of CRISPR system (e.g., type II, V, VI), including Cpfl (a type-V CRISPR-Cas systems), C2cl (a type V CRISPR-Cas system), C2c2 (a type VI CRISPR-Cas system), C2c3 (a type V CRISPR-Cas system), dCas9, GeoCas9, CjCas9, Cas 12a, Cas 12b, Cas 12c, Casl2d, Casl2g, Casl2h, Casl2i, Cas 13b, Cas 13c, Cas 13d, Cas 14, Csn2, Argonaute (Ago), and nCas9
- the term also embraces Cas homologs and variants such as an xCas9, an SpCas9-NG, an LbCasl2a, an AsCasl2a, a Cas9-KKH, a circularly permuted Cas9, a SmacCas9, a Spy-macCas9. Further Cas-equivalents are described in Makarova et al.,“C2c2 is a single-component
- nucleic acid programmable DNA binding protein that may be used in connection with this disclosure are not limited to CRISPR-Cas systems.
- the disclosure embraces any such programmable protein, such as the Argonaute protein from Natronobacterium gregoryi (NgAgo) which may also be used for DNA-guided genome editing.
- napR/DNAbp is an RNA-programmable nuclease, when in a complex with an RNA, may be referred to as a nuclease:RNA complex.
- the bound RNA(s) is referred to as a guide RNA (gRNA).
- gRNAs can exist as a complex of two or more RNAs, or as a single RNA molecule.
- gRNAs that exist as a single RNA molecule may be referred to as single-guide RNAs (sgRNAs), though“gRNA” is used interchangeabley to refer to guide RNAs that exist as either single molecules or as a complex of two or more molecules.
- gRNAs that exist as single RNA species comprise two domains: (1) a domain that shares homology to a target nucleic acid (e.g., and directs binding of a Cas9 (or equivalent) complex to the target); and (2) a domain that binds a Cas9 protein.
- domain (2) corresponds to a sequence known as a tracrRNA, and comprises a stem-loop structure.
- domain (2) is homologous to a tracrRNA as depicted in Figure IE of Jinek et al, Science 337:816- 821(2012), the entire contents of which is incorporated herein by reference.
- gRNAs e.g., those including domain 2 can be found in U.S. Patent No. 9,340,799, entitled“mRNA-Sensing Switchable gRNAs,” and International Patent Application No.
- a gRNA comprises two or more of domains (1) and (2), and may be referred to as an“extended gRNA.”
- an extended gRNA will, e.g., bind two or more Cas9 proteins and bind a target nucleic acid at two or more distinct regions, as described herein.
- the gRNA comprises a nucleotide sequence that complements a target site, which mediates binding of the nuclease/RNA complex to said target site, providing the sequence specificity of the nuclease:RNA complex.
- the RNA-programmable nuclease is the (CRISPR-associated system) Cas9 endonuclease, for example Cas9 (Csnl) from Streptococcus pyogenes (see, e.g., “Complete genome sequence of an Ml strain of Streptococcus pyogenes” Ferretti J.J. et al.., Proc. Natl. Acad. Sci. U.S.A. 98:4658-4663(2001);“CRISPR RNA maturation by trans- encoded small RNA and host factor RNase III.” Deltcheva E.
- Cas9 Cas9
- the napR/DNAbp nucleases (e.g., Cas9) use RNA:DNA hybridization to target DNA cleavage sites, these proteins are able to be targeted, in principle, to any sequence specified by the guide RNA.
- Methods of using napR/DNAbp nucleases, such as Cas9, for site-specific cleavage (e.g., to modify a genome) are known in the art (see e.g., Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819-823 (2013); Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823-826 (2013); Hwang, W.Y.
- napR/DNAbp-programming nucleic acid molecule or equivalently “guide sequence” refers the one or more nucleic acid molecules which associate with and direct or otherwise program a napR/DNAbp protein to localize to a specific target nucleotide sequence (e.g., a gene locus of a genome) that is complementary to the one or more nucleic acid molecules (or a portion or region thereof) associated with the protein, thereby causing the napR/DNAbp protein to bind to the nucleotide sequence at the specific target site.
- a specific target nucleotide sequence e.g., a gene locus of a genome
- a non limiting example is a guide RNA of a Cas protein of a CRISPR-Cas genome editing system.
- a nuclear localization signal or sequence is an amino acid sequence that tags, designates, or otherwise marks a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysines or arginines exposed on the protein surface. Different nuclear localized proteins may share the same NLS. An NLS has the opposite function of a nuclear export signal (NES), which targets proteins out of the nucleus. Thus, a single nuclear localization signal can direct the entity with which it is associated to the nucleus of a cell.
- NES nuclear export signal
- nucleobase modification domain or“modification domain” embraces any protein, enzyme, or polypeptide (or functional fragment thereof) which is capable of modifying a DNA or RNA molecule. Nucleobase modification domains may be naturally occurring, or may be engineered.
- a nucleobase modification domain can include one or more DNA repair enzymes, for example, and an enzyme or protein involved in base excision repair (BER), nucleotide excision repair (NER), homology- dependnent recombinational repair (HR), non-homologous end-joining repair (NHEJ), microhomology end-joining repair (MMEJ), mismatch repair (MMR), direct reversal repair, or other known DNA repair pathway.
- a nucleobase modification domain can have one or more types of enzymatic activities, including, but not limited to, endonuclease activity, polymerase activity, ligase activity, replication activity, and proofreading activity.
- Nucleobase modification domains can also include DNA or RNA-modifying enzymes and/or mutagenic enzymes, such as DNA deaminases and oxidizing enzymes (i.e., adenosine deaminases and inosine oxidases, respectively), which covalently modify nucleobases leading in some cases to mutagenic corrections by way of normal cellular DNA repair and replication processes.
- Exemplary nucleobase modification domains include, but are not limited to an adenosine deaminase, an inosine oxidase, a nuclease, a nickase, a recombinase, a
- the nucleobase modification domain is a oxidase (e.g., AlkBH3).
- oligonucleotide and“polynucleotide” can be used interchangeably to refer to a polymer of nucleotides (e.g., a string of at least three
- phage-assisted continuous evolution refers to continuous evolution that employs phage as viral vectors.
- PACE phage-assisted continuous evolution
- the general concept of PACE technology has been described, for example, in International PCT Application, PCT/US 2009/056194, filed September 8, 2009, published as WO 2010/028347 on March 11, 2010; International PCT Application, PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381 on June 28, 2012; U.S. Application, U.S. Patent No.
- PANCE phage-assisted non-continuous evolution
- SP evolving‘selection phage’
- promoter refers to a nucleic acid molecule with a sequence recognized by the cellular transcription machinery and able to initiate transcription of a downstream gene.
- a promoter can be constitutively active, meaning that the promoter is always active in a given cellular context, or conditionally active, meaning that the promoter is only active in the presence of a specific condition.
- conditional promoter may only be active in the presence of a specific protein that connects a protein associated with a regulatory element in the promoter to the basic transcriptional machinery, or only in the absence of an inhibitory molecule.
- a subclass of conditionally active promoters are inducible promoters that require the presence of a small molecule“inducer” for activity.
- inducible promoters include, but are not limited to, arabinose-inducible promoters, Tet-on promoters, and tamoxifen-inducible promoters.
- inducible promoters include, but are not limited to, arabinose-inducible promoters, Tet-on promoters, and tamoxifen-inducible promoters.
- constitutive, conditional, and inducible promoters are well known to the skilled artisan, and the skilled artisan will be able to ascertain a variety of such promoters useful in carrying out the instant disclosure, which is not limited in this respect.
- the disclosure provides vectors with appropriate promoters for driving expression of the nucleic acid sequences encoding the base editor fusion proteins (or one or more individual components thereof).
- bacteriophage refers to a virus that infects bacterial cells.
- phages consist of an outer protein capsid enclosing genetic material.
- the genetic material may be ssRNA, dsRNA, ssDNA, or dsDNA, in either linear or circular form.
- Phages and phage vectors are well known to those of skill in the art and non-limiting examples of phages that are useful for carrying out the methods provided herein are l, T2, T4, T7, T12, R17, M13, MS2, G4, PI,
- the phage utilized in the present disclosure is M13. Additional suitable phages and host cells will be apparent to those of skill in the art and the disclosure is not limited in this aspect. For an exemplary description of additional suitable phages and host cells, see Elizabeth Kutter and Alexander Sulakvelidze: Bacteriophages: Biology and Applications . CRC Press; 1 st edition (December 2004), ISBN: 0849313368; Martha R. J. Clokie and Andrew M.
- the terms“protein,” “peptide,” and“polypeptide” are used interchangeably herein, and refer to a polymer of amino acid residues linked together by peptide (amide) bonds.
- the terms refer to a protein, peptide, or polypeptide of any size, structure, or function. Typically, a protein, peptide, or polypeptide will be at least three amino acids long.
- a protein, peptide, or polypeptide may refer to an individual protein or a collection of proteins.
- One or more of the amino acids in a protein, peptide, or polypeptide may be modified, for example, by the addition of a chemical entity such as a carbohydrate group, a hydroxyl group, a phosphate group, a farnesyl group, an isofarnesyl group, a fatty acid group, a linker for conjugation, functionalization, or other modification, etc.
- a protein, peptide, or polypeptide may also be a single molecule or may be a multi-molecular complex.
- a protein, peptide, or polypeptide may be just a fragment of a naturally occurring protein or peptide.
- a protein, peptide, or polypeptide may be naturally occurring, engineered, or synthetic, or any combination thereof.
- the term“fusion protein” as used herein refers to a hybrid polypeptide which comprises protein domains from at least two different proteins. One protein may be located at the amino-terminal (N-terminal) portion of the fusion protein or at the carboxy-terminal (C-terminal) protein thus forming an“amino- terminal fusion protein” or a“carboxy-terminal fusion protein,” respectively.
- a protein may comprise different domains, for example, a nucleic acid binding domain (e.g., the gRNA binding domain of Cas9 that directs the binding of the protein to a target site) and a nucleic acid cleavage domain or a catalytic domain of a recombinase.
- a protein comprises a proteinaceous part, e.g., an amino acid sequence constituting a nucleic acid binding domain, and an organic compound, e.g., a compound that can act as a nucleic acid cleavage agent.
- a protein is in a complex with, or is in association with, a nucleic acid, e.g., RNA.
- any of the proteins provided herein may be produced by any method known in the art.
- the proteins provided herein may be produced via recombinant protein expression and purification, which is especially suited for fusion proteins comprising a peptide linker.
- Methods for recombinant protein expression and purification are well known, and include those described by Green and Sambrook, Molecular Cloning: A Laboratory Manual (4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2012)), the entire contents of which are incorporated herein by reference.
- recombinant protein or nucleic acid molecule comprises an amino acid or nucleotide sequence that comprises at least one, at least two, at least three, at least four, at least five, at least six, or at least seven mutations as compared to any naturally occurring sequence.
- the term“subject,” as used herein, refers to an individual organism, for example, an individual mammal.
- the subject is a human.
- the subject is a non-human mammal.
- the subject is a non-human primate.
- the subject is a rodent.
- the subject is a sheep, a goat, a cattle, a cat, or a dog.
- the subject is a vertebrate, an amphibian, a reptile, a fish, an insect, a fly, or a nematode.
- the subject is a research animal.
- the subject is genetically engineered, e.g., a genetically engineered non-human subject.
- the subject may be of either sex and at any stage of development.
- the term“target site” refers to a sequence within a nucleic acid molecule that is edited by a base editor (e.g., a dCas9-deaminase-oxidase fusion protein provided herein).
- the target site further refers to the sequence within a nucleic acid molecule to which a complex of the base editor and gRNA binds.
- the term“vector,” as used herein, may refer to a nucleic acid that has been modified to encode a gene of interest and that is able to enter into a host cell, mutate and replicate within the host cell, and then transfer a replicated form of the vector into another host cell.
- the term“vector” as used herein may refer to a nucleic acid that has been modified to encode the base editor.
- Exemplary suitable vectors include viral vectors, such as retroviral vectors or bacteriophages and filamentous phage, and conjugative plasmids.
- viral particle refers to a viral genome, for example, a DNA or RNA genome, that is associated with a coat of a viral protein or proteins, and, in some cases, with an envelope of lipids.
- a phage particle comprises a phage genome packaged into a protein encoded by the wild type phage genome.
- the term“viral vector,” as used herein, refers to a nucleic acid comprising a viral genome that, when introduced into a suitable host cell, can be replicated and packaged into viral particles able to transfer the viral genome into another host cell.
- the term“viral vector” extends to vectors comprising truncated or partial viral genomes.
- a viral vector is provided that lacks a gene encoding a protein essential for the generation of infectious viral particles.
- suitable host cells for example, host cells comprising the lacking gene under the control of a conditional promoter, however, such truncated viral vectors can replicate and generate viral particles able to transfer the truncated viral genome into another host cell.
- the viral vector is an adeno- associated virus (AAV) vector.
- AAV adeno- associated virus
- treatment refers to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease, disorder, or condition, or one or more symptoms thereof, as described herein.
- treatment refers to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease, disorder, or condition, or one or more symptoms thereof, as described herein.
- treatment may be any clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease, disorder, or condition, or one or more symptoms thereof, as described herein.
- treatment may be
- treatment may be administered after one or more symptoms have developed and/or after a disease has been diagnosed.
- treatment may be administered in the absence of symptoms, e.g., to prevent or delay onset of a symptom or inhibit onset or progression of a disease.
- treatment may be administered to a susceptible individual prior to the onset of symptoms (e.g., in light of a history of symptoms and/or in light of genetic or other susceptibility factors). Treatment may also be continued after symptoms have resolved, for example, to prevent or delay their prevention or recurrence.
- the term“variant” refers to a protein having characteristics that deviate from what occurs in nature that retains at least one functional i.e. binding, interaction, or enzymatic activity and/or therapeutic property thereof.
- A“variant” is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to the wild type protein.
- a variant of Cas9 may comprise a Cas9 that has one or more changes in amino acid residues as compared to a wild type Cas9 amino acid sequence.
- a variant of a deaminase may comprise a deaminase that has one or more changes in amino acid residues as compared to a wild type deaminase amino acid sequence, e.g. following ancestral sequence reconstruction of the deaminase.
- changes include chemical modifications, substitutions of different amino acid residues truncations, covalent additions (e.g. of a tag), and any other changes.
- This term also embraces fragments of a wild type protein.
- the level or degree of which the property is retained may be reduced relative to the wild type protein but is typically the same or similar in kind. Generally, variants are overall very similar, and in many regions, identical to the amino acid sequence of the protein described herein. A skilled artisan will appreciate how to make and use variants that maintain all, or at least some, of a functional ability or property.
- the variant proteins may comprise, or alternatively consist of, an amino acid sequence which is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100%, identical to, for example, the amino acid sequence of a wild-type protein, or any protein provided herein.
- Further polypeptides encompassed by the invention are polypeptides encoded by polynucleotides which hybridize to the complement of a nucleic acid molecule encoding a protein such as a napDNAbp under stringent hybridization conditions (e.g. hybridization to filter bound DNA in 6x Sodium chloride/S odium citrate (SSC) at about 45 degrees Celsius, followed by one or more washes in 0.2. times.
- stringent hybridization conditions e.g. hybridization to filter bound DNA in 6x Sodium chloride/S odium citrate (SSC) at about 45 degrees Celsius, followed by one or more washes in 0.2. times.
- SSC 0.1% SDS at about 50-65 degrees Celsius
- highly stringent conditions e.g. hybridization to filter bound DNA in 6x sodium chloride/S odium citrate (SSC) at about 45 degrees Celsius, followed by one or more washes in O.lxSSC, 0.2% SDS at about 68 degrees Celsius
- other stringent hybridization conditions which are known to those of skill in the art (see, for example, Ausubel, F. M. el al, eds., 1989 Current Protocol in Molecular Biology , Green publishing associates, Inc., and John Wiley & Sons Inc., New York, at pp. 6.3.1-6.3.6 and 2.10.3).
- a polypeptide having an amino acid sequence at least, for example, 95% “identical” to a query amino acid sequence it is intended that the amino acid sequence of the subject polypeptide is identical to the query sequence except that the subject polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence.
- the subject polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence.
- up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, or substituted with another amino acid.
- These alterations of the reference sequence may occur at the amino- or carboxy-terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence or in one or more contiguous groups within the reference sequence.
- any particular polypeptide is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to, for instance, the amino acid sequence of a protein such as a napDNAbp, can be determined conventionally using known computer programs.
- a preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. ⁇ Comp. App. Biosci. 6:237-245 (1990)).
- the query and subject sequences are either both nucleotide sequences or both amino acid sequences.
- the percent identity is corrected by calculating the number of residues of the query sequence that are bl and C-terminal of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a percent of the total bases of the query sequence. Whether a residue is matched/aligned is determined by results of the FASTDB sequence alignment.
- This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score.
- This final percent identity score is what is used for the purposes of the present invention. Only residues to the N- and C-termini of the subject sequence, which are not matched/aligned with the query sequence, are considered for the purposes of manually adjusting the percent identity score. That is, only query residue positions outside the farthest N- and C-terminal residues of the reference sequence.
- wild type is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene or characteristic as it occurs in nature as distinguished from mutant or variant forms.
- the present disclosure provides adenine-to-thymine or“ATBE” (or thymine- to-adenine or“TABE”) transversion base editors which comprise a napDNAbp (e.g., a dCas9 domain) fused to a nucleobase modification domain.
- the nucleobase modification domain comprises an adenosine deaminase and an oxidase.
- the ATBE transversion base editors are capable of converting an A:T nucleobase pair to a T:A nucleobase pair in a target nucleotide sequence of interest, e.g., the genome of a cell.
- the disclosed base editors comprise a deaminase-oxidase domain, or variant thereof, that catalyzes the conversion of a target adenine to a thymine via deamination and oxidation reactions.
- the disclosed base editors also comprise TABE transversion base editors that comprise a deaminase-oxidase domain, or variant thereof, that catalyzes the conversion of a target adenine to a thymine via deamination and oxidation, wherein the base-paired thymine of the non-edited strand is subsequently converted to an adenine by the concerted action of the cell’s mismatch repair factors.
- a targeted A in a nucleic acid of interest is first enzymatically deaminated to an inosine. Enzyme-catalyzed oxidation of the newly formed inosine is induced, results in formation of 8-oxo-inosine (8-oxo-I). Steric rotation of the 8-oxo-I around the glycosidic bond is induced, presenting the Hoogsteen edge for base pairing. During replication or repair of the unmutated strand (which may be induced by a Cas9 nickase in some embodiments), the 8-oxo-I is paired with thymine by a
- the cell recognizes the mismatch between 8-oxo-I and the thymine on the unmutated strand and converts the thymine to an adenine.
- the 8- oxo-I is converted to a thymine.
- a desired A-to-T transversion is thus achieved.
- Adenine oxidation is achieved by the targeted use of a fusion protein comprising a Cas9 (e.g., dCas9 or nCas9) domain, an adenosine deaminase domain, an inosine oxidase domain, and optionally linkers interconnecting these domains (see FIG. 1A).
- the adenosine deaminase and oxidase domains of the disclosed base editors may comprise variants of wild-type deaminase and oxidase enzymes, respectively. These variants may comprise an amino acid sequence that is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to the wild type enzyme.
- the adenosine deaminase and oxidase domains may comprise an amino acid sequence having 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, or more than 30 amino acids that differ relative to the amino acid sequence of the wild type enzyme. These differences may comprise nucleotides that have been inserted, deleted, or substituted relative to the amino acid sequence of the wild type enzyme.
- the disclosed adenosine deaminase and oxidase domains contain stretches of about 50, about 75, about 100, about 125, about 150, about 175, about 200, about 300, about 400, about 500, or more than 500 consecutive amino acids in common with the wild type enzyme.
- the adenosine deaminase and oxidase domains comprise truncations at the N-terminus or C-terminus relative to the wild-type enzyme. In some embodiments, the adenosine deaminase and oxidase domains comprise truncations of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, or more than 30 amino acids at the N-terminus or C-terminus relative to the wild-type or base sequence.
- the oxidase is an AlkB, or a variant thereof. In certain embodiments, the oxidase is a bacterial AlkB, or a variant thereof. In other embodiments, the oxidase is a human AlkBH, or a variant thereof. In certain embodiments, the oxidase is a human AlkBHl, AlkBH2, AlkBH3, AlkBH4, AlkBH5, AlkBH7, AlkBH8, or a variant thereof.
- the oxidase is a xanthine dehydrogenase, or a variant thereof.
- the xanthine dehydrogenase is a Streptomyces cyanogenus xanthine dehydrogenase (ScXDH), or a variant thereof.
- the xanthine dehydrogenase or variant thereof is derived from C. capitata, N. crassa, M. hansupus, E. cloacae, S. snoursei, S. albulus, S. himastatinicus , or S. lividans.
- the oxidase is a cytochrome P450 enzyme, or a variant thereof.
- the oxidase is a human CYP1A2, CYP2A4, or CYP3A6, or a variant thereof.
- the oxidase is a TET-oxidase, or a variant thereof.
- the oxidase is a human TET1, TET1-CD, TET2, or TET3, or a variant thereof. In other embodiments, the oxidase is a human FTO (alpha-ketoglutarate- dependent dioxygenase), or a variant thereof.
- the present disclosure provides for A:T to T:A transversion base editors which satisfy a need in the art for the installation of targeted transversions in a target nucleotide sequence, e.g., a genome.
- A:T to T:A base editors e.g., fusion proteins comprising a dCas9 domain and an adenosine deaminase-oxidase domain
- A:T to T:A base editors e.g., fusion proteins comprising a dCas9 domain and an adenosine deaminase-oxidase domain
- transversions particularly A:T to T:A trans versions.
- the disclosure provides compositions comprising the transversion base editors as described herein, e.g., fusion proteins comprising a dCas9 domain and an adenosine deaminase-oxidase domain.
- the present disclosure provides for nucleic acid molecules encoding and/or expressing the transversion base editors as described herein, as well as expression vectors and constructs for expressing the transversion base editors described herein, host cells comprising said nucleic acid molecules and expression vectors, and compositions for delivering and/or administering nucleic acid-based embodiments described herein.
- the present disclosure provides for methods of making the transversion base editors, as well as methods of using the transversion base editors or nucleic acid molecules encoding the transversion base editors in applications including editing a nucleic acid molecule, e.g., a genome.
- methods of engineering the transversion base editors provided herein is a phage-assisted continuous evolution (PACE) system or non-continuous system (e.g., PANCE) which may be utilized to evolve one or more components of a base editor (e.g., a Cas9 domain or an adenosine deaminase-oxidase domain).
- methods of making the base editors comprise recombinant protein expression methodologies known to one of ordinary skill in the art.
- the specification also provides methods for editing a target nucleic acid molecule, e.g., a single nucleobase within a genome, with a base editing system described herein (e.g., in the form of an evolved base editor as described herein, or a vector or construct encoding same).
- a base editing system described herein e.g., in the form of an evolved base editor as described herein, or a vector or construct encoding same.
- Such methods involve transducing (e.g., via transfection) cells with a plurality of complexes each comprising a fusion protein (e.g., a fusion protein comprising a Cas9 nickase (nCas9) domain and an adenosine deaminase and oxidase domain) and a gRNA molecule.
- a fusion protein e.g., a fusion protein comprising a Cas9 nickase (nCas9) domain and an adeno
- the gRNA is bound to the napDNAbp domain (e.g., nCas9 domain) of the fusion protein.
- each gRNA comprises a guide sequence of at least 10 contiguous nucleotides (e.g., 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 contiguous nucleotides) that is complementary to a target sequence.
- the methods involve the transfection of nucleic acid constructs (e.g., plasmids) that each (or together) encode the components of a complex of fusion protein and gRNA molecule.
- a nucleic acid construct that encodes the fusion protein is transfected into the cell separately from the plasmid that encodes the gRNA molecule. In certain embodiments, these components are encoded on a single construct and transfected together.
- the methods disclosed herein involve the introduction into cells of a complex comprising a fusion protein and gRNA molecule that has been expressed and cloned outside of these cells.
- any fusion protein e.g., any of the fusion proteins provided herein, may be introduced into the cell in any suitable way, either stably or transiently.
- a fusion protein may be transfected into the cell.
- the cell may be transduced or transfected with a nucleic acid construct that encodes a fusion protein.
- a cell may be transduced (e.g., with a virus encoding a fusion protein), or transfected (e.g., with a plasmid encoding a fusion protein) with a nucleic acid that encodes a fusion protein, or the translated fusion protein.
- transduction may be a stable or transient transduction.
- cells expressing a fusion protein or containing a fusion protein may be transduced or transfected with one or more gRNA molecules, for example when the fusion protein comprises a Cas9 (e.g., nCas9) domain.
- a plasmid expressing a fusion protein may be introduced into cells through electroporation, transient (e.g., lipofection) and stable genome integration (e.g., piggybac) and viral transduction or other methods known to those of skill in the art.
- the methods described above result in a cutting (or nicking) one strand of the double-stranded DNA, for example, the strand that includes the thymine (T) of the target A:T nucleobase pair opposite the strand containing the target adenine (A) that is being deaminated.
- This nicking result serves to direct mismatch repair machinery to the non-edited strand, ensuring that the chemically modified nucleobase is not interpreted as a lesion by the machinery.
- This nick may be created by the use of an nCas9.
- the specification also provides methods for efficiently editing a target nucleic acid molecule, e.g., a single nucleobase of a genome, with a base editing system described herein (e.g., in the form of an base editor as described herein or a vector or construct encoding same), thereby installing a transversion edit.
- a target nucleic acid molecule e.g., a single nucleobase of a genome
- a base editing system described herein e.g., in the form of an base editor as described herein or a vector or construct encoding same
- the disclosure provides therapeutic methods for treating a genetic disease and/or for altering or changing a genetic trait or condition by contacting a target nucleic acid molecule, e.g., a target nucleic acid molecule in the genome of an organism, with a base editing system (e.g., in the form of an base editor protein or a vector encoding same) and conducting base editing to treat the genetic disease and/or change the genetic trait (e.g., eye color).
- a target nucleic acid molecule e.g., a target nucleic acid molecule in the genome of an organism
- a base editing system e.g., in the form of an base editor protein or a vector encoding same
- conducting base editing to treat the genetic disease and/or change the genetic trait (e.g., eye color).
- a method for editing a nucleobase pair of a double- stranded DNA sequence comprising: (i) contacting a double-stranded DNA sequence with a complex comprising a base editor and a guide nucleic acid, wherein the double- stranded DNA comprises a target A:T nucleobase pair; (ii) deaminating the adenine (A) of the A:T nucleobase pair to inosine, and (iii) oxidizing the inosine to 8-oxo- inosine (8-oxo-I).
- the 8-oxo-inosine is subsequently replaced with a thymine (T), thereby generating an A to T replacement.
- the methods described above further comprise (iii) cutting (or nicking) one strand of the double-stranded DNA, for example, wherein the one strand comprises the A of the target T: A nucleobase pair, or the T of the A:T nucleobase pair.
- the present disclosure provides a complex comprising the base editor fusion proteins described herein and an RNA bound to the napDNAbp of the fusion protein, such as a guide RNA (gRNA), e.g. a single guide RNA.
- gRNA guide RNA
- the target nucleotide sequence may comprise a target sequence (e.g., a point mutation) associated with a disease, disorder, or condition, such as sickle cell anemia,
- the target sequence may comprise an A to T point mutation associated with a disease, disorder, or condition, and wherein the deamination and oxidation of the mutant A base results in mismatch repair-mediated correction to a sequence that is not associated with a disease, or disorder, or condition.
- the target sequence may instead comprise a T to A point mutation associated with a disease, or disorder, or condition, and wherein the deamination and oxidation of the mutant A base results in mismatch repair- mediated correction to a sequence that is not associated with the disease, or disorder, or condition.
- the target sequence may encode a protein, and where the point mutation is in a codon and results in a change in the amino acid encoded by the mutant codon as compared to a wild-type codon.
- the target sequence may also be at a splice site, and the point mutation results in a change in the splicing of an mRNA transcript as compared to a wild-type transcript.
- the target may be at a non-coding sequence of a gene, such as a promoter, and the point mutation results in increased or decreased expression of the gene.
- Exemplary target genes include HBB, in which an A to T point mutation at residue 334 results in a sickle cell anemia phenotype; and FANCC, in which an A to T point mutation at residue 456 results in a Fanconi anemia phenotype.
- Additional target genes include TGFBI (associated with lattice corneal dystrophy type III), PKP1 (associated with ectodermal dysplasiaskin fragility syndrome), KRAS and SOS1 (both associated with Noonan syndrome), for which the disease phenotype is frequently caused by T:A to A:T point mutations.
- application of the base editors results in the deamination-oxidation of a target site.
- the deamination-oxidation of a mutant A results in a change of the amino acid encoded by the mutant codon, which in some cases can result in the expression of a wild-type amino acid.
- the application of the base editors can also result in a change of the mRNA transcript, and even restoring the mRNA transcript to a wild-type state.
- the subject has been diagnosed with a disease, disorder, or condition, such as, but not limited to, a disease, disorder, or condition associated with a point mutation in the HBB gene, the TGFBl gene, the PKP1 gene, the KRAS gene, the SOS / gene, or the FANCC gene.
- a disease, disorder, or condition such as, but not limited to, a disease, disorder, or condition associated with a point mutation in the HBB gene, the TGFBl gene, the PKP1 gene, the KRAS gene, the SOS / gene, or the FANCC gene.
- the specification discloses a pharmaceutical composition comprising any one of the presently disclosed base editor fusion proteins. In one aspect, the specification discloses a pharmaceutical composition comprising any one of the presently disclosed complexes of fusion proteins and gRNA. In one aspect, the specification discloses a pharmaceutical composition comprising polynucleotides encoding the fusion proteins disclosed herein and polynucleotides encoding a gRNA, or polynucleotides encoding both.
- the specification discloses a pharmaceutical composition comprising any one of the presently disclosed vectors.
- the pharmaceutical composition further comprises a pharmaceutically acceptable excipient.
- the pharmaceutical composition further comprises a lipid and/or polymer.
- the lipid and/or polymer is cationic. The preparation of such lipid particles is well known. See, e.g. U.S. Patent Nos. 4,880,635; 4,906,477;
- the present disclosure provides A-to-T (or T-to-A) transversion base editor fusion proteins comprising (i) a nucleic acid programmable DNA binding protein (napDNAbp), and (ii) a nucleobase modification domain capable of facilitating the conversion of a A:T nucleobase pair to a T:A nucleobase pair in a target nucleotide sequence, e.g., a genome.
- napDNAbp nucleic acid programmable DNA binding protein
- a nucleobase modification domain capable of facilitating the conversion of a A:T nucleobase pair to a T:A nucleobase pair in a target nucleotide sequence, e.g., a genome.
- the nucleobase modification domain may be a fusion product of a deaminase and an inosine oxidase, which enzymatically converts the inosine product of a catalyzed deamination of an adenine nucleobase in a A:T nucleobase pair to 8- oxo-inosine, which then is subsequently processed by the cell’s DNA repair and replication machinery to a thymine, thereby converting the A:T nucleobase pair to a T:A nucleobase pair.
- the various domains of the transversion fusion proteins described herein may be obtained as a result of mutagenizing a reference or starting-point base editor (or a component or domain thereof) by an evolution or modification strategy.
- Such strategies include a directed evolution process, e.g., a continuous evolution method (e.g., PACE) or a non-continuous evolution method (e.g., PANCE or other discrete plate-based selections).
- the disclosure provides a base editor that has one or more amino acid variations introduced into its amino acid sequence relative to the amino acid sequence of the reference or starting-point base editor.
- the base editor may include variants in one or more components or domains of the base editor (e.g., variants introduced into a Cas9 domain, a deaminase domain, oxidase domain, an inhibitor of base excision repair (iBER) domain, or variants introduced into combinations of these domains).
- the nucleobase modification domain may be evolved from a reference protein that is an ssDNA modifying enzyme (e.g., ssDNA demethylases) and evolved using PACE, PANCE, or other plate-based evolution methods to obtain a DNA modifying version of the nucleobase modification domain, which can then be used in the fusion proteins described herein.
- the base editors described herein comprise a nucleic acid programmable DNA binding (napDNAbp) domain.
- the napDNAbp is associated with at least one guide nucleic acid (e.g., guide RNA), which localizes the napDNAbp to a DNA sequence that comprises a DNA strand (i.e., a target strand) that is complementary to the guide nucleic acid, or a portion thereof (e.g., the protospacer of a guide RNA).
- the guide nucleic-acid “programs” the napDNAbp domain to localize and bind to a complementary sequence of the target strand. Binding of the napDNAbp domain to a complementary sequence enables the nucleobase modification domain of the base editor to access and enzymatically deaminate a target adenine base in the target strand.
- the napDNAbp can be a CRISPR (clustered regularly interspaced short palindromic repeat)-associated nuclease.
- CRISPR is an adaptive immune system that provides protection against mobile genetic elements (viruses, transposable elements and conjugative plasmids).
- CRISPR clusters contain spacers, sequences complementary to antecedent mobile elements, and target invading nucleic acids.
- CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).
- crRNA CRISPR RNA
- type II CRISPR systems correct processing of pre-crRNA requires a trans-encoded small RNA (tracrRNA), endogenous ribonuclease 3 (me) and a Cas9 protein.
- the tracrRNA serves as a guide for ribonuclease 3- aided processing of pre-crRNA. Subsequently, Cas9/crRNA/tracrRNA endonucleolytically cleaves linear or circular dsDNA target complementary to the spacer. The target strand not complementary to crRNA is first cut endonucleolytically, then trimmed 3 '-5' exonucleolytically. In nature, DNA-binding and cleavage typically requires protein and both RNAs. However, single guide RNAs (“sgRNA”, or simply“gNRA”) can be engineered so as to incorporate aspects of both the crRNA and tracrRNA into a single RNA species. See, e.g., Jinek et al, Science 337:816-821(2012), the entire contents of which is hereby incorporated by reference.
- sgRNA single guide RNAs
- the binding mechanism of a napDNAbp - guide RNA complex includes the step of forming an R-loop whereby the napDNAbp induces the unwinding of a double-strand DNA target, thereby separating the strands in the region bound by the napDNAbp.
- the guideRNA protospacer then hybridizes to the“target strand.” This displaces a“non-target strand” that is
- the napDNAbp includes one or more nuclease activities, which cuts the DNA leaving various types of lesions (e.g., a nick in one strand of the DNA).
- the napDNAbp may comprises a nuclease activity that cuts the non-target strand at a first location, and / or cuts the target strand at a second location.
- the target DNA can be cut to form a“double- stranded break” whereby both strands are cut.
- the target DNA can be cut at only a single site, i.e., the DNA is“nicked” on one strand.
- the base editors may comprise the canonical SpCas9, or any ortholog Cas9 protein, or any variant Cas9 protein— including any naturally occurring variant, mutant, or otherwise engineered version of Cas9— that is known or which can be made or evolved through a directed evolution or otherwise mutagenic process.
- the napDNAbp has a nickase activity, i.e., only cleave one strand of the target DNA sequence.
- the napDNAbp has an inactive nuclease, e.g., are“dead” proteins.
- Other variant Cas9 proteins that may be used are those having a smaller molecular weight than the canonical SpCas9 (e.g., for easier delivery) or having modified or rearranged primary amino acid sequence (e.g., the circular permutant forms).
- the base editors described herein may also comprise Cas9 equivalents, including Casl2a/Cpfl and Casl2b proteins.
- the napDNAbps used herein e.g., an SpCas9 or SpCas9 variant
- the disclosure contemplates any Cas9, Cas9 variant, or Cas9 equivalent which has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.9% sequence identity to a reference Cas9 sequence, such as a reference SpCas9 canonical sequence (set forth in SEQ ID NO: 9), a reference SaCas9 canonical sequence (set forth in SEQ ID NO:
- the napDNAbp directs cleavage of one or both strands at the location of a target sequence, such as within the target sequence and/or within the
- the napDNAbp directs cleavage of one or both strands within about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, 500, or more base pairs from the first or last nucleotide of a target sequence.
- an aspartate-to-alanine substitution (D10A) in the RuvC I catalytic domain of Cas9 from S. pyogenes converts Cas9 from a nuclease that cleaves both strands to a nickase (cleaves a single strand).
- mutations that render Cas9 a nickase include, without limitation, H840A, N854A, and N863A in reference to the canonical SpCas9 sequence, or to equivalent amino acid positions in other Cas9 variants or Cas9 equivalents.
- Cas protein refers to a full-length Cas protein obtained from nature, a recombinant Cas protein having a sequences that differs from a naturally occurring Cas protein, or any fragment of a Cas protein that nevertheless retains all or a significant amount of the requisite basic functions needed for the disclosed methods, i.e., (i) possession of nucleic-acid programmable binding of the Cas protein to a target DNA, and (ii) ability to nick the target DNA sequence on one strand.
- the Cas proteins contemplated herein embrace CRISPR Cas9 proteins, as well as Cas9 equivalents, variants (e.g., Cas9 nickase (nCas9) or nuclease inactive Cas9 (dCas9)) homologs, orthologs, or paralogs, whether naturally occurring or non-naturally occurring (e.g., engineered or recombinant), and may include a Cas9 equivalent from any type of CRISPR system (e.g., type II, V, VI), including Cpfl (a type-V CRISPR-Cas systems), C2cl (a type V CRISPR-Cas system), C2c2 (a type VI CRISPR-Cas system) and C2c3 (a type V CRISPR-Cas system).
- Cpfl a type-V CRISPR-Cas systems
- C2cl a type V CRISPR-Cas system
- C2c2 a type VI CRISPR-Ca
- C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector,” Science 2016; 353(6299), the contents of which are incorporated herein by reference.
- Cas9 or“Cas9 domain” embraces any naturally occurring Cas9 from any organism, any naturally-occurring Cas9 equivalent or functional fragment thereof, any Cas9 homolog, ortholog, or paralog from any organism, and any mutant or variant of a Cas9, naturally-occurring or engineered.
- the term Cas9 is not meant to be particularly limiting and may be referred to as a“Cas9 or equivalent.”
- Exemplary Cas9 proteins are further described herein and/or are described in the art and are incorporated herein by reference. The present disclosure is unlimited with regard to the particular napDNAbp that is employed in the base editors of the disclosure.
- Cas9 and Cas9 equivalents are provided as follows; however, these specific examples are not meant to be limiting.
- the base editors of the present disclosure may use any suitable napDNAbp, including any suitable Cas9 or Cas9 equivalent.
- the base editor constructs described herein may comprise the “canonical SpCas9” nuclease from S. pyogenes, which has been widely used as a tool for genome engineering.
- This Cas9 protein is a large, multi-domain protein containing two distinct nuclease domains. Point mutations can be introduced into Cas9 to abolish one or both nuclease activities, resulting in a nickase Cas9 (nCas9) or dead Cas9 (dCas9), respectively, that still retains its ability to bind DNA in a sgRNA-programmed manner.
- Cas9 or variant thereof when fused to another protein or domain, Cas9 or variant thereof (e.g., nCas9) can target that protein to virtually any DNA sequence simply by co-expression with an appropriate sgRNA.
- the canonical SpCas9 protein refers to the wild type protein from
- Streptococcus pyogenes having the following amino acid sequence:
- the base editors described herein may include canonical SpCas9, or any variant thereof having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity with a wild type Cas9 sequence provided above.
- These variants may include SpCas9 variants containing one or more mutations, including any known mutation reported with the SwissProt Accession No. Q99ZW2 entry, which include:
- the base editors described herein may include any of the above SpCas9 sequences, or any variant thereof having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto.
- the Cas9 protein can be a wild type Cas9 ortholog from another bacterial species.
- the following Cas9 orthologs can be used in connection with the base editor constructs described in this disclosure.
- any variant Cas9 orthologs having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity to any of the below orthologs may also be used with the disclosed base editors.
- the base editors described herein may include any of the above Cas9 ortholog sequences, or any variants thereof having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto.
- the napDNAbp may include any suitable homologs and/or orthologs or naturally occurring enzymes, such as Cas9.
- Cas9 homologs and/or orthologs have been described in various species, including, but not limited to, S. pyogenes and S. thermophilus .
- the Cas moiety is configured (e.g, mutagenized, recombinantly engineered, or otherwise obtained from nature) as a nickase, i.e., capable of cleaving only a single strand of the target doubpdditional suitable Cas9 nucleases and sequences will be apparent to those of skill in the art based on this disclosure, and such Cas9 nucleases and sequences include Cas9 sequences from the organisms and loci disclosed in Chylinski, Rhun, and Charpentier,“The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems” (2013) RNA Biology 10:5, 726-737; the entire contents of which are incorporated herein by reference.
- a Cas9 nuclease has an inactive (e.g., an inactivated) DNA cleavage domain, that is, the Cas9 is a nickase.
- the Cas9 protein comprises an amino acid sequence that is at least 80% identical to the amino acid sequence of a Cas9 protein as provided by any one of the variants of Table 3.
- the Cas9 protein comprises an amino acid sequence that is at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to the amino acid sequence of a Cas9 protein as provided by any one of the Cas9 orthologs in the above tables.
- the disclosed base editors may comprise a catalytically inactive, or“dead,” napDNAbp domain.
- exemplary catalytically inactive domains in the disclosed base editors are dead S. pyogenes Cas9 (dSpCas9) and S. pyogenes Cas9 nickase (SpCas9n).
- the base editors described herein may include a dead Cas9, e.g., dead SpCas9, which has no nuclease activity due to one or more mutations that inactivate both nuclease domains of SpCas9, namely the RuvC domain (which cleaves the non-protospacer DNA strand) and HNH domain (which cleaves the protospacer DNA strand).
- the nuclease inactivation may be due to one or mutations that result in one or more substitutions and/or deletions in the amino acid sequence of the encoded protein, or any variants thereof having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto.
- the base editors described herein may include a dead Cas9, e.g., dead SpCas9, which has no nuclease activity due to one or more mutations that inactivate both nuclease domains of SpCas9, namely the RuvC domain (which cleaves the non-protospacer DNA strand) and HNH domain (which cleaves the protospacer DNA strand).
- the D10A and N580A mutations in the wild-type S. aureus Cas9 amino acid sequence may be used to form a dSaCas9.
- the napDNAbp domain of the base editors provided herein comprises a dSaCas9 that has D10A and N580A mutations relative to the wild-type SaCas9 sequence (SEQ ID NO: 154).
- dCas9 refers to a nuclease-inactive Cas9 or nuclease-dead Cas9, or a functional fragment thereof, and embraces any naturally occurring dCas9 from any organism, any naturally-occurring dCas9 equivalent or functional fragment thereof, any dCas9 homolog, ortholog, or paralog from any organism, and any mutant or variant of a dCas9, naturally-occurring or engineered.
- dCas9 is not meant to be particularly limiting and may be referred to as a“dCas9 or equivalent.” Exemplary dCas9 proteins and method for making dCas9 proteins are further described herein and/or are described in the art and are incorporated herein by reference. [0145] In other embodiments, dCas9 corresponds to, or comprises in part or in whole, a Cas9 amino acid sequence having one or more mutations that inactivate the Cas9 nuclease activity.
- Cas9 variants having mutations other than D10A and H840A are provided which may result in the full or partial inactivate of the endogenous Cas9 nuclease activity (e.g., nCas9 or dCas9, respectively).
- Such mutations include other amino acid substitutions at DIO and H820, or other substitutions within the nuclease domains of Cas9 (e.g., substitutions in the HNH nuclease subdomain and/or the RuvCl subdomain) with reference to a wild type sequence such as Cas9 from Streptococcus pyogenes (NCBI Reference Sequence: NC_017053.1).
- variants or homologues of Cas9 are provided which are at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to NCBI Reference Sequence: NC_017053.1.
- variants of dCas9 are provided having amino acid sequences which are shorter, or longer than NC_017053.1 by about 5 amino acids, by about 10 amino acids, by about 15 amino acids, by about 20 amino acids, by about 25 amino acids, by about 30 amino acids, by about 40 amino acids, by about 50 amino acids, by about 75 amino acids, by about 100 amino acids or more.
- the napDNAbp domain of any of the disclosed base editors comprises a dead S. pyogenes Cas9 (dSpCas9).
- the napDNAbp domain of any of the disclosed based editors is comprises at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity to SEQ ID NO: 168.
- the napDNAbp domain of any of the disclosed base editors comprises the amino acid sequence of SEQ ID NO: 168.
- the dead Cas9 may be based on the canonical SpCas9 sequence of Q99ZW2 and may have the following sequence, which comprises a D10A and an H810A substitutions (underlined and bolded), or a variant of SEQ ID NO: 168 having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto:
- the disclosed base editors may comprise a napDNAbp domain that comprises a nickase.
- the base editors described herein comprise a Cas9 nickase.
- the term“Cas9 nickase” of“nCas9” refers to a variant of Cas9 which is capable of introducing a single-strand break in a double strand DNA molecule target.
- the Cas9 nickase comprises only a single functioning nuclease domain.
- the wild type Cas9 (e.g., the canonical SpCas9) comprises two separate nuclease domains, namely, the RuvC domain (which cleaves the non-protospacer DNA strand) and HNH domain (which cleaves the protospacer DNA strand).
- the Cas9 nickase comprises a mutation in the RuvC domain which inactivates the RuvC nuclease activity.
- nickase mutations in the RuvC domain could include D10X, H983X, D986X, or E762X, wherein X is any amino acid other than the wild type amino acid.
- the nickase could be D10A, of H983A, or D986A, or E762A, or a combination thereof.
- the napDNAbp domain of any of the disclosed base editors comprises an S. pyogenes Cas9 nickase (SpCas9n).
- the napDNAbp domain of any of the disclosed based editors is comprises at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity to SEQ ID NO: 174 or 180.
- the napDNAbp domain of any of the disclosed base editors comprises the amino acid sequence of SEQ ID NO: 174.
- the napDNAbp domain of any of the disclosed base editors comprises the amino acid sequence of SEQ ID NO: 180.
- the napDNAbp domain of any of the disclosed base editors comprises an S. aureus Cas9 nickase (SaCas9n).
- the napDNAbp domain of any of the disclosed based editors is comprises at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity to SEQ ID NO: 178.
- the napDNAbp domain of any of the disclosed base editors comprises the amino acid sequence of SEQ ID NO: 178.
- the Cas9 nickase can having a mutation in the RuvC nuclease domain and have one of the following amino acid sequences, or a variant thereof having an amino acid sequence that has at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto.
- the Cas9 nickase comprises a mutation in the HNH domain which inactivates the HNH nuclease activity.
- mutations in histidine (H) 840 or asparagine (R) 863 have been reported as loss-of-function mutations of the HNH nuclease domain and the creation of a functional Cas9 nickase (e.g., Nishimasu el al,“Crystal structure of Cas9 in complex with guide RNA and target DNA,” Cell 156(5), 935-949, which is incorporated herein by reference).
- nickase mutations in the HNH domain could include H840X and R863X, wherein X is any amino acid other than the wild type amino acid.
- the nickase could be H840A or R863A or a combination thereof.
- the Cas9 nickase can have a mutation in the HNH nuclease domain and have one of the following amino acid sequences, or a variant thereof having an amino acid sequence that has at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto.
- the N-terminal methionine is removed from a Cas9 nickase, or from any Cas9 variant, ortholog, or equivalent disclosed or contemplated herein.
- methionine-minus Cas9 nickases include the following sequences, or a variant thereof having an amino acid sequence that has at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto.
- the napDNAbp domains used in the base editors described herein may also include other Cas9 variants that area at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to any reference Cas9 protein, including any wild type Cas9, or mutant Cas9 (e.g., a dead Cas9 or Cas9 nickase), or circular permutant Cas9, or other variant of Cas9 disclosed herein or known in the art.
- a Cas9 variant may have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 21, 24, 25, 26, 27, 28, 29, 30,
- the Cas9 variant comprises a fragment of a reference Cas9 (e.g., a gRNA binding domain or a DNA-cleavage domain), such that the fragment is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to the corresponding fragment of wild type Cas9.
- a reference Cas9 e.g., a gRNA binding domain or a DNA-cleavage domain
- the fragment is at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% identical, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% of the amino acid length of a corresponding wild type Cas9 (e.g., SEQ ID NO: 9).
- a corresponding wild type Cas9 e.g., SEQ ID NO: 9
- the disclosure also may utilize Cas9 fragments which retain their functionality and which are fragments of any herein disclosed Cas9 protein.
- the Cas9 fragment is at least 100 amino acids in length.
- the fragment is at least 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or at least 1300 amino acids in length.
- the base editors disclosed herein may comprise one of the Cas9 variants described as follows, or a Cas9 variant thereof having at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to any reference Cas9 variants.
- the base editors described herein can include any Cas9 equivalent.
- the term“Cas9 equivalent” is a broad term that encompasses any napDNAbp protein that serves the same function as Cas9 in the present base editors despite that its amino acid primary sequence and/or its three-dimensional structure may be different and/or unrelated from an evolutionary standpoint.
- Cas9 equivalents include any Cas9 ortholog, homolog, mutant, or variant described or embraced herein that are evolutionarily related
- the Cas9 equivalents also embrace proteins that may have evolved through convergent evolution processes to have the same or similar function as Cas9, but which do not necessarily have any similarity with regard to amino acid sequence and/or three dimensional structure.
- the base editors described here embrace any Cas9 equivalent that would provide the same or similar function as Cas9 despite that the Cas9 equivalent may be based on a protein that arose through convergent evolution.
- CasX is a Cas9 equivalent that reportedly has the same function as Cas9 but which evolved through convergent evolution.
- the CasX protein described in Liu et al.,“CasX enzymes comprises a distinct family of RNA-guided genome editors,” Nature , 2019, Vol.566: 218-223, is contemplated to be used with the base editors described herein.
- Cas9 is a bacterial enzyme that evolved in a wide variety of species.
- the Cas9 equivalents contemplated herein may also be obtained from archaea, which constitute a domain and kingdom of single-celled prokaryotic microbes different from bacteria.
- Cas9 equivalents may refer to CasX or CasY, which have been described in, for example, Burstein et ah,“New CRISPR-Cas systems from uncultivated microbes.” Cell Res. 2017 Feb 21. doi: 10.1038/cr.2017.21, the entire contents of which is hereby incorporated by reference.
- genome-resolved metagenomics a number of CRISPR-Cas systems were identified, including the first reported Cas9 in the archaeal domain of life. This divergent Cas9 protein was found in little- studied nanoarchaea as part of an active CRISPR-Cas system.
- Cas9 refers to CasX, or a variant of CasX. In some embodiments, Cas9 refers to a CasY, or a variant of CasY. It should be appreciated that other RNA-guided DNA binding proteins may be used as a nucleic acid programmable DNA binding protein (napDNAbp), and are within the scope of this disclosure. Also see Liu et al., “CasX enzymes comprises a distinct family of RNA-guided genome editors,” Nature , 2019, Vol.566: 218-223. Any of these Cas9 equivalents are contemplated.
- the Cas9 equivalent comprises an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to a naturally-occurring CasX or CasY protein.
- the napDNAbp is a naturally-occurring CasX or CasY protein.
- the napDNAbp comprises an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to a wild-type Cas moiety or any Cas moiety provided herein.
- the nucleic acid programmable DNA binding proteins include, without limitation, Cas9 (e.g., dCas9 and nCas9), CasX, CasY, Cpfl, C2cl, C2c2, C2C3, Argonaute, Casl2a, and Casl2b.
- Cas9 e.g., dCas9 and nCas9
- CasX CasY
- Cpfl C2cl
- C2c2, C2C3, Argonaute Casl2a
- Casl2b e.g., a nucleic acid programmable DNA- binding protein that has different PAM specificity than Cas9 is Clustered Regularly
- Cpfl Interspaced Short Palindromic Repeats from Prevotella and Francisella 1 (Cpfl). Similar to Cas9, Cpfl is also a class 2 CRISPR effector. It has been shown that Cpfl mediates robust DNA interference with features distinct from Cas9. Cpfl is a single RNA-guided
- Cpfl cleaves DNA via a staggered DNA double-stranded break.
- TTN T-rich protospacer-adjacent motif
- TTTN TTTN
- YTN T-rich protospacer-adjacent motif
- Cpfl proteins are known in the art and have been described previously, for example Yamano et al,“Crystal structure of Cpfl in complex with guide RNA and target DNA.” Cell (165) 2016, p. 949-962; the entire contents of which is hereby incorporated by reference. The state of the art may also now refer to Cpfl enzymes as Casl2a.
- the Cas protein may include any CRISPR associated protein, including but not limited to Casl2a, Casl2b, Casl, CaslB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (sometimes referred to as Csnl and Csxl2), CaslO, Csyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2.
- the napDNAbp can be any of the following proteins: a Cas9, a Cpfl, a CasX, a CasY, a C2cl, a C2c2, a C2c3, a GeoCas9, a CjCas9, a Casl2a, a Casl2b, a Casl2g, a Casl2h, a Casl2i, a Casl3b, a Casl3c, a Casl3d, a Casl4, a Csn2, an xCas9, an SpCas9-NG, a circularly permuted Cas9, or an Argonaute (Ago), a Cas9-KKH, a SmacCas9, a Spy-macCas9, an SpCas9-VRQR, an SpCas9-NRRH, an SpaCas9-
- the base editors contemplated herein can include a Cas9 protein that is of smaller molecular weight than the canonical SpCas9 sequence.
- the smaller-sized Cas9 variants may facilitate delivery to cells, e.g., by an expression vector, nanoparticle, or other means of delivery.
- the canonical SpCas9 protein is 1368 amino acids in length and has a predicted molecular weight of 158 kilodaltons.
- small-sized Cas9 variant refers to any Cas9 variant— naturally occurring, engineered, or otherwise— that is less than at least 1300 amino acids, or at least less than 1290 amino acids, or than less than 1280 amino acids, or less than 1270 amino acid, or less than 1260 amino acid, or less than 1250 amino acids, or less than 1240 amino acids, or less than 1230 amino acids, or less than 1220 amino acids, or less than 1210 amino acids, or less than 1200 amino acids, or less than 1190 amino acids, or less than 1180 amino acids, or less than 1170 amino acids, or less than 1160 amino acids, or less than 1150 amino acids, or less than 1140 amino acids, or less than 1130 amino acids, or less than 1120 amino acids, or less than 1110 amino acids, or less than 1100 amino acids, or less than 1050 amino acids, or less than 1000 amino acids, or less than 950 amino acids, or less than 900 amino acids, or less than 850 amino acids, or less than 800 amino acids, or
- the base editors disclosed herein may comprise one of the small-sized Cas9 variants described as follows, or a Cas9 variant thereof having at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to any reference small-sized Cas9 protein.
- Exemplary small-sized Cas9 variants include, but are not limited to, SaCas9 and LbCasl2a.
- the base editors described herein may also comprise
- Casl2a/Cpfl (dCpfl) variants that may be used as a guide nucleotide sequence- programmable DNA-binding protein domain.
- the Casl2a/Cpfl protein has a RuvC-like endonuclease domain that is similar to the RuvC domain of Cas9 but does not have a HNH endonuclease domain, and the N-terminal of Cpfl does not have the alpha-helical recognition lobe of Cas9.
- Additional exemplary Cas9 equivalent protein sequences can include the following:
- the napDNAbp is a nucleic acid programmable DNA binding protein that does not require a canonical (NGG) PAM sequence.
- the napDNAbp is an argonaute protein.
- NgAgo is a ssDNA-guided endonuclease. NgAgo binds 5' phosphorylated ssDNA of ⁇ 24 nucleotides (gDNA) to guide it to its target site and will make DNA double-strand breaks at the gDNA site.
- NgAgo-gDNA system does not require a protospacer-adjacent motif (PAM).
- PAM protospacer-adjacent motif
- the disclosure provides napDNAbp domains that comprise SpCas9 variants that recognize and work best with NRRH, NRCH, and NRTH PAMs. See PCT Application No. PCT/US2019/47996, incorporated by reference herein.
- the disclosed base editors comprise a napDNAbp domain selected from SpCas9-NRRH, SpCas9-NRTH, and SpCas9-NRCH.
- the disclosed base editors comprise a napDNAbp domain that has a sequence that is at least 90%, at least 95%, at least 98%, or at least 99% identical to SpCas9-NRRH.
- the disclosed base editors comprise a napDNAbp domain that comprises SpCas9-NRRH.
- the SpCas9-NRRH has an amino acid sequence as presented in SEQ ID NO: 203 (underlined residues are mutated relative to SpCas9, as set forth in SEQ ID NO: 9)
- the disclosed base editors comprise a napDNAbp domain that has a sequence that is at least 90%, at least 95%, at least 98%, or at least 99% identical to
- the disclosed base editors comprise a napDNAbp domain that comprises SpCas9-NRCH.
- the SpCas9-NRCH has an amino acid sequence as presented in SEQ ID NO: 204 (underligned residues are mutated relative to SpCas9) MDKKY S IGLDIGTNS VGWAVITDEYKVPS KKFKVLGNTDRHS IKKNLIGALLFDS GET
- the disclosed base editors comprise a napDNAbp domain that has a sequence that is at least 90%, at least 95%, at least 98%, or at least 99% identical to
- the disclosed base editors comprise a napDNAbp domain that comprises SpCas9-NRTH.
- the SpCas9-NRTH has an amino acid sequence as presented in SEQ ID NO: 205 (underligned residues are mutated relative to SpCas9)
- the napDNAbp of any of the disclosed base editors comprises a Cas9 derived from a Streptococcus macacae, e.g. Streptococcus macacae NCTC 11558, or
- the napDNAbp comprises a hybrid variant of SmacCas9 that incorporates an SpCas9 domain with the SmacCas9 domain and is known as Spy-macCas9, or a variant thereof.
- the napDNAbp comprises a hybrid variant of SmacCas9 that incorporates an increased nucleolytic variant of an SpCas9 (iSpy Cas9) domain and is known as iSpy-macCas9.
- iSpyMac-Cas9 contains two mutations, R221K and N394K, that were identified by deep mutational scans of Spy Cas9 that raise modification rates of the protein on most targets. See
- the disclosed base editors comprise a napDNAbp domain that has a sequence that is at least 90%, at least 95%, at least 98%, or at least 99% identical to iSpyMac-Cas9.
- the disclosed base editors comprise a napDNAbp domain that comprises iSpyMac-Cas9.
- the iSpyMac-Cas9 has an amino acid sequence as presented in SEQ ID NO: 206 (R221K and N394K mutations are underlined):
- the napDNAbp of any of the disclosed base editors is a prokaryotic homolog of an Argonaute protein.
- Prokaryotic homologs of Argonaute proteins are known and have been described, for example, in Makarova K., el al.,“Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements”, Biol Direct. 2009 Aug 25;4:29. doi:
- the napDNAbp is a Marinitoga piezophila Argunaute (MpAgo) protein.
- the CRISPR-associated Marinitoga piezophila Argunaute (MpAgo) protein cleaves single- stranded target sequences using 5'-phosphorylated guides.
- the 5' guides are used by all known Argonautes.
- the crystal structure of an MpAgo-RNA complex shows a guide strand binding site comprising residues that block 5' phosphate interactions. This data suggests the evolution of an Argonaute subclass with noncanonical specificity for a 5'-hydroxylated guide.
- the napDNAbp is a single effector of a microbial CRISPR-Cas system.
- Single effectors of microbial CRISPR-Cas systems include, without limitation, Cas9, Cpfl, C2cl, C2c2, and C2c3.
- microbial CRISPR-Cas systems are divided into Class 1 and Class 2 systems. Class 1 systems have multisubunit effector complexes, while Class 2 systems have a single protein effector.
- Cas9 and Cpfl are Class 2 effectors.
- three distinct Class 2 CRISPR-Cas systems (C2cl, C2c2, and C2c3) have been described by Shmakov el al.,“Discovery and Functional
- C2cl and C2c3 contain RuvC-like endonuclease domains related to Cpfl.
- a third system, C2c2 contains an effector with two predicated HEPN RNase domains.
- C2cl depends on both CRISPR RNA and tracrRNA for DNA cleavage.
- Bacterial C2c2 has been shown to possess a unique RNase activity for CRISPR RNA maturation distinct from its RNA-activated single- stranded RNA degradation activity. These RNase functions are different from each other and from the CRISPR RNA-processing behavior of Cpfl. See, e.g., East-Seletsky, et al.,“Two distinct RNase activities of CRISPR- C2c2 enable guide-RNA processing and RNA detection”, Nature, 2016 Oct
- C2c2 is a single-component programmable RNA-guided RNA- targeting CRISPR effector”, Science, 2016 Aug 5; 353(6299), the entire contents of which are hereby incorporated by reference.
- the napDNAbp may be a C2cl, a C2c2, or a C2c3 protein. In some embodiments, the napDNAbp is a C2cl protein. In some embodiments, the napDNAbp is a C2c2 protein. In some embodiments, the napDNAbp is a C2c3 protein.
- the napDNAbp comprises an amino acid sequence that is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to a naturally-occurring C2cl, C2c2, or C2c3 protein.
- the napDNAbp is a naturally-occurring C2cl, C2c2, or C2c3 protein.
- Cas9 domains that have different PAM specificities.
- Cas9 proteins such as Cas9 from S. pyogenes (spCas9)
- spCas9 require a canonical NGG PAM sequence to bind a particular nucleic acid region. This may limit the ability to edit desired bases within a genome.
- the base editing base editors provided herein may need to be placed at a precise location, for example where a target base is placed within a 4 base region (e.g ., a“editing window” or a“target window”), which is approximately 15 bases upstream of the PAM. See Komor, A.C., et al,
- any of the base editors provided herein may contain a Cas9 domain that is capable of binding a nucleotide sequence that does not contain a canonical (e.g., NGG) PAM sequence.
- Cas9 domains that bind to non-canonical PAM sequences have been described in the art and would be apparent to the skilled artisan.
- Cas9 domains that bind non-canonical PAM sequences have been described in Kleinstiver, B. P., et al.,“Engineered CRISPR-Cas9 nucleases with altered PAM
- a napDNAbp domain with altered PAM specificity such as a domain with at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity with wild type Francisella novicida Cpfl (SEQ ID NO: 207) (D917, E1006, and D1255), which has the following amino acid sequence:
- An additional napDNAbp domain with altered PAM specificity such as a domain having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity with wild type Geobacillus thermodenitrificans Cas9 (SEQ ID NO: 208), which has the following amino acid sequence:
- the nucleic acid programmable DNA binding protein [0184] In some embodiments, the nucleic acid programmable DNA binding protein
- napDNAbp is a nucleic acid programmable DNA binding protein that does not require a canonical (NGG) PAM sequence.
- the napDNAbp is an argonaute protein.
- One example of such a nucleic acid programmable DNA binding protein is an Argonaute protein from Natronobacterium gregoryi (NgAgo).
- NgAgo is a ssDNA-guided endonuclease.
- NgAgo binds 5' phosphorylated ssDNA of ⁇ 24 nucleotides (gDNA) to guide it to its target site and will make DNA double-strand breaks at the gDNA site.
- the NgAgo-gDNA system does not require a protospacer-adjacent motif (PAM).
- NgAgo nuclease inactive NgAgo
- the characterization and use of NgAgo have been described in Gao et al, Nat Biotechnol., 34(7): 768-73 (2016), PubMed PMID: 27136078; Swarts et al., Nature, 507(7491): 258-61 (2014); and Swarts et al., Nucleic Acids Res. 43(10) (2015): 5120-9, each of which is incorporated herein by reference.
- the sequence of Natronobacterium gregoryi Argonaute is provided in SEQ ID NO: 209.
- the disclosed base editors may comprise a napDNAbp domain having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity with wild type Natronobacterium gregoryi Argonaute (SEQ ID NO: 209), which has the following amino acid sequence:
- the base editors disclosed herein may comprise a circular permutant of Cas9.
- the term“circularly permuted Cas9” or“circular permutant” of Cas9 or“CP-Cas9”) refers to any Cas9 protein, or variant thereof, that occurs or has been modify to engineered as a circular permutant variant, which means the N-terminus and the C-terminus of a Cas9 protein (e.g., a wild type Cas9 protein) have been topically rearranged.
- Such circularly permuted Cas9 proteins, or variants thereof retain the ability to bind DNA when complexed with a guide RNA (gRNA).
- gRNA guide RNA
- the instant disclosure contemplates any previously known CP-Cas9 or use a new CP-Cas9 so long as the resulting circularly permuted protein retains the ability to bind DNA when complexed with a guide RNA (gRNA).
- gRNA guide RNA
- the circular permutants of Cas9 may have the following structure:
- the present disclosure contemplates the following circular permutants of canonical S. pyogenes Cas9 (1368 amino acids of UniProtKB - Q99ZW2 (CAS9_STRP1) (numbering is based on the amino acid position in SEQ ID NO: 9)):
- the circular permuant Cas9 has the following structure (based on S. pyogenes Cas9 (1368 amino acids of UniProtKB - Q99ZW2 (CAS9_STRP1) (numbering is based on the amino acid position in SEQ ID NO: 9):
- the circular permuant Cas9 has the following structure (based on S. pyogenes Cas9 (1368 amino acids of UniProtKB - Q99ZW2 (CAS9_STRP1) (numbering is based on the amino acid position in SEQ ID NO: 9):
- the circular permutant can be formed by linking a C-terminal fragment of a Cas9 to an N-terminal fragment of a Cas9, either directly or by using a linker, such as an amino acid linker.
- the C-terminal fragment may correspond to the C-terminal 95% or more of the amino acids of a Cas9 (e.g., amino acids about 1300-1368), or the C-terminal 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%,
- the N-terminal portion may correspond to the N-terminal 95% or more of the amino acids of a Cas9 (e.g., amino acids about 1-1300), or the N-terminal 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, or 5% or more of a Cas9 (e.g., of SEQ ID NO: 9).
- a Cas9 e.g., amino acids about 1-1300
- the N-terminal 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, or 5% or more of a Cas9 e.g., of SEQ ID NO: 9).
- the circular permutant can be formed by linking a C-terminal fragment of a Cas9 to an N-terminal fragment of a Cas9, either directly or by using a linker, such as an amino acid linker.
- the C-terminal fragment that is rearranged to the N-terminus includes or corresponds to the C-terminal 30% or less of the amino acids of a Cas9 (e.g., amino acids 1012-1368 of SEQ ID NO: 9).
- the C-terminal fragment that is rearranged to the N-terminus includes or corresponds to the C-terminal 30%, 29%, 28%, 27%, 26%, 25%, 24%, 23%, 22%, 21%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%,
- the C-terminal fragment that is rearranged to the N-terminus includes or corresponds to the C-terminal 410 residues or less of a Cas9 (e.g., the Cas9 of SEQ ID NO:
- the C-terminal portion that is rearranged to the N-terminus includes or corresponds to the C-terminal 410, 400, 390, 380, 370, 360, 350, 340, 330, 320, 310, 300, 290, 280, 270, 260, 250, 240, 230, 220, 210, 200, 190, 180, 170, 160, 150, 140,
- the C-terminal portion that is rearranged to the N- terminus includes or corresponds to the C-terminal 357, 341, 328, 120, or 69 residues of a Cas9 (e.g., the Cas9 of SEQ ID NO: 9).
- circular permutant Cas9 variants may be defined as a topological rearrangement of a Cas9 primary structure based on the following method, which is based on S. pyogenes Cas9 of SEQ ID NO: 9: (a) selecting a circular permutant (CP) site corresponding to an internal amino acid residue of the Cas9 primary structure, which dissects the original protein into two halves: an N-terminal region and a C-terminal region; (b) modifying the Cas9 protein sequence (e.g., by genetic engineering techniques) by moving the original C-terminal region (comprising the CP site amino acid) to preceed the original N- terminal region, thereby forming a new N-terminus of the Cas9 protein that now begins with the CP site amino acid residue.
- CP circular permutant
- the CP site can be located in any domain of the Cas9 protein, including, for example, the helical-II domain, the RuvCIII domain, or the CTD domain.
- the CP site may be located (relative the S. pyogenes Cas9 of SEQ ID NO: 9) at original amino acid residue 181, 199, 230, 270, 310, 1010, 1016, 1023, 1029, 1041, 1247, 1249, or 1282.
- original amino acid 181, 199, 230, 270, 310, 1010, 1016, 1023, 1029, 1041, 1247, 1249, or 1282 would become the new N- terminal amino acid.
- Nomenclature of these CP-Cas9 proteins may be referred to as Cas9- CP 181 , Cas9-CP 199 , Cas9-CP 230 , Cas9-CP 270 , Cas9-CP 310 , Cas9-CP 1010 , Cas9-CP 1016 , Cas9- CP 1023 , Cas9-CP 1029 , Cas9-CP 1041 , Cas9-CP 1247 , Cas9-CP 1249 , and Cas9-CP 1282 , respectively.
- This description is not meant to be limited to making CP variants from SEQ ID NO: 9, but may be implemented to make CP variants in any Cas9 sequence, either at CP sites that correspond to these positions, or at other CP sites entireley. This description is not meant to limit the specific CP sites in any way. Virtually any CP site may be used to form a CP-Cas9 variant.
- Exemplary CP-Cas9 amino acid sequences are provided below in which linker sequences are indicated by underlining and optional methionine (M) residues are indicated in bold. It should be appreciated that the disclosure provides CP-Cas9 sequences that do not include a linker sequence or that include different linker sequences. It should be appreciated that CP-Cas9 sequences may be based on Cas9 sequences other than that of SEQ ID NO: 9 and any examples provided herein are not meant to be limiting. Exemplary CP-Cas9 sequences are as follows:
- Cas9 circular permutants that may be useful in the base editor constructs described herein.
- Exemplary C-terminal fragments of Cas9 based on the Cas9 of SEQ ID NO: 9, which may be rearranged to an N-terminus of Cas9, are provided below. It should be appreciated that such C-terminal fragments of Cas9 are exemplary and are not meant to be limiting.
- These exemplary CP-Cas9 fragments have the following sequences:
- the base editors of the present disclosure may also comprise Cas9 variants with modified PAM specificities.
- Some aspects of this disclosure provide Cas9 proteins that exhibit activity on a target sequence that does not comprise the canonical PAM (5'-NGG-3', where N is A, C, G, or T) at its 3 '-end.
- the Cas9 protein exhibits activity on a target sequence comprising a 5'-NGG-3' PAM sequence at its 3 '-end.
- the Cas9 protein exhibits activity on a target sequence comprising a 5 -NNG- 3' PAM sequence at its 3 '-end.
- the Cas9 protein exhibits activity on a target sequence comprising a 5'-NNA-3' PAM sequence at its 3 '-end. In some embodiments, the Cas9 protein exhibits activity on a target sequence comprising a 5'-NNC-3' PAM sequence at its 3 '-end. In some embodiments, the Cas9 protein exhibits activity on a target sequence comprising a 5 -NNT-3' PAM sequence at its 3'-end. In some embodiments, the Cas9 protein exhibits activity on a target sequence comprising a 5 -NGT-3' PAM sequence at its 3'-end.
- the Cas9 protein exhibits activity on a target sequence comprising a 5 -NGA-3' PAM sequence at its 3'-end. In some embodiments, the Cas9 protein exhibits activity on a target sequence comprising a 5 -NGC-3' PAM sequence at its 3'-end. In some embodiments, the Cas9 protein exhibits activity on a target sequence comprising a 5'- NAA-3' PAM sequence at its 3 -end. In some embodiments, the Cas9 protein exhibits activity on a target sequence comprising a 5 -NAC-3' PAM sequence at its 3 '-end.
- the Cas9 protein exhibits activity on a target sequence comprising a 5 -NAT-3' PAM sequence at its 3 -end. In still other embodiments, the Cas9 protein exhibits activity on a target sequence comprising a 5 -NAG-3' PAM sequence at its 3 -end.
- the disclosed base editors comprise a napDNAbp domain comprising a SpCas9-NG, which has a PAM that corresponds to NGN.
- the disclosed base editors comprise a napDNAbp domain comprising a SpCas9-KKH, which has a PAM that corresponds to NNNRRT (SEQ ID NO: 222).
- any of the amino acid mutations described herein, (e.g., A262T) from a first amino acid residue (e.g., A) to a second amino acid residue (e.g., T) may also include mutations from the first amino acid residue to an amino acid residue that is similar to (e.g., conserved) the second amino acid residue.
- mutation of an amino acid with a hydrophobic side chain may be a mutation to a second amino acid with a different hydrophobic side chain (e.g., alanine, valine, isoleucine, leucine, methionine, phenylalanine, tyrosine, or tryptophan).
- alanine, valine, isoleucine, leucine, methionine, phenylalanine, tyrosine, or tryptophan may be a mutation to a second amino acid with a different hydrophobic side chain (e.g., alanine, valine, isoleucine, leucine, methionine, phenylalanine, tyrosine, or tryptophan).
- a mutation of an alanine to a threonine may also be a mutation from an alanine to an amino acid that is similar in size and chemical properties to a threonine, for example, serine.
- mutation of an amino acid with a positively charged side chain e.g., arginine, histidine, or lysine
- mutation of a second amino acid with a different positively charged side chain e.g., arginine, histidine, or lysine.
- mutation of an amino acid with a polar side chain may be a mutation to a second amino acid with a different polar side chain (e.g., serine, threonine, asparagine, or glutamine).
- Additional similar amino acid pairs include, but are not limited to, the following: phenylalanine and tyrosine; asparagine and glutamine; methionine and cysteine; aspartic acid and glutamic acid; and arginine and lysine. The skilled artisan would recognize that such conservative amino acid substitutions will likely have minor effects on protein structure and are likely to be well tolerated without compromising function.
- any amino of the amino acid mutations provided herein from one amino acid to a threonine may be an amino acid mutation to a serine.
- any amino of the amino acid mutations provided herein from one amino acid to an arginine may be an amino acid mutation to a lysine.
- any amino of the amino acid mutations provided herein from one amino acid to an isoleucine may be an amino acid mutation to an alanine, valine, methionine, or leucine.
- any amino of the amino acid mutations provided herein from one amino acid to a lysine may be an amino acid mutation to an arginine.
- any amino of the amino acid mutations provided herein from one amino acid to an aspartic acid may be an amino acid mutation to a glutamic acid or asparagine.
- any amino of the amino acid mutations provided herein from one amino acid to a valine may be an amino acid mutation to an alanine, isoleucine, methionine, or leucine.
- any amino of the amino acid mutations provided herein from one amino acid to a glycine may be an amino acid mutation to an alanine. It should be appreciated, however, that additional conserved amino acid residues would be recognized by the skilled artisan and any of the amino acid mutations to other conserved amino acid residues are also within the scope of this disclosure.
- the present disclosure may utilize any of the Cas9 variants disclosed in the SEQUENCES section herein.
- the Cas9 protein comprises a combination of mutations that exhibit activity on a target sequence comprising a 5 -NAA-3' PAM sequence at its 3 - end.
- the combination of mutations are present in any one of the clones listed in Table 1.
- the combination of mutations are conservative mutations of the clones listed in Table 1.
- the Cas9 protein comprises the combination of mutations of any one of the Cas9 clones listed in Table 1.
- the Cas9 protein comprises an amino acid sequence that is at least 80% identical to the amino acid sequence of a Cas9 protein as provided by any one of the variants of Table 1. In some embodiments, the Cas9 protein comprises an amino acid sequence that is at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to the amino acid sequence of a Cas9 protein as provided by any one of the variants of Table 1.
- the Cas9 protein exhibits an increased activity on a target sequence that does not comprise the canonical PAM (5'-NGG-3') at its 3' end as compared to Streptococcus pyogenes Cas9 as provided by SEQ ID NO: 9.
- the Cas9 protein exhibits an activity on a target sequence having a 3' end that is not directly adjacent to the canonical PAM sequence (5'-NGG-3') that is at least 5-fold increased as compared to the activity of Streptococcus pyogenes Cas9 as provided by SEQ ID NO: 9 on the same target sequence.
- the Cas9 protein exhibits an activity on a target sequence that is not directly adjacent to the canonical PAM sequence (5'-NGG-3') that is at least 10- fold, at least 50-fold, at least 100-fold, at least 500-fold, at least 1,000-fold, at least 5,000- fold, at least 10,000-fold, at least 50,000-fold, at least 100,000-fold, at least 500,000-fold, or at least 1,000,000-fold increased as compared to the activity of Streptococcus pyogenes as provided by SEQ ID NO: 9 on the same target sequence.
- the 3' end of the target sequence is directly adjacent to an AAA, GAA, CAA, or TAA sequence.
- the Cas9 protein comprises a combination of mutations that exhibit activity on a target sequence comprising a 5 -NAC-3' PAM sequence at its 3 '-end. In some embodiments, the combination of mutations are present in any one of the clones listed in Table 2. In some embodiments, the combination of mutations are conservative mutations of the clones listed in Table 2. In some embodiments, the Cas9 protein comprises the combination of mutations of any one of the Cas9 clones listed in Table 2.
- the Cas9 protein comprises an amino acid sequence that is at least 80% identical to the amino acid sequence of a Cas9 protein as provided by any one of the variants of Table 2. In some embodiments, the Cas9 protein comprises an amino acid sequence that is at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to the amino acid sequence of a Cas9 protein as provided by any one of the variants of Table 2.
- the Cas9 protein comprises a combination of mutations that exhibit activity on a target sequence comprising a 5'-NAT-3' PAM sequence at its 3 '-end.
- the combination of mutations are present in any one of the clones listed in Table 3.
- the combination of mutations are conservative mutations of the clones listed in Table 3.
- the Cas9 protein comprises the combination of mutations of any one of the Cas9 clones listed in Table 3. Table 3: NAT PAM Clones
- the above description of various napDNAbps which can be used in connection with the presently disclose base editors is not meant to be limiting in any way.
- the base editors may comprise the canonical SpCas9, or any ortholog Cas9 protein, or any variant Cas9 protein— including any naturally occurring variant, mutant, or otherwise engineered version of Cas9— that is known or which can be made or evolved through a directed evolutionary or otherwise mutagenic process.
- the Cas9 or Cas9 varants have a nickase activity, i.e., only cleave of strand of the target DNA sequence.
- the Cas9 or Cas9 variants have inactive nucleases, i.e., are“dead” Cas9 proteins.
- Other variant Cas9 proteins that may be used are those having a smaller molecular weight than the canonical SpCas9 (e.g., for easier delivery) or having modified or rearranged primary amino acid structure (e.g., the circular permutant formats).
- the base editors described herein may also comprise Cas9 equivalents, including Casl2a/Cpfl and Casl2b proteins which are the result of convergent evolution.
- the napDNAbps used herein may also may also contain various modifications that alter/enhance their PAM specifities.
- the application contemplates any Cas9, Cas9 variant, or Cas9 equivalent which has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.9% sequence identity to a reference Cas9 sequence, such as a references SpCas9 canonical sequences or a reference Cas9 equivalent (e.g., Casl2a/Cpfl).
- a reference Cas9 sequence such as a references SpCas9 canonical sequences or a reference Cas9 equivalent (e.g., Casl2a/Cpfl).
- the Cas9 variant having expanded PAM capabilities is SpCas9 (H840A) VRQR, or SpCas9-VRQR.
- the disclosed base editors comprise a napDNAbp domain that has a sequence that is at least 90%, at least 95%, at least 98%, or at least 99% identical to SpCas9-VRQR.
- the disclosed base editors comprise a napDNAbp domain that comprises SpCas9-VRQR.
- the SpCas9- VRQR comprises the following amino acid sequence (with the V, R, Q, R substitutions relative to the SpCas9 (H840A) of SEQ ID NO: 220 show, in bold underline.
- the methionine residue in SpCas9 (H840) was removed for SpCas9 (H840A) VRQR):
- the Cas9 variant having expanded PAM capabilities is SpCas9 (H840A) VRER, having the following amino acid sequence (with the V, R, E, R substitutions relative to the SpCas9 (H840A) of SEQ ID NO: 221 are shown in bold underline.
- the methionine residue in SpCas9 (H840) was removed for SpCas9 (H840A) VRER):
- any available methods may be utilized to obtain or construct a variant or mutant Cas9 protein.
- the term“mutation,” as used herein, refers to a substitution of a residue within a sequence, e.g., a nucleic acid or amino acid sequence, with another residue, or a deletion or insertion of one or more residues within a sequence. Mutations are typically described herein by identifying the original residue followed by the position of the residue within the sequence and by the identity of the newly substituted residue.
- Mutations can include a variety of categories, such as single base polymorphisms, microduplication regions, indel, and inversions, and is not meant to be limiting in any way. Mutations can include“loss-of-function” mutations which is the normal result of a mutation that reduces or abolishes a protein activity.
- Gain-of-function mutations are recessive, because in a heterozygote the second chromosome copy carries an unmutated version of the gene coding for a fully functional protein whose presence compensates for the effect of the mutation. Mutations also embrace“gain-of-function” mutations, which is one which confers an abnormal activity on a protein or cell that is otherwise not present in a normal condition. Many gain-of-function mutations are in regulatory sequences rather than in coding regions, and can therefore have a number of consequences. For example, a mutation might lead to one or more genes being expressed in the wrong tissues, these tissues gaining functions that they normally lack. Because of their nature, gain-of-function mutations are usually dominant.
- Mutations can be introduced into a reference Cas9 protein using site-directed mutagenesis.
- Older methods of site-directed mutagenesis known in the art rely on sub cloning of the sequence to be mutated into a vector, such as an M13 bacteriophage vector, that allows the isolation of single-stranded DNA template.
- a mutagenic primer i.e., a primer capable of annealing to the site to be mutated but bearing one or more mismatched nucleotides at the site to be mutated
- a mutagenic primer i.e., a primer capable of annealing to the site to be mutated but bearing one or more mismatched nucleotides at the site to be mutated
- PCR-based site-directed mutagenesis has employed PCR methodologies, which have the advantage of not requiring a single-stranded template.
- methods have been developed that do not require sub-cloning.
- Several issues must be considered when PCR-based site-directed mutagenesis is performed. First, in these methods it is desirable to reduce the number of PCR cycles to prevent expansion of undesired mutations introduced by the polymerase. Second, a selection must be employed in order to reduce the number of non-mutated parental molecules persisting in the reaction. Third, an extended-length PCR method is preferred in order to allow the use of a single PCR primer set. And fourth, because of the non-template-dependent terminal extension activity of some thermostable polymerases it is often necessary to incorporate an end-polishing step into the procedure prior to blunt-end ligation of the PCR-generated mutant product.
- the disclosure provides adenine-to-thymine base editors that comprise an adenosine deaminase domain.
- any of the disclosed base editors are capable of deaminating adenosine in a nucleic acid sequence (e.g ., DNA or RNA).
- any of the base editors provided herein may be base editors (e.g., adenine base editors).
- the disclosed adenosine deaminases are variants of known adenine deaminase TadA7.10, which comprises the following mutations as compared to wild-type ecTadA: W23R, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y,
- the adenosine deaminases of the disclosed base editors hydrolytically deaminate a targeted adenosine in a nucleic acid of interest to an inosine, which is read as a guanosine (G) by DNA polymerase enzymes.
- G guanosine
- adenosine deaminases are provided herein.
- the adenosine deaminase domain of any of the disclosed base editors comprises a single adenosine deaminase, or a monomer.
- the adenosine deaminase domain comprises 2, 3, 4 or 5 adenosine deaminases.
- the adenosine deaminase domain comprises two adenosine deaminases, or a dimer.
- the deaminase domain comprises a dimer of an engineered (or evolved) deaminase and a wild-type deaminase, such as a wild-type E. coli deaminase.
- a wild-type deaminase such as a wild-type E. coli deaminase.
- the mutations provided herein may be applied to adenosine deaminases in other adenosine base editors, for example, those provided in International Publication No. WO 2018/027078, published August 2, 2018; International Publication No. WO 2019/079347 on April 25, 2019; International Application No
- any of the adenosine deaminases provided herein are capable of deaminating adenine, e.g., deaminating adenine in a deoxyadenosine residue of DNA.
- the adenosine deaminase may be derived from any suitable organism (e.g., E. coli).
- the adenosine deaminase is a naturally-occurring adenosine deaminase that includes one or more mutations corresponding to any of the mutations provided herein (e.g., mutations in ecTadA).
- mutations in ecTadA e.g., mutations in ecTadA.
- One of skill in the art will be able to identify the corresponding residue in any homologous protein and in the respective encoding nucleic acid by methods well known in the art, e.g., by sequence alignment and determination of homologous residues.
- adenosine deaminase e.g., having homology to ecTadA
- the adenosine deaminase is derived from a prokaryote.
- the adenosine deaminase is from a bacterium.
- the adenosine deaminase is from Escherichia coli, Staphylococcus aureus, Salmonella typhi, Shewanella putrefaciens, Haemophilus influenzae, Caulobacter crescentus, or Bacillus subtilis. In some embodiments, the adenosine deaminase is from E. coli.
- the adenosine deaminase is a naturally-occurring adenosine deaminase that includes one or more mutations corresponding to any of the mutations provided herein (e.g., mutations in ecTadA).
- ecTadA natively operates as a homodimer, with one monomer catalyzing deamination, and the other monomer acting as a docking station for the tRNA substrate.
- the adenosine deaminase may be modified.
- Modified adenosine deaminases may be obtained by, e.g., evolving a reference version using a continuous evolution process (e.g., PACE) described herein so that the deaminase is effective on a nucleic acid target.
- the adenosine deaminases provided herein are capable of deaminating adenine.
- the adenosine deaminases provided herein are capable of deaminating adenine in a
- the deaminase provided herein is a dier of two adenosine deaminases. In various embodiments, the deaminase provided herein is a homodimer of two TadA deaminases.
- the deaminase provided herein is a heterodimer of a wild-type TadA deaminase and an evolved variant of a TadA deaminase. In other embodiments, the deaminase provided herein is a monomer of an evolved variant of a TadA deaminase.
- the deaminase provided herein is a dimer of two adenosine deaminases that is linked covalently or non-covalently to a napDNAbp and an oxidase.
- the adenosine deaminase comprises an amino acid sequence that is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to any one of the amino acid sequences set forth in any one of SEQ ID NOs: 1, 81-130, or to any of the adenosine deaminases provided herein. It should be appreciated that adenosine deaminases provided herein may include one or more mutations ( e.g ., any of the mutations provided herein).
- the disclosure provides any deaminase domains with a certain percent identiy plus any of the mutations or combinations thereof described herein.
- the adenosine deaminase comprises an amino acid sequence that has 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 21, 24, 25, 26, 27, 28, 29,
- the adenosine deaminase comprises an amino acid sequence that has at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 110, at least 120, at least 130, at least 140, at least 150, at least 160, or at least 170 identical contiguous amino acid residues as compared to any one of the amino acid sequences set forth in SEQ ID NOs: 1, 81-130, or any of the adenosine deaminases provided herein.
- the adenosine deaminase comprises a D108X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a D108G, D108N, D108V, D108A, or D108Y mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a D108N mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase. It should be appreciated, however, that additional deaminases may similarly be aligned to identify homologous amino acid residues that can be mutated as provided herein.
- the adenosine deaminase comprises an A106X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an A106V mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a E155X mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where the presence of X indicates any amino acid other than the corresponding amino acid in the wild- type adenosine deaminase.
- the adenosine deaminase comprises a E155D, E155G, or E155V mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a E155V mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a D147X mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where the presence of X indicates any amino acid other than the corresponding amino acid in the wild- type adenosine deaminase.
- the adenosine deaminase comprises a D147Y mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the disclosure provides an adenosine deaminase domain that is a homodimer of TadA connected through a linker sequence and that comprises an amino acid sequence that is at least 80%, 85%, 90%, 95%, 98%, or 99% identical to the amino acid sequence of:
- the disclosure provides base editors comprising a Cas9 nickase (D10A) domain and adenosine deaminase domain that is a homodimer of TadA (i.e., TadA-TadA-Cas9 nickase) that comprises an amino acid sequence that is at least 80%, 85%, 90%, 95%, 98%, or 99% identical to the amino acid sequence of:
- aureus TadA (saTadA), or other adenosine deaminases (e.g., bacterial adenosine deaminases). It would be apparent to the skilled artisan how to identify amino acid residues from other adenosine deaminases that are homologous to the mutated residues in ecTadA. Thus, any of the mutations identified in ecTadA may be made in other adenosine deaminases that have homologous amino acid residues. It should also be appreciated that any of the mutations provided herein may be made individually or in any combination in ecTadA or another adenosine deaminase.
- an adenosine deaminase may contain a D108N, a A106V, a E155V, and/or a D147Y mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- an adenosine deaminase comprises the following group of mutations (groups of mutations are separated by a in ecTadA SEQ ID NO: 1, or corresponding mutations in another adenosine deaminase: D108N and A106V; D108N and E155V; D108N and D147Y; A106V and E155V; A106V and D147Y; E155V and D147Y; D108N, A106V, and E55V; D108N, A106V, and D147Y; D108N, E55V, and D147Y; A106V, E55V, and D147Y; and D108N, A106V, E55V, and D147Y.
- an adenosine deaminase e.g., ecTadA
- an adenosine deaminase comprises one or more of the mutations , which identifies individual mutations and combinations of mutations made in ecTadA.
- an adenosine deaminase comprises any mutation or combination of mutations provided herein.
- the adenosine deaminase comprises an L84X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an L84F mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an H123X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an H123Y mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an I156X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an I156F mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises one, two, three, four, five, six, or seven mutations selected from the group consisting of L84X, A106X, D108X, H123X, D147X, E155X, and I156X in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises one, two, three, four, five, six, or seven mutations selected from the group consisting of L84F, A106V, D108N, H123Y, D147Y, E155V, and I156F in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises one, two, three, four, five, or six mutations selected from the group consisting of S2A, I49F, A106V, D108N, D147Y, and E155V in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises one, two, three, four, or five, mutations selected from the group consisting of H8Y, A106T, D108N, N127S, and K160S in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises an A142X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an A142N, A142D, A142G, mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an A142N mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an H36X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an H36L mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an N37X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an N37T, or N37S mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a N37S mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an P48X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an P48T, P48S, P48A, or P48L mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a P48T mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase. In some embodiments, the adenosine deaminase comprises a P48S mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase. In some embodiments, the adenosine deaminase comprises a P48A mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an R51X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an R51H, or R51L mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a R51L mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an S146X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises an S146R, or S146C mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a S146C mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an K157X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a K157N mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an W23X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a W23R, or W23L mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a W23R mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase. In some embodiments, the adenosine deaminase comprises a W23L mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an R152X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a R152P, or R52H mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises a R152P mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase. In some embodiments, the adenosine deaminase comprises a R152H mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an R26X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a R26G mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an I49X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a I49V mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an N72X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a N72D mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an S97X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a S97C mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an G125X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a G125A mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises an K161X mutation in ecTadA SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises a K161T mutation in SEQ ID NO: 1, or a corresponding mutation in another adenosine deaminase.
- the adenosine deaminase comprises one or more of a W23X, H36X, N37X, P48X, I49X, R51X, N72X, L84X, S97X, A106X, D108X, H123X, G125X, A142X, S146X, D147X, R152X, E155X, I156X, K157X, and/or K161X mutation in SEQ ID NO: 1, or one or more corresponding mutations in another adenosine deaminase, where the presence of X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises one or more of W23L, W23R, H36L, P48S, P48A, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, R152P, E155V, I156F, and/or K157N mutation in SEQ ID NO: 1, or one or more corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises one or more of the mutations corresponding to SEQ ID NO: 1, or one or more corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of one or two mutations selected from A106X and D108X in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises or consists of one or two mutations selected from A106V and D108N in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, or four mutations selected from A106X, D108X, D147X, and E155X in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild- type adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, or four mutations selected from A106V, D108N, D147Y, and E155V in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase. In some embodiments, the adenosine deaminase comprises or consists of a A106V, D108N, D147Y, and E155V mutation in SEQ ID NO: 1, or corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, or seven mutations selected from L84X, A106X, D108X, H123X, D147X, E155X, and I156X in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, or seven mutations selected from L84F, A106V, D108N, H123Y, D147Y, E155V, and I156F in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of a L84F, A106V, D108N, H123Y, D147Y, E155V, and I156F mutation in SEQ ID NO: 1, or corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, or eleven mutations selected from H36X, R51X, L84X, A106X, D108X, H123X, S146X, D147X, E155X, I156X, and K157X in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, or eleven mutations selected from H36L, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y, E155V, I156F, and K157N in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of a H36L, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y, E155V, I156F, and K157N mutation in SEQ ID NO: 1, or corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, or twelve mutations selected from H36X, P48X, R51X, L84X, A106X, D108X, H123X, S146X, D147X, E155X, I156X, and K157X in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, or twelve mutations selected from H36L, P48S, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y, E155V, I156F, and K157N in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of a H36L, P48S, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y, E155V, I156F, and K157N mutation in SEQ ID NO: 1, or corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, or thirteen mutations selected from H36X, P48X, R51X, L84X, A106X, D108X, H123X, A142X, S146X, D147X, E155X, I156X, and K157X in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, or thirteen mutations selected from H36L, P48S, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, E155V, I156F, and K157N in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of a H36L, P48S, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, E155V, I156F, and K157N mutation in SEQ ID NO: 1, or corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, or fourteen mutations selected from W23X, H36X, P48X, R51X, L84X, A106X, D108X, H123X, A142X, S146X, D147X, E155X, I156X, and K157X in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, or fourteen mutations selected from W23L, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, E155V, I156F, and K157N in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of a W23L, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, E155V, I156F, and K157N mutation in SEQ ID NO: 1, or corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, or fourteen mutations selected from W23X, H36X, P48X, R51X, L84X, A106X, D108X, H123X,
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, or fourteen mutations selected from W23R, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y, R152P, E155V, I156F, and K157N in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of a W23R, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, S 146C, D147Y, R152P, E155V, I156F, and K157N mutation in SEQ ID NO: 1, or corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, or fifteen mutations selected from W23X, H36X, P48X, R51X, L84X, A106X, D108X, H123X, A142X, S146X, D147X, R152X, E155X, I156X, and K157X in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
- the adenosine deaminase comprises or consists of one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, or fifteen mutations selected from W23L, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, R152P, E155V, I156F, and K157N in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- W23L, H36L, P48A, R51L, L84F A106V, D108N, H123Y, A142N, S146C, D147Y, R152P, E155V, I156F, and K157N in SEQ ID NO: 1, or a corresponding mutation or mutations in another adenosine deaminase.
- the adenosine deaminase comprises or consists of a W23L, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, R152P, E155V, I156F, and K157N mutation in SEQ ID NO: 1, or corresponding mutations in another adenosine deaminase.
- the adenosine deaminase comprises one or more of the mutations corresponding to SEQ ID NO: 1, or one or more of the corresponding mutations in another deaminase. In some embodiments, the adenosine deaminase comprises or consists of a variant of SEQ ID NO: 1, or the corresponding variant in another adenosine deaminase.
- the adenosine deaminase may comprise one or more of the mutations provided in any of the adenosine deaminases (e.g., ecTadA adenosine deaminases).
- the adenosine deaminase comprises the combination of mutations of any of the adenosine deaminases (e.g., ecTadA adenosine deaminases).
- the adenosine deaminase may comprise the mutations W23R, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y, R152P, E155V, I156F, and K157N (relative to SEQ ID NO: 1), which is shown as ABE7.1.
- the adenosine deaminase may comprise the mutations H36L, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y, E155V, I156F, and K157N (relative to SEQ ID NO: 1).
- the adenosine deaminase comprises any of the following combination of mutations relative to SEQ ID NO:l, where each mutation of a combination is separated by a and each combination of mutations is between parentheses: (A106V_D108N), (R107C_D108N),
- the adenosine deaminase comprises an amino acid sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95, 98%, 99%, or 99.5% identical to any one of SEQ ID NOs: 1, 81-130, or any of the adenosine deaminases provided herein.
- the adenosine deaminase comprises an amino acid sequence that has 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 21, 24, 25, 26,
- the adenosine deaminase comprises an amino acid sequence that has at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 110, at least 120, at least 130, at least 140, at least 150, at least 160, or at least 166, identical contiguous amino acid residues as compared to any one of the amino acid sequences set forth in SEQ ID NOs: 1, 81-130, or any of the adenosine deaminases provided herein.
- the adenosine deaminase comprises the amino acid sequence of any one of SEQ ID NOs: 1, 81-130 or any of the adenosine deaminases provided herein. In some embodiments, the adenosine deaminase consists of the amino acid sequence of any one of SEQ ID NOs: 1, 81-130, or any of the adenosine deaminases provided herein.
- the ecTadA sequences provided below are from ecTadA (SEQ ID NO: 1), absent the N-terminal methionine (M).
- saTadA sequences provided below are from saTadA (SEQ ID NO: 8), absent the N-terminal methionine (M).
- amino acid numbering scheme used to identify the various amino acid mutations is derived from ecTadA (SEQ ID NO: 1) for E. coli TadA and saTadA (SEQ ID NO: 8) for S. aureus TadA.
- Amino acid mutations, relative to SEQ ID NO: 1 (ecTadA) or SEQ ID NO: 8 (saTadA) are indicated by underlining. [0274] ecTadA
- ecTadA H8Y, D108N, N127S, and E155D
- SEVEFSYEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRH DPTAHAEIMALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARN AKTGAAGSLMDVLHHPGMSHR VEITEGILADECAALLSDFFRMRRQDIKAQKKAQS STD (SEQ ID NO: 86)
- ecTadA (R26G, L84F, A106V, R107H, D108N, H123Y, A142N, A143D, D147Y, E155V, I156F)
- ecTadA E25G, R26G, L84F, A106V, R107H, D108N, H123Y, A142N, A143D, D147Y, E155V, I156F
- ecTadA E25D, R26G, L84F, A106V, R107K, D108N, H123Y, A142N, A143G, D147Y, E155V, I156F
- ecTadA E25M, R26G, L84F, A106V, R107P, D108N, H123Y, A142N, A143D, D147Y, E155V, I156F
- ecTadA (R26C, L84F, A106V, R107H, D108N, H123Y, A142N , D147Y, E155V, I156F)
- ecTadA (L84F, A106V, D108N, H123Y, S146R, D147Y, E155V, I156F) SEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRH DPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRN AKTGAAGSLMDVLHYPGMNHRVEITEGILADECAALLRYFFRMRRQVFKAQKKAQ SSTD (SEQ ID NO: 106)
- ecTadA H36L, L84F, A106V, D108N, H123Y, D147Y, E155V, K57N,
- ecTadA H36L, L84F, A106V, D108N, H123Y, S 146C, D147Y, E155V,
- ecTadA (L84F, A106V, D108N, H123Y, S146R, D147Y, E155V, I156F) SEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRH DPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRN AKTGAAGSLMDVLHYPGMNHRVEITEGILADECAALLRYFFRMRRQVFKAQKKAQ SSTD (SEQ ID NO: 110) [0304] ecTadA (N37S, R51H, L84F, A106V, D108N, H123Y, D147Y, E155V,
- ecTadA H36L, R51L, L84F, A106V, D108N, H123Y, S 146C, D147Y, E155V, I156F, K157N
- ecTadA H36F, P48S, R51F, F84F, A106V, D108N, H123Y, S 146C, D147Y, E155V, I156F, K157N
- ecTadA H36L, P48A, R51L, L84F, A106V, D108N, H123Y, S 146C, D147Y, E155V, I156F , K157N
- ecTadA (W23L, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, S 146C, D147Y, R152P, E155V, I156F, K157N)
- ecTadA H36L, P48S, R51L, L84F, A106V, D108N, H123Y, A142N, S 146C, D147Y, E155V, I156F, K157N
- ecTadA (W23F, H36F, P48A, R51F, F84F, A106V, D108N, H123Y, A142N, S 146C, D147Y, E155V, I156F, K157N)
- ecTadA W23L, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, A142N, S 146C, D147Y, R152P, E155V, I156F, K157N
- ecTadA (W23R, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, S 146C, D147Y, R152P, E155V, I156F, K157N)
- Some aspects of the disclosure provide fusion proteins that comprise a nucleic acid programmable DNA binding protein (napDNAbp) and at least two adenosine deaminase domains.
- napDNAbp nucleic acid programmable DNA binding protein
- dimerization of adenosine deaminases may improve the ability (e.g., efficiency) of the fusion protein to modify a nucleic acid base, for example to deaminate adenine.
- dimerization of adenosine deaminases may improve the ability (e.g., efficiency) of the fusion protein to modify a nucleic acid base, for example to deaminate adenine.
- any of the fusion proteins may comprise 2, 3, 4 or 5 adenosine deaminase domains.
- any of the fusion proteins provided herein comprise two adenosine deaminases.
- any of the fusion proteins provided herein contain only two adenosine deaminases.
- the adenosine deaminases are the same.
- the adenosine deaminases are any of the adenosine deaminases provided herein.
- the adenosine deaminases are different.
- the first adenosine deaminase is any of the adenosine deaminases provided herein
- the second adenosine is any of the adenosine deaminases provided herein, but is not identical to the first adenosine deaminase.
- the fusion protein may comprise a first adenosine deaminase and a second adenosine deaminase that both comprise the amino acid sequence of SEQ ID NO: 89, which contains a A106V,
- the fusion protein may comprise a first adenosine deaminase that comprises the amino acid sequence of SEQ ID NO: 124, which contains a H36L, P48S, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y, E155V, I156F, and K157N mutation from SEQ ID NO: 1, and a second adenosine deaminase domain that comprises the amino amino acid sequence of wild- type ecTadA (SEQ ID NO: 1).
- the fusion protein may comprise a first adenosine deaminase that comprises the amino acid sequence of SEQ ID NO: 127, which contains a H36L, P48S, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, E155V, I156F, and K157N mutation from SEQ ID NO: 1, and a second adenosine deaminase domain that comprises the amino amino acid sequence of wild-type ecTadA (SEQ ID NO: 1).
- the fusion protein may comprise a first adenosine deaminase that comprises the amino acid sequence of SEQ ID NO: 128, which contains a W23L, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, E155V, I156F, and K157N mutation from SEQ ID NO: 1, and a second adenosine deaminase domain that comprises the amino amino acid sequence of wild-type ecTadA (SEQ ID NO: 1).
- the fusion protein may comprise a first adenosine deaminase that comprises the amino acid sequence of SEQ ID NO: 129, which contains a W23L, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, A142N, S146C, D147Y, R152P, E155V, I156F, and K157N mutation from SEQ ID NO: 1, and a second adenosine deaminase domain that comprises the amino amino acid sequence of wild-type ecTadA (SEQ ID NO: 1).
- the fusion protein may comprise a first adenosine deaminase that comprises the amino acid sequence of SEQ ID NO: 130, which contains a W23R, H36L, P48A, R51L, L84F, A106V, D108N, H123Y, S146C, D147Y, R152P, E155V, I156F, and K157N mutation from SEQ ID NO: 1, and a second adenosine deaminase domain that comprises the amino amino acid sequence of wild-type ecTadA (SEQ ID NO: 1).
- the fusion protein comprises two adenosine deaminases (e.g a first adenosine deaminase and a second adenosine deaminase). In some embodiments, the fusion protein comprises a first adenosine deaminase and a second adenosine deaminase. In some embodiments, the first adenosine deaminase is N-terminal to the second adenosine deaminase in the fusion protein.
- the first adenosine deaminase is C- terminal to the second adenosine deaminase in the fusion protein.
- the first adenosine deaminase and the second deaminase are fused directly or via a linker.
- the linker is any of the linkers provided herein, for example, any of the linkers described in the“Linkers” section.
- the linker comprises the amino acid sequence of any one of SEQ ID NOs: 2, 11, 12, 14 28, 59, 60 or 79. In some embodiments, the linker is 32 amino acids in length.
- the linker comprises the amino acid sequence (SGGS)2-SGSETPGTSESATPES-(SGGS)2 (SEQ ID NO: 59), which may also be referred to as (SGGS)2-XTEN-(SGGS)2 (SEQ ID NO: 59).
- the linker comprises the amino acid sequence (SGGS) n - SGSETPGTSESATPES-(SGGS) protest (SEQ ID NO: 60), wherein n is 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.
- the first adenosine deaminase is the same as the second adenosine deaminase.
- the first adenosine deaminase and the second adenosine deaminase are any of the adenosine deaminases described herein. In some embodiments, the first adenosine deaminase and the second adenosine deaminase are different. In some embodiments, the first adenosine deaminase is any of the adenosine deaminases provided herein. In some embodiments, the second adenosine deaminase is any of the adenosine deaminases provided herein but is not identical to the first adenosine deaminase.
- the first adenosine deaminase is an ecTadA adenosine deaminase.
- the first adenosine deaminase comprises an amino acid sequence that is at least least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to any one of the amino acid sequences set forth in any one of SEQ ID NOs: 1, 81-130, or to any of the adenosine deaminases provided herein.
- the first adenosine deaminase comprises the amino acid sequence of SEQ ID NO: 1.
- the second adenosine deaminase comprises an amino acid sequence that is at least least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to any one of the amino acid sequences set forth in any one of SEQ ID NOs: 1, 81-126, or to any of the adenosine deaminases provided herein.
- the second adenosine deaminase comprises the amino acid sequence of SEQ ID NO: 1.
- the first adenosine deaminase and the second adenosine deaminase of the fusion protein comprise the mutations in ecTadA (SEQ ID NO: 1), or corresponding mutations in another adenosine deaminase, as shown in any one of the constructs provided in Table 4 (e.g ., pNMG-371, pNMG-477, pNMG-576, pNMG-586, and pNMG-616).
- the fusion protein comprises the two adenosine deaminases (e.g., a first adenosine deaminase and a second adenosine deaminase) of any one of the constructs (e.g., pNMG-371, pNMG-477, pNMG-576, pNMG-586, and pNMG-616) in Table 4.
- adenosine deaminases e.g., a first adenosine deaminase and a second adenosine deaminase
- the general architecture of exemplary fusion proteins with a first adenosine deaminase, a second adenosine deaminase, and a napDNAbp comprises any one of the following structures, where NLS is a nuclear localization sequence (e.g., any NLS provided herein), Nth is the N-terminus of the fusion protein, and COOH is the C- terminus of the fusion protein.
- NLS is a nuclear localization sequence (e.g., any NLS provided herein)
- Nth is the N-terminus of the fusion protein
- COOH is the C- terminus of the fusion protein.
- Fusion proteins comprising a first adenosine deaminase, a second adenosine deaminase, and a napDNAbp.
- the fusion proteins provided herein do not comprise a linker.
- a linker is present between one or more of the domains or proteins (e.g ., first adenosine deaminase, second adenosine deaminase, and/or napDNAbp).
- the used in the general architecture above indicates the presence of an optional linker.
- the transversion base editors provided herein comprise an oxidase.
- the oxidase may be modified from its wild type form.
- Modified oxidases may be obtained by, e.g., evolving a reference oxidase or dioxygenase (e.g., an RNA modification enzyme or a 5-methylcytosine modification enzyme) evolved using a continuous evolution process (e.g., PACE) or non-continuous evolution process (e.g.,
- the oxidase comprises an inosine oxidase (e.g., AlkB), or an evolved variant thereof (FIG. 1A).
- the deaminase-oxidase domain comprises an inosine oxidase and a TadA adenosine deaminase domain, or a variant thereof.
- the deaminase-oxidase domain comprises an inosine oxidase and a TadA adenosine deaminase homodimer, or a variant thereof (FIGs. 1A, IB).
- MLDLFADAEPWQEPLAAGAVILRRFAFNAAEQLIRDINDVASQSPFRQ M VTPGG YTMS V AMTN C GHLGWTTHRQG YLY S PIDPQTNKPWP AMPQS FHNLCQRA ATAAGYPDFQPDACLINRYAPGAKLSLHQDKDEPDLRAPIVSVSLGLPAIFQFGGLKR NDPLKRLLLEHGDVVVWGGESRLFYHGIQPLKAGFHPLTIDCRYNLTFRQAGKKE
- Cytochrome P 1A2 (“CYP1A2”) (human):
- V VFN GQTTTLS N S HIN S ATN Q AS TKS HEY S K VTN S LS LFIPKS NS S KIDTNKS IAQGIIT
- VEKKPIPRIKRKNN S TTTNN S KPS S LPTLGS NTET V QPE VKS ETEPHFILKS S DNTKT Y S
- TET1-CD (“Catalytic domain”) (human):
- the base editors and constructs encoding the base editors disclosed herein further comprise one or more additional base editor elements, e.g., a nuclear localization signal(s), an inhibitor of base excision repair, and/or a heterologous protein domain.
- additional base editor elements e.g., a nuclear localization signal(s), an inhibitor of base excision repair, and/or a heterologous protein domain.
- the base editors and constructs encoding the base editors disclosed herein further comprise one or more, preferably, at least two nuclear localization signals.
- the base editors comprise at least two NLSs.
- the NLSs can be the same NLSs or they can be different NLSs.
- the NLSs may be expressed as part of a fusion protein with the remaining portions of the base editors.
- one or more of the NLSs are bipartite NLSs (“bpNLS”).
- the disclosed fusion proteins comprise two bipartite NLSs. In some embodiments, the disclosed fusion proteins comprise more than two bipartite NLSs.
- the location of the NLS fusion can be at the N-terminus, the C-terminus, or within a sequence of a base editor (e.g., inserted between the encoded napDNAbp component (e.g., Cas9) and a DNA nucleobase modification domain (e.g., a deaminase-oxidase domain)).
- a base editor e.g., inserted between the encoded napDNAbp component (e.g., Cas9) and a DNA nucleobase modification domain (e.g., a deaminase-oxidase domain)).
- the NLSs may be any known NLS sequence in the art.
- the NLSs may also be any future-discovered NLSs for nuclear localization.
- the NLSs also may be any naturally-occurring NLS, or any non-naturally occurring NLS (e.g., an NLS with one or more desired mutations).
- nuclear localization sequence refers to an amino acid sequence that promotes import of a protein into the cell nucleus, for example, by nuclear transport.
- Nuclear localization sequences are known in the art and would be apparent to the skilled artisan.
- NLS sequences are described in Plank et ah, International PCT application PCT/EP2000/011690, filed November 23, 2000, published as WO/2001/038547 on May 31, 2001, the contents of which are incorporated herein by reference.
- an NLS comprises the amino acid sequence PKKKRKV (SEQ ID NO: 61), MDSLLMNRRKFLY QFKNVRWAKGRRETYLC (SEQ ID NO: 62),
- NLS comprises the amino acid sequences
- a base editor may be modified with one or more nuclear localization signals (NLS), preferably at least two NLSs.
- NLS nuclear localization signals
- the base editors are modified with two or more NLSs.
- the disclosure contemplates the use of any nuclear localization signal known in the art at the time of the disclosure, or any nuclear localization signal that is identified or otherwise made available in the state of the art after the time of the instant filing.
- a representative nuclear localization signal is a peptide sequence that directs the protein to the nucleus of the cell in which the sequence is expressed.
- a nuclear localization signal is predominantly basic, can be positioned almost anywhere in a protein's amino acid sequence, generally comprises a short sequence of four amino acids (Autieri & Agrawal, (1998) J. Biol. Chem.
- Nuclear localization signals often comprise proline residues.
- a variety of nuclear localization signals have been identified and have been used to effect transport of biological molecules from the cytoplasm to the nucleus of a cell. See, e.g., Tinland et al., (1992) Proc. Natl. Acad. Sci. U.S.A. 89:7442-46; Moede et al., (1999) FEBS Lett. 461:229-34, which is incorporated by reference. Translocation is currently thought to involve nuclear pore proteins.
- NLSs can be classified in three general groups: (i) a monopartite NLS exemplified by the SV40 large T antigen NLS (PKKKRKV (SEQ ID NO: 61)); (ii) a bipartite motif consisting of two basic domains separated by a variable number of spacer amino acids and exemplified by the Xenopus nucleoplasmin NLS (KRXXXXXXXXXKKKL (SEQ ID NO: 65)); and (hi) noncanonical sequences such as M9 of the hnRNP Al protein, the influenza virus nucleoprotein NLS, and the yeast Gal4 protein NLS (Dingwall and Laskey 1991).
- Nuclear localization signals appear at various points in the amino acid sequences of proteins. NLS’s have been identified at the N-terminus, the C-terminus and in the central region of proteins. Thus, the disclosure provides base editors that may be modified with one or more NLSs at the C-terminus, the N-terminus, as well as at in internal regaion of the base editor. The residues of a longer sequence that do not function as component NLS residues should be selected so as not to interfere, for example tonically or sterically, with the nuclear localization signal itself. Therefore, although there are no strict limits on the composition of an NLS -comprising sequence, in practice, such a sequence can be
- the present disclosure contemplates any suitable means by which to modify a base editor to include one or more NLSs.
- the base editors may be engineered to express a base editor protein that is translationally fused at its N-terminus or its C-terminus (or both) to one or more NLSs, i.e., to form a base editor-NLS fusion construct.
- the base editor-encoding nucleotide sequence may be genetically modified to incorporate a reading frame that encodes one or more NLSs in an internal region of the encoded base editor.
- the NLSs may include various amino acid linkers or spacer regions encoded between the base editor and the N-terminally, C-terminally, or internally- attached NLS amino acid sequence, e.g, and in the central region of proteins.
- the present disclosure also provides for nucleotide constructs, vectors, and host cells for expressing fusion proteins that comprise a base editor and one or more NLSs.
- the base editors described herein may also comprise nuclear localization signals which are linked to a base editor through one or more linkers, e.g., and polymeric, amino acid, nucleic acid, polysaccharide, chemical, or nucleic acid linker element.
- linkers within the contemplated scope of the disclosure are not intented to have any limitations and can be any suitable type of molecule (e.g., polymer, amino acid,
- polysaccharide, nucleic acid, lipid, or any synthetic chemical linker domain and be joined to the base editor by any suitable strategy that effectuates forming a bond (e.g., covalent linkage, hydrogen bonding) between the base editor and the one or more NLSs.
- the base editors described herein may comprise an inhibitor of base repair.
- the term“inhibitor of base repair” or“IBR” refers to a protein that is capable in inhibiting the activity of a nucleic acid repair enzyme, for example a base excision repair enzyme.
- the IBR is an inhibitor of OGG base excision repair.
- the IBR is an inhibitor of base excision repair (“iBER”).
- Exemplary inhibitors of base excision repair include inhibitors of APE1, Endo III, Endo IV, Endo V, Endo VIII, Fpg, hOGGl, hNEILl, T7 Endol, T4PDG, UDG, hSMUGl, and hAAG.
- the IBR is an inhibitor of Endo V or hAAG. In some embodiments, the IBR is an iBER that may be a catalytically inactive glycosylase or catalytically inactive dioxygenase or a small molecule or peptide inhibitor of an oxidase, or variants threreof. In some embodiments, the IBR is an iBER that may be a TDG inhibitor, MBD4 inhibitor or an inhibitor of an AlkBH enzyme. In some embodiments, the IBR is an iBER that comprises a catalytically inactive TDG or catalytically inactive MBD4. An exemplary catalytically inactive TDG is an N140A mutant of SEQ ID NO: 145 (human TDG).
- glycosylases are provided below.
- the catalytically inactivated variants of any of these glycosylase domains are iBERs that may be fused to the napDNAbp, deaminase, or oxidase domains of the base editors provided in this disclosure.
- the fusion proteins described herein may comprise one or more heterologous protein domains (e.g., about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more domains in addition to the base editor components).
- a fusion protein may comprise any additional protein sequence, and optionally a linker sequence between any two domains.
- Other exemplary features that may be present are localization sequences, such as cytoplasmic localization sequences, export sequences, such as nuclear export sequences, or other localization sequences, as well as sequence tags that are useful for solubilization, purification, or detection of the fusion proteins.
- Examples of protein domains that may be fused to a base editor or component thereof include, without limitation, epitope tags, and reporter gene sequences.
- epitope tags include histidine (His) tags, V5 tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags, and thioredoxin (Trx) tags.
- reporter genes include, but are not limited to, glutathione-5-transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT), beta-galactosidase, beta- glucuronidase, luciferase, green fluorescent protein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and autofluorescent proteins including blue fluorescent protein (BFP).
- GST glutathione-5-transferase
- HRP horseradish peroxidase
- CAT chloramphenicol acetyltransferase
- beta-galactosidase beta-galactosidase
- beta-glucuronidase beta-galactosidase
- luciferase green fluorescent protein
- GFP green fluorescent protein
- HcRed HcRed
- DsRed cyan fluorescent protein
- YFP
- a base editor may be fused to a gene sequence encoding a protein or a fragment of a protein that bind DNA molecules or bind other cellular molecules, including, but not limited to, maltose binding protein (MBP), S-tag, Lex A DNA binding domain (DBD) fusions, GAL4 DNA binding domain fusions, and herpes simplex virus (HSV) BP 16 protein fusions. Additional domains that may form part of a base editor are described in US Patent Publication No. 2011/0059502, published March 10, 2011 and incorporated herein by reference in its entirety.
- a reporter gene which includes, but is not limited to, glutathione-5-transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT) beta-galactosidase, beta-glucuronidase, luciferase, green fluorescent protein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and autofluorescent proteins including blue fluorescent protein (BFP), may be introduced into a cell to encode a gene product which serves as a marker by which to measure the alteration or modification of expression of the gene product.
- the gene product is luciferase.
- the expression of the gene product is decreased.
- Suitable protein tags include, but are not limited to, biotin carboxylase carrier protein (BCCP) tags, myc-tags, calmodulin-tags, FLAG-tags,
- BCCP biotin carboxylase carrier protein
- hemagglutinin (HA)-tags polyhistidine tags, also referred to as histidine tags or His-tags
- maltose binding protein (MBP)-tags nus-tags, glutathione-S -transferase (GST)-tags, green fluorescent protein (GFP)-tags, thioredoxin-tags, S-tags, Softags (e.g., Softag 1, Softag 3), strep-tags , biotin ligase tags, FlAsH tags, V5 tags, and SBP-tags. Additional suitable sequences will be apparent to those of skill in the art.
- the fusion protein comprises one or more His tags.
- linkers may be used to link any of the peptides or peptide domains or domains of the disclosure (e.g., domain A covalently linked to domain B which is covalently linked to domain C).
- the term“linker,” as used herein, refers to a chemical group or a molecule linking two molecules or domains, e.g., a binding domain and a cleavage domain of a nuclease.
- a linker joins a gRNA binding domain of a napDNAbp nuclease and the catalytic domain of a recombinase.
- a linker joins a dCas9 and base editor domain (e.g., an oxidase).
- the linker is positioned between, or flanked by, two groups, molecules, or other domains and connected to each one via a covalent bond, thus connecting the two.
- the linker is an amino acid or a plurality of amino acids (e.g., a peptide or protein).
- the linker is an organic molecule, group, polymer, or chemical domain. Chemical domains include, but are not limited to, disulfide, hydrazone, thiol and azo domains.
- the linker is 5-100 amino acids in length, for example, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 30-35, 35-40, 40-45, 45- 50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-150, or 150-200 amino acids in length.
- the linker is a single atom, or a single angstrom, in length. Longer or shorter linkers are also contemplated.
- the linker may be as simple as a covalent bond, or it may be a polymeric linker many atoms in length.
- the linker is a polpeptide or based on amino acids. In other embodiments, the linker is not peptide-like.
- the linker is a covalent bond (e.g., a carbon-carbon bond, disulfide bond, carbon-heteroatom bond, etc.).
- the linker is a carbon-nitrogen bond of an amide linkage.
- the linker is a cyclic or acyclic, substituted or unsubstituted, branched or unbranched aliphatic or hetero aliphatic linker.
- the linker is polymeric (e.g., polyethylene, polyethylene glycol, polyamide, polyester, etc.). In certain embodiments, the linker comprises a monomer, dimer, or polymer of aminoalkanoic acid. In certain embodiments, the linker comprises an aminoalkanoic acid (e.g., glycine, ethanoic acid, alanine, beta-alanine, 3-aminopropanoic acid, 4-aminobutanoic acid, 5- pentanoic acid, etc.). In certain embodiments, the linker comprises a monomer, dimer, or polymer of aminohexanoic acid (Ahx).
- Ahx aminohexanoic acid
- the linker is based on a carbocyclic domain (e.g., cyclopentane, cyclohexane). In other embodiments, the linker comprises a polyethylene glycol domain (PEG). In other embodiments, the linker comprises amino acids. In certain embodiments, the linker comprises a peptide. In certain embodiments,
- the linker comprises an aryl or heteroaryl domain. In certain embodiments, the linker is based on a phenyl ring.
- the linker may included funtionalized domains to facilitate attachment of a nucleophile (e.g., thiol, amino) from the peptide to the linker.
- Any electrophile may be used as part of the linker. Exemplary electrophiles include, but are not limited to, activated esters, activated amides, Michael acceptors, alkyl halides, aryl halides, acyl halides, and isothiocyanates.
- the linker comprises the amino acid sequence (GGGGS) n (SEQ ID NO: 73), (G) spacious(SEQ ID NO: 74), (EAAAK) structuri (SEQ ID NO: 75), (GGS) connect (SEQ ID NO: 76), (SGGS) compassion (SEQ ID NO: 77), (XP) meaning (SEQ ID NO: 78), or any combination thereof, wherein n is independently an integer between 1 and 30, and wherein X is any amino acid.
- the linker comprises the amino acid sequence (GGS) n (SEQ ID NO: 63), wherein n is 1, 3, or 7.
- the linker comprises the amino acid sequence SGSETPGTSESATPES (SEQ ID NO: 79). In some embodiments, the linker comprises the amino acid sequence SGGSSGGSSGS ETPGTS ES ATPES S GGS S GGS (SEQ ID NO: 11). In some embodiments, the linker comprises the amino acid sequence
- the linker comprises the amino acid sequence SGGS (SEQ ID NO: 14).
- the fusion protein comprises the structure Ntb- [oxidase]- [optional linker sequence] -[adenosine deaminase] -[optional linker sequence]- [adenosine deaminase] -[optional linker sequence] -[dCas9 or Cas9 nickase]-COOH; or Nth- [adenosine deaminase] -[optional linker sequence] -[adenosine deaminase] -[optional linker sequence] -[oxidase] -[optional linker sequence] -[dCas9 or Cas9 nickase]-COOH; Ntb- [adenosine deaminase] -[optional linker sequence] -[adenosine deaminase] -[optional linker sequence] -[dCas9 or Cas9 nickase]-COOH
- the target nucleotide sequence is a DNA sequence in a genome, e.g. a eukaryotic genome.
- the target nucleotide sequence is in a mammalian (e.g. a human) genome.
- the target nucleotide sequence is in a human genome.
- the target nucleotide sequence is in the genome of a rodent, such as a mouse or rate.
- the target nucleotide sequence is in the genome of a domesticated animal, such as a horse, cat, dog, or rabbit.
- Some embodiments of the disclosure are based on the recognition that any of the fusion proteins provided herein are capable of modifying a specific nucleobase without generating a significant proportion of indels.
- An“indel”, as used herein, refers to the insertion or deletion of a nucleobase within a nucleic acid. Such insertions or deletions can lead to frame shift mutations within a coding region of a gene.
- any of the fusion proteins provided herein are capable of generating a greater proportion of intended modifications (e.g., point mutations) versus indels.
- the fusion proteins provided herein are capable of generating a ratio of intended point mutations to indels that is greater than 1:1.
- the fusion proteins provided herein are capable of generating a ratio of intended point mutations to indels that is at least 1.5:1, at least 2:1, at least 2.5:1, at least 3:1, at least 3.5:1, at least 4:1, at least 4.5:1, at least 5:1, at least 5.5:1, at least 6:1, at least 6.5:1, at least 7:1, at least 7.5:1, at least 8:1, at least 10:1, at least 12:1, at least 15:1, at least 20:1, at least 25:1, at least 30:1, at least 40:1, at least 50:1, at least 100:1, at least 200:1, at least 300:1, at least 400:1, at least 500:1, at least 600:1, at least 700:1, at least 800:1, at least 900:1, or at least 1000:1, or more.
- the number of intended mutations and indels may be determined using any suitable method, for example the methods used in the below Examples.
- sequencing reads are scanned for exact matches to two 10-bp sequences that flank both sides of a window in which indels might occur. If no exact matches are located, the read is excluded from analysis. If the length of this indel window exactly matches the reference sequence the read is classified as not containing an indel. If the indel window is two or more bases longer or shorter than the reference sequence, then the sequencing read is classified as an insertion or deletion, respectively.
- the fusion proteins provided herein are capable of limiting formation of indels in a region of a nucleic acid.
- the region is at a nucleotide targeted by a fusion protein or a region within 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides of a nucleotide targeted by a fusion protein.
- any of the fusion proteins provided herein are capable of limiting the formation of indels at a region of a nucleic acid to less than 1%, less than 1.5%, less than 2%, less than 2.5%, less than 3%, less than 3.5%, less than 4%, less than 4.5%, less than 5%, less than 6%, less than 7%, less than 8%, less than 9%, less than 10%, less than 12%, less than 15%, or less than 20%.
- the number of indels formed at a nucleic acid region may depend on the amount of time a nucleic acid (e.g., a nucleic acid within the genome of a cell) is exposed to a fusion protein.
- an number or proportion of indels is determined after at least 1 hour, at least 2 hours, at least 6 hours, at least 12 hours, at least 24 hours, at least 36 hours, at least 48 hours, at least 3 days, at least 4 days, at least 5 days, at least 7 days, at least 10 days, or at least 14 days of exposing a nucleic acid (e.g., a nucleic acid within the genome of a cell) to a fusion protein.
- a nucleic acid e.g., a nucleic acid within the genome of a cell
- an intended mutation such as a point mutation
- a nucleic acid e.g. a nucleic acid within a genome of a subject
- an intended mutation is a mutation that is generated by a specific fusion protein bound to a gRNA, specifically designed to generate the intended mutation.
- the intended mutation is a mutation associated with a disease, disorder, or condition.
- the intended mutation is the correction of a thymine (T) to adenine (A) point mutation associated with a disease, disorder, or condition. In some embodiments, the intended mutation is the correction of an adenine (A) to thymine (T) point mutation associated with a disease, disorder, or condition. In some embodiments, the intended mutation is the correction of a thymine (T) to adenine (A) point mutation within the coding region of a gene. In some embodiments, the intended mutation is the correction of an adenine (A) to thymine (T) point mutation within the coding region of a gene.
- the intended mutation is a point mutation that generates a stop codon, for example, a premature stop codon within the coding region of a gene. In some embodiments, the intended mutation is a mutation that eliminates a stop codon. In some embodiments, the intended mutation is a mutation that alters the splicing of a gene. In some embodiments, the intended mutation is a mutation that alters the regulatory sequence of a gene (e.g., a gene promotor or gene repressor).
- any of the fusion proteins provided herein are capable of generating a ratio of intended mutations to unintended mutations (e.g., intended point m utati o n s : u n i n t c n dcd point mutations) that is greater than 1: 1. In some embodiments, any of the fusion proteins provided herein are capable of generating a ratio of intended mutations to unintended mutations (e.g., intended point
- Some embodiments of the disclosure are based on the recognition that the formation of indels in a region of a nucleic acid may be limited by nicking the non-edited strand opposite to the strand in which edits are introduced.
- This nick serves to direct mismatch repair machinery to the non-edited strand, ensuring that the chemically modified nucleobase is not interpreted as a lesion by the machinery.
- This nick may be created by the use of an nCas9.
- the methods provided in this disclosure comprise cutting (or nicking) the non-edited strand of the double-stranded DNA, for example, wherein the one strand comprises the A of the target T: A nucleobase pair, or the T of the T:A nucleobase pair.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Medicinal Chemistry (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Pharmacology & Pharmacy (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
Abstract
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201962814796P | 2019-03-06 | 2019-03-06 | |
US62/814,796 | 2019-03-06 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020181202A1 true WO2020181202A1 (fr) | 2020-09-10 |
Family
ID=70057349
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2020/021429 WO2020181202A1 (fr) | 2019-03-06 | 2020-03-06 | Édition de base a:t en t:a par déamination et oxydation d'adénine |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2020181202A1 (fr) |
Cited By (62)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11053481B2 (en) | 2013-12-12 | 2021-07-06 | President And Fellows Of Harvard College | Fusions of Cas9 domains and nucleic acid-editing domains |
EP3922719A1 (fr) | 2020-06-12 | 2021-12-15 | Eligo Bioscience | Décolonisation spécifique des bactéries résistantes aux antibiotiques à des fins prophylactiques |
US11214780B2 (en) | 2015-10-23 | 2022-01-04 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
WO2022003209A1 (fr) | 2020-07-03 | 2022-01-06 | Eligo Bioscience | Procédé de confinement de vecteurs d'acide nucléique introduits dans une population de microbiome |
US11224621B2 (en) | 2020-04-08 | 2022-01-18 | Eligo Bioscience | Modulation of microbiota function by gene therapy of the microbiome to prevent, treat or cure microbiome-associated diseases or disorders |
US11268082B2 (en) | 2017-03-23 | 2022-03-08 | President And Fellows Of Harvard College | Nucleobase editors comprising nucleic acid programmable DNA binding proteins |
US11299755B2 (en) | 2013-09-06 | 2022-04-12 | President And Fellows Of Harvard College | Switchable CAS9 nucleases and uses thereof |
US11306324B2 (en) | 2016-10-14 | 2022-04-19 | President And Fellows Of Harvard College | AAV delivery of nucleobase editors |
US11319532B2 (en) | 2017-08-30 | 2022-05-03 | President And Fellows Of Harvard College | High efficiency base editors comprising Gam |
WO2022096590A1 (fr) | 2020-11-04 | 2022-05-12 | Eligo Bioscience | Particules dérivées de phages pour l'administration in situ de charge utile d'adn dans une population de c. acnes |
WO2022144381A1 (fr) | 2020-12-30 | 2022-07-07 | Eligo Bioscience | Modulation du microbiome d'un hôte par administration de charges utiles d'adn à étalement réduit à un minimum |
US11447770B1 (en) | 2019-03-19 | 2022-09-20 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
WO2022238555A1 (fr) | 2021-05-12 | 2022-11-17 | Eligo Bioscience | Production de phages lytiques |
WO2022251712A1 (fr) | 2021-05-28 | 2022-12-01 | Sana Biotechnology, Inc. | Particules lipidiques contenant une glycoprotéine d'enveloppe de rétrovirus endogène de babouin (baev) tronquée et méthodes et utilisations associées |
US11542509B2 (en) | 2016-08-24 | 2023-01-03 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
US11542496B2 (en) | 2017-03-10 | 2023-01-03 | President And Fellows Of Harvard College | Cytosine to guanine base editor |
US11560566B2 (en) | 2017-05-12 | 2023-01-24 | President And Fellows Of Harvard College | Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation |
US11578343B2 (en) | 2014-07-30 | 2023-02-14 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
WO2023019227A1 (fr) | 2021-08-11 | 2023-02-16 | Sana Biotechnology, Inc. | Cellules génétiquement modifiées pour une thérapie cellulaire allogénique pour réduire les réactions inflammatoires induites par le complément |
WO2023019229A1 (fr) | 2021-08-11 | 2023-02-16 | Sana Biotechnology, Inc. | Cellules primaires génétiquement modifiées pour une thérapie cellulaire allogénique |
WO2023019225A2 (fr) | 2021-08-11 | 2023-02-16 | Sana Biotechnology, Inc. | Cellules génétiquement modifiées pour une thérapie cellulaire allogénique permettant de réduire les réactions inflammatoires à médiation par le sang instantanée |
WO2023019226A1 (fr) | 2021-08-11 | 2023-02-16 | Sana Biotechnology, Inc. | Cellules génétiquement modifiées pour une thérapie cellulaire allogénique |
US11584781B2 (en) | 2019-12-30 | 2023-02-21 | Eligo Bioscience | Chimeric receptor binding proteins resistant to proteolytic degradation |
US11617773B2 (en) | 2020-04-08 | 2023-04-04 | Eligo Bioscience | Elimination of colonic bacterial driving lethal inflammatory cardiomyopathy |
WO2023069790A1 (fr) | 2021-10-22 | 2023-04-27 | Sana Biotechnology, Inc. | Procédés de modification de lymphocytes t allogéniques avec un transgène dans un locus de tcr et compositions et procédés associés |
US11661590B2 (en) | 2016-08-09 | 2023-05-30 | President And Fellows Of Harvard College | Programmable CAS9-recombinase fusion proteins and uses thereof |
WO2023115041A1 (fr) | 2021-12-17 | 2023-06-22 | Sana Biotechnology, Inc. | Glycoprotéines de fixation de paramyxoviridae modifiées |
WO2023115039A2 (fr) | 2021-12-17 | 2023-06-22 | Sana Biotechnology, Inc. | Glycoprotéines de fusion de paramyxoviridae modifiées |
WO2023133595A2 (fr) | 2022-01-10 | 2023-07-13 | Sana Biotechnology, Inc. | Méthodes de dosage et d'administration ex vivo de particules lipidiques ou de vecteurs viraux ainsi que systèmes et utilisations associés |
US11702651B2 (en) | 2016-08-03 | 2023-07-18 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
WO2023150647A1 (fr) | 2022-02-02 | 2023-08-10 | Sana Biotechnology, Inc. | Procédés d'administration et de dosage répétés de particules lipidiques ou de vecteurs viraux et systèmes et utilisations connexes |
WO2023150518A1 (fr) | 2022-02-01 | 2023-08-10 | Sana Biotechnology, Inc. | Vecteurs lentiviraux ciblant cd3 et leurs utilisations |
US11732274B2 (en) | 2017-07-28 | 2023-08-22 | President And Fellows Of Harvard College | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) |
WO2023158836A1 (fr) | 2022-02-17 | 2023-08-24 | Sana Biotechnology, Inc. | Protéines cd47 modifiées et leurs utilisations |
US11746352B2 (en) | 2019-12-30 | 2023-09-05 | Eligo Bioscience | Microbiome modulation of a host by delivery of DNA payloads with minimized spread |
US11795443B2 (en) | 2017-10-16 | 2023-10-24 | The Broad Institute, Inc. | Uses of adenosine base editors |
WO2023217280A1 (fr) * | 2022-05-13 | 2023-11-16 | Huidagene Therapeutics Co., Ltd. | Éditeur de base d'adénine programmable et ses utilisations |
US11820969B2 (en) | 2016-12-23 | 2023-11-21 | President And Fellows Of Harvard College | Editing of CCR2 receptor gene to protect against HIV infection |
US11898179B2 (en) | 2017-03-09 | 2024-02-13 | President And Fellows Of Harvard College | Suppression of pain by gene editing |
US11912985B2 (en) | 2020-05-08 | 2024-02-27 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
WO2024044655A1 (fr) | 2022-08-24 | 2024-02-29 | Sana Biotechnology, Inc. | Administration de protéines hétérologues |
US11920181B2 (en) | 2013-08-09 | 2024-03-05 | President And Fellows Of Harvard College | Nuclease profiling system |
WO2024047151A1 (fr) | 2022-08-31 | 2024-03-07 | Snipr Biome Aps | Nouveau type de système crispr/cas |
WO2024064838A1 (fr) | 2022-09-21 | 2024-03-28 | Sana Biotechnology, Inc. | Particules lipidiques comprenant des glycoprotéines fixant des paramyxovirus variants et leurs utilisations |
WO2024081820A1 (fr) | 2022-10-13 | 2024-04-18 | Sana Biotechnology, Inc. | Particules virales ciblant des cellules souches hématopoïétiques |
WO2024097313A1 (fr) | 2022-11-02 | 2024-05-10 | Sana Biotechnology, Inc. | Procédés de production de produits de thérapie à base de lymphocytes t |
WO2024119157A1 (fr) | 2022-12-02 | 2024-06-06 | Sana Biotechnology, Inc. | Particules lipidiques avec cofusogènes et leurs procédés de production et d'utilisation |
US12006520B2 (en) | 2011-07-22 | 2024-06-11 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
WO2024151541A1 (fr) | 2023-01-09 | 2024-07-18 | Sana Biotechnology, Inc. | Souris auto-immune présentant un diabète de type 1 |
US12098372B2 (en) | 2019-12-30 | 2024-09-24 | Eligo Bioscience | Microbiome modulation of a host by delivery of DNA payloads with minimized spread |
WO2024220574A1 (fr) | 2023-04-18 | 2024-10-24 | Sana Biotechnology, Inc. | Fusogènes de protéine g universelle et systèmes adaptateurs de ceux-ci et particules lipidiques et utilisations associées |
WO2024220560A1 (fr) | 2023-04-18 | 2024-10-24 | Sana Biotechnology, Inc. | Fusogènes de protéine g modifiés et particules lipidiques associées et procédés associés |
WO2024220598A2 (fr) | 2023-04-18 | 2024-10-24 | Sana Biotechnology, Inc. | Vecteurs lentiviraux à deux génomes ou plus |
WO2024229302A1 (fr) | 2023-05-03 | 2024-11-07 | Sana Biotechnology, Inc. | Procédés de dosage et d'administration de cellules d'îlots modifiées |
WO2024243236A2 (fr) | 2023-05-22 | 2024-11-28 | Sana Biotechnology, Inc. | Procédés d'administration de cellules des îlots pancréatiques et procédés associés |
WO2024243340A1 (fr) | 2023-05-23 | 2024-11-28 | Sana Biotechnology, Inc. | Fusogènes en tandem et particules lipidiques associées |
US12157760B2 (en) | 2018-05-23 | 2024-12-03 | The Broad Institute, Inc. | Base editors and uses thereof |
WO2025021861A1 (fr) | 2023-07-24 | 2025-01-30 | Eligo Bioscience | Détection et traitement de maladies associées aux bactéries c. acnes |
WO2025046062A1 (fr) | 2023-08-31 | 2025-03-06 | Snipr Biome Aps | Nouveau type de système crispr/cas |
WO2025054202A1 (fr) | 2023-09-05 | 2025-03-13 | Sana Biotechnology, Inc. | Procédé de criblage d'un échantillon contenant un transgène à l'aide d'un code à barres unique |
US12281338B2 (en) | 2018-10-29 | 2025-04-22 | The Broad Institute, Inc. | Nucleobase editors comprising GeoCas9 and uses thereof |
US12351837B2 (en) | 2020-01-23 | 2025-07-08 | The Broad Institute, Inc. | Supernegatively charged proteins and uses thereof |
Citations (57)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4186183A (en) | 1978-03-29 | 1980-01-29 | The United States Of America As Represented By The Secretary Of The Army | Liposome carriers in chemotherapy of leishmaniasis |
US4217344A (en) | 1976-06-23 | 1980-08-12 | L'oreal | Compositions containing aqueous dispersions of lipid spheres |
US4235871A (en) | 1978-02-24 | 1980-11-25 | Papahadjopoulos Demetrios P | Method of encapsulating biologically active materials in lipid vesicles |
US4261975A (en) | 1979-09-19 | 1981-04-14 | Merck & Co., Inc. | Viral liposome particle |
US4485054A (en) | 1982-10-04 | 1984-11-27 | Lipoderm Pharmaceuticals Limited | Method of encapsulating biologically active materials in multilamellar lipid vesicles (MLV) |
US4501728A (en) | 1983-01-06 | 1985-02-26 | Technology Unlimited, Inc. | Masking of liposomes from RES recognition |
EP0264166A1 (fr) | 1986-04-09 | 1988-04-20 | Genzyme Corporation | Animaux transformés génétiquement sécrétant une protéine désirée dans le lait |
US4774085A (en) | 1985-07-09 | 1988-09-27 | 501 Board of Regents, Univ. of Texas | Pharmaceutical administration systems containing a mixture of immunomodulators |
US4797368A (en) | 1985-03-15 | 1989-01-10 | The United States Of America As Represented By The Department Of Health And Human Services | Adeno-associated virus as eukaryotic expression vector |
US4837028A (en) | 1986-12-24 | 1989-06-06 | Liposome Technology, Inc. | Liposomes with enhanced circulation time |
US4873316A (en) | 1987-06-23 | 1989-10-10 | Biogen, Inc. | Isolation of exogenous recombinant proteins from the milk of transgenic mammals |
US4880635A (en) | 1984-08-08 | 1989-11-14 | The Liposome Company, Inc. | Dehydrated liposomes |
US4897355A (en) | 1985-01-07 | 1990-01-30 | Syntex (U.S.A.) Inc. | N[ω,(ω-1)-dialkyloxy]- and N-[ω,(ω-1)-dialkenyloxy]-alk-1-yl-N,N,N-tetrasubstituted ammonium lipids and uses therefor |
US4906477A (en) | 1987-02-09 | 1990-03-06 | Kabushiki Kaisha Vitamin Kenkyusyo | Antineoplastic agent-entrapping liposomes |
US4911928A (en) | 1987-03-13 | 1990-03-27 | Micro-Pak, Inc. | Paucilamellar lipid vesicles |
US4917951A (en) | 1987-07-28 | 1990-04-17 | Micro-Pak, Inc. | Lipid vesicles formed of surfactants and steroids |
US4920016A (en) | 1986-12-24 | 1990-04-24 | Linear Technology, Inc. | Liposomes with enhanced circulation time |
US4921757A (en) | 1985-04-26 | 1990-05-01 | Massachusetts Institute Of Technology | System for delayed and pulsed release of biologically active substances |
US4946787A (en) | 1985-01-07 | 1990-08-07 | Syntex (U.S.A.) Inc. | N-(ω,(ω-1)-dialkyloxy)- and N-(ω,(ω-1)-dialkenyloxy)-alk-1-yl-N,N,N-tetrasubstituted ammonium lipids and uses therefor |
US5049386A (en) | 1985-01-07 | 1991-09-17 | Syntex (U.S.A.) Inc. | N-ω,(ω-1)-dialkyloxy)- and N-(ω,(ω-1)-dialkenyloxy)Alk-1-YL-N,N,N-tetrasubstituted ammonium lipids and uses therefor |
WO1991016024A1 (fr) | 1990-04-19 | 1991-10-31 | Vical, Inc. | Lipides cationiques servant a l'apport intracellulaire de molecules biologiquement actives |
WO1991017424A1 (fr) | 1990-05-03 | 1991-11-14 | Vical, Inc. | Acheminement intracellulaire de substances biologiquement actives effectue a l'aide de complexes de lipides s'auto-assemblant |
US5173414A (en) | 1990-10-30 | 1992-12-22 | Applied Immune Sciences, Inc. | Production of recombinant adeno-associated virus vectors |
WO1993024641A2 (fr) | 1992-06-02 | 1993-12-09 | The United States Of America, As Represented By The Secretary, Department Of Health & Human Services | Virus adeno-associe a sequences terminales inversees utilisees comme promoteur |
WO2001038547A2 (fr) | 1999-11-24 | 2001-05-31 | Mcs Micro Carrier Systems Gmbh | Polypeptides comprenant des multimeres de signaux de localisation nucleaire ou de domaines de transduction de proteine et utilisations de ces derniers pour transferer des molecules dans des cellules |
US6453242B1 (en) | 1999-01-12 | 2002-09-17 | Sangamo Biosciences, Inc. | Selection of sites for targeting by zinc finger proteins and methods of designing zinc finger proteins to bind to preselected sites |
US6503717B2 (en) | 1999-12-06 | 2003-01-07 | Sangamo Biosciences, Inc. | Methods of using randomized libraries of zinc finger proteins for the identification of gene function |
US6534261B1 (en) | 1999-01-12 | 2003-03-18 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
US6599692B1 (en) | 1999-09-14 | 2003-07-29 | Sangamo Bioscience, Inc. | Functional genomics using zinc finger proteins |
US6689558B2 (en) | 2000-02-08 | 2004-02-10 | Sangamo Biosciences, Inc. | Cells for drug discovery |
US7013219B2 (en) | 1999-01-12 | 2006-03-14 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
US20070015238A1 (en) | 2002-06-05 | 2007-01-18 | Snyder Richard O | Production of pseudotyped recombinant AAV virions |
WO2010028347A2 (fr) | 2008-09-05 | 2010-03-11 | President & Fellows Of Harvard College | Evolution dirigée continue de protéines et d'acides nucléiques |
US20110059502A1 (en) | 2009-09-07 | 2011-03-10 | Chalasani Sreekanth H | Multiple domain proteins |
WO2011053982A2 (fr) | 2009-11-02 | 2011-05-05 | University Of Washington | Compositions thérapeutiques à base de nucléases et méthodes |
WO2012088381A2 (fr) | 2010-12-22 | 2012-06-28 | President And Fellows Of Harvard College | Évolution dirigée continue |
US20120322861A1 (en) | 2007-02-23 | 2012-12-20 | Barry John Byrne | Compositions and Methods for Treating Diseases |
US8871445B2 (en) | 2012-12-12 | 2014-10-28 | The Broad Institute Inc. | CRISPR-Cas component systems, methods and compositions for sequence manipulation |
WO2015035136A2 (fr) | 2013-09-06 | 2015-03-12 | President And Fellows Of Harvard College | Système d'administration pour des nucléases fonctionnelles |
US20150166981A1 (en) | 2013-12-12 | 2015-06-18 | President And Fellows Of Harvard College | Methods for nucleic acid editing |
WO2015134121A2 (fr) | 2014-01-20 | 2015-09-11 | President And Fellows Of Harvard College | Sélection négative et modulation de la stringence dans des systèmes à évolution continue |
US9340799B2 (en) | 2013-09-06 | 2016-05-17 | President And Fellows Of Harvard College | MRNA-sensing switchable gRNAs |
US9405700B2 (en) | 2010-11-04 | 2016-08-02 | Sonics, Inc. | Methods and apparatus for virtualization in an integrated circuit |
WO2016168631A1 (fr) | 2015-04-17 | 2016-10-20 | President And Fellows Of Harvard College | Système de mutagénèse à base de vecteurs |
WO2016205764A1 (fr) | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Nouvelles enzymes crispr et systèmes associés |
US20170044520A1 (en) | 2015-07-22 | 2017-02-16 | President And Fellows Of Harvard College | Evolution of site-specific recombinases |
WO2017070632A2 (fr) | 2015-10-23 | 2017-04-27 | President And Fellows Of Harvard College | Éditeurs de nucléobases et leurs utilisations |
US20170233708A1 (en) | 2014-10-22 | 2017-08-17 | President And Fellows Of Harvard College | Evolution of proteases |
WO2018027078A1 (fr) | 2016-08-03 | 2018-02-08 | President And Fellows Of Harard College | Éditeurs de nucléobases d'adénosine et utilisations associées |
WO2018071868A1 (fr) | 2016-10-14 | 2018-04-19 | President And Fellows Of Harvard College | Administration d'aav d'éditeurs de nucléobases |
WO2018152197A1 (fr) * | 2017-02-15 | 2018-08-23 | Massachusetts Institute Of Technology | Éléments d'écriture d'adn, enregistreurs moléculaires et leurs utilisations |
US10077453B2 (en) | 2014-07-30 | 2018-09-18 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
WO2018176009A1 (fr) | 2017-03-23 | 2018-09-27 | President And Fellows Of Harvard College | Éditeurs de nucléobase comprenant des protéines de liaison à l'adn programmable par acides nucléiques |
WO2019023680A1 (fr) | 2017-07-28 | 2019-01-31 | President And Fellows Of Harvard College | Procédés et compositions pour l'évolution d'éditeurs de bases à l'aide d'une évolution continue assistée par phage (pace) |
WO2019079347A1 (fr) | 2017-10-16 | 2019-04-25 | The Broad Institute, Inc. | Utilisations d'éditeurs de bases adénosine |
WO2019226593A1 (fr) | 2018-05-24 | 2019-11-28 | Aqua-Aerobic Systems, Inc. | Système et procédé de traitement de matières solides dans un système de filtration |
WO2019241649A1 (fr) | 2018-06-14 | 2019-12-19 | President And Fellows Of Harvard College | Évolution de cytidine désaminases |
-
2020
- 2020-03-06 WO PCT/US2020/021429 patent/WO2020181202A1/fr active Application Filing
Patent Citations (80)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4217344A (en) | 1976-06-23 | 1980-08-12 | L'oreal | Compositions containing aqueous dispersions of lipid spheres |
US4235871A (en) | 1978-02-24 | 1980-11-25 | Papahadjopoulos Demetrios P | Method of encapsulating biologically active materials in lipid vesicles |
US4186183A (en) | 1978-03-29 | 1980-01-29 | The United States Of America As Represented By The Secretary Of The Army | Liposome carriers in chemotherapy of leishmaniasis |
US4261975A (en) | 1979-09-19 | 1981-04-14 | Merck & Co., Inc. | Viral liposome particle |
US4485054A (en) | 1982-10-04 | 1984-11-27 | Lipoderm Pharmaceuticals Limited | Method of encapsulating biologically active materials in multilamellar lipid vesicles (MLV) |
US4501728A (en) | 1983-01-06 | 1985-02-26 | Technology Unlimited, Inc. | Masking of liposomes from RES recognition |
US4880635B1 (en) | 1984-08-08 | 1996-07-02 | Liposome Company | Dehydrated liposomes |
US4880635A (en) | 1984-08-08 | 1989-11-14 | The Liposome Company, Inc. | Dehydrated liposomes |
US4897355A (en) | 1985-01-07 | 1990-01-30 | Syntex (U.S.A.) Inc. | N[ω,(ω-1)-dialkyloxy]- and N-[ω,(ω-1)-dialkenyloxy]-alk-1-yl-N,N,N-tetrasubstituted ammonium lipids and uses therefor |
US5049386A (en) | 1985-01-07 | 1991-09-17 | Syntex (U.S.A.) Inc. | N-ω,(ω-1)-dialkyloxy)- and N-(ω,(ω-1)-dialkenyloxy)Alk-1-YL-N,N,N-tetrasubstituted ammonium lipids and uses therefor |
US4946787A (en) | 1985-01-07 | 1990-08-07 | Syntex (U.S.A.) Inc. | N-(ω,(ω-1)-dialkyloxy)- and N-(ω,(ω-1)-dialkenyloxy)-alk-1-yl-N,N,N-tetrasubstituted ammonium lipids and uses therefor |
US4797368A (en) | 1985-03-15 | 1989-01-10 | The United States Of America As Represented By The Department Of Health And Human Services | Adeno-associated virus as eukaryotic expression vector |
US4921757A (en) | 1985-04-26 | 1990-05-01 | Massachusetts Institute Of Technology | System for delayed and pulsed release of biologically active substances |
US4774085A (en) | 1985-07-09 | 1988-09-27 | 501 Board of Regents, Univ. of Texas | Pharmaceutical administration systems containing a mixture of immunomodulators |
EP0264166A1 (fr) | 1986-04-09 | 1988-04-20 | Genzyme Corporation | Animaux transformés génétiquement sécrétant une protéine désirée dans le lait |
US4837028A (en) | 1986-12-24 | 1989-06-06 | Liposome Technology, Inc. | Liposomes with enhanced circulation time |
US4920016A (en) | 1986-12-24 | 1990-04-24 | Linear Technology, Inc. | Liposomes with enhanced circulation time |
US4906477A (en) | 1987-02-09 | 1990-03-06 | Kabushiki Kaisha Vitamin Kenkyusyo | Antineoplastic agent-entrapping liposomes |
US4911928A (en) | 1987-03-13 | 1990-03-27 | Micro-Pak, Inc. | Paucilamellar lipid vesicles |
US4873316A (en) | 1987-06-23 | 1989-10-10 | Biogen, Inc. | Isolation of exogenous recombinant proteins from the milk of transgenic mammals |
US4917951A (en) | 1987-07-28 | 1990-04-17 | Micro-Pak, Inc. | Lipid vesicles formed of surfactants and steroids |
WO1991016024A1 (fr) | 1990-04-19 | 1991-10-31 | Vical, Inc. | Lipides cationiques servant a l'apport intracellulaire de molecules biologiquement actives |
WO1991017424A1 (fr) | 1990-05-03 | 1991-11-14 | Vical, Inc. | Acheminement intracellulaire de substances biologiquement actives effectue a l'aide de complexes de lipides s'auto-assemblant |
US5173414A (en) | 1990-10-30 | 1992-12-22 | Applied Immune Sciences, Inc. | Production of recombinant adeno-associated virus vectors |
WO1993024641A2 (fr) | 1992-06-02 | 1993-12-09 | The United States Of America, As Represented By The Secretary, Department Of Health & Human Services | Virus adeno-associe a sequences terminales inversees utilisees comme promoteur |
US6607882B1 (en) | 1999-01-12 | 2003-08-19 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
US6453242B1 (en) | 1999-01-12 | 2002-09-17 | Sangamo Biosciences, Inc. | Selection of sites for targeting by zinc finger proteins and methods of designing zinc finger proteins to bind to preselected sites |
US6534261B1 (en) | 1999-01-12 | 2003-03-18 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
US20030087817A1 (en) | 1999-01-12 | 2003-05-08 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
US7013219B2 (en) | 1999-01-12 | 2006-03-14 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
US7163824B2 (en) | 1999-01-12 | 2007-01-16 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
US6824978B1 (en) | 1999-01-12 | 2004-11-30 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
US6933113B2 (en) | 1999-01-12 | 2005-08-23 | Sangamo Biosciences, Inc. | Modulation of endogenous gene expression in cells |
US6979539B2 (en) | 1999-01-12 | 2005-12-27 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
US6599692B1 (en) | 1999-09-14 | 2003-07-29 | Sangamo Bioscience, Inc. | Functional genomics using zinc finger proteins |
WO2001038547A2 (fr) | 1999-11-24 | 2001-05-31 | Mcs Micro Carrier Systems Gmbh | Polypeptides comprenant des multimeres de signaux de localisation nucleaire ou de domaines de transduction de proteine et utilisations de ces derniers pour transferer des molecules dans des cellules |
US6503717B2 (en) | 1999-12-06 | 2003-01-07 | Sangamo Biosciences, Inc. | Methods of using randomized libraries of zinc finger proteins for the identification of gene function |
US6689558B2 (en) | 2000-02-08 | 2004-02-10 | Sangamo Biosciences, Inc. | Cells for drug discovery |
US20070015238A1 (en) | 2002-06-05 | 2007-01-18 | Snyder Richard O | Production of pseudotyped recombinant AAV virions |
US20120322861A1 (en) | 2007-02-23 | 2012-12-20 | Barry John Byrne | Compositions and Methods for Treating Diseases |
WO2010028347A2 (fr) | 2008-09-05 | 2010-03-11 | President & Fellows Of Harvard College | Evolution dirigée continue de protéines et d'acides nucléiques |
US9771574B2 (en) | 2008-09-05 | 2017-09-26 | President And Fellows Of Harvard College | Apparatus for continuous directed evolution of proteins and nucleic acids |
US9023594B2 (en) | 2008-09-05 | 2015-05-05 | President And Fellows Of Harvard College | Continuous directed evolution of proteins and nucleic acids |
US20110059502A1 (en) | 2009-09-07 | 2011-03-10 | Chalasani Sreekanth H | Multiple domain proteins |
WO2011053982A2 (fr) | 2009-11-02 | 2011-05-05 | University Of Washington | Compositions thérapeutiques à base de nucléases et méthodes |
US9405700B2 (en) | 2010-11-04 | 2016-08-02 | Sonics, Inc. | Methods and apparatus for virtualization in an integrated circuit |
US9394537B2 (en) | 2010-12-22 | 2016-07-19 | President And Fellows Of Harvard College | Continuous directed evolution |
WO2012088381A2 (fr) | 2010-12-22 | 2012-06-28 | President And Fellows Of Harvard College | Évolution dirigée continue |
US20130345064A1 (en) | 2010-12-22 | 2013-12-26 | President And Fellows Of Harvard College | Continuous directed evolution |
US8871445B2 (en) | 2012-12-12 | 2014-10-28 | The Broad Institute Inc. | CRISPR-Cas component systems, methods and compositions for sequence manipulation |
US9737604B2 (en) | 2013-09-06 | 2017-08-22 | President And Fellows Of Harvard College | Use of cationic lipids to deliver CAS9 |
US9340799B2 (en) | 2013-09-06 | 2016-05-17 | President And Fellows Of Harvard College | MRNA-sensing switchable gRNAs |
WO2015035136A2 (fr) | 2013-09-06 | 2015-03-12 | President And Fellows Of Harvard College | Système d'administration pour des nucléases fonctionnelles |
US20150166981A1 (en) | 2013-12-12 | 2015-06-18 | President And Fellows Of Harvard College | Methods for nucleic acid editing |
US20150166980A1 (en) | 2013-12-12 | 2015-06-18 | President And Fellows Of Harvard College | Fusions of cas9 domains and nucleic acid-editing domains |
US9840699B2 (en) | 2013-12-12 | 2017-12-12 | President And Fellows Of Harvard College | Methods for nucleic acid editing |
US20160348096A1 (en) | 2014-01-20 | 2016-12-01 | President And Fellows Of Harvard College | Negative selection and stringency modulation in continuous evolution systems |
WO2015134121A2 (fr) | 2014-01-20 | 2015-09-11 | President And Fellows Of Harvard College | Sélection négative et modulation de la stringence dans des systèmes à évolution continue |
US10179911B2 (en) | 2014-01-20 | 2019-01-15 | President And Fellows Of Harvard College | Negative selection and stringency modulation in continuous evolution systems |
US10077453B2 (en) | 2014-07-30 | 2018-09-18 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
US20170233708A1 (en) | 2014-10-22 | 2017-08-17 | President And Fellows Of Harvard College | Evolution of proteases |
WO2016168631A1 (fr) | 2015-04-17 | 2016-10-20 | President And Fellows Of Harvard College | Système de mutagénèse à base de vecteurs |
US20180087046A1 (en) | 2015-04-17 | 2018-03-29 | President And Fellows Of Harvard College | Vector-based mutagenesis system |
WO2016205764A1 (fr) | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Nouvelles enzymes crispr et systèmes associés |
US20170044520A1 (en) | 2015-07-22 | 2017-02-16 | President And Fellows Of Harvard College | Evolution of site-specific recombinases |
WO2017070632A2 (fr) | 2015-10-23 | 2017-04-27 | President And Fellows Of Harvard College | Éditeurs de nucléobases et leurs utilisations |
WO2017070633A2 (fr) | 2015-10-23 | 2017-04-27 | President And Fellows Of Harvard College | Protéines cas9 évoluées pour l'édition génétique |
US20170121693A1 (en) | 2015-10-23 | 2017-05-04 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
US10167457B2 (en) | 2015-10-23 | 2019-01-01 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
WO2018027078A1 (fr) | 2016-08-03 | 2018-02-08 | President And Fellows Of Harard College | Éditeurs de nucléobases d'adénosine et utilisations associées |
US10113163B2 (en) | 2016-08-03 | 2018-10-30 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US20180073012A1 (en) | 2016-08-03 | 2018-03-15 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US20180127780A1 (en) | 2016-10-14 | 2018-05-10 | President And Fellows Of Harvard College | Aav delivery of nucleobase editors |
WO2018071868A1 (fr) | 2016-10-14 | 2018-04-19 | President And Fellows Of Harvard College | Administration d'aav d'éditeurs de nucléobases |
WO2018152197A1 (fr) * | 2017-02-15 | 2018-08-23 | Massachusetts Institute Of Technology | Éléments d'écriture d'adn, enregistreurs moléculaires et leurs utilisations |
WO2018176009A1 (fr) | 2017-03-23 | 2018-09-27 | President And Fellows Of Harvard College | Éditeurs de nucléobase comprenant des protéines de liaison à l'adn programmable par acides nucléiques |
WO2019023680A1 (fr) | 2017-07-28 | 2019-01-31 | President And Fellows Of Harvard College | Procédés et compositions pour l'évolution d'éditeurs de bases à l'aide d'une évolution continue assistée par phage (pace) |
WO2019079347A1 (fr) | 2017-10-16 | 2019-04-25 | The Broad Institute, Inc. | Utilisations d'éditeurs de bases adénosine |
WO2019226593A1 (fr) | 2018-05-24 | 2019-11-28 | Aqua-Aerobic Systems, Inc. | Système et procédé de traitement de matières solides dans un système de filtration |
WO2019241649A1 (fr) | 2018-06-14 | 2019-12-19 | President And Fellows Of Harvard College | Évolution de cytidine désaminases |
Non-Patent Citations (135)
Title |
---|
A. R. GRUBER ET AL., CELL, vol. 106, no. 1, 2008, pages 23 - 24 |
ABUDAYYEH ET AL.: "C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector", SCIENCE, vol. 353, no. 6299, 5 August 2016 (2016-08-05), XP055407082, DOI: 10.1126/science.aaf5573 |
AHMAD ET AL., CANCER RES., vol. 52, 1992, pages 4817 - 4820 |
ALEXIS C. KOMOR ET AL: "Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage", NATURE, vol. 533, no. 7603, 20 April 2016 (2016-04-20), London, pages 420 - 424, XP055551781, ISSN: 0028-0836, DOI: 10.1038/nature17946 * |
AMRANN ET AL., GENE, vol. 69, 1988, pages 301 - 315 |
ANDERSON, SCIENCE, vol. 256, 1992, pages 808 - 813 |
AURICCHIO ET AL., HUM. MOLEC. GENET., vol. 10, 2001, pages 3075 - 3081 |
AUTIERIAGRAWAL, J. BIOL. CHEM., vol. 273, 1998, pages 14731 - 37 |
BADRAN, A.H.LIU, D.R.: "In vivo continuous directed evolution", CURR. OPIN. CHEM. BIOL., vol. 24, 2015, pages 1 - 10, XP055350566, DOI: 10.1016/j.cbpa.2014.09.040 |
BAKERCORNISH, PNAS, 2002 |
BANERJEE, A.SANTOS, W. L.VERDINE, G. L.: "Structure of a DNA glycosylase searching for lesions", SCIENCE, vol. 311, 2006, pages 1153 - 1157 |
BENNETT, N. J.RAKONJAC, J.: "Unlocking of the filamentous bacteriophage virion during infection is mediated by the C domain of pill", JOURNAL OF MOLECULAR BIOLOGY, vol. 356, no. 2, 2006, pages 266 - 73, XP024950566, DOI: 10.1016/j.jmb.2005.11.069 |
BLAESE ET AL., CANCER GENE THER., vol. 2, 1995, pages 291 - 297 |
BRINER AE ET AL.: "Guide RNA functional modules direct Cas9 activity and orthogonality", MOL CELL, vol. 56, 2014, pages 333 - 339, XP055376599, DOI: 10.1016/j.molcel.2014.09.019 |
BRODIE L. RANZAU ET AL: "Genome, Epigenome, and Transcriptome Editing via Chemical Modification of Nucleobases in Living Cells", BIOCHEMISTRY, vol. 58, no. 5, 30 November 2018 (2018-11-30), US, pages 330 - 335, XP055701780, ISSN: 0006-2960, DOI: 10.1021/acs.biochem.8b00958 * |
BRUTLAG ET AL., COMP. APP. BIOSCI., vol. 6, 1990, pages 237 - 245 |
BUCHSCHER ET AL., J. VIROL., vol. 66, 1992, pages 1635 - 1640 |
BURSTEIN ET AL.: "New CRISPR-Cas systems from uncultivated microbes", CELL RES., 21 February 2017 (2017-02-21) |
BYRNERUDDLE, PROC. NATL. ACAD. SCI. USA, vol. 86, 1989, pages 5473 - 5477 |
CALAMEEATON, ADV. IMMUNOL., vol. 43, 1988, pages 235 - 275 |
CAMPESTILGHMAN, GENES DEV., vol. 3, 1989, pages 537 - 546 |
CHO SW ET AL.: "Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease", NATURE BIOTECHNOLOGY, vol. 31, 2013, pages 230 - 232 |
CHUAI, G. ET AL.: "DeepCRISPR: optimized CRISPR guide RNA design by deep learning", GENOME BIOL., vol. 19, 2018, pages 80 |
CHYLINSKIRHUNCHARPENTIER: "The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems", RNA BIOLOGY, vol. 10, no. 5, 2013, pages 726 - 737, XP055116068, DOI: 10.4161/rna.24321 |
COFFIN ET AL.: "Retroviruses", 1997, CSHL PRESS |
CONG L ET AL.: "Multiplex genome engineering using CRIPSR/Cas systems", SCIENCE, vol. 339, 2013, pages 819 - 823 |
CONG, L. ET AL.: "Multiplex genome engineering using CRISPR/Cas systems", SCIENCE, vol. 339, 2013, pages 819 - 823, XP055458249, DOI: 10.1126/science.1231143 |
CRYSTAL, SCIENCE, vol. 270, 1995, pages 404 - 410 |
DELTCHEVA E.CHYLINSKI K.SHARMA C.M.GONZALES K.CHAO Y.PIRZADA Z.A.ECKERT M.R.VOGEL J.CHARPENTIER E.: "CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III", NATURE, vol. 471, 2011, pages 602 - 607, XP055619637, DOI: 10.1038/nature09886 |
DICARLO, J.E. ET AL.: "Genome engineering in Saccharomyces cerevisiae using CRISPR-Cas systems", NUCLEIC ACID RES., 2013 |
DICKINSON, B.C.PACKER, M.S.BADRAN, A.H.LIU, D.R.: "A system for the continuous directed evolution of proteases rapidly reveals drug-resistance mutations", NAT. COMMUN., vol. 5, 2014, pages 5352 |
DUAN ET AL., J. VIROL., vol. 75, 2001, pages 7662 - 7671 |
EAST-SELETSKY ET AL.: "Two distinct RNase activities of CRISPR-C2c2 enable guide-RNA processing and RNA detection", NATURE, vol. 538, no. 7624, 13 October 2016 (2016-10-13), pages 270 - 273, XP055407060, DOI: 10.1038/nature19802 |
EDLUND ET AL., SCIENCE, vol. 230, 1985, pages 912 - 916 |
ELIZABETH KUTTERALEXANDER SULAKVELIDZE: "Bacteriophages: Biology and Applications", December 2004, CRC PRESS |
FALNES, P. 0.ROGNES, T.: "DNA repair by bacterial AlkB proteins", RES. MICROBIOL., vol. 154, no. 8, 2003, pages 531 - 538 |
FORTINI, P. ET AL.: "8-Oxoguanine DNA damage: at the crossroad of alternative repair pathways", MUTAT. RES., vol. 531, no. 1-2, 2003, pages 127 - 39, XP001182325, DOI: 10.1016/j.mrfmmm.2003.07.004 |
GAO ET AL., GENE THERAPY, vol. 2, 1995, pages 710 - 722 |
GAO ET AL., NAT BIOTECHNOL., vol. 34, no. 7, 2016, pages 768 - 73 |
GAO ET AL., NAT BIOTECHNOL., vol. 34, no. 7, July 2016 (2016-07-01), pages 768 - 73 |
GAUDELLI, N. M. ET AL.: "Programmable base editing of A-T to G*C in genomic DNA without DNA cleavage", NATURE, vol. 551, 2017, pages 464 - 471 |
GAUDELLI, N.M. ET AL.: "Programmable base editing of A:T to G:C in genomic DNA without DNA cleavage", NATURE, vol. 551, 2017, pages 464 - 471 |
GOEDDEL: "Methods In Enzymology", vol. 185, 1990, ACADEMIC PRESS, article "Gene Expression Technology" |
HALBERT ET AL., J. VIROL., vol. 74, 2000, pages 1524 - 1532 |
HERMONATMUZYCZKA, PNAS, vol. 81, 1984, pages 6466 - 6470 |
HUANG, T.P. ET AL.: "Circularly permuted and PAM-modified Cas9 variants broaden the targeting scope of base editors", NAT. BIOTECHNOL., vol. 37, 2019, pages 626 - 631, XP036900674, DOI: 10.1038/s41587-019-0134-y |
HUBBARD, B.P. ET AL.: "Continuous directed evolution of DNA-binding proteins to improve TALEN specificity", NAT. METHODS, vol. 12, 2015, pages 939 - 942, XP055548970, DOI: 10.1038/nmeth.3515 |
HWANG, W.Y. ET AL.: "Efficient genome editing in zebrafish using a CRISPR-Cas system", NATURE BIOTECHNOLOGY, vol. 31, 2013, pages 227 - 229, XP055086625, DOI: 10.1038/nbt.2501 |
IE OF JINEK ET AL., SCIENCE, vol. 337, 2012, pages 816 - 821 |
ITO, S. ET AL.: "Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine", SCIENCE, vol. 333, no. 6047, 2011, pages 1300 - 1303, XP055101432, DOI: 10.1126/science.1210597 |
JAKIMO ET AL., BIORXIV, A CAS9 WITH COMPLETE PAM RECOGNITION FOR ADENINE DINUCLEOTIDES, September 2018 (2018-09-01) |
JIANG, W. ET AL.: "RNA-guided editing of bacterial genomes using CRISPR-Cas systems", NATURE BIOTECHNOLOGY, vol. 31, 2013, pages 233 - 239, XP055249123, DOI: 10.1038/nbt.2508 |
JINEK M.CHYLINSKI K.FONFARA I.HAUER M.DOUDNA J.A.CHARPENTIER E.: "A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity", SCIENCE, vol. 337, 2012, pages 816 - 821, XP055549487, DOI: 10.1126/science.1225829 |
JINEK, M. ET AL.: "RNA-programmed genome editing in human cells", ELIFE, vol. 2, 2013, pages e00471, XP002699851, DOI: 10.7554/eLife.00471 |
KAUFMAN ET AL., EMBO J., vol. 6, 1987, pages 187 - 195 |
KAYA ET AL.: "A bacterial Argonaute with noncanonical guide RNA specificity", PROC NATL ACAD SCI USA., vol. 113, no. 15, 12 April 2016 (2016-04-12), pages 4057 - 62, XP055482683, DOI: 10.1073/pnas.1524385113 |
KESSELGRUSS, SCIENCE, vol. 249, 1990, pages 374 - 379 |
KLEINSTIVER, B. P. ET AL.: "Broadening the targeting range of Staphylococcus aureus CRISPR-Cas9 by modifying PAM recognition", NATURE BIOTECHNOLOGY, vol. 33, 2015, pages 1293 - 1298, XP055309933, DOI: 10.1038/nbt.3404 |
KLEINSTIVER, B. P. ET AL.: "Engineered CRISPR-Cas9 nucleases with altered PAM specificities", NATURE, vol. 523, 2015, pages 481 - 485, XP055293257, DOI: 10.1038/nature14592 |
KOMOR, A. C.BADRAN, A. H.LIU, D. R.: "CRISPR-Based Technologies for the Manipulation of Eukaryotic Genomes", CELL, vol. 168, 2017, pages 20 - 36, XP002781814, DOI: 10.1016/j.cell.2016.10.044 |
KOMOR, A.C. ET AL.: "Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity", SCI ADV, vol. 3, 2017, XP055453964, DOI: 10.1126/sciadv.aao4774 |
KOMOR, A.C. ET AL.: "Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage", NATURE, vol. 533, 2016, pages 420 - 424, XP055551781, DOI: 10.1038/nature17946 |
KOTIN, HUMAN GENE THERAPY, vol. 5, 1994, pages 793 - 801 |
KREMERPERRICAUDET, BRITISH MEDICAL BULLETIN, vol. 51, no. 1, 1995, pages 31 - 44 |
KUIJANHERSKOWITZ, CELL, vol. 30, 1982, pages 933 - 943 |
LANDRUM, M.J. ET AL.: "ClinVar: public archive of relationships among sequence variation and human phenotype", NUCLEIC ACIDS RES., vol. 42, 2014, pages D980 - 985 |
LEONARD, G. A. ET AL.: "Conformation of guanine-8-oxoadenine base pairs in the crystal structure of d(CGCGAATT(08A)GCG", BIOCHEM., vol. 31, no. 36, 1992, pages 8415 - 8420 |
LI JF ET AL.: "Multiplex and homologous recombination-mediated genome editing in Arabidopsis and Nicotiana benthamiana using guide RNA and Cas9", NATURE BIOTECHNOLOGY, vol. 31, 2013, pages 688 - 691, XP055129103, DOI: 10.1038/nbt.2654 |
LIANG, P. ET AL.: "Genome-wide profiling of adenine base editor specificity by EndoV-seq", NAT. COMMUN., vol. 10, 2019, pages 67 |
LIEFKE ET AL.: "The Oxidative Demethylase ALKBH3 Marks Hyperactive Gene Promoters In Human Cancer Cells", GENOME MEDICINE, vol. 7, 2015, pages 66 |
LIU ET AL., CELL DISCOVERY, vol. 5, 2019, pages 58 |
LIU ET AL.: "C2cl-sgRNA Complex Structure Reveals RNA-Guided DNA Cleavage Mechanism", MOL. CELL, vol. 65, no. 2, 19 January 2017 (2017-01-19), pages 310 - 322, XP029890333, DOI: 10.1016/j.molcel.2016.11.040 |
LIU ET AL.: "CasX enzymes comprises a distinct family of RNA-guided genome editors", NATURE, vol. 566, 2019, pages 218 - 223 |
LUCKLOWSUMMERS, IROLOGY, vol. 170, 1989, pages 6.3.1 - 6.3.6,2.10.3 |
MAGIN ET AL., VIROLOGY, vol. 274, 2000, pages 11 - 16 |
MAKAROVA ET AL.: "C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector", SCIENCE, vol. 353, no. 6299, 2016, XP055407082, DOI: 10.1126/science.aaf5573 |
MAKAROVA K. ET AL.: "Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements", BIOL DIRECT., vol. 4, 25 August 2009 (2009-08-25), pages 29, XP021059840, DOI: 10.1186/1745-6150-4-29 |
MALI PESVELT KMCHURCH GM: "Cas9 as a versatile tool for engineering biology", NATURE METHODS, vol. 10, 2013, pages 957 - 963, XP002718606, DOI: 10.1038/nmeth.2649 |
MALI, P. ET AL.: "RNA-guided human genome engineering via Cas9", SCIENCE, vol. 339, 2013, pages 823 - 826, XP055469277, DOI: 10.1126/science.1232033 |
MARTHA R. J. CLOKIEANDREW M. KROPINSKI: "Bacteriophages: Methods and Protocols", vol. 2, December 2008, HUMANA PRESS, article "Isolation, Characterization, and Interactions (Methods in Molecular Biology)" |
MCSHAN W.M.AJDIC D.J.SAVIC D.J.SAVIC G.LYON K.PRIMEAUX C.SEZATE S.SUVOROV A.N.KENTON S.LAI H.S.: "Complete genome sequence of an Ml strain of Streptococcus pyogenes", PROC. NATL. ACAD. SCI. U.S.A., vol. 98, 2001, pages 4658 - 4663 |
MILLER ET AL., J. VIROL., vol. 65, 1991, pages 2220 - 2224 |
MILLER, NATURE, vol. 357, 1992, pages 455 - 460 |
MITANICASKEY, TIBTECH, vol. 11, 1993, pages 167 - 175 |
MOEDE ET AL., FEBS LETT., vol. 461, 1999, pages 229 - 34 |
MOL THER., vol. 20, no. 4, April 2012 (2012-04-01), pages 699 - 708 |
MUZYCZKA, J. CLIN. INVEST., vol. 94, 1994, pages 1351 |
NAKAMURA, Y. ET AL.: "Codon usage tabulated from the international DNA sequence databases: status for the year 2000", NUCL. ACIDS RES., vol. 28, 2000, pages 292, XP002941557, DOI: 10.1093/nar/28.1.292 |
NISHIMASU ET AL.: "Crystal structure of Cas9 in complex with guide RNA and target DNA", CELL, vol. 156, no. 5, pages 935 - 949, XP028667665, DOI: 10.1016/j.cell.2014.02.001 |
NORMAN, D. P.CHUNG, S. J.VERDINE, G. L.: "Structural and biochemical exploration of a critical amino acid in human 8-oxo-guanine glycosylase", BIOCHEMISTRY, vol. 42, 2003, pages 1564 - 1572 |
OAKES ET AL.: "CRISPR-Cas9 Circular Permutants as Programmable Scaffolds for Genome Modification", CELL, vol. 176, 10 January 2019 (2019-01-10), pages 254 - 267 |
OAKES ET AL.: "Protein Engineering of Cas9 for enhanced function", METHODS ENZYMOL, vol. 546, 2014, pages 491 - 511, XP008176614, DOI: 10.1016/B978-0-12-801185-0.00024-6 |
OHE, T.WATANABE, Y.: "Purification and Properties of Xanthine Dehydrogenase from Streptomyces cyanogenus", J. BIOCHEM., vol. 86, 1979, pages 45 - 53 |
PA CARRGM CHURCH, NATURE BIOTECH., vol. 27, no. 12, 2009, pages 1151 - 62 |
PÅL Ø. FALNES ET AL: "DNA repair by bacterial AlkB proteins", RESEARCH IN MICROBIOLOGY, vol. 154, no. 8, 1 October 2003 (2003-10-01), NL, pages 531 - 538, XP055701885, ISSN: 0923-2508, DOI: 10.1016/S0923-2508(03)00150-5 * |
PINKERT ET AL., GENES DEV., vol. 1, 1987, pages 268 - 277 |
QI ET AL., CELL, vol. 152, no. 5, 2013, pages 1173 - 83 |
QUEENBALTIMORE, CELL, vol. 33, 1983, pages 741 - 748 |
REES HOLLY A ET AL: "Base editing: precision chemistry on the genome and transcriptome of living cells", NATURE REVIEWS GENETICS, NATURE PUBLISHING GROUP, GB, vol. 19, no. 12, 15 October 2018 (2018-10-15), pages 770 - 788, XP036637435, ISSN: 1471-0056, [retrieved on 20181015], DOI: 10.1038/S41576-018-0059-1 * |
REESLIU: "Base editing: precision chemistry on the genome and transcriptome of living cells", NAT REV GENET., vol. 19, no. 12, 2018, pages 770 - 788, XP036637441, DOI: 10.1038/s41576-018-0068-0 |
REESLIU: "Base editing: precision chemistry on the genome and transcriptome of living cells", NAT REV GENET., vol. 19, no. 12, December 2018 (2018-12-01), pages 770 - 788 |
REMY ET AL., BIOCONJUGATE CHEM., vol. 5, 1994, pages 647 - 654 |
SAMULSKI ET AL., J. VIROL., vol. 63, 1989, pages 03822 - 3828 |
SCHULTZ ET AL., GENE, vol. 54, 1987, pages 113 - 123 |
SEE REES, H.A. ET AL.: "Improving the DNA specificity and applicability of base editing through protein engineering and protein delivery", NAT. COMMUN., vol. 8, 2017, pages 15790, XP055597104, DOI: 10.1038/ncomms15790 |
SEED, NATURE, vol. 329, 1987, pages 840 |
SHMAKOV ET AL.: "Discovery and Functional Characterization of Diverse Class 2 CRISPR Cas Systems", MOL. CELL, vol. 60, no. 3, 5 November 2015 (2015-11-05), pages 385 - 397, XP055482679, DOI: 10.1016/j.molcel.2015.10.008 |
SICES, H. J. ET AL.: "Rapid genetic selection of inhibitor-resistant protease mutants: clinically relevant and novel mutants of the HIV protease", AIDS RES HUM RETROVIRUSES, vol. 17, no. 13, 2001, pages 1249 - 55 |
SICES, H. J.KRISTIE, T. M.: "A genetic screen for the isolation and characterization of site-specific proteases", PROC. NATL. ACAD. SCI. USA, vol. 95, no. 6, 1998, pages 2828 - 33, XP002306297, DOI: 10.1073/pnas.95.6.2828 |
SMITH ET AL., MOL. CELL. BIOL., vol. 3, 1983, pages 2156 - 2165 |
SOMMNERFELT ET AL., VIROL., vol. 176, 1990, pages 58 - 59 |
SUZUKI T. ET AL.: "Crystal structures reveal an elusive functional domain of pyrrolysyl-tRNA synthetase", NAT CHEM BIOL., vol. 13, no. 12, 2017, pages 1261 - 1266 |
SWARTS ET AL., NATURE, vol. 507, no. 7491, 2014, pages 258 - 61 |
SWARTS ET AL., NUCLEIC ACIDS RES., vol. 43, no. 10, 2015, pages 5120 - 9 |
THURONYI, B.W. ET AL.: "Continuous evolution of base editors with expanded target compatibility and improved activity", NAT. BIOTECHNOL., 2019, pages 1070 - 1079, XP036878165, DOI: 10.1038/s41587-019-0193-0 |
TINLAND ET AL., PROC. NATL. ACAD. SCI. U.S.A., vol. 89, 1992, pages 7442 - 46 |
TRATSCHIN ET AL., MOL. CELL. BIOL., vol. 4, 1984, pages 2072 - 2081 |
TRATSCHIN ET AL., MOL. CELL. BIOL., vol. 5, 1985, pages 3251 - 3260 |
TSAI, S. Q. ET AL.: "GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases", NATURE BIOTECHNOLOGY, vol. 33, 2015, pages 187 - 197, XP055555627, DOI: 10.1038/nbt.3117 |
VAN BRUNT, BIOTECHNOLOGY, vol. 6, no. 10, 1988, pages 1149 - 1154 |
VIDALLEGRAIN: "Yeast n-hybrid review", NUCLEIC ACID RES., vol. 27, 1999, pages 919 |
VIGNE, RESTORATIVE NEUROLOGY AND NEUROSCIENCE, vol. 8, 1995, pages 35 - 36 |
WANG, T.BADRAN, A.H.HUANG, T.P.LIU, D.R.: "Continuous directed evolution of proteins with improved soluble expression", NAT. CHEM. BIOL., vol. 14, 2018, pages 972 - 980, XP036592855, DOI: 10.1038/s41589-018-0121-5 |
WEST ET AL., VIROLOGY, vol. 160, 1987, pages 38 - 47 |
WHARTON, R. P.PTASHNE, M.: "A new-specificity mutant of 434 repressor that defines an amino acid-base pair contact", NATURE, vol. 326, no. 6116, 1987, pages 888 - 91 |
WHARTON, R. P.PTASHNE, M.: "Changing the binding specificity of a repressor by redesigning an alphahelix", NATURE, vol. 316, no. 6029, 1985, pages 601 - 5 |
WINOTOBALTIMORE, EMBO J., vol. 8, 1989, pages 729 - 733 |
YAMANO ET AL.: "Crystal structure of Cpfl in complex with guide RNA and target DNA", CELL, vol. 165, 2016, pages 949 - 962 |
YANG ET AL.: "PAM-dependent Target DNA Recognition and Cleavage by C2C1 CRISPR-Cas endonuclease", CELL, vol. 167, no. 7, 15 December 2016 (2016-12-15), pages 1814 - 1828, XP029850724, DOI: 10.1016/j.cell.2016.11.053 |
YU ET AL., GENE THERAPY, vol. 1, 1994, pages 13 - 26 |
ZETSCHE ET AL., CELL, vol. 163, 2015, pages 759 - 771 |
ZHANG Y. P. ET AL., GENE THER., vol. 6, 1999, pages 1438 - 47 |
ZHIZHONG GONG ET AL: "Active DNA demethylation by oxidation and repair", CELL RESEARCH - XIBAO YANJIU, vol. 21, no. 12, 23 August 2011 (2011-08-23), GB, CN, pages 1649 - 1651, XP055701331, ISSN: 1001-0602, DOI: 10.1038/cr.2011.140 * |
ZOLOTUKHIN ET AL.: "Production and purification of serotype 1, 2, and 5 recombinant adeno-associated viral vectors", METHODS, vol. 28, 2002, pages 158 - 167, XP002256404, DOI: 10.1016/S1046-2023(02)00220-7 |
ZUKERSTIEGLER, NUCLEIC ACIDS RES., vol. 9, 1981, pages 133 - 148 |
Cited By (93)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12006520B2 (en) | 2011-07-22 | 2024-06-11 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
US11920181B2 (en) | 2013-08-09 | 2024-03-05 | President And Fellows Of Harvard College | Nuclease profiling system |
US11299755B2 (en) | 2013-09-06 | 2022-04-12 | President And Fellows Of Harvard College | Switchable CAS9 nucleases and uses thereof |
US11124782B2 (en) | 2013-12-12 | 2021-09-21 | President And Fellows Of Harvard College | Cas variants for gene editing |
US11053481B2 (en) | 2013-12-12 | 2021-07-06 | President And Fellows Of Harvard College | Fusions of Cas9 domains and nucleic acid-editing domains |
US12215365B2 (en) | 2013-12-12 | 2025-02-04 | President And Fellows Of Harvard College | Cas variants for gene editing |
US11578343B2 (en) | 2014-07-30 | 2023-02-14 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
US11214780B2 (en) | 2015-10-23 | 2022-01-04 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
US12043852B2 (en) | 2015-10-23 | 2024-07-23 | President And Fellows Of Harvard College | Evolved Cas9 proteins for gene editing |
US12344869B2 (en) | 2015-10-23 | 2025-07-01 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
US11999947B2 (en) | 2016-08-03 | 2024-06-04 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US11702651B2 (en) | 2016-08-03 | 2023-07-18 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US11661590B2 (en) | 2016-08-09 | 2023-05-30 | President And Fellows Of Harvard College | Programmable CAS9-recombinase fusion proteins and uses thereof |
US12084663B2 (en) | 2016-08-24 | 2024-09-10 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
US11542509B2 (en) | 2016-08-24 | 2023-01-03 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
US11306324B2 (en) | 2016-10-14 | 2022-04-19 | President And Fellows Of Harvard College | AAV delivery of nucleobase editors |
US11820969B2 (en) | 2016-12-23 | 2023-11-21 | President And Fellows Of Harvard College | Editing of CCR2 receptor gene to protect against HIV infection |
US11898179B2 (en) | 2017-03-09 | 2024-02-13 | President And Fellows Of Harvard College | Suppression of pain by gene editing |
US11542496B2 (en) | 2017-03-10 | 2023-01-03 | President And Fellows Of Harvard College | Cytosine to guanine base editor |
US11268082B2 (en) | 2017-03-23 | 2022-03-08 | President And Fellows Of Harvard College | Nucleobase editors comprising nucleic acid programmable DNA binding proteins |
US11560566B2 (en) | 2017-05-12 | 2023-01-24 | President And Fellows Of Harvard College | Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation |
US11732274B2 (en) | 2017-07-28 | 2023-08-22 | President And Fellows Of Harvard College | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) |
US11319532B2 (en) | 2017-08-30 | 2022-05-03 | President And Fellows Of Harvard College | High efficiency base editors comprising Gam |
US11932884B2 (en) | 2017-08-30 | 2024-03-19 | President And Fellows Of Harvard College | High efficiency base editors comprising Gam |
US11795443B2 (en) | 2017-10-16 | 2023-10-24 | The Broad Institute, Inc. | Uses of adenosine base editors |
US12157760B2 (en) | 2018-05-23 | 2024-12-03 | The Broad Institute, Inc. | Base editors and uses thereof |
US12281338B2 (en) | 2018-10-29 | 2025-04-22 | The Broad Institute, Inc. | Nucleobase editors comprising GeoCas9 and uses thereof |
US11795452B2 (en) | 2019-03-19 | 2023-10-24 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US12281303B2 (en) | 2019-03-19 | 2025-04-22 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US11447770B1 (en) | 2019-03-19 | 2022-09-20 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US11643652B2 (en) | 2019-03-19 | 2023-05-09 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US12098372B2 (en) | 2019-12-30 | 2024-09-24 | Eligo Bioscience | Microbiome modulation of a host by delivery of DNA payloads with minimized spread |
US11584781B2 (en) | 2019-12-30 | 2023-02-21 | Eligo Bioscience | Chimeric receptor binding proteins resistant to proteolytic degradation |
US11746352B2 (en) | 2019-12-30 | 2023-09-05 | Eligo Bioscience | Microbiome modulation of a host by delivery of DNA payloads with minimized spread |
US12351837B2 (en) | 2020-01-23 | 2025-07-08 | The Broad Institute, Inc. | Supernegatively charged proteins and uses thereof |
US11617773B2 (en) | 2020-04-08 | 2023-04-04 | Eligo Bioscience | Elimination of colonic bacterial driving lethal inflammatory cardiomyopathy |
US11690880B2 (en) | 2020-04-08 | 2023-07-04 | Eligo Bioscience | Modulation of microbiota function by gene therapy of the microbiome to prevent, treat or cure microbiome-associated diseases or disorders |
US11224621B2 (en) | 2020-04-08 | 2022-01-18 | Eligo Bioscience | Modulation of microbiota function by gene therapy of the microbiome to prevent, treat or cure microbiome-associated diseases or disorders |
US11376286B2 (en) | 2020-04-08 | 2022-07-05 | Eligo Bioscience | Modulation of microbiota function by gene therapy of the microbiome to prevent, treat or cure microbiome-associated diseases or disorders |
US11534467B2 (en) | 2020-04-08 | 2022-12-27 | Eligo Bioscience | Modulation of microbiota function by gene therapy of the microbiome to prevent, treat or cure microbiome-associated diseases or disorders |
US12031126B2 (en) | 2020-05-08 | 2024-07-09 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
US11912985B2 (en) | 2020-05-08 | 2024-02-27 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
WO2021250284A1 (fr) | 2020-06-12 | 2021-12-16 | Eligo Bioscience | Décolonisation spécifique de bactéries résistantes aux antibiotiques à des fins prophylactiques |
EP3922719A1 (fr) | 2020-06-12 | 2021-12-15 | Eligo Bioscience | Décolonisation spécifique des bactéries résistantes aux antibiotiques à des fins prophylactiques |
WO2022003209A1 (fr) | 2020-07-03 | 2022-01-06 | Eligo Bioscience | Procédé de confinement de vecteurs d'acide nucléique introduits dans une population de microbiome |
WO2022096590A1 (fr) | 2020-11-04 | 2022-05-12 | Eligo Bioscience | Particules dérivées de phages pour l'administration in situ de charge utile d'adn dans une population de c. acnes |
US11473093B2 (en) | 2020-11-04 | 2022-10-18 | Eligo Bioscience | Cutibacterium acnes recombinant phages, method of production and uses thereof |
US11970701B2 (en) | 2020-11-04 | 2024-04-30 | Eligo Bioscience | Phage-derived particles for in situ delivery of DNA payload into C. acnes population |
WO2022096596A1 (fr) | 2020-11-04 | 2022-05-12 | Eligo Bioscience | Phages recombinants de cutibacterium acnes, leur procédé de production et leurs utilisations |
US11840695B2 (en) | 2020-11-04 | 2023-12-12 | Eligo Bioscience | Recombinant C. acnes phages comprising transgenes |
US11820989B2 (en) | 2020-11-04 | 2023-11-21 | Eligo Bioscience | Phage-derived particles for in situ delivery of DNA payload into C. acnes population |
WO2022144381A1 (fr) | 2020-12-30 | 2022-07-07 | Eligo Bioscience | Modulation du microbiome d'un hôte par administration de charges utiles d'adn à étalement réduit à un minimum |
WO2022144382A1 (fr) | 2020-12-30 | 2022-07-07 | Eligo Bioscience | Protéines de liaison au récepteur chimérique résistantes à la dégradation protéolytique |
US11952595B2 (en) | 2021-05-12 | 2024-04-09 | Eligo Bioscience | Production of lytic phages |
US11697802B2 (en) | 2021-05-12 | 2023-07-11 | Eligo Bioscience | Production bacterial cells and use thereof in production methods |
WO2022238552A1 (fr) | 2021-05-12 | 2022-11-17 | Eligo Bioscience | Cellules bactériennes de production et leur utilisation dans des procédés de production |
US11739304B2 (en) | 2021-05-12 | 2023-08-29 | Eligo Bioscience | Production of lytic phages |
WO2022238555A1 (fr) | 2021-05-12 | 2022-11-17 | Eligo Bioscience | Production de phages lytiques |
US11939598B2 (en) | 2021-05-12 | 2024-03-26 | Eligo Bioscience | Production bacterial cells and use thereof in production methods |
WO2022251712A1 (fr) | 2021-05-28 | 2022-12-01 | Sana Biotechnology, Inc. | Particules lipidiques contenant une glycoprotéine d'enveloppe de rétrovirus endogène de babouin (baev) tronquée et méthodes et utilisations associées |
WO2023019229A1 (fr) | 2021-08-11 | 2023-02-16 | Sana Biotechnology, Inc. | Cellules primaires génétiquement modifiées pour une thérapie cellulaire allogénique |
WO2023019227A1 (fr) | 2021-08-11 | 2023-02-16 | Sana Biotechnology, Inc. | Cellules génétiquement modifiées pour une thérapie cellulaire allogénique pour réduire les réactions inflammatoires induites par le complément |
WO2023019225A2 (fr) | 2021-08-11 | 2023-02-16 | Sana Biotechnology, Inc. | Cellules génétiquement modifiées pour une thérapie cellulaire allogénique permettant de réduire les réactions inflammatoires à médiation par le sang instantanée |
WO2023019226A1 (fr) | 2021-08-11 | 2023-02-16 | Sana Biotechnology, Inc. | Cellules génétiquement modifiées pour une thérapie cellulaire allogénique |
WO2023069790A1 (fr) | 2021-10-22 | 2023-04-27 | Sana Biotechnology, Inc. | Procédés de modification de lymphocytes t allogéniques avec un transgène dans un locus de tcr et compositions et procédés associés |
WO2023115039A2 (fr) | 2021-12-17 | 2023-06-22 | Sana Biotechnology, Inc. | Glycoprotéines de fusion de paramyxoviridae modifiées |
WO2023115041A1 (fr) | 2021-12-17 | 2023-06-22 | Sana Biotechnology, Inc. | Glycoprotéines de fixation de paramyxoviridae modifiées |
WO2023133595A2 (fr) | 2022-01-10 | 2023-07-13 | Sana Biotechnology, Inc. | Méthodes de dosage et d'administration ex vivo de particules lipidiques ou de vecteurs viraux ainsi que systèmes et utilisations associés |
WO2023150518A1 (fr) | 2022-02-01 | 2023-08-10 | Sana Biotechnology, Inc. | Vecteurs lentiviraux ciblant cd3 et leurs utilisations |
WO2023150647A1 (fr) | 2022-02-02 | 2023-08-10 | Sana Biotechnology, Inc. | Procédés d'administration et de dosage répétés de particules lipidiques ou de vecteurs viraux et systèmes et utilisations connexes |
WO2023158836A1 (fr) | 2022-02-17 | 2023-08-24 | Sana Biotechnology, Inc. | Protéines cd47 modifiées et leurs utilisations |
WO2023217280A1 (fr) * | 2022-05-13 | 2023-11-16 | Huidagene Therapeutics Co., Ltd. | Éditeur de base d'adénine programmable et ses utilisations |
WO2024044655A1 (fr) | 2022-08-24 | 2024-02-29 | Sana Biotechnology, Inc. | Administration de protéines hétérologues |
WO2024047151A1 (fr) | 2022-08-31 | 2024-03-07 | Snipr Biome Aps | Nouveau type de système crispr/cas |
EP4574980A2 (fr) | 2022-08-31 | 2025-06-25 | SNIPR Biome ApS | Nouveau type de système crispr/cas |
WO2024064838A1 (fr) | 2022-09-21 | 2024-03-28 | Sana Biotechnology, Inc. | Particules lipidiques comprenant des glycoprotéines fixant des paramyxovirus variants et leurs utilisations |
WO2024081820A1 (fr) | 2022-10-13 | 2024-04-18 | Sana Biotechnology, Inc. | Particules virales ciblant des cellules souches hématopoïétiques |
WO2024097314A2 (fr) | 2022-11-02 | 2024-05-10 | Sana Biotechnology, Inc. | Procédés et systèmes pour déterminer des caractéristiques de cellules donatrices et formuler des produits de thérapie cellulaire sur la base de caractéristiques de cellules |
WO2024097313A1 (fr) | 2022-11-02 | 2024-05-10 | Sana Biotechnology, Inc. | Procédés de production de produits de thérapie à base de lymphocytes t |
WO2024097311A2 (fr) | 2022-11-02 | 2024-05-10 | Sana Biotechnology, Inc. | Lymphocytes mait hypoimmunogènes, leurs procédés de fabrication et leurs procédés d'utilisation |
WO2024097315A2 (fr) | 2022-11-02 | 2024-05-10 | Sana Biotechnology, Inc. | Produits de thérapie cellulaire et leurs procédés de production |
WO2024119157A1 (fr) | 2022-12-02 | 2024-06-06 | Sana Biotechnology, Inc. | Particules lipidiques avec cofusogènes et leurs procédés de production et d'utilisation |
WO2024151541A1 (fr) | 2023-01-09 | 2024-07-18 | Sana Biotechnology, Inc. | Souris auto-immune présentant un diabète de type 1 |
US12359218B2 (en) | 2023-03-03 | 2025-07-15 | President And Fellows Of Harvard College | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) |
WO2024220598A2 (fr) | 2023-04-18 | 2024-10-24 | Sana Biotechnology, Inc. | Vecteurs lentiviraux à deux génomes ou plus |
WO2024220560A1 (fr) | 2023-04-18 | 2024-10-24 | Sana Biotechnology, Inc. | Fusogènes de protéine g modifiés et particules lipidiques associées et procédés associés |
WO2024220574A1 (fr) | 2023-04-18 | 2024-10-24 | Sana Biotechnology, Inc. | Fusogènes de protéine g universelle et systèmes adaptateurs de ceux-ci et particules lipidiques et utilisations associées |
WO2024229302A1 (fr) | 2023-05-03 | 2024-11-07 | Sana Biotechnology, Inc. | Procédés de dosage et d'administration de cellules d'îlots modifiées |
WO2024243236A2 (fr) | 2023-05-22 | 2024-11-28 | Sana Biotechnology, Inc. | Procédés d'administration de cellules des îlots pancréatiques et procédés associés |
WO2024243340A1 (fr) | 2023-05-23 | 2024-11-28 | Sana Biotechnology, Inc. | Fusogènes en tandem et particules lipidiques associées |
WO2025021861A1 (fr) | 2023-07-24 | 2025-01-30 | Eligo Bioscience | Détection et traitement de maladies associées aux bactéries c. acnes |
WO2025046062A1 (fr) | 2023-08-31 | 2025-03-06 | Snipr Biome Aps | Nouveau type de système crispr/cas |
WO2025054202A1 (fr) | 2023-09-05 | 2025-03-13 | Sana Biotechnology, Inc. | Procédé de criblage d'un échantillon contenant un transgène à l'aide d'un code à barres unique |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220170013A1 (en) | T:a to a:t base editing through adenosine methylation | |
WO2020181202A1 (fr) | Édition de base a:t en t:a par déamination et oxydation d'adénine | |
US11732274B2 (en) | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) | |
WO2020181195A1 (fr) | Édition de base t : a à a : t par excision d'adénine | |
WO2020181178A1 (fr) | Édition de base t:a à a:t par alkylation de thymine | |
US20230235309A1 (en) | Adenine base editors and uses thereof | |
US20220282275A1 (en) | G-to-t base editors and uses thereof | |
US20220307003A1 (en) | Adenine base editors with reduced off-target effects | |
WO2020181180A1 (fr) | Éditeurs de base a:t en c:g et leurs utilisations | |
WO2021030666A1 (fr) | Édition de bases par transglycosylation | |
US12031126B2 (en) | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence | |
US20250059244A1 (en) | Base editors and uses thereof | |
US20240076652A1 (en) | Adenosine nucleobase editors and uses thereof | |
US20220380740A1 (en) | Constructs for improved hdr-dependent genomic editing | |
US20240417715A1 (en) | Methods and compositions for prime editing rna | |
US20230086199A1 (en) | Systems and methods for evaluating cas9-independent off-target editing of nucleic acids | |
US20220204975A1 (en) | System for genome editing | |
US20240287487A1 (en) | Improved cytosine to guanine base editors | |
EP4370666A2 (fr) | Éditeurs de base adénine spécifiques au contexte et leurs utilisations | |
US12359218B2 (en) | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) | |
US20250101395A1 (en) | Evolved cas14a1 variants, compositions, and methods of making and using same in genome editing | |
CN118202041A (zh) | 背景特异性腺嘌呤碱基编辑器及其用途 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20715642 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20715642 Country of ref document: EP Kind code of ref document: A1 |