WO2024211306A1 - Rf1 ko e. coli strains - Google Patents
Rf1 ko e. coli strains Download PDFInfo
- Publication number
- WO2024211306A1 WO2024211306A1 PCT/US2024/022668 US2024022668W WO2024211306A1 WO 2024211306 A1 WO2024211306 A1 WO 2024211306A1 US 2024022668 W US2024022668 W US 2024022668W WO 2024211306 A1 WO2024211306 A1 WO 2024211306A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- deficient
- protein
- cell
- coli cell
- coli
- Prior art date
Links
- 241000588724 Escherichia coli Species 0.000 title claims abstract description 220
- 210000004027 cell Anatomy 0.000 claims abstract description 266
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 261
- 230000002950 deficient Effects 0.000 claims abstract description 147
- 108020004705 Codon Proteins 0.000 claims abstract description 134
- 230000035772 mutation Effects 0.000 claims abstract description 66
- 230000000694 effects Effects 0.000 claims abstract description 51
- 230000001590 oxidative effect Effects 0.000 claims abstract description 26
- 210000000805 cytoplasm Anatomy 0.000 claims abstract description 25
- 102000004169 proteins and genes Human genes 0.000 claims description 204
- 235000018102 proteins Nutrition 0.000 claims description 196
- 150000001413 amino acids Chemical class 0.000 claims description 189
- 235000001014 amino acid Nutrition 0.000 claims description 188
- 229940024606 amino acid Drugs 0.000 claims description 186
- 238000000034 method Methods 0.000 claims description 106
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 68
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 67
- 229920001184 polypeptide Polymers 0.000 claims description 65
- 230000014509 gene expression Effects 0.000 claims description 64
- GAJBPZXIKZXTCG-VIFPVBQESA-N (2s)-2-amino-3-[4-(azidomethyl)phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(CN=[N+]=[N-])C=C1 GAJBPZXIKZXTCG-VIFPVBQESA-N 0.000 claims description 57
- 102000052866 Amino Acyl-tRNA Synthetases Human genes 0.000 claims description 50
- 108700028939 Amino Acyl-tRNA Synthetases Proteins 0.000 claims description 50
- 239000013612 plasmid Substances 0.000 claims description 40
- 108010071134 CRM197 (non-toxic variant of diphtheria toxin) Proteins 0.000 claims description 36
- 101150115248 hda gene Proteins 0.000 claims description 36
- 101100075089 Epichloe uncinata lolA1 gene Proteins 0.000 claims description 35
- 101100075091 Epichloe uncinata lolA2 gene Proteins 0.000 claims description 35
- 101150102170 coaD gene Proteins 0.000 claims description 35
- 101150112623 hemA gene Proteins 0.000 claims description 35
- 101150052914 lolA gene Proteins 0.000 claims description 35
- 101150070011 lpxK gene Proteins 0.000 claims description 35
- 101150092863 mreC gene Proteins 0.000 claims description 35
- 101150102210 murF gene Proteins 0.000 claims description 35
- 102000014914 Carrier Proteins Human genes 0.000 claims description 33
- XOOUIPVCVHRTMJ-UHFFFAOYSA-L zinc stearate Chemical compound [Zn+2].CCCCCCCCCCCCCCCCCC([O-])=O.CCCCCCCCCCCCCCCCCC([O-])=O XOOUIPVCVHRTMJ-UHFFFAOYSA-L 0.000 claims description 33
- 108010078791 Carrier Proteins Proteins 0.000 claims description 32
- 239000012634 fragment Substances 0.000 claims description 29
- 108091026890 Coding region Proteins 0.000 claims description 22
- 102000004127 Cytokines Human genes 0.000 claims description 21
- 108090000695 Cytokines Proteins 0.000 claims description 21
- 230000001580 bacterial effect Effects 0.000 claims description 21
- 108091033319 polynucleotide Proteins 0.000 claims description 20
- 102000040430 polynucleotide Human genes 0.000 claims description 20
- 239000002157 polynucleotide Substances 0.000 claims description 20
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 claims description 17
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 claims description 17
- 230000001965 increasing effect Effects 0.000 claims description 17
- 230000001939 inductive effect Effects 0.000 claims description 17
- 229960001153 serine Drugs 0.000 claims description 17
- 210000001744 T-lymphocyte Anatomy 0.000 claims description 16
- ZXSBHXZKWRIEIA-JTQLQIEISA-N (2s)-3-(4-acetylphenyl)-2-azaniumylpropanoate Chemical compound CC(=O)C1=CC=C(C[C@H](N)C(O)=O)C=C1 ZXSBHXZKWRIEIA-JTQLQIEISA-N 0.000 claims description 15
- 108010021625 Immunoglobulin Fragments Proteins 0.000 claims description 15
- 230000002163 immunogen Effects 0.000 claims description 15
- 101100012355 Bacillus anthracis fabH1 gene Proteins 0.000 claims description 14
- 101100012357 Bacillus subtilis (strain 168) fabHA gene Proteins 0.000 claims description 14
- 101100134884 Corynebacterium glutamicum (strain ATCC 13032 / DSM 20300 / BCRC 11384 / JCM 1318 / LMG 3730 / NCIMB 10025) aceF gene Proteins 0.000 claims description 14
- 101150090997 DLAT gene Proteins 0.000 claims description 14
- 101100484521 Haloferax volcanii (strain ATCC 29605 / DSM 3757 / JCM 8879 / NBRC 14742 / NCIMB 2012 / VKM B-1768 / DS2) atpF gene Proteins 0.000 claims description 14
- 102000008394 Immunoglobulin Fragments Human genes 0.000 claims description 14
- 101100110710 Streptococcus mutans serotype c (strain ATCC 700610 / UA159) atpH gene Proteins 0.000 claims description 14
- 101150090348 atpC gene Proteins 0.000 claims description 14
- 101150099875 atpE gene Proteins 0.000 claims description 14
- 101150035981 fabH gene Proteins 0.000 claims description 14
- 101150078036 odhB gene Proteins 0.000 claims description 14
- 101150055132 sucB gene Proteins 0.000 claims description 14
- 229960004441 tyrosine Drugs 0.000 claims description 14
- NEMHIKRLROONTL-QMMMGPOBSA-N (2s)-2-azaniumyl-3-(4-azidophenyl)propanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N=[N+]=[N-])C=C1 NEMHIKRLROONTL-QMMMGPOBSA-N 0.000 claims description 12
- 230000003213 activating effect Effects 0.000 claims description 12
- 101100483544 Escherichia coli (strain K12) ubiF gene Proteins 0.000 claims description 11
- 101100221590 Herbaspirillum seropedicae coq7 gene Proteins 0.000 claims description 11
- BZQFBWGGLXLEPQ-REOHCLBHSA-N phosphoserine Chemical compound OC(=O)[C@@H](N)COP(O)(O)=O BZQFBWGGLXLEPQ-REOHCLBHSA-N 0.000 claims description 11
- 238000013518 transcription Methods 0.000 claims description 9
- 230000035897 transcription Effects 0.000 claims description 9
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 8
- 241000894006 Bacteria Species 0.000 claims description 8
- 102000001706 Immunoglobulin Fab Fragments Human genes 0.000 claims description 8
- 108010054477 Immunoglobulin Fab Fragments Proteins 0.000 claims description 8
- 102000014150 Interferons Human genes 0.000 claims description 8
- 108010050904 Interferons Proteins 0.000 claims description 8
- 102000015696 Interleukins Human genes 0.000 claims description 8
- 108010063738 Interleukins Proteins 0.000 claims description 8
- 239000002773 nucleotide Substances 0.000 claims description 8
- 125000003729 nucleotide group Chemical group 0.000 claims description 8
- JSXMFBNJRFXRCX-NSHDSACASA-N (2s)-2-amino-3-(4-prop-2-ynoxyphenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(OCC#C)C=C1 JSXMFBNJRFXRCX-NSHDSACASA-N 0.000 claims description 7
- 241000606768 Haemophilus influenzae Species 0.000 claims description 7
- 238000012258 culturing Methods 0.000 claims description 7
- 229940047650 haemophilus influenzae Drugs 0.000 claims description 7
- 229940047122 interleukins Drugs 0.000 claims description 7
- 108010012236 Chemokines Proteins 0.000 claims description 6
- 102000019034 Chemokines Human genes 0.000 claims description 6
- 102100037840 Dehydrogenase/reductase SDR family member 2, mitochondrial Human genes 0.000 claims description 6
- 101710188053 Protein D Proteins 0.000 claims description 6
- 101710132893 Resolvase Proteins 0.000 claims description 6
- 108030001722 Tentoxilysin Proteins 0.000 claims description 6
- 102000009618 Transforming Growth Factors Human genes 0.000 claims description 6
- 108010009583 Transforming Growth Factors Proteins 0.000 claims description 6
- 238000012239 gene modification Methods 0.000 claims description 6
- 229940047124 interferons Drugs 0.000 claims description 6
- YYTDJPUFAVPHQA-VKHMYHEASA-N (2s)-2-amino-3-(2,3,4,5,6-pentafluorophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=C(F)C(F)=C(F)C(F)=C1F YYTDJPUFAVPHQA-VKHMYHEASA-N 0.000 claims description 5
- PEMUHKUIQHFMTH-QMMMGPOBSA-N (2s)-2-amino-3-(4-bromophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(Br)C=C1 PEMUHKUIQHFMTH-QMMMGPOBSA-N 0.000 claims description 5
- JZRBSTONIYRNRI-VIFPVBQESA-N 3-methylphenylalanine Chemical compound CC1=CC=CC(C[C@H](N)C(O)=O)=C1 JZRBSTONIYRNRI-VIFPVBQESA-N 0.000 claims description 5
- IRZQDMYEJPNDEN-UHFFFAOYSA-N 3-phenyl-2-aminobutanoic acid Natural products OC(=O)C(N)C(C)C1=CC=CC=C1 IRZQDMYEJPNDEN-UHFFFAOYSA-N 0.000 claims description 5
- CMUHFUGDYMFHEI-QMMMGPOBSA-N 4-amino-L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N)C=C1 CMUHFUGDYMFHEI-QMMMGPOBSA-N 0.000 claims description 5
- PZNQZSRPDOEBMS-QMMMGPOBSA-N 4-iodo-L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(I)C=C1 PZNQZSRPDOEBMS-QMMMGPOBSA-N 0.000 claims description 5
- 108010067060 Immunoglobulin Variable Region Proteins 0.000 claims description 5
- WTDRDQBEARUVNC-LURJTMIESA-N L-DOPA Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C(O)=C1 WTDRDQBEARUVNC-LURJTMIESA-N 0.000 claims description 5
- WTDRDQBEARUVNC-UHFFFAOYSA-N L-Dopa Natural products OC(=O)C(N)CC1=CC=C(O)C(O)=C1 WTDRDQBEARUVNC-UHFFFAOYSA-N 0.000 claims description 5
- GEYBMYRBIABFTA-VIFPVBQESA-N O-methyl-L-tyrosine Chemical compound COC1=CC=C(C[C@H](N)C(O)=O)C=C1 GEYBMYRBIABFTA-VIFPVBQESA-N 0.000 claims description 5
- 229950006137 dexfosfoserine Drugs 0.000 claims description 5
- 230000005017 genetic modification Effects 0.000 claims description 5
- 235000013617 genetically modified food Nutrition 0.000 claims description 5
- TVIDEEHSOPHZBR-AWEZNQCLSA-N para-(benzoyl)-phenylalanine Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1C(=O)C1=CC=CC=C1 TVIDEEHSOPHZBR-AWEZNQCLSA-N 0.000 claims description 5
- DCWXELXMIBXGTH-QMMMGPOBSA-N phosphonotyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(OP(O)(O)=O)C=C1 DCWXELXMIBXGTH-QMMMGPOBSA-N 0.000 claims description 5
- 102000017727 Immunoglobulin Variable Region Human genes 0.000 claims description 4
- 239000001963 growth medium Substances 0.000 claims description 4
- 101150047875 RF2 gene Proteins 0.000 claims description 3
- 101710175727 Peptide chain release factor 1 Proteins 0.000 claims 3
- 230000004083 survival effect Effects 0.000 abstract 1
- 108020004566 Transfer RNA Proteins 0.000 description 70
- 238000009482 thermal adhesion granulation Methods 0.000 description 57
- -1 or equivalently Proteins 0.000 description 25
- 150000007523 nucleic acids Chemical class 0.000 description 24
- 102000039446 nucleic acids Human genes 0.000 description 22
- 108020004707 nucleic acids Proteins 0.000 description 22
- 108020005038 Terminator Codon Proteins 0.000 description 21
- 238000010348 incorporation Methods 0.000 description 20
- 230000015572 biosynthetic process Effects 0.000 description 19
- 230000014616 translation Effects 0.000 description 18
- 239000000427 antigen Substances 0.000 description 17
- 108091007433 antigens Proteins 0.000 description 17
- 102000036639 antigens Human genes 0.000 description 17
- 238000004519 manufacturing process Methods 0.000 description 17
- 108091033409 CRISPR Proteins 0.000 description 16
- 125000003275 alpha amino acid group Chemical group 0.000 description 16
- 239000000047 product Substances 0.000 description 16
- 239000006166 lysate Substances 0.000 description 15
- 230000006870 function Effects 0.000 description 14
- 238000003786 synthesis reaction Methods 0.000 description 13
- 102000003960 Ligases Human genes 0.000 description 12
- 108090000364 Ligases Proteins 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 12
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 12
- 229930027917 kanamycin Natural products 0.000 description 12
- 229960000318 kanamycin Drugs 0.000 description 12
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 12
- 229930182823 kanamycin A Natural products 0.000 description 12
- 238000013519 translation Methods 0.000 description 12
- 238000010354 CRISPR gene editing Methods 0.000 description 11
- 102000004190 Enzymes Human genes 0.000 description 11
- 108090000790 Enzymes Proteins 0.000 description 11
- 230000027455 binding Effects 0.000 description 11
- 238000006243 chemical reaction Methods 0.000 description 11
- 239000004301 calcium benzoate Substances 0.000 description 10
- 229940088598 enzyme Drugs 0.000 description 10
- 239000013598 vector Substances 0.000 description 10
- 238000010453 CRISPR/Cas method Methods 0.000 description 9
- 108090000171 Interleukin-18 Proteins 0.000 description 9
- 102000003810 Interleukin-18 Human genes 0.000 description 9
- 230000004071 biological effect Effects 0.000 description 9
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 9
- 101150061398 fabR gene Proteins 0.000 description 9
- 230000012010 growth Effects 0.000 description 9
- 239000002953 phosphate buffered saline Substances 0.000 description 9
- 229920000642 polymer Polymers 0.000 description 9
- 235000002374 tyrosine Nutrition 0.000 description 9
- 108091060545 Nonsense suppressor Proteins 0.000 description 8
- 125000000539 amino acid group Chemical group 0.000 description 8
- 239000000872 buffer Substances 0.000 description 8
- 150000001720 carbohydrates Chemical group 0.000 description 8
- 238000004895 liquid chromatography mass spectrometry Methods 0.000 description 8
- 108020004999 messenger RNA Proteins 0.000 description 8
- 210000003705 ribosome Anatomy 0.000 description 8
- 239000000243 solution Substances 0.000 description 8
- 239000006228 supernatant Substances 0.000 description 8
- 108020004414 DNA Proteins 0.000 description 7
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 7
- 108020005004 Guide RNA Proteins 0.000 description 7
- 239000000203 mixture Substances 0.000 description 7
- 230000004048 modification Effects 0.000 description 7
- 238000012986 modification Methods 0.000 description 7
- 230000006798 recombination Effects 0.000 description 7
- 238000005215 recombination Methods 0.000 description 7
- 239000000523 sample Substances 0.000 description 7
- 238000006467 substitution reaction Methods 0.000 description 7
- 230000001131 transforming effect Effects 0.000 description 7
- 229960000575 trastuzumab Drugs 0.000 description 7
- 108091028043 Nucleic acid sequence Proteins 0.000 description 6
- 230000006229 amino acid addition Effects 0.000 description 6
- 229960001230 asparagine Drugs 0.000 description 6
- 238000003556 assay Methods 0.000 description 6
- 125000000852 azido group Chemical group *N=[N+]=[N-] 0.000 description 6
- 238000005119 centrifugation Methods 0.000 description 6
- 238000010367 cloning Methods 0.000 description 6
- 238000000855 fermentation Methods 0.000 description 6
- 230000004151 fermentation Effects 0.000 description 6
- 125000000524 functional group Chemical group 0.000 description 6
- 150000004676 glycans Chemical class 0.000 description 6
- 239000012139 lysis buffer Substances 0.000 description 6
- 150000002923 oximes Chemical class 0.000 description 6
- 238000011218 seed culture Methods 0.000 description 6
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 6
- 241000894007 species Species 0.000 description 6
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 6
- 150000003668 tyrosines Chemical class 0.000 description 6
- RCTJXPOZTBLMNZ-VIFPVBQESA-N (2s)-3-(4-azidophenyl)-2-(methylamino)propanoic acid Chemical compound CN[C@H](C(O)=O)CC1=CC=C(N=[N+]=[N-])C=C1 RCTJXPOZTBLMNZ-VIFPVBQESA-N 0.000 description 5
- 108700004991 Cas12a Proteins 0.000 description 5
- 108060003951 Immunoglobulin Proteins 0.000 description 5
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 5
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 5
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 5
- 102000016943 Muramidase Human genes 0.000 description 5
- 108010014251 Muramidase Proteins 0.000 description 5
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 5
- 239000004473 Threonine Substances 0.000 description 5
- 125000000304 alkynyl group Chemical group 0.000 description 5
- 238000013459 approach Methods 0.000 description 5
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 5
- FPPNZSSZRUTDAP-UWFZAAFLSA-N carbenicillin Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)C(C(O)=O)C1=CC=CC=C1 FPPNZSSZRUTDAP-UWFZAAFLSA-N 0.000 description 5
- 229960003669 carbenicillin Drugs 0.000 description 5
- 239000003153 chemical reaction reagent Substances 0.000 description 5
- 239000003795 chemical substances by application Substances 0.000 description 5
- 102000018358 immunoglobulin Human genes 0.000 description 5
- 235000010335 lysozyme Nutrition 0.000 description 5
- 229960000274 lysozyme Drugs 0.000 description 5
- 239000004325 lysozyme Substances 0.000 description 5
- 239000003550 marker Substances 0.000 description 5
- 229960005190 phenylalanine Drugs 0.000 description 5
- 235000008729 phenylalanine Nutrition 0.000 description 5
- 229920001282 polysaccharide Polymers 0.000 description 5
- 239000005017 polysaccharide Substances 0.000 description 5
- 230000010076 replication Effects 0.000 description 5
- 239000000126 substance Substances 0.000 description 5
- 230000001629 suppression Effects 0.000 description 5
- 230000001225 therapeutic effect Effects 0.000 description 5
- 229960002898 threonine Drugs 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- GAJBPZXIKZXTCG-UHFFFAOYSA-N 2-amino-3-[4-(azidomethyl)phenyl]propanoic acid Chemical compound [NH3+]C(CC1=CC=C(CN=[N+]=[N-])C=C1)C([O-])=O GAJBPZXIKZXTCG-UHFFFAOYSA-N 0.000 description 4
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 4
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 4
- OAKJQQAXSVQMHS-UHFFFAOYSA-N Hydrazine Chemical compound NN OAKJQQAXSVQMHS-UHFFFAOYSA-N 0.000 description 4
- 108010002350 Interleukin-2 Proteins 0.000 description 4
- 102000000588 Interleukin-2 Human genes 0.000 description 4
- 101000972623 Kluyveromyces lactis (strain ATCC 8585 / CBS 2359 / DSM 70799 / NBRC 1267 / NRRL Y-1140 / WM37) Killer toxin subunits alpha/beta Proteins 0.000 description 4
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 4
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 4
- 101150110096 RF1 gene Proteins 0.000 description 4
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 4
- 102100040247 Tumor necrosis factor Human genes 0.000 description 4
- 150000001299 aldehydes Chemical class 0.000 description 4
- PYMYPHUHKUWMLA-WDCZJNDASA-N arabinose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 4
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 4
- 150000001540 azides Chemical class 0.000 description 4
- 230000021615 conjugation Effects 0.000 description 4
- OCCYFTDHSHTFER-UHFFFAOYSA-N dbco-amine Chemical compound NCCC(=O)N1CC2=CC=CC=C2C#CC2=CC=CC=C12 OCCYFTDHSHTFER-UHFFFAOYSA-N 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 4
- 235000004554 glutamine Nutrition 0.000 description 4
- 125000000404 glutamine group Chemical class N[C@@H](CCC(N)=O)C(=O)* 0.000 description 4
- 238000001727 in vivo Methods 0.000 description 4
- 230000006698 induction Effects 0.000 description 4
- 125000000468 ketone group Chemical group 0.000 description 4
- 125000005647 linker group Chemical group 0.000 description 4
- 235000018977 lysine Nutrition 0.000 description 4
- BDAGIHXWWSANSR-UHFFFAOYSA-N methanoic acid Natural products OC=O BDAGIHXWWSANSR-UHFFFAOYSA-N 0.000 description 4
- 239000008194 pharmaceutical composition Substances 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- 125000001424 substituent group Chemical group 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 3
- YFKPJUHXCHQTCQ-ZETCQYMHSA-N (2S)-6-amino-2-(2-azidoethoxycarbonylamino)hexanoic acid Chemical compound NCCCC[C@H](NC(=O)OCCN=[N+]=[N-])C(O)=O YFKPJUHXCHQTCQ-ZETCQYMHSA-N 0.000 description 3
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 3
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 3
- 108010077805 Bacterial Proteins Proteins 0.000 description 3
- 239000004471 Glycine Substances 0.000 description 3
- 108700005091 Immunoglobulin Genes Proteins 0.000 description 3
- 108090000172 Interleukin-15 Proteins 0.000 description 3
- 102000003812 Interleukin-15 Human genes 0.000 description 3
- 102100030703 Interleukin-22 Human genes 0.000 description 3
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 3
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 3
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 3
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 3
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 3
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 3
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 3
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 3
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 3
- 238000002835 absorbance Methods 0.000 description 3
- 150000001345 alkine derivatives Chemical group 0.000 description 3
- 150000001408 amides Chemical class 0.000 description 3
- 235000009582 asparagine Nutrition 0.000 description 3
- 229940041514 candida albicans extract Drugs 0.000 description 3
- 125000002915 carbonyl group Chemical group [*:2]C([*:1])=O 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 238000010828 elution Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- 235000013922 glutamic acid Nutrition 0.000 description 3
- 239000004220 glutamic acid Substances 0.000 description 3
- 238000002744 homologous recombination Methods 0.000 description 3
- 230000006801 homologous recombination Effects 0.000 description 3
- 230000001976 improved effect Effects 0.000 description 3
- 239000007788 liquid Substances 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 239000002184 metal Substances 0.000 description 3
- 229910052751 metal Inorganic materials 0.000 description 3
- 229930182817 methionine Natural products 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- 239000011022 opal Substances 0.000 description 3
- 239000008188 pellet Substances 0.000 description 3
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 3
- 229920000570 polyether Polymers 0.000 description 3
- 229920001223 polyethylene glycol Polymers 0.000 description 3
- 235000013930 proline Nutrition 0.000 description 3
- 238000001243 protein synthesis Methods 0.000 description 3
- 230000009257 reactivity Effects 0.000 description 3
- 239000011347 resin Substances 0.000 description 3
- 229920005989 resin Polymers 0.000 description 3
- 238000010561 standard procedure Methods 0.000 description 3
- 239000000758 substrate Substances 0.000 description 3
- 235000000346 sugar Nutrition 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- 229960005486 vaccine Drugs 0.000 description 3
- 239000012138 yeast extract Substances 0.000 description 3
- VRYALKFFQXWPIH-PBXRRBTRSA-N (3r,4s,5r)-3,4,5,6-tetrahydroxyhexanal Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)CC=O VRYALKFFQXWPIH-PBXRRBTRSA-N 0.000 description 2
- NEMHIKRLROONTL-UHFFFAOYSA-N 2-amino-3-(4-azidophenyl)propanoic acid Chemical compound OC(=O)C(N)CC1=CC=C(N=[N+]=[N-])C=C1 NEMHIKRLROONTL-UHFFFAOYSA-N 0.000 description 2
- LSFLSTKLVZEYDR-UHFFFAOYSA-N 2-amino-3-[4-(azidomethyl)pyridin-2-yl]propanoic acid Chemical compound NC(C(=O)O)CC1=NC=CC(=C1)CN=[N+]=[N-] LSFLSTKLVZEYDR-UHFFFAOYSA-N 0.000 description 2
- SJHUEOARVQCKOH-UHFFFAOYSA-N 2-amino-3-[5-(azidomethyl)pyridin-2-yl]propanoic acid Chemical compound OC(=O)C(N)CC1=CC=C(CN=[N+]=[N-])C=N1 SJHUEOARVQCKOH-UHFFFAOYSA-N 0.000 description 2
- PIVPBUHWDALSQO-UHFFFAOYSA-N 2-amino-3-[6-(azidomethyl)pyridin-3-yl]propanoic acid Chemical compound NC(C(=O)O)CC=1C=NC(=CC=1)CN=[N+]=[N-] PIVPBUHWDALSQO-UHFFFAOYSA-N 0.000 description 2
- AUARUCAREKTRCL-UHFFFAOYSA-N 2-amino-5-azidopentanoic acid Chemical compound OC(=O)C(N)CCCN=[N+]=[N-] AUARUCAREKTRCL-UHFFFAOYSA-N 0.000 description 2
- VRYALKFFQXWPIH-HSUXUTPPSA-N 2-deoxy-D-galactose Chemical compound OC[C@@H](O)[C@H](O)[C@H](O)CC=O VRYALKFFQXWPIH-HSUXUTPPSA-N 0.000 description 2
- OUCMTIKCFRCBHK-UHFFFAOYSA-N 3,3-dibenzylcyclooctyne Chemical compound C1CCCCC#CC1(CC=1C=CC=CC=1)CC1=CC=CC=C1 OUCMTIKCFRCBHK-UHFFFAOYSA-N 0.000 description 2
- OSWFIVFLDKOXQC-UHFFFAOYSA-N 4-(3-methoxyphenyl)aniline Chemical compound COC1=CC=CC(C=2C=CC(N)=CC=2)=C1 OSWFIVFLDKOXQC-UHFFFAOYSA-N 0.000 description 2
- ZKHQWZAMYRWXGA-KQYNXXCUSA-J ATP(4-) Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-J 0.000 description 2
- 241000093740 Acidaminococcus sp. Species 0.000 description 2
- ZKHQWZAMYRWXGA-UHFFFAOYSA-N Adenosine triphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)C(O)C1O ZKHQWZAMYRWXGA-UHFFFAOYSA-N 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 2
- 241000193449 Clostridium tetani Species 0.000 description 2
- SHZGCJCMOBCMKK-UHFFFAOYSA-N D-mannomethylose Natural products CC1OC(O)C(O)C(O)C1O SHZGCJCMOBCMKK-UHFFFAOYSA-N 0.000 description 2
- 102000016607 Diphtheria Toxin Human genes 0.000 description 2
- 108010053187 Diphtheria Toxin Proteins 0.000 description 2
- 102000003951 Erythropoietin Human genes 0.000 description 2
- 108090000394 Erythropoietin Proteins 0.000 description 2
- 101100521102 Escherichia coli (strain K12) priC gene Proteins 0.000 description 2
- 229920002444 Exopolysaccharide Polymers 0.000 description 2
- 108010046276 FLP recombinase Proteins 0.000 description 2
- PNNNRSAQSRJVSB-SLPGGIOYSA-N Fucose Natural products C[C@H](O)[C@@H](O)[C@H](O)[C@H](O)C=O PNNNRSAQSRJVSB-SLPGGIOYSA-N 0.000 description 2
- 101000597577 Gluconacetobacter diazotrophicus (strain ATCC 49037 / DSM 5601 / CCUG 37298 / CIP 103539 / LMG 7603 / PAl5) Outer membrane protein Proteins 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- 238000006736 Huisgen cycloaddition reaction Methods 0.000 description 2
- 108010065805 Interleukin-12 Proteins 0.000 description 2
- 102000013462 Interleukin-12 Human genes 0.000 description 2
- 108090000176 Interleukin-13 Proteins 0.000 description 2
- 102000049772 Interleukin-16 Human genes 0.000 description 2
- 101800003050 Interleukin-16 Proteins 0.000 description 2
- 102000013691 Interleukin-17 Human genes 0.000 description 2
- 108050003558 Interleukin-17 Proteins 0.000 description 2
- 108010002386 Interleukin-3 Proteins 0.000 description 2
- 108010002616 Interleukin-5 Proteins 0.000 description 2
- 108010002586 Interleukin-7 Proteins 0.000 description 2
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- SHZGCJCMOBCMKK-DHVFOXMCSA-N L-fucopyranose Chemical compound C[C@@H]1OC(O)[C@@H](O)[C@H](O)[C@@H]1O SHZGCJCMOBCMKK-DHVFOXMCSA-N 0.000 description 2
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 2
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- 241000904817 Lachnospiraceae bacterium Species 0.000 description 2
- 239000004472 Lysine Substances 0.000 description 2
- 108010052285 Membrane Proteins Proteins 0.000 description 2
- 241000542065 Moraxella bovoculi Species 0.000 description 2
- 108010021466 Mutant Proteins Proteins 0.000 description 2
- 102000008300 Mutant Proteins Human genes 0.000 description 2
- 102100031789 Myeloid-derived growth factor Human genes 0.000 description 2
- 108091034117 Oligonucleotide Proteins 0.000 description 2
- 101710116435 Outer membrane protein Proteins 0.000 description 2
- 239000002202 Polyethylene glycol Substances 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 108010091086 Recombinases Proteins 0.000 description 2
- 102000018120 Recombinases Human genes 0.000 description 2
- 108010034546 Serratia marcescens nuclease Proteins 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 101100166144 Staphylococcus aureus cas9 gene Proteins 0.000 description 2
- 241000193996 Streptococcus pyogenes Species 0.000 description 2
- 238000010459 TALEN Methods 0.000 description 2
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 2
- 108091028113 Trans-activating crRNA Proteins 0.000 description 2
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 2
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 2
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 2
- 238000010958 [3+2] cycloaddition reaction Methods 0.000 description 2
- 125000002777 acetyl group Chemical group [H]C([H])([H])C(*)=O 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 150000001294 alanine derivatives Chemical class 0.000 description 2
- 150000001336 alkenes Chemical class 0.000 description 2
- PMMURAAUARKVCB-UHFFFAOYSA-N alpha-D-ara-dHexp Natural products OCC1OC(O)CC(O)C1O PMMURAAUARKVCB-UHFFFAOYSA-N 0.000 description 2
- HSFWRNGVRCDJHI-UHFFFAOYSA-N alpha-acetylene Natural products C#C HSFWRNGVRCDJHI-UHFFFAOYSA-N 0.000 description 2
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 2
- 235000009697 arginine Nutrition 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- 125000003236 benzoyl group Chemical group [H]C1=C([H])C([H])=C(C([H])=C1[H])C(*)=O 0.000 description 2
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 235000014633 carbohydrates Nutrition 0.000 description 2
- 230000010261 cell growth Effects 0.000 description 2
- 230000005754 cellular signaling Effects 0.000 description 2
- WHTVZRBIWZFKQO-UHFFFAOYSA-N chloroquine Chemical compound ClC1=CC=C2C(NC(C)CCCN(CC)CC)=CC=NC2=C1 WHTVZRBIWZFKQO-UHFFFAOYSA-N 0.000 description 2
- 238000004587 chromatography analysis Methods 0.000 description 2
- 230000002759 chromosomal effect Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 2
- 235000018417 cysteine Nutrition 0.000 description 2
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 2
- 210000000172 cytosol Anatomy 0.000 description 2
- 231100000135 cytotoxicity Toxicity 0.000 description 2
- 230000003013 cytotoxicity Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 238000000326 densiometry Methods 0.000 description 2
- 238000010511 deprotection reaction Methods 0.000 description 2
- 125000004989 dicarbonyl group Chemical group 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 239000000539 dimer Substances 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 239000012149 elution buffer Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 229940105423 erythropoietin Drugs 0.000 description 2
- 125000002534 ethynyl group Chemical group [H]C#C* 0.000 description 2
- 235000019253 formic acid Nutrition 0.000 description 2
- 230000013595 glycosylation Effects 0.000 description 2
- 238000006206 glycosylation reaction Methods 0.000 description 2
- 238000003306 harvesting Methods 0.000 description 2
- 125000000623 heterocyclic group Chemical group 0.000 description 2
- 238000004128 high performance liquid chromatography Methods 0.000 description 2
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 2
- 229930195733 hydrocarbon Natural products 0.000 description 2
- 150000002430 hydrocarbons Chemical class 0.000 description 2
- 230000007062 hydrolysis Effects 0.000 description 2
- 238000006460 hydrolysis reaction Methods 0.000 description 2
- 230000028993 immune response Effects 0.000 description 2
- 239000012535 impurity Substances 0.000 description 2
- 230000002779 inactivation Effects 0.000 description 2
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 2
- 229940079322 interferon Drugs 0.000 description 2
- 108010074108 interleukin-21 Proteins 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 2
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 2
- 229960000310 isoleucine Drugs 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 101150078797 luxS gene Proteins 0.000 description 2
- 125000003588 lysine group Chemical class [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 2
- 238000001819 mass spectrum Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 125000001360 methionine group Chemical class N[C@@H](CCSC)C(=O)* 0.000 description 2
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 2
- 229910052757 nitrogen Inorganic materials 0.000 description 2
- 238000010899 nucleation Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 125000003544 oxime group Chemical group 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 101150113529 pgpC gene Proteins 0.000 description 2
- 150000002993 phenylalanine derivatives Chemical class 0.000 description 2
- 150000002994 phenylalanines Chemical class 0.000 description 2
- 230000004481 post-translational protein modification Effects 0.000 description 2
- OXCMYAYHXIHQOA-UHFFFAOYSA-N potassium;[2-butyl-5-chloro-3-[[4-[2-(1,2,4-triaza-3-azanidacyclopenta-1,4-dien-5-yl)phenyl]phenyl]methyl]imidazol-4-yl]methanol Chemical compound [K+].CCCCC1=NC(Cl)=C(CO)N1CC1=CC=C(C=2C(=CC=CC=2)C2=N[N-]N=N2)C=C1 OXCMYAYHXIHQOA-UHFFFAOYSA-N 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000000069 prophylactic effect Effects 0.000 description 2
- 238000000751 protein extraction Methods 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- DUIOPKIIICUYRZ-UHFFFAOYSA-N semicarbazide Chemical compound NNC(N)=O DUIOPKIIICUYRZ-UHFFFAOYSA-N 0.000 description 2
- 150000003354 serine derivatives Chemical class 0.000 description 2
- 150000003384 small molecules Chemical class 0.000 description 2
- 125000006850 spacer group Chemical group 0.000 description 2
- 239000011550 stock solution Substances 0.000 description 2
- 150000003568 thioethers Chemical class 0.000 description 2
- 125000003396 thiol group Chemical group [H]S* 0.000 description 2
- 230000000699 topical effect Effects 0.000 description 2
- 231100000419 toxicity Toxicity 0.000 description 2
- 230000001988 toxicity Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 239000004474 valine Substances 0.000 description 2
- 239000011782 vitamin Substances 0.000 description 2
- 235000013343 vitamin Nutrition 0.000 description 2
- 229940088594 vitamin Drugs 0.000 description 2
- 229930003231 vitamin Natural products 0.000 description 2
- JPZXHKDZASGCLU-LBPRGKRZSA-N β-(2-naphthyl)-alanine Chemical compound C1=CC=CC2=CC(C[C@H](N)C(O)=O)=CC=C21 JPZXHKDZASGCLU-LBPRGKRZSA-N 0.000 description 2
- HCBJYCNPDWLIGG-VIFPVBQESA-N (2S)-2-amino-3-[5-[(6-methyl-1,2,4,5-tetrazin-3-yl)amino]pyridin-3-yl]propanoic acid Chemical compound Cc1nnc(Nc2cncc(C[C@H](N)C(O)=O)c2)nn1 HCBJYCNPDWLIGG-VIFPVBQESA-N 0.000 description 1
- PPDNGMUGVMESGE-JTQLQIEISA-N (2s)-2-amino-3-(4-ethynylphenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(C#C)C=C1 PPDNGMUGVMESGE-JTQLQIEISA-N 0.000 description 1
- WHTVZRBIWZFKQO-AWEZNQCLSA-N (S)-chloroquine Chemical compound ClC1=CC=C2C(N[C@@H](C)CCCN(CC)CC)=CC=NC2=C1 WHTVZRBIWZFKQO-AWEZNQCLSA-N 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical group C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- GOJUJUVQIVIZAV-UHFFFAOYSA-N 2-amino-4,6-dichloropyrimidine-5-carbaldehyde Chemical group NC1=NC(Cl)=C(C=O)C(Cl)=N1 GOJUJUVQIVIZAV-UHFFFAOYSA-N 0.000 description 1
- 125000003903 2-propenyl group Chemical group [H]C([*])([H])C([H])=C([H])[H] 0.000 description 1
- 125000001494 2-propynyl group Chemical group [H]C#CC([H])([H])* 0.000 description 1
- DVGKRPYUFRZAQW-UHFFFAOYSA-N 3 prime Natural products CC(=O)NC1OC(CC(O)C1C(O)C(O)CO)(OC2C(O)C(CO)OC(OC3C(O)C(O)C(O)OC3CO)C2O)C(=O)O DVGKRPYUFRZAQW-UHFFFAOYSA-N 0.000 description 1
- 108010082808 4-1BB Ligand Proteins 0.000 description 1
- QCVGEOXPDFCNHA-UHFFFAOYSA-N 5,5-dimethyl-2,4-dioxo-1,3-oxazolidine-3-carboxamide Chemical compound CC1(C)OC(=O)N(C(N)=O)C1=O QCVGEOXPDFCNHA-UHFFFAOYSA-N 0.000 description 1
- HBAQYPYDRFILMT-UHFFFAOYSA-N 8-[3-(1-cyclopropylpyrazol-4-yl)-1H-pyrazolo[4,3-d]pyrimidin-5-yl]-3-methyl-3,8-diazabicyclo[3.2.1]octan-2-one Chemical class C1(CC1)N1N=CC(=C1)C1=NNC2=C1N=C(N=C2)N1C2C(N(CC1CC2)C)=O HBAQYPYDRFILMT-UHFFFAOYSA-N 0.000 description 1
- 208000030507 AIDS Diseases 0.000 description 1
- 101150067361 Aars1 gene Proteins 0.000 description 1
- 206010001488 Aggression Diseases 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 102100021266 Alpha-(1,6)-fucosyltransferase Human genes 0.000 description 1
- 102100026277 Alpha-galactosidase A Human genes 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 108020005098 Anticodon Proteins 0.000 description 1
- OZDNDGXASTWERN-CTNGQTDRSA-N Apovincamine Chemical compound C1=CC=C2C(CCN3CCC4)=C5[C@@H]3[C@]4(CC)C=C(C(=O)OC)N5C2=C1 OZDNDGXASTWERN-CTNGQTDRSA-N 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 108010028006 B-Cell Activating Factor Proteins 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 102100021935 C-C motif chemokine 26 Human genes 0.000 description 1
- 102100032937 CD40 ligand Human genes 0.000 description 1
- 108010009575 CD55 Antigens Proteins 0.000 description 1
- 102100025221 CD70 antigen Human genes 0.000 description 1
- 108091079001 CRISPR RNA Proteins 0.000 description 1
- 241000282465 Canis Species 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 239000004215 Carbon black (E152) Substances 0.000 description 1
- 108091092236 Chimeric RNA Proteins 0.000 description 1
- 229920013669 Clearsite Polymers 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 108010051219 Cre recombinase Proteins 0.000 description 1
- 239000004971 Cross linker Substances 0.000 description 1
- 150000008574 D-amino acids Chemical class 0.000 description 1
- 230000004568 DNA-binding Effects 0.000 description 1
- 208000020401 Depressive disease Diseases 0.000 description 1
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 1
- 108010000912 Egg Proteins Proteins 0.000 description 1
- 102000002322 Egg Proteins Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 241001125671 Eretmochelys imbricata Species 0.000 description 1
- 241001522878 Escherichia coli B Species 0.000 description 1
- 241001646716 Escherichia coli K-12 Species 0.000 description 1
- 108700039887 Essential Genes Proteins 0.000 description 1
- 241000282324 Felis Species 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 108091006027 G proteins Proteins 0.000 description 1
- 102000030782 GTP binding Human genes 0.000 description 1
- 108091000058 GTP-Binding Proteins 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- 108010017080 Granulocyte Colony-Stimulating Factor Proteins 0.000 description 1
- 102000004269 Granulocyte Colony-Stimulating Factor Human genes 0.000 description 1
- 108010017213 Granulocyte-Macrophage Colony-Stimulating Factor Proteins 0.000 description 1
- 102100039620 Granulocyte-macrophage colony-stimulating factor Human genes 0.000 description 1
- 108010051696 Growth Hormone Proteins 0.000 description 1
- 102000018997 Growth Hormone Human genes 0.000 description 1
- 101000819490 Homo sapiens Alpha-(1,6)-fucosyltransferase Proteins 0.000 description 1
- 101000897493 Homo sapiens C-C motif chemokine 26 Proteins 0.000 description 1
- 101000934356 Homo sapiens CD70 antigen Proteins 0.000 description 1
- 101001002470 Homo sapiens Interferon lambda-1 Proteins 0.000 description 1
- 101000853002 Homo sapiens Interleukin-25 Proteins 0.000 description 1
- 101000853000 Homo sapiens Interleukin-26 Proteins 0.000 description 1
- 101000998139 Homo sapiens Interleukin-32 Proteins 0.000 description 1
- 101001128431 Homo sapiens Myeloid-derived growth factor Proteins 0.000 description 1
- 101000684503 Homo sapiens Sentrin-specific protease 3 Proteins 0.000 description 1
- 101100369992 Homo sapiens TNFSF10 gene Proteins 0.000 description 1
- 101000638161 Homo sapiens Tumor necrosis factor ligand superfamily member 6 Proteins 0.000 description 1
- 101000638255 Homo sapiens Tumor necrosis factor ligand superfamily member 8 Proteins 0.000 description 1
- 102000002265 Human Growth Hormone Human genes 0.000 description 1
- 108010000521 Human Growth Hormone Proteins 0.000 description 1
- 239000000854 Human Growth Hormone Substances 0.000 description 1
- AVXURJPOCDRRFD-UHFFFAOYSA-N Hydroxylamine Chemical compound ON AVXURJPOCDRRFD-UHFFFAOYSA-N 0.000 description 1
- 102000004877 Insulin Human genes 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 108090000723 Insulin-Like Growth Factor I Proteins 0.000 description 1
- 102000014429 Insulin-like growth factor Human genes 0.000 description 1
- 102000006992 Interferon-alpha Human genes 0.000 description 1
- 108010047761 Interferon-alpha Proteins 0.000 description 1
- 102000003996 Interferon-beta Human genes 0.000 description 1
- 108090000467 Interferon-beta Proteins 0.000 description 1
- 108010074328 Interferon-gamma Proteins 0.000 description 1
- 102000008070 Interferon-gamma Human genes 0.000 description 1
- 102000000589 Interleukin-1 Human genes 0.000 description 1
- 108010002352 Interleukin-1 Proteins 0.000 description 1
- 102000003814 Interleukin-10 Human genes 0.000 description 1
- 108090000174 Interleukin-10 Proteins 0.000 description 1
- 108090000177 Interleukin-11 Proteins 0.000 description 1
- 102100039879 Interleukin-19 Human genes 0.000 description 1
- 108050009288 Interleukin-19 Proteins 0.000 description 1
- 102100036679 Interleukin-26 Human genes 0.000 description 1
- 108010066979 Interleukin-27 Proteins 0.000 description 1
- 101710181613 Interleukin-31 Proteins 0.000 description 1
- 108010067003 Interleukin-33 Proteins 0.000 description 1
- 101710181549 Interleukin-34 Proteins 0.000 description 1
- 108091007973 Interleukin-36 Proteins 0.000 description 1
- 108090000978 Interleukin-4 Proteins 0.000 description 1
- 108090001005 Interleukin-6 Proteins 0.000 description 1
- 108090001007 Interleukin-8 Proteins 0.000 description 1
- 108010002335 Interleukin-9 Proteins 0.000 description 1
- 229930194542 Keto Chemical class 0.000 description 1
- 150000008575 L-amino acids Chemical class 0.000 description 1
- SRBFZHDQGSBBOR-HWQSCIPKSA-N L-arabinopyranose Chemical compound O[C@H]1COC(O)[C@H](O)[C@H]1O SRBFZHDQGSBBOR-HWQSCIPKSA-N 0.000 description 1
- 125000000174 L-prolyl group Chemical group [H]N1C([H])([H])C([H])([H])C([H])([H])[C@@]1([H])C(*)=O 0.000 description 1
- 101150027568 LC gene Proteins 0.000 description 1
- 241000448224 Lachnospiraceae bacterium MA2020 Species 0.000 description 1
- 241000029590 Leptotrichia wadei Species 0.000 description 1
- 241000390917 Listeria newyorkensis Species 0.000 description 1
- 108010046938 Macrophage Colony-Stimulating Factor Proteins 0.000 description 1
- 102000007651 Macrophage Colony-Stimulating Factor Human genes 0.000 description 1
- 102000018697 Membrane Proteins Human genes 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 101000597780 Mus musculus Tumor necrosis factor ligand superfamily member 18 Proteins 0.000 description 1
- SSURCGGGQUWIHH-UHFFFAOYSA-N NNON Chemical compound NNON SSURCGGGQUWIHH-UHFFFAOYSA-N 0.000 description 1
- 241000588650 Neisseria meningitidis Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 108091005461 Nucleic proteins Proteins 0.000 description 1
- 108010042215 OX40 Ligand Proteins 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 108090000526 Papain Proteins 0.000 description 1
- 108090000284 Pepsin A Proteins 0.000 description 1
- 102000057297 Pepsin A Human genes 0.000 description 1
- 229920002352 Peptidyl-tRNA Polymers 0.000 description 1
- 108010064851 Plant Proteins Proteins 0.000 description 1
- 101000983333 Plasmodium falciparum (isolate NF54) 25 kDa ookinete surface antigen Proteins 0.000 description 1
- 239000004721 Polyphenylene oxide Substances 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 201000004681 Psoriasis Diseases 0.000 description 1
- ASNFTDCKZKHJSW-REOHCLBHSA-N Quisqualic acid Chemical class OC(=O)[C@@H](N)CN1OC(=O)NC1=O ASNFTDCKZKHJSW-REOHCLBHSA-N 0.000 description 1
- 102000014128 RANK Ligand Human genes 0.000 description 1
- 108010025832 RANK Ligand Proteins 0.000 description 1
- 108090000783 Renin Proteins 0.000 description 1
- 102100028255 Renin Human genes 0.000 description 1
- 108700008625 Reporter Genes Proteins 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 102000002278 Ribosomal Proteins Human genes 0.000 description 1
- 108010000605 Ribosomal Proteins Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 102100023645 Sentrin-specific protease 3 Human genes 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 241000193985 Streptococcus agalactiae Species 0.000 description 1
- 241000193998 Streptococcus pneumoniae Species 0.000 description 1
- 108091008874 T cell receptors Proteins 0.000 description 1
- 102000016266 T-Cell Antigen Receptors Human genes 0.000 description 1
- 102000046283 TNF-Related Apoptosis-Inducing Ligand Human genes 0.000 description 1
- 108700012411 TNFSF10 Proteins 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- 102000003978 Tissue Plasminogen Activator Human genes 0.000 description 1
- 108090000373 Tissue Plasminogen Activator Proteins 0.000 description 1
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 1
- 108700019146 Transgenes Proteins 0.000 description 1
- 108010075344 Tryptophan synthase Proteins 0.000 description 1
- 108060008682 Tumor Necrosis Factor Proteins 0.000 description 1
- 102100024584 Tumor necrosis factor ligand superfamily member 12 Human genes 0.000 description 1
- 101710097155 Tumor necrosis factor ligand superfamily member 12 Proteins 0.000 description 1
- 102100036922 Tumor necrosis factor ligand superfamily member 13B Human genes 0.000 description 1
- 102100035283 Tumor necrosis factor ligand superfamily member 18 Human genes 0.000 description 1
- 102100026890 Tumor necrosis factor ligand superfamily member 4 Human genes 0.000 description 1
- 102100031988 Tumor necrosis factor ligand superfamily member 6 Human genes 0.000 description 1
- 102100032100 Tumor necrosis factor ligand superfamily member 8 Human genes 0.000 description 1
- 102100032101 Tumor necrosis factor ligand superfamily member 9 Human genes 0.000 description 1
- 108010067390 Viral Proteins Proteins 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- SMEGJBVQLJJKKX-HOTMZDKISA-N [(2R,3S,4S,5R,6R)-5-acetyloxy-3,4,6-trihydroxyoxan-2-yl]methyl acetate Chemical compound CC(=O)OC[C@@H]1[C@H]([C@@H]([C@H]([C@@H](O1)O)OC(=O)C)O)O SMEGJBVQLJJKKX-HOTMZDKISA-N 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- UDMBCSSLTHHNCD-KQYNXXCUSA-N adenosine 5'-monophosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)[C@H]1O UDMBCSSLTHHNCD-KQYNXXCUSA-N 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 108010030291 alpha-Galactosidase Proteins 0.000 description 1
- 150000001412 amines Chemical class 0.000 description 1
- 229940093740 amino acid and derivative Drugs 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 125000002344 aminooxy group Chemical group [H]N([H])O[*] 0.000 description 1
- 230000000202 analgesic effect Effects 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 239000002518 antifoaming agent Substances 0.000 description 1
- 210000000612 antigen-presenting cell Anatomy 0.000 description 1
- 239000003430 antimalarial agent Substances 0.000 description 1
- 229940033495 antimalarials Drugs 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- OZDNDGXASTWERN-UHFFFAOYSA-N apovincamine Natural products C1=CC=C2C(CCN3CCC4)=C5C3C4(CC)C=C(C(=O)OC)N5C2=C1 OZDNDGXASTWERN-UHFFFAOYSA-N 0.000 description 1
- 230000036528 appetite Effects 0.000 description 1
- 235000019789 appetite Nutrition 0.000 description 1
- 125000000637 arginyl group Chemical class N[C@@H](CCCNC(N)=N)C(=O)* 0.000 description 1
- 125000003710 aryl alkyl group Chemical group 0.000 description 1
- 150000001510 aspartic acids Chemical class 0.000 description 1
- 125000004429 atom Chemical group 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 238000000376 autoradiography Methods 0.000 description 1
- 210000003578 bacterial chromosome Anatomy 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 238000001815 biotherapy Methods 0.000 description 1
- 150000001615 biotins Chemical class 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 210000000481 breast Anatomy 0.000 description 1
- 125000001246 bromo group Chemical group Br* 0.000 description 1
- 238000005251 capillar electrophoresis Methods 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- VQXINLNPICQTLR-UHFFFAOYSA-N carbonyl diazide Chemical compound [N-]=[N+]=NC(=O)N=[N+]=[N-] VQXINLNPICQTLR-UHFFFAOYSA-N 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000030833 cell death Effects 0.000 description 1
- 230000006037 cell lysis Effects 0.000 description 1
- 230000004663 cell proliferation Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002738 chelating agent Substances 0.000 description 1
- 125000003636 chemical group Chemical group 0.000 description 1
- 239000003638 chemical reducing agent Substances 0.000 description 1
- 229960003677 chloroquine Drugs 0.000 description 1
- 238000013375 chromatographic separation Methods 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 230000002060 circadian Effects 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 208000010877 cognitive disease Diseases 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 239000007819 coupling partner Substances 0.000 description 1
- 238000002425 crystallisation Methods 0.000 description 1
- 230000008025 crystallization Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000001086 cytosolic effect Effects 0.000 description 1
- 239000002254 cytotoxic agent Substances 0.000 description 1
- 229940127089 cytotoxic agent Drugs 0.000 description 1
- 231100000599 cytotoxic agent Toxicity 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 230000006324 decarbonylation Effects 0.000 description 1
- 238000006606 decarbonylation reaction Methods 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 235000013367 dietary fats Nutrition 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 238000011143 downstream manufacturing Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- 235000014103 egg white Nutrition 0.000 description 1
- 210000000969 egg white Anatomy 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 210000001671 embryonic stem cell Anatomy 0.000 description 1
- 230000002124 endocrine Effects 0.000 description 1
- 239000002158 endotoxin Substances 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000035558 fertility Effects 0.000 description 1
- 235000012041 food component Nutrition 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 108091006104 gene-regulatory proteins Proteins 0.000 description 1
- 102000034356 gene-regulatory proteins Human genes 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 229960002743 glutamine Drugs 0.000 description 1
- 150000002308 glutamine derivatives Chemical class 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 239000000122 growth hormone Substances 0.000 description 1
- 230000009036 growth inhibition Effects 0.000 description 1
- 230000003394 haemopoietic effect Effects 0.000 description 1
- 230000037308 hair color Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 125000004475 heteroaralkyl group Chemical group 0.000 description 1
- 125000000487 histidyl group Chemical class [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 1
- 230000003054 hormonal effect Effects 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 230000008348 humoral response Effects 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 125000000717 hydrazino group Chemical group [H]N([*])N([H])[H] 0.000 description 1
- 150000007857 hydrazones Chemical class 0.000 description 1
- 125000001165 hydrophobic group Chemical group 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 150000002443 hydroxylamines Chemical class 0.000 description 1
- 230000003463 hyperproliferative effect Effects 0.000 description 1
- 150000007976 iminium ions Chemical class 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000016784 immunoglobulin production Effects 0.000 description 1
- 230000006054 immunological memory Effects 0.000 description 1
- 238000001114 immunoprecipitation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000000099 in vitro assay Methods 0.000 description 1
- 238000005462 in vivo assay Methods 0.000 description 1
- 239000012678 infectious agent Substances 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 229940125396 insulin Drugs 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 229960003130 interferon gamma Drugs 0.000 description 1
- 229960001388 interferon-beta Drugs 0.000 description 1
- 102000004114 interleukin 20 Human genes 0.000 description 1
- 108090000681 interleukin 20 Proteins 0.000 description 1
- 108010074109 interleukin-22 Proteins 0.000 description 1
- 102000003898 interleukin-24 Human genes 0.000 description 1
- 108090000237 interleukin-24 Proteins 0.000 description 1
- 239000000543 intermediate Substances 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 125000002346 iodo group Chemical group I* 0.000 description 1
- YOBAEOGBNPPUQV-UHFFFAOYSA-N iron;trihydrate Chemical compound O.O.O.[Fe].[Fe] YOBAEOGBNPPUQV-UHFFFAOYSA-N 0.000 description 1
- 150000002519 isoleucine derivatives Chemical class 0.000 description 1
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 230000002147 killing effect Effects 0.000 description 1
- 150000002614 leucines Chemical class 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 229920006008 lipopolysaccharide Polymers 0.000 description 1
- 150000002668 lysine derivatives Chemical class 0.000 description 1
- 201000004792 malaria Diseases 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- 125000000956 methoxy group Chemical group [H]C([H])([H])O* 0.000 description 1
- 239000011707 mineral Substances 0.000 description 1
- 235000010755 mineral Nutrition 0.000 description 1
- LSSWKSPJKJQQHH-UHFFFAOYSA-N n-aminooxyhydroxylamine Chemical compound NONO LSSWKSPJKJQQHH-UHFFFAOYSA-N 0.000 description 1
- SQDFHQJTAWCFIB-UHFFFAOYSA-N n-methylidenehydroxylamine Chemical group ON=C SQDFHQJTAWCFIB-UHFFFAOYSA-N 0.000 description 1
- 125000000449 nitro group Chemical group [O-][N+](*)=O 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 239000012038 nucleophile Substances 0.000 description 1
- 235000003170 nutritional factors Nutrition 0.000 description 1
- 229920001542 oligosaccharide Polymers 0.000 description 1
- 150000002482 oligosaccharides Chemical class 0.000 description 1
- 238000000424 optical density measurement Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 125000002971 oxazolyl group Chemical group 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229940055729 papain Drugs 0.000 description 1
- 235000019834 papain Nutrition 0.000 description 1
- 244000045947 parasite Species 0.000 description 1
- 229940111202 pepsin Drugs 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 230000019612 pigmentation Effects 0.000 description 1
- 235000021118 plant-derived protein Nutrition 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 101150093386 prfA gene Proteins 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 150000003147 proline derivatives Chemical class 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 238000003498 protein array Methods 0.000 description 1
- 239000003531 protein hydrolysate Substances 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 125000003373 pyrazinyl group Chemical group 0.000 description 1
- 125000003226 pyrazolyl group Chemical group 0.000 description 1
- 125000004076 pyridyl group Chemical group 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 108020004418 ribosomal RNA Proteins 0.000 description 1
- 238000007363 ring formation reaction Methods 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 229930195734 saturated hydrocarbon Natural products 0.000 description 1
- 230000003248 secreting effect Effects 0.000 description 1
- 125000001439 semicarbazido group Chemical group [H]N([H])C(=O)N([H])N([H])* 0.000 description 1
- 150000007659 semicarbazones Chemical class 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000009450 sialylation Effects 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 229940031000 streptococcus pneumoniae Drugs 0.000 description 1
- 125000003107 substituted aryl group Chemical group 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 229920001059 synthetic polymer Polymers 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- 150000004905 tetrazines Chemical class 0.000 description 1
- 238000010257 thawing Methods 0.000 description 1
- 125000000335 thiazolyl group Chemical group 0.000 description 1
- 125000001544 thienyl group Chemical group 0.000 description 1
- 150000003588 threonines Chemical class 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 229960000187 tissue plasminogen activator Drugs 0.000 description 1
- 238000006257 total synthesis reaction Methods 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 125000001425 triazolyl group Chemical group 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 229930195735 unsaturated hydrocarbon Natural products 0.000 description 1
- 150000003679 valine derivatives Chemical class 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 239000011534 wash buffer Substances 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 229920003169 water-soluble polymer Polymers 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 238000012070 whole genome sequencing analysis Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
- C12P21/02—Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12R—INDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
- C12R2001/00—Microorganisms ; Processes using microorganisms
- C12R2001/01—Bacteria or Actinomycetales ; using bacteria or Actinomycetales
- C12R2001/185—Escherichia
- C12R2001/19—Escherichia coli
Definitions
- Release Factor 1 is a termination complex protein that facilitates translation termination by recognizing the amber codon in an mRNA molecule. RF1 terminates translation in response to the amber codon, i.e., the TAG stop codon.
- RF1 non-natural amino acid
- an RF1-deficient E. coli cell comprising at least one stop codon mutation from TAG to a non-TAG stop codon, a functional release factor 2 (RF2), and an oxidative cytoplasm, wherein the functional RF2 has greater RF2 activity than a control.
- the number of stop codon mutations is no greater than 20, for example, between 2 and 10.
- At least one of the coding sequences selected from the group consisting of hda, lpxK, coaD, lolA, mreC, murF, and hemA comprises a non-TAG stop codon
- the cell has increased RF2 activity or expression as compared to a control E. coli cell.
- 2 to 7 of the coding sequences comprises non-TAG stop codons.
- the cell further comprises a gene encoding a protein of interest selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy chain, a cytokine, a cytokine fragment, or an immunogenic polypeptide.
- the immunogenic polypeptide is a carrier protein.
- the carrier protein comprises an amino acid sequence having at least 80% sequence identity to SEQ ID NO:33.
- the cytokine is selected from an interleukin, an interferon, a transforming growth factor, or a chemokine.
- the protein of interest comprises one or more non-natural amino acids (NNAAs).
- the one or more NNAAs is selected from the group consisting of p-acetyl-L-phenylalanine, O-methyl-L-tyrosine, an -3-(2- naphthyl)alanine, 3-methyl-phenylalanine, O-4-allyl-L-tyrosine, 4-propyl-L-tyrosine, a tri O- acetyl-GlcNAc ⁇ -serine, L-Dopa, fluorinated phenylalanine, isopropyl-L-phenylalanine, p-azido- L-phenylalanine, p-azido-methyl-L-phenylalanine, p-acyl-L-phenylalanine, p-benzoyl-L- phenylalanine, L-phosphoserine, phosphonoserine, phosphonotyrosine, p-iodo-phenylalanine, p- bromoph
- kits comprising the RF1-deficient E. coli cell described above, and the kit further comprises a bacteria growth medium.
- the kit further comprises a plasmid encoding a protein of interest.
- the kit further comprises a plasmid encoding an aminoacyl-tRNA synthetase (RS) specific for pAMF and a tRNA specific for p-azidophenylalanine.
- RS aminoacyl-tRNA synthetase
- tRNA specific for p-azidophenylalanine a method for expressing a soluble, recombinant protein in an RF1-deficient E. coli bacterial cell comprising the steps of: culturing the RF1- deficient E.
- the RF1-deficient E. coli bacterial cell comprises an oxidative cytoplasm, which allows recombinant proteins, especially those comprising disulfide bonds, to be expressed as soluble proteins.
- a method for expressing a protein of interest comprising culturing the RF1-deficient E. coli bacterial cell disclosed above, wherein the RF1- deficient E. coli bacterial cell comprises an expression cassette comprising a coding sequence for the protein of interest.
- the recombinant protein comprises one or more NNAAs.
- the stop codons are non-TAG stop codons due to genetic modifications of the stop codons in the wild-type coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA
- the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA polypeptides comprise polynucleotide sequences of SEQ ID NO: 5, 7, 9, 11, 13, 15, and 17, respectively.
- the RF1-deficient E. coli cell contains an oxidative cytoplasm. In some embodiments, the RF1-deficient E.
- the coli cell comprises a stop codon mutation in one or more of sucB, atpE, fabH, and ubiF coding sequences.
- the RF1- deficient E. coli cell further comprises a ⁇ fabR mutation.
- the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy chain, a cytokine, a cytokine fragment, or an immunogenic polypeptide.
- the immunogenic polypeptide is a carrier protein.
- RF1-deficient E. coli cells can produce proteins comprising NNAAs in high yield.
- RF1- deficient E. coli cells comprise coding sequences in which the TAG stop codons have been mutated to non-TAG stop codons.
- the coding sequences encode one or more proteins selected from hda, lpxK, coaD, lolA, mreC, murF, and hemA.
- the E. coli cells have increased RF2 activity or expression as compared a control E. coli cell.
- the RF2 protein has a T246A mutation with reference to SEQ ID NO: 2, and said mutation confers the increased RF2 activity.
- the E. coli cells are from the K12 strain. [0015] Unlike previous attempts to delete RF1 from the genome, the methods and compositions disclosed in this application advantageously produce RF1-deficient E. coli cells that maintain a fast growth rate but with significantly fewer mutations. The approach repairs RF2 activity by producing an RF2 with the T246A mutation with reference to SEQ ID NO: 2.
- the methods of the present disclosure involve making only 7-12 stop codon mutations in essential genes (mutating TAG to TAA or TGA).
- the methods further comprise knocking in an NNAA aminoacyl-tRNA synthetase and tRNA so the amber suppressor tRNA can release ribosomes stalled at the TAG stop codon and knocking out the fabR gene for enhanced growth of the RF1-deficient E. coli cells.
- a polynucleotide sequence is “operably linked” to another polynucleotide sequence placed into a functional relationship with another polynucleotide sequence.
- a promoter or enhancer is operably linked to a coding sequence if it regulates the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if positioned to facilitate translation.
- operably linked means that the polynucleotide sequences being linked are contiguous and, in the case of a secretory leader, contiguous and in the same open reading frame. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice.
- a non-TAG stop codon refers to a trinucleotide at the 3 terminus of a coding sequence that is not TAG.
- a non-TAG stop codon may be TAA or TGA.
- a control cell refers to a cell from the same genetic background as the cell in which a gene of interest has been genetically modified, except that the control cell comprises the wild-type version of the gene of interest. For example, a control cell for an RF-1 deficient E. coli cell from the K12 strain in which RF2 has been mutagenized is an RF-1 deficient cell derived from the K12 strain in which the RF2 is the wild-type.
- aminoacylation refers to the attachment of an amino acid to a tRNA, a process commonly referring to as charging a tRNA with its correct amino acid.
- Aminoacylation is typically a two-step process catalyzed by the aminoacyl-tRNA synthetases. The first step is the formation of an aminoacyl-AMP (aminoacyl-adenylate) on the enzyme through the hydrolysis of adenosine triphosphate (ATP). The second step is the transfer of the activated amino acid residue from the adenylate to a tRNA.
- aminoacyl-AMP aminoacyl-adenylate
- ATP adenosine triphosphate
- a tRNA that undergoes aminoacylation or has been aminoacylated is one that has been charged with an amino acid, and an amino acid that undergoes aminoacylation or has been aminoacylated has been charged to a tRNA molecule.
- aminoacyl-tRNA synthetase refers to an enzyme that catalyzes the formation of a covalent linkage between an amino acid and a tRNA molecule. This results in an aminoacylated tRNA molecule, which is a tRNA molecule that has its respective amino acid attached via an ester bond.
- tRNA refers to aminoacylation of a tRNA with an amino acid, both natural and non-natural, where the aminoacylation permits a ribosome to incorporate the amino acid into a polypeptide that is being translated from mRNA.
- biologically active adduct refers to a molecule that can perfom a function in a cell or an organism.
- the function may include cell proliferation, apoptosis, post- translational modification (e.g., phosphorylation), cell signaling activation, cell signaling inactivation, cell death, cell labeling, etc.
- preferentially aminoacylates refers to the preference of a tRNA synthtase to aminoacylate (charge) a particular tRNA molecule with a predetermined amino acid molecule compared to another amino acid molecule.
- the tRNA synthtase can selectively aminoacylate a non-natural amino acid (NNAA) over a naturally occurring amino acid, for example, the tRNA synthtase can aminoacylate a specific NNAA at a frequency greater of than 90%, e.g., 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%, compared to any or all other natural amino acids.
- nucleic acid or “polynucleotide” refers to polymers of deoxyribonucleotides (DNA) or ribonucleotides (RNA) in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleic acids that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated.
- DNA deoxyribonucleotides
- RNA ribonucleotides
- degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res.19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)).
- nucleic acid or polynucleotide is used interchangeably with “gene,” “cDNA,” and “mRNA encoded by a gene.”
- peptide protein
- polypeptide are used herein interchangeably and refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residues are artificial chemical mimetics of corresponding naturally occurring amino acids, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers. As used herein, the terms encompass amino acid chains of any length, including full-length proteins and truncated proteins, wherein the amino acid residues are linked by covalent peptide bonds.
- sequence identity refers to two or more sequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence over a comparison window, as measured using a sequence comparison algorithm, e.g., BLASTP. For purposes of this document, the percent identity is determined over the full-length wild-type sequence such as the reference sequence set forth in SEQ ID NO:1.
- the method for calculating the sequence identity as provided herein is the BLASTP program having its defaults set at a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see, e.g., Henikoff and Henikoff, 1989, Proc Natl Acad Sci USA 89:10915).
- Exemplary determination of sequence alignment and % sequence identity can employ the BESTFIT or GAP programs in the GCG Wisconsin Software package (Accelrys, Madison WI), using default parameters provided.
- substitution at amino acid position refers to an amino acid residue at a specific position of the amino acid sequence of a protein being replaced by another, different amino acid.
- the term “X20Y” refers to the substitution of the wild-type (reference) amino acid X at position 20 of the protein with amino acid Y.
- the term “functional variant” refers to a molecule (a polypeptide or polynucleotide) that contain mutations as compared to a reference molecule while retaining at least some of the biological activity of the reference molecule.
- the biological activity can be determined by comparing the activity, function and/or structure of the reference molecule expressed by the methods described herein to the activity of a reference molecule. For example, if the reference molecule is an IgG, a functional variant of the reference molecule comprises a properly folded and assembled IgG molecule.
- the biological activity of the reference molecule and its variants can be determined using an in vitro or in vivo assay that is appropriate for the reference molecule.
- the biological activity of a functional variant is at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or 99% of the activity of a reference protein when assessed using the same or a similar assay.
- antibody refers to a protein functionally defined as a binding protein and structurally defined as comprising an amino acid sequence that is recognized by one of skill as being derived from the framework region of an immunoglobulin encoding gene of an animal- producing antibodies.
- An antibody can consist of one or more polypeptides substantially encoded by immunoglobulin genes or fragments of immunoglobulin genes.
- the recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon, and mu constant region genes, as well as myriad immunoglobulin variable region genes.
- Light chains are classified as either kappa or lambda.
- Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD, and IgE, respectively.
- a typical immunoglobulin (antibody) structural unit is known to comprise a tetramer.
- Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one “light” (about 25 kD) and one “heavy” chain (about 50-70 kD).
- the N-terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition.
- the terms variable light chain (VL) and variable heavy chain (VH) refer to these light and heavy chains respectively.
- VL variable light chain
- VH variable heavy chain
- Fab fragment is an antibody fragment that contains the portion of the full-length antibody that results from digestion of a full-length immunoglobulin with papain, or a fragment having the same structure that is produced synthetically, e.g., recombinantly.
- a Fab fragment contains a light chain (containing a variable (V L ) and constant (C L ) region domain) and another chain containing a variable domain of a heavy chain (V H ) and one constant region domain portion of the heavy chain (CH1).
- a F(ab’) 2 fragment is an antibody fragment that results from digestion of an immunoglobulin with pepsin at pH 4.0-4.5, or a synthetically, e.g., recombinantly, produced antibody having the same structure.
- the F(ab')2 fragment contains two Fab fragments but where each heavy chain portion contains additional amino acids, including cysteine residues that form disulfide linkages joining the two fragments.
- naturally occurring amino acid refers to any one of the 20 amino acids encoded by the genetic code, such as, (arginine, Arg, R; histidine, His, H; lysine, Lys, K; aspartic acid, Asp, D; glutamic acid, Glu, E; serine, S, Ser; threonine, Thr, T; asparagine, Asn, N; glutamine, Gln, Q; cysteine, Cys, G; glycine, Gly, G; proline, Pro, P; alanine, Ala, A; isoleucine, Ile, I; leucine, Leu, L; methionine, Met, M; phenylalanine; Phe, F; tryptophan, Trp, W; tyrosine, Tyr, Y, and valine, Val, V.
- a “null mutation” refers to a mutation in a gene that results in a non-functional gene.
- the null mutation can cause the complete lack of production of associated gene product or the production of a product that lacks the function of the wild type protein.
- oxidative cytoplasm refers to the cytosol of a cell in which a substrate is more likely to become oxidized than reduced.
- the term “recombinant nucleic acid” has its convention meaning.
- a recombinant nucleic acid, or equivalently, polynucleotide is one that is inserted into a heterologous location such that it is not associated with nucleotide sequences that normally flank the nucleic acid as it is found in nature (for example, a nucleic acid inserted into a vector or a genome of a heterologous organism).
- a nucleic acid sequence that does not appear in nature for example, a variant of a naturally occurring gene, is recombinant.
- a cell containing a recombinant nucleic acid, or protein expressed in vitro or in vivo from a recombinant nucleic acid are also “recombinant.”
- recombinant nucleic acids include a protein-encoding DNA sequence that is (i) operably linked to a heterologous promoter and/or (ii) encodes a fusion polypeptide with a protein sequence and a heterologous signal peptide sequence.
- carrier protein refers to a non-toxic or detoxified polypeptide containing a T-cell activating epitope which is able to be attached to an antigen (e.g., a polysaccharide) to enhance the humoral response to the conjugated antigen in a subject.
- an antigen e.g., a polysaccharide
- the term includes any of the bacterial proteins used as epitope carriers in FDA-approved vaccines.
- the carrier protein is Corynebacterium diphtheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D (PD, HiD), outer membrane protein complex of serogroup B meningococcus (OMPC), CRM197, or malaria ookinete specific surface protein Pfs25.
- the carrier protein is BB, derived from the G protein of Streptococcus strain G148.
- the term “immunogenic polypeptide” refers to a polypeptide comprising at least one T-cell activating epitope, wherein the T-cell epitope is derived from a protein capable of inducing immunologic memory in animals.
- T-cell activating epitope refers to a structural unit of molecular structure which is capable of inducing T-cell immunity.
- carrier proteins which include T- cell activating epitopes are well known and documented for conjugates.
- a T-cell activating epitope in the carrier protein enables the covalently attached antigen to be processed by antigen-presenting cells and presented to CD4+ T cells to induce immunological memory against the antigen.
- cytokine or “cytokines” refers to the general class of biological molecules which effect cells of the immune system.
- cytokines include, but are not limited to, interferons and interleukins (IL), for example, IL-2, IL-12, IL-15, IL-18, and IL-21.
- IL interferons and interleukins
- aminoacyl-tRNAs because they are adhered to specific amino acids corresponding to each tRNA's anticodon.
- aminoacyl-tRNAs In the standard genetic code, there are three mRNA stop codons: UAG (“amber”), UAA (“ochre”), and UGA (“opal” or “umber”). These codons are not decoded by tRNAs but are recognized by release factors.
- the release factors upon recognizing the stop codons release the newly synthesized peptides from the ribosome, thus terminating the translation.
- two types of release factors recognize stop codons: Release factor 1 (RF-1) and Release factor 2 (RF-2).
- the E. coli RF1 (SEQ ID NO: 1) and RF2 (SEQ ID NO: 2) are structurally similar and have related but distinct functions. RF1 and RF2 share a universally conserved GGQ motif that interacts with the peptidyl transferase center of the ribosome to promote catalysis (Youngman et al., Annu. Rev. Microbiol.62, 353–373 (2008)).
- RF1 and RF2 promote termination through induced-fit mechanisms.
- One main difference between the two release factors is that RF1 recognizes the UAA and UAG stop codons, while RF2 recognizes UAA and UGA stop codons.
- stop codon refers to a trinucleotide in a DNA or mRNA sequence that signals a halt to protein synthesis in the cell.
- TAG DNA trinucleotide
- UAG RNA trinucleotide
- E. coli cells [0045]
- the E. coli strain in this disclosure can be any E. coli strain known to one of skill in the art. In some embodiments, the E. coli strain is a A (K-12), B, C or D strain.
- RF1-deficient E. coli cells [0046] The E. coli cells disclosed herein are RF1-deficient.
- RF1-deficient E. coli cells are generated by knocking out or introducing mutations to the wild type RF1 gene from the genome of the E. coli strain using methods that are well known in the art and as further described below; see the section below entitled “Methods of introducing mutations to E. coli.”
- an RF-1 deficient E. coli cell may possess 50% or less, 40% or less, 30% or less, 20% or less, 15% or less, 10% or less, 5% or less, or 0% of the RF-1 activity (catalyzing translational termination from a ribosomal complex stalled at the amber codon) of the control E. coli cell.
- E. coli cells e.g., RF1-deficient E. coli cells having an oxidative cytoplasm can be selected based on their ability to support production of a polypeptide having one or more disulfide bonds.
- E. coli cells having an oxidative cytoplasm can facilitate the formation of disulfide bonds that are required for the proper folding and functioning of those polypeptides. Accordingly, in some embodiments, selecting the E. coli having an oxidative cytoplasm can be conducted by transforming the bacteria with a gene encoding a polypeptide (a “test” polypeptide, for example, an antibody light chain) normally containing at least one disulfide bond.
- a test polypeptide for example, an antibody light chain
- a coding sequence for an LC protein e.g., an anti-MUC1 antibody light chain (SEQ ID NO: 15 as disclosed in WO 2020/097385), described above can be engineered into an expression cassette under a suitable promoter and transformed into the candidate E. coli cells.
- the soluble protein fraction that contains the LC is measured.
- a suitable E. coli strain can be selected if it is able to express the LC in a soluble form of at least 1 mg/100 mL. Methods for preparing a bacterial lysate and measuring the amount of protein expression (e.g., the expression of LC) in the lysate are well known. In some embodiments, the E.
- coli cells can be treated with a lysis agent to produce a lysate.
- Cytoplasmic proteins can be released by treating the lysate with enzymes, such as benzonase and egg white lysozyme.
- the insoluble protein fraction can be separated from the soluble fraction by e.g., centrifugation.
- the soluble protein fraction (containing the LC) can be collected and analyzed by SDS-PAGE.
- the amount of LC protein in the soluble protein fraction can then be quantified by, e.g., densitometry.
- an RF1-deficient E. coli cell having an oxidative cytoplasm comprises at least one stop codon mutation: TAG to non-TAG stop codon.
- the stop codon mutation is TAG to TAA.
- RF1-deficient E. coli cells comprising one or more stop codon mutations are grown to produce a recombinant protein comprising an NNAA, as described herein.
- at least one stop codon mutation is introduced to a coding sequence using TAG as the stop codon.
- Nonlimiting examples of such coding sequences include those that encode polypeptides hda, lpxK, coaD, lolA, mreC, murF, hemA, sucB, atpE, fabH, ubiF, or any combination thereof.
- at least one stop codon mutation is introduced to one or more or all coding sequences encoding polypeptides include hda, lpxK, coaD, lolA, mreC, murF, and/or hemA.
- At least one, at least two, at least three, at least four, at least five, at least six, or all of the coding sequences hda, lpxK, coaD, lolA, mreC, murF, hemA comprise stop codon mutations and the coding sequences comprise TAA or TGA as stop codons instead of TGA.
- the number of stop codon mutations in the coding sequences hda, lpxK, coaD, lolA, mreC, murF, and hemA is in a range from 1 to 7, from 2 to 6, from 3 to 7, from 4 to 7, from 5 to 7, from 6 to 7.
- all TAG codons in the coding sequences encoding polypeptides hda, lpxK, coaD, lolA, mreC, murF, and hemA have been mutated to TAA or TGA. In some embodiments, all TAG codons in the coding sequences encoding polypeptides hda, lpxK, coaD, lolA, mreC, murF, and hemA have been mutated to TAA. In some embodiments, the number of the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA that comprise non-TAG stop codons is 2, 3, 4, 5, 6, or 7.
- an RF1-deficient E. coli cell comprises at least three, at least five, at least 10, at least 20, at least 30, at least 50, or at least 100 contiguous nucleotides starting from the 3-prime terminus of the coding sequence of one or more stop codon variants.
- Said one or more stop codon variants include the hda stop codon variant (SEQ ID NO: 5), the lpxK stop codon variant (SEQ ID NO: 7), the coaD stop codon variant (SEQ ID NO: 9), the lolA stop codon variant (SEQ ID NO: 11), the mreC stop codon variant (SEQ ID NO: 13), the murF stop codon variant (SEQ ID NO: 15), and the hemA stop codon variant (SEQ ID NO: 17), the sucB stop codon variant (SEQ ID NO: 19), the atpE stop codon variant (SEQ ID NO: 21), the fabH stop codon variant (SEQ ID NO: 23), and/or the ubiF stop codon variant (SEQ ID NO: 25).
- the one or more stop codon variants encode one or more of the following polypeptides: hda (SEQ ID NO: 34), lpxK (SEQ ID NO: 35), coaD (SEQ ID NO: 36), lolA (SEQ ID NO: 37), mreC (SEQ ID NO: 38), murF (SEQ ID NO: 39), and hemA (SEQ ID NO: 40), sucB (SEQ ID NO: 41), atpE (SEQ ID NO: 42), and/or fabH (SEQ ID NO: 43).
- the one or more stop codon variants encode one or more following polypeptides that are functional variants of the above polypeptides.
- Such functional variants may have at least 70%, at least 75%, at least 80%, at least 85% at least 90% at least 95% at least 98% or at least 99% amino acid sequence identity to one of the hda (SEQ ID NO: 34), lpxK (SEQ ID NO: 35), coaD (SEQ ID NO: 36), lolA (SEQ ID NO: 37), mreC (SEQ ID NO: 38), murF (SEQ ID NO: 39), and hemA (SEQ ID NO: 40), sucB (SEQ ID NO: 41), atpE (SEQ ID NO: 42), and fabH (SEQ ID NO: 43).
- Mutations in RF2 [0051]
- the RF1-deficient E may have at least 70%, at least 75%, at least 80%, at least 85% at least 90% at least 95% at least 98% or at least 99% amino acid sequence identity to one of the hda (SEQ ID NO: 34), lpxK (SEQ ID NO: 35), coa
- an RF1-deficient E. coli cell is derived from the K12 strain and has increased RF2 activity as compared to a control cell.
- an RF1-deficient E. coli cell disclosed herein comprises a T246X mutation as compared to the wild-type RF2 (SEQ ID NO: 2), wherein X represent any naturally occurring amino acid (e.g., any one of G, A, V, L, and I) or modified amino acid.
- the T246X mutation is T246A.
- the E. coli cell comprises an RF2 variant having the sequence of SEQ ID NO: 3.
- the RF1-deficient E. coli cell is from a B10 strain; an E.
- the RF1-deficient E. coli cell further comprises a fabR null mutation ( ⁇ fabR mutation) as described in Mukai et al., Sci. Rep. 2015; 5: 9699, doi: 10.1038/srep09699.
- the mutation improves E. coli cell growth and increase the production efficiency of a protein of interest.
- the fabR null mutation was introduced using homologous recombination with lambda red recombinase.
- the fabR gene was replaced with a linear piece of DNA that contained a selection marker flanked with arms homologous to the 5’ and 3’ regions of the fabR gene. See, Datsenko, K. A., & Wanner, B. L. (2000). One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proceedings of the National Academy of Sciences of the United States of America, 97(12), 6640–6645. doi.org/10.1073/pnas.120163297.
- Non-natural amino acids [0053]
- the RF1-deficient E. coli cells can be used as host cells to produce non-natural amino acid (NNAA)-containing recombinant proteins.
- Suitable non-natural amino acids that can be incorporated in the antibodies include, for example, those disclosed in U.S. Pat. No. 10,179,909; U. S. Pat. No. 9,938,516; U. S. Pat. No. 9,682,934; U. S. Pat. No. 10,596,270; and U.S. Pat. No. 10,610,571, the entire contents of which are herein incorporated by reference.
- the non-natural amino acid may comprise a reactive group useful for forming a covalent bond to a linker or a biologically active adduct (aka., a payload), as described below.
- the reactive group is selected from the group consisting of amino, carboxy, acetyl, hydrazino, hydrazido, semicarbazido, sulfanyl, azido and alkynyl.
- the non-natural amino acids may be L-amino acids, or D-amino acids, or racemic amino acids.
- the non-natural amino acids described herein include D-versions of the natural amino acids and racemic versions of the natural amino acids.
- the non-natural amino acid is according to any of the following formulas: [0056] In the above formulas, the wavy lines indicate bonds that connect to the remainder of the polypeptide chains of the antibodies.
- non-natural amino acids can be incorporated into polypeptide chains just as natural amino acids are incorporated into the same polypeptide chains.
- the non-natural amino acids are incorporated into the polypeptide chain via amide bonds as indicated in the formulas.
- R designates any functional group without limitation, so as long as the amino acid residue is not identical to a natural amino acid residue.
- R can be a hydrophobic group, a hydrophilic group, a polar group, an acidic group, a basic group, a chelating group, a reactive group, a therapeutic moiety or a labeling moiety.
- each L represents a linker (e.g., a divalent linker), as further described below.
- the non-naturally encoded amino acids include side chain functional groups that react efficiently and selectively with functional groups not found in the 20 common amino acids (including but not limited to, azido, ketone, aldehyde and aminooxy groups) to form stable conjugates.
- antigen-binding polypeptide that includes a non-naturally encoded amino acid containing an azido functional group can be reacted with a polymer (including but not limited to, poly(ethylene glycol) or, alternatively, a second polypeptide containing an alkyne moiety to form a stable conjugate resulting for the selective reaction of the azide and the alkyne functional groups to form a Huisgen [3+2] cycloaddition product.
- a strong nucleophile including, but not limited to, hydrazine, hydrazide, aminooxy, hydroxylamine, or semicarbazide
- an aldehyde or ketone group present in a non-naturally encoded amino acid to form a hydrazone, oxime, or semicarbazone, as applicable, which in some cases can be further reduced by treatment with an appropriate reducing agent.
- non-naturally encoded amino acids that may be suitable for use in the present invention and that are useful for reactions with water soluble polymers, including but are not limited to, those with carbonyl, aminooxy, hydrazine, hydrazide, semicarbazide, azide and alkyne reactive groups.
- non-naturally encoded amino acids comprise a saccharide moiety.
- amino acids examples include N-acetyl-L-glucosaminyl-L-serine, N-acetyl-L- galactosaminyl-L-serine, N-acetyl-L-glucosaminyl-L-threonine, N-acetyl-L-glucosaminyl-L- asparagine and O-mannosaminyl-L-serine.
- amino acids also include examples where the naturally-occurring N- or O-linkage between the amino acid, and the saccharide is replaced by a covalent linkage not commonly found in nature-including but not limited to, an alkene, an oxime, a thioether, an amide and the like.
- amino acids also include amino acids linked to saccharides that are not commonly found in naturally-occurring proteins, such as 2-deoxy-glucose, 2-deoxygalactose and the like.
- Many of the non-natural amino acids suitable for use in the present invention are commercially available, e.g., from Sigma (USA) or Aldrich (Milwaukee, Wis., USA).
- Tyrosine analogs include, but are not limited to, para-substituted tyrosines, ortho-substituted tyrosines, and meta substituted tyrosines, where the substituted tyrosine comprises, including but not limited to, a keto group (including but not limited to, an acetyl group), a benzoyl group, an amino group, a hydrazine, an hydroxyamine, a thiol group, a carboxy group, an isopropyl group, a methyl group, a C6-C20 straight chain or branched hydrocarbon, a saturated or unsaturated hydrocarbon, an O- methyl group, a polyether group, a nitro group, an alkynyl group or the like.
- a keto group including but not limited to, an acetyl group
- a benzoyl group an amino group, a hydrazine, an hydroxyamine, a thiol group, a carboxy group, an
- Glutamine analogs that may be suitable for use in the present invention include, but are not limited to, ⁇ -hydroxy derivatives, ⁇ -substituted derivatives, cyclic derivatives, and amide substituted glutamine derivatives.
- Example phenylalanine analogs that may be suitable for use in the present invention include, but are not limited to, para-substituted phenylalanines, ortho-substituted phenyalanines, and meta-substituted phenylalanines, where the substituent comprises, including but not limited to, a hydroxy group, a methoxy group, a methyl group, an allyl group, an aldehyde, an azido, an iodo, a bromo, a keto group (including but not limited to, an acetyl group), a benzoyl, an alkynyl group, or the like.
- non- natural amino acids include, but are not limited to, an azidoethoxycarbonyl lysine (AEK), a p-acetyl-L-phenylalanine, an O-methyl-L-tyrosine, an L-3-(2-naphthyl)alanine, a 3-methyl-phenylalanine, an O-4-allyl-L-tyrosine, a 4-propyl-L- tyrosine, a tri-O-acetyl-GlcNAc ⁇ -serine, an L-Dopa, a fluorinated phenylalanine, an isopropyl-L- phenylalanine, a p-azido-L-phenylalanine, a p-azido-methyl-L-phenylalanine, a p-acyl-L- phenylalanine, a p-benzoyl-
- AEK azidoethoxycarbon
- WO 2002/085923 entitled “In vivo incorporation of non- natural amino acids.” See also Kiick et al., (2002) Incorporation of azides into recombinant proteins for chemoselective modification by the Staudinger ligation, PNAS 99:19-24, for additional methionine analogs.
- non-natural amino acids include, but are not limited to, p- acetyl-L-phenylalanine, O-methyl-L-tyrosine, L-3-(2-naphthyl)alanine, 3-methyl-phenylalanine, O-4-allyl-L-tyrosine, 4-propyl-L-tyrosine, tri-O-acetyl-GlcNAc b-serine, L-Dopa, fluorinated phenylalanine, isopropyl-L-phenylalanine, p-azido-methyl-L-phenyl alanine, p-azido-L- phenylalanine, p-acyl-L-phenylalanine, p-benzoyl-L-phenylalanine, L-phosphoserine, phosphonoserine, phosphonotyrosine, p-iodo-phenylalanine, p-
- N-acetyl-L-glucosaminyl-L-serine N-acetyl-L-galactosaminyl-L-serine, N- acetyl-L-glucosaminyl-L-threonine, N-acetyl-L-glucosaminyl-L-asparagine and O- mannosaminyl-L-serine.
- the non-natural amino acids are selected from p-acetyl- phenylalanine, p-ethynyl-phenylalanine, p-propargyloxyphenylalanine, p-azido-methyl- phenylalanine, and p-azido-phenylalanine.
- the non-natural amino acid is p- azido phenylalanine.
- the first reactive group is an alkynyl moiety (including but not limited to, in the non-natural amino acid p-propargyloxyphenylalanine, where the propargyl group is also sometimes referred to as an acetylene moiety) and the second reactive group is an azido moiety, and [3+2] cycloaddition chemistry can be used.
- the first reactive group is the azido moiety (including but not limited to, in the non-natural amino acid p-azido-L- phenylalanine) and the second reactive group is the alkynyl moiety.
- the non-natural amino acids used in the methods and compositions described herein have at least one of the following four properties: (1) at least one functional group on the sidechain of the non-natural amino acid has at least one characteristics and/or activity and/or reactivity orthogonal to the chemical reactivity of the 20 common, genetically-encoded amino acids (i.e., alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine), or at least orthogonal to the chemical reactivity of the naturally occurring amino acids present in the polypeptide that includes the non-natural amino acid; (2) the introduced non-natural amino acids are substantially chemically inert toward the 20 common, genetically- encoded amino acids; (3) the non-natural amino acid can be
- Non-natural amino acids may also include protected or masked oximes or protected or masked groups that can be transformed into an oxime group after deprotection of the protected group or unmasking of the masked group.
- Non-natural amino acids may also include protected or masked carbonyl or dicarbonyl groups, which can be transformed into a carbonyl or dicarbonyl group after deprotection of the protected group or unmasking of the masked group and thereby are available to react with hydroxylamines or oximes to form oxime groups.
- non-natural amino acids that may be used in the methods and compositions described herein include, but are not limited to, amino acids comprising a photoactivatable cross-linker, spin-labeled amino acids, fluorescent amino acids, metal binding amino acids, metal-containing amino acids, radioactive amino acids, amino acids with novel functional groups, amino acids that covalently or non-covalently interact with other molecules, photocaged and/or photoisomerizable amino acids, amino acids comprising biotin or a biotin analogue, glycosylated amino acids such as a sugar substituted serine, other carbohydrate modified amino acids, keto-containing amino acids, aldehyde-containing amino acids, amino acids comprising polyethylene glycol or other polyethers, heavy atom substituted amino acids, chemically cleavable and/or photocleavable amino acids, amino acids with an elongated side chains as compared to natural amino acids, including but not limited to, polyethers or long chain hydrocarbons, including but not limited to, greater
- non-natural amino acids comprise a saccharide moiety.
- amino acids include N-acetyl-L-glucosaminyl-L-serine, N-acetyl-L-galactosaminyl-L- serine, N-acetyl-L-glucosaminyl-L-threonine, N-acetyl-L-glucosaminyl-L-asparagine and O- mannosaminyl-L-serine.
- amino acids also include examples where the naturally-occurring N- or O-linkage between the amino acid and the saccharide is replaced by a covalent linkage not commonly found in nature-including but not limited to, an alkene, an oxime, a thioether, an amide and the like.
- amino acids also include saccharides that are not commonly found in naturally-occurring proteins such as 2-deoxy-glucose, 2-deoxygalactose and the like.
- the non-natural amino acid is one selected from the group of non-natural amino acids shown in FIG. 8A-8D of WO 2021/222719, the entire disclosure of said international application publication is herein incorporated by reference.
- non-natural amino acids may be in the form of a salt or may be incorporated into a non-natural amino acid polypeptide, polymer, polysaccharide, or a polynucleotide and optionally post translationally modified.
- the non-natural amino acid is para-azidomethyl-L-phenylalanine (pAMF), Azidoethoxycarbonyl lysine (AEK), or p-acetyl-L-phenylalanine (pAcF).
- the non-natural amino acid is (S)-2-amino-3-(5-(6-methyl-1,2,4,5-tetrazin-3- ylamino)pyridin-3-yl)propanoic acid.
- Incorporation of non-natural amino acids [0069] Methods for incorporating non-natural amino acids into a protein of interest for production in the RF1-deficient E. coli cells described herein are well known, e.g., as described U.S. Pat. No. 9,988,619 and U.S. Pat. No. 9,938,516, the entire contents of which are herein incorporated by reference. [0070] In one approach, the coding sequence of the protein of interest is modified to contain at least one non-natural amino acid codon.
- the non-natural amino acid codon is one that does not result in the incorporation of any of the 20 natural amino acids.
- the non- natural amino acid codon is an amber, opal, or ochre stop codon, which is repurposed to charge a non-natural amino acid to its cognate tRNA by a tRNA synthetase instead of terminating translation.
- one or more codons encoding one or more natural amino acids at a desired NNAA incorporation sites are mutated to one or more TAG codons, which are then repurposed to charge one or more NNAAs.
- a non-natural amino acid can be charged to a tRNA by a tRNA synthetase, which preferentially acetylates the non-natural amino acid as compared to any of the 20 natural amino acids.
- tRNA synthetases having such function are known, for example, U.S. Pat. No. 9,938,516 discloses tRNA synthetases that selectively incorporate a non-natural amino acid para- methylazido-L-phenylalanine (pAMF).
- tRNA synthetases that can selectively incorporate other non-natural amino acids, for example, Azidoethoxycarbonyl lysine (AEK) or -acetyl-L- phenylalanine (pAcF), are also well known; see, for example, Chen et al., Angew Chem Int Ed Engl. 2009; 48(22):4052-5 (doi: 10.1002/anie.200900683); and Li et al., Proc Natl Acad Sci U S A, 2003 Jan 7;100(1):56-61. The entire contents of said publications are herein incorporated by reference.
- AEK Azidoethoxycarbonyl lysine
- pAcF -acetyl-L- phenylalanine
- non-natural amino acids that can be incorporated into the antibodies disclosed herein include aralkyl, heterocyclyl, and heteroaralkyl, and lysine-derivative unnatural amino acids.
- non-natural amino acid comprise pyridyl, pyrazinyl, pyrazolyl, triazolyl, oxazolyl, thiazolyl, thiophenyl, or other heterocycle.
- amino acids in some embodiments comprise azides, tetrazines, or other chemical group capable of conjugation to a coupling partner, such as a water-soluble moiety.
- tRNA synthetases that are capable of selectively incorporating a non-natural amino acid may also be obtained by genetically modifying a wild type tRNA synthetase to produce mutant tRNA synthetases. Each of these mutant tRNA synthetases can then be tested for its activity in selectively incorporating the non-natural amino acid using in a reporter gene, which contains the desired non-natural amino acid codon. The activity of the mutant tRNA synthetase variant in the presence of the non-natural amino acid (e.g., pAMF) as compared to the 20 common naturally occurring amino acids can be measured by detecting the presence or absence of the reporter protein.
- the non-natural amino acid e.g., pAMF
- the non-natural amino acid codon is a synthetic codon
- the unnatural amino acid is incorporated into a protein (e.g., an antibody) using an orthogonal synthetase/tRNA pair.
- the orthogonal synthetase may be a synthetase that is modified from any of the natural amino acid synthetases.
- the orthogonal synthetase may be a proline synthetase, a modified serine synthetase, a modified tryptophan synthetase, or a modified phosphoserine synthetase.
- the orthogonal tRNA may also be modified from any of the natural amino acid tRNA.
- the orthogonal tRNA may be a modified alanine tRNA, a modified arginine tRNA, a modified aspartic acid tRNA, a modified cysteine tRNA, a modified glutamine tRNA, a modified glutamic acid tRNA, a modified alanine glycine, a modified histidine tRNA, a modified leucine tRNA, a modified isoleucine tRNA, a modified lysine tRNA, a modified methionine tRNA, a modified phenylalanine tRNA, a modified proline tRNA, a modified serine tRNA, a modified threonine tRNA, or a modified tryptophan tRNA.
- a modified tyrosine tRNA in some embodiments, a modified valine tRNA, or a modified phosphoserine tRNA.
- the tRNA is encoded by nucleotides 1018-1163 of SEQ ID NO: 32.
- the tRNA shares at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to a tRNA encoded by the polynucleotide sequence comprising or consisting of nucleotides 1018-1163 of SEQ ID NO: 32.
- the RNA synthetase is encoded by the polynucleotide sequence comprising or consisting of 152-1072 of SEQ ID NO: 32. In some embodiments, the RNA synthetase shares at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to a polypeptide encoded by the polynucleotide sequence comprising or consisting of 152-1072 of SEQ ID NO: 32. Codon optimization [0075] Codon optimization may be used to increase the rate of translation of the protein of interest or to produce recombinant RNA transcripts having desirable properties, such as a longer half-life, as compared with transcripts produced using a non-optimized sequence.
- the protein coding sequences may be optimized to maximize expression efficiency in E. coli.
- bases that are in the vicinity of the non-natural amino acid codon may affect the mRNA conformation in the P site in the ribosome and thus have an impact on the efficiency of incorporating the non-natural amino acid.
- the codon that is immediately 3’ to the non-natural amino acid codon is optimized to maximize protein expression.
- the optimal codon can be selected by comparing the yield of proteins produced from expressing the coding sequences having different codons for the same amino acid in the same location. Coding sequences that produce the desired high yield are then selected.
- Proteins of interest [0076]
- the RF1-deficient E. coli cell can be used to produce a protein of interest, for example, a recombinant protein comprising one or more NNAAs as described above.
- the protein of interest can be eukaryotic or prokaryotic proteins, viral proteins, or plant proteins. In some embodiments, the protein of interest is of mammalian origin, including murine, bovine, ovine, feline, porcine, canine, goat, equine, and primate origin.
- the protein of interest is of human origin.
- the protein of interest is an antibody, such as single chain antibodies, a fragment of an antibody, as well as antibodies consisting of multiple polypeptide chains.
- the antibody is an IgA, an IgD, an IgE, an IgG, or an IgM antibody.
- the protein of interest is a light chain or heavy chain of an antibody.
- the protein of interest is an scFv.
- the protein of interest is a Fab fragment.
- the protein of interest is a monoclonal antibody.
- the antibody is a humanized antibody or a human antibody.
- an antibody of the disclosure may be chemically modified (e.g., one or more chemical moieties can be attached to the antibody) or be modified, e.g., produced in cell lines and/or in cell culture conditions to alter its glycosylation (e.g., hypofucosylation, afucosylation, or increased sialylation) to alter one or more functional properties of the antibody.
- the antibody can be linked to one of a variety of polymers, for example, polyethylene glycol.
- the antibody is aglycosylated.
- an antibody may comprise mutations to facilitate linkage to a chemical moiety and/or to alter residues that are subject to post-translational modifications, e.g., glycosylation.
- the Fc region of the antibody containing no fucose i.e., the Fc region is afucosylated.
- Afucosylated antibodies can be produced using cell lines that express a heterologous enzyme that depletes the fucose pool inside the cell (e.g., GlymaxX ® by ProBioGen AG, Berlin, Germany).
- Non-fucosylated antibodies can also be produced using a host cell line in which the endogenous ⁇ -1,6-fucosyltransferase (FUT8) gene is deleted. See Satoh, M. et al., “Non-fucosylated therapeutic antibodies as next-generation therapeutic antibodies,” Expert Opinion on Biological Therapy, 6:11, 1161-1173, DOI: 10.1517/14712598.6.11.1161. [0081] Antibodies produced using the methods in this disclosure can be conjugated to a biologically active adduct (aka, a payload) using a chemical reaction such as the click chemistry.
- a biologically active adduct aka, a payload
- the antibody comprises one or more non-natural amino acids (as described above) at specific sites in the protein sequence, and the biologically active adduct can be conjugated to these non-natural amino acids.
- a biologically active adduct can be conjugated to the non-natural amino acid using a chemical reaction such as the click chemistry.
- the pAMF containing antibody produced using the methods disclosed herein can be purified by standard procedures.
- the purified protein is subject to a click chemistry reaction (e.g., copper(I)- catalyzed azide-alkyne 1,3-cycloaddition reaction or copper-free catalyzed azide-aklyne 1,3- cycloaddition reaction) to directly conjugate a biologically active adduct to the pAMF residue.
- a click chemistry reaction e.g., copper(I)- catalyzed azide-alkyne 1,3-cycloaddition reaction or copper-free catalyzed azide-aklyne 1,3- cycloaddition reaction
- Exemplary biologically active adducts for use in the present invention include, but are not limited to, small molecules, oligonucleotides, peptides, amino acids, nucleic acids, sugars, oligosaccharides, polymers, synthetic polymers, chelators, fluorophores, chromophores, other detectable agents, drug moieties, cytotoxic agents, detectable agents, and the like.
- the protein of interest is an immunogenic polypeptide.
- the immunogenic polypeptide is a carrier protein.
- a carrier protein disclosed herein comprises at least one T-cell activating epitope.
- the T-cell activating epitope is from a protein selected from the group consisting of Corynebacterium diphtheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D (PD, HiD), outer membrane protein complex of serogroup B meningococcus (OMPC), and CRM197.
- a carrier protein comprises a polypeptide that can be conjugated to an antigen to provide a T-cell dependent immune response.
- the antigen comprises a T-cell independent antigen selected from the group consisting of a hapten, a bacterial capsular polysaccharide, a bacterial lipopolysaccharide, or a tumor-derived glycan.
- the antigen comprises a bacterial non-capsular polysaccharide, such as an exopolysaccharide e.g. the S.aureus exopolysaccharide.
- the antigen is a bacterial polysaccharide and the bacteria is selected from the group consisting of Streptococcus pneumoniae, Neisseria meningitidis, Haemophilus influenzae (e.g.
- At least one of the non-natural amino acids is selected from group consisting of 2-amino-3-(4-azidophenyl)propanoic acid (pAF), 2-amino-3- (4-(azidomethyl)phenyl)propanoic acid (pAMF), 2-amino-3-(5-(azidomethyl)pyridin-2- yl)propanoic acid, 2-amino-3-(4-(azidomethyl)pyridin-2-yl)propanoic acid, 2-amino-3-(6- (azidomethyl)pyridin-3-yl)propanoic acid, 2-amino-5-azidopentanoic acid, and 2-amino-3-(4- (azidomethyl)phenyl)propanoic acid.
- pAF 2-amino-3-(4-azidophenyl)propanoic acid
- pAMF 2-amino-3-(5-(azidomethyl
- a carrier protein comprises a polypeptide that comprises one or more NNAAs as disclosed above.
- the carrier protein comprises at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, or at least 9 NNAA residues.
- the non-natural amino acid is selected from the group consisting of 2-amino- 3-(4-azidophenyl)propanoic acid (pAF), 2-amino-3-(4-(azidomethyl)phenyl)propanoic acid (pAMF), 2-amino-3-(5-(azidomethyl)pyridin-2-yl)propanoic acid, 2-amino-3-(4- (azidomethyl)pyridin-2-yl)propanoic acid, 2-amino-3-(6-(azidomethyl)pyridin-3-yl)propanoic acid, 2-amino-5-azidopentanoic acid, or 2-amino-3-(4-(azidomethyl)phenyl)propanoic acid, and any combination thereof.
- a carrier protein comprises a polypeptide that comprises at least one NNAA, the NNAA comprising a bio-orthogonal reactive moiety through which the antigen is conjugated to the carrier protein.
- the polypeptide comprises at least two non-natural amino acids comprising a bio-orthogonal reactive moiety through which the antigen is conjugated to the polypeptide.
- a carrier protein disclosed herein comprises at least one T-cell activating epitope and at least one, and preferably at least two, NNAA, wherein the antigen is conjugated to the NNAA and further wherein the at least one NNAA is a 2,3-disubstituted propanoic acid bearing an amino substituent at the 2-position and an azido-containing substituent, a 1,2,4,5-tetrazinyl-containing substituent, or an ethynyl-containing substituent at the 3-position.
- the carrier protein comprises an amino acid sequence having at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% amino acid sequence identity to CRM197 (SEQ ID NO: 33).
- the carrier protein comprises one or more NNAA.
- the carrier protein comprises or consists of a polypeptide having an amino acid sequence of SEQ ID NO: 33.
- At least one of the K25, K34, K38, K40, K213, K215, K228, K245, K265, K386, K523 and K527 of SEQ ID NO: 33 is substituted by an NNAA. In some embodiments, at least one of the K34, K213, K245, K265, K386, and K527 of SEQ ID NO: 33 is substituted by an NNAA.
- 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 lysines selected from the group consisting of K25, K34, K38, K40, K213, K215, K228, K245, K265, K386, K523 and K527 of SEQ ID NO: 33 are substituted by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 NNAA(s).
- 1, 2, 3, 4, 5, 6 lysines of K34, K213, K245, K265, K386, and K527 of SEQ ID NO: 33 are substituted by 1, 2, 3, 4, 5, 6 NNAA(s).
- proteins of interest which can be produced include the following proteins: mammalian polypeptides including molecules such as, e.g., renin, growth hormone, receptors for hormones or growth factors; CD proteins such as CD-3, CD4, CD8, and CD-19; interleukins; interferons; T-cell receptors; surface membrane proteins; decay accelerating factor; viral antigen such as, for example, a portion of the AIDS envelope; transport proteins; homing receptors; addressins; regulatory proteins; antibodies; and fragments of any of the above-listed polypeptides.
- the protein of interest is cytokine.
- the protein is selected from the group consisting of IL-1-like, IL-1 ⁇ , IL-1 ⁇ , IL-1RA, IL-18, IL-2, IL- 4, IL-7, IL-9, IL-13, IL-15, IL-3, IL-5, , IL-16, IL-17, IFN- ⁇ , IFN- ⁇ , IFN- ⁇ , TNF, CD154, LT- ⁇ , TNF- ⁇ , TNF- ⁇ , 4-1BBL, APRIL, CD70, CD153, CD178, GITRL, LIGHT, OX40L, TALL-1, TRAIL, TWEAK, TRANCE, TGF- ⁇ , TGF- ⁇ 1, TGF- ⁇ 2, TGF- ⁇ 3, Epo, Tpo, Flt-3L, SCF, M-CSF, and MSP.
- the protein is IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL- 9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IL-16, IL-17, IL-18, IL-19, IL-20, IL-21, IL-22, IL- 23, IL-24, IL-25, IL-26, IL-27, IL-28, IL-29, IL-30, IL-31, IL-32, IL-33, IL-34, IL-35, or IL-36.
- the protein is IL-2.
- the protein is an interferon.
- the protein is interferon alpha. In certain embodiments, the protein is interferon beta. In certain embodiments, the protein is interferon gamma. In certain embodiments, the protein is a tumor necrosis factor. In certain embodiments, the protein is TNF alpha. In certain embodiments, the protein is TNF beta. In certain embodiments, the protein is a transforming growth factor. In certain embodiments, the protein is a chemokine. In certain embodiments, the protein is G-CSF. In certain embodiments, the protein is GM-CSF. In certain embodiments, the protein is erythropoietin. In certain embodiments, the protein is alpha-galactosidase A.
- the protein is tissue plasminogen activator. In certain embodiments, the protein is insulin. In certain embodiments, the protein is insulin-like growth factor. In certain embodiments, the protein is human growth hormone. In certain embodiments, the protein is erythropoietin.
- a protein produced by the methods and compositions disclosed herein can be used for one or more of the following purposes or effects: inhibiting the growth, infection or function of, or killing, infectious agents, including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape (such as, for example, breast augmentation or diminution, change in bone form or shape); effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional factors or component(s); effecting behavioral characteristics, including, without limitation, appetite, libido, stress, cognition (including cognitive disorders), depression (including depressive disorders) and violent behaviors; providing analgesic effects or other
- polypeptides and proteins produced by the invention can be used for any purpose known to one of skill in the art.
- Preferred uses include medical uses, including diagnostic uses, prophylactic and therapeutic uses.
- the proteins can be prepared for topical or other type of administration.
- Another preferred medical use is for the preparation of vaccines.
- the proteins produced by the invention are solubilized or suspended in pharmacologically acceptable solutions to form pharmaceutical compositions for administration to a subject. Appropriate buffers for medical purposes and methods of administration of the pharmaceutical compositions are further set forth below. It will be understood by a person of skill in the art that medical compositions can also be administered to subjects other than humans, such as for veterinary purposes.
- a protein of interest such as an antibody, produced by the invention, including those incorporating non-natural amino acids can be used for any purpose known to one of skill in the art.
- Preferred uses include medical uses, including diagnostic uses, prophylactic, and therapeutic uses.
- the antibodies can be prepared for topical or other type of administration.
- the proteins produced by the invention are solubilized or suspended in pharmacologically acceptable solutions to form pharmaceutical compositions for administration to a subject. Appropriate buffers for medical purposes and methods of administration of the pharmaceutical compositions are further set forth below. It will be understood by a person of skill in the art that medical compositions can also be administered to subjects other than humans, such as for veterinary purposes.
- the proteins described herein include the wild-type prototype protein, as well as homologs, polymorphic variations and recombinantly created muteins.
- the name “RF1” includes the wild- type prototype protein from E. coli (e.g., SEQ ID NO: 1), as well as homologs from other species, polymorphic variations and recombinantly created muteins. Proteins such as RF1 are defined as having similar functions if they have substantially the same biological activity as the wild-type protein when assessed using the same type of assay.
- substantially the same biological activity refers to that the activity of a protein is at least 70%, at least 80%, at least 85%, at least 90%, at least 92%, or at least 95% of that of the corresponding reference protein (e.g., the corresponding wild-type protein).
- Proteins are defined as homologs having similar amino acid sequences if they each has an amino acid sequence at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence of the corresponding prototype protein such as hda, lpxK, coaD, lolA, mreC, murF, hemA, sucB, atpE, fabH, and ubiF.
- sequence identity of a protein is determined using the BLASTP program with the defaults wordlength of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89:10915-10919, 1992).
- a readily conventional test to determine if a protein homolog, polymorphic variant, or a recombinant mutein is inclusive of a protein having the function described herein is by specific binding to polyclonal antibodies generated against the prototype protein. Typically, a specific or selective reaction will be at least twice background signal or noise and more typically more than 10 to 100 times background.
- an hda protein includes a protein that binds to polyclonal antibodies generated against the prototype protein of SEQ ID NO: 4.
- E. coli bacterial cells e.g., the RF-1 deficient E. coli cells
- Methods and conditions for culturing E. coli bacterial cells e.g., the RF-1 deficient E. coli cells
- the gene modifications e.g., the knock-out of RF1 or the stop codon mutations to the coding sequences of the genes of interest, or the gain of function point mutations in RF2 can be performed with a site-specific recombination.
- Site-specific recombination uses enzymes possessing both endonuclease activity and ligase activity and the enzymes recognize a certain part of DNA sequences and replace it with any other corresponding DNA sequences, see Yang W. and Mizuuchi K., Structure, 1997, Vol. 5, 1401-1406(9).
- Site-specific recombination systems are well known in the art, including, e.g., Int/att system from bacterio ⁇ phage, Cre/LoxP system from PI bacteriophage, and FLP-FRT system from yeast.
- Int/att system from bacterio ⁇ phage
- Cre/LoxP system from PI bacteriophage
- FLP-FRT system from yeast.
- site-specific integration into bacterial chromosomes has been reported (see, e.g., Fukushige et al., Proc. Natl. Acad. Sci., 89. 7905-7907 (1992); Baubonis et al., Nucleic Acids Research. 21, 2025-2029 (1993); Hasan et al., Gene, 150. 51-56 (1994)).
- Genes encoding the Cre or Flp recombinases can be provided in trans under the control of either constitutive or inducible promoters, or purified recombinase has been introduced (see, e.g., Baubonis et al., supra; Dang et al., Develop. Genet.13, 367-375 (1992); Chou et al., Genetics. 131.643-653 (1992); Morris et al., Nucleic Acids Res. 19. 5895-5900 (1991)).
- the genomic manipulations disclosed herein are performed with a modified site-specific recombination protocol from Kirill A. Datsenko and Barry L.
- knocking out a gene for example, RF1 or fabR can be performed as follows.
- a PCR amplicon was generated comprising an antibiotic resistance gene flanked by two FRT sites and homology extensions, which are homologous to the two ends of the gene to be knocked out.
- the gene to be knocked out is then replaced by the antibiotic resistance gene through Red-mediated recombination in these flanking homology regions.
- the resistance gene can be eliminated using a helper plasmid expressing the FLP recombinase, which acts on the directly repeated FRT (FLP recognition target) sites flanking the resistance gene.
- the Red and FLP helper plasmid can be simply cured by growth at 37 oC because they are temperature-sensitive replicons. Knocking-in a gene, if needed, can be performed by standard molecular cloning techniques that are well known for one skilled in the art.
- the nucleic acid modification is introduced by a (modified) CRISPR/Cas complex or system.
- the CRISPR/Cas system or complex is a class 2 CRISPR/Cas system.
- said CRISPR/Cas system or complex is a type II, type V, or type VI CRISPR/Cas system or complex.
- the CRISPR/Cas system does not require the generation of customized proteins to target specific sequences but rather a single Cas protein can be programmed by an RNA guide (gRNA) to recognize a specific nucleic acid target, in other words, the Cas enzyme protein can be recruited to a specific nucleic acid target locus (which may comprise or consist of RNA and/or DNA) of interest using said short RNA guide.
- gRNA RNA guide
- CRISPR/Cas or CRISPR system refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene and one or more of, a tracr (trans-activating CRISPR) sequence (e.g., tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), or “RNA(s)” as that term is herein used (e.g., RNA(s) to guide Cas, such as Cas9, e.g., CRISPR RNA and, where applicable, transactivating (tracr) RNA or a single guide RNA (s
- a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system).
- target sequence refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex.
- a target sequence may comprise any polynucleotide, such as DNA or RNA polynucleotides.
- the gRNA is a chimeric guide RNA or single guide RNA (sgRNA).
- the gRNA comprises a guide sequence and a tracr mate sequence (or direct repeat). In certain embodiments, the gRNA comprises a guide sequence, a tracr mate sequence (or direct repeat), and a tracr sequence. In certain embodiments, the CRISPR/Cas system or complex as described herein does not comprise and/or does not rely on the presence of a tracr sequence (e.g., if the Cas protein is Cas12a).
- the Cas protein as referred to herein such as but not limited to Cas9, Cas12a (formerly referred to as Cpf1), Cas12b (formerly referred to as C2c1), Cas13a (formerly referred to as C2c2), C2c3, Cas13b protein, may originate from any suitable source and hence may include different orthologues, originating from a variety of (prokaryotic) organisms, as is well documented in the art.
- the Cas protein is (modified) Cas9, preferably (modified) Staphylococcus aureus Cas9 (SaCas9) or (modified) Streptococcus pyogenes Cas9 (SpCas9).
- the Cas protein is Cas12a, optionally from Acidaminococcus sp., such as Acidaminococcus sp. BV3L6 Cpf1 (AsCas12a) or Lachnospiraceae bacterium Cas12a , such as Lachnospiraceae bacterium MA2020 or Lachnospiraceae bacterium MD2006 (LBCas12a). See U.S. Pat. No. 10,669,540, incorporated herein by reference in its entirety.
- the Cas12a protein may be from Moraxella bovoculi AAX08_00205 [Mb2Cas12a] or Moraxella bovoculi AAX11_00205 [Mb3Cas12a]. See WO 2017/189308, incorporated herein by reference in its entirety.
- the Cas protein is (modified) C2c2, preferably Leptotrichia wadei C2c2 (LwC2c2) or Listeria newyorkensis FSL M6-0635 C2c2 (LbFSLC2c2).
- the (modified) Cas protein is C2c1.
- the (modified) Cas protein is C2c3.
- the (modified) Cas protein is Cas13b.
- Other Cas enzymes are available to a person skilled in the art. Methods of using CRISPR/Cas system to eliminate gene expression are well known and also described in e.g., US. Pat. Pub. No. 2014/0170753, the disclosure of which hereby is incorporated by reference in its entirety.
- Additional methods of knocking out a target gene include, but are not limited to, homologous recombination technology, transcription activation of the effector nuclease (Transcription Activator-Like Effector Nuclease, TALEN) technology, a zinc finger nuclease (Zinc-Finger Nuclease, ZFN).
- a nucleic acid encoding a protein of interest as disclosed herein can be inserted into a replicable vector for expression in the E. coli under the control of a suitable prokaryotic promoter.
- a vector typically comprises one or more of the following: a signal sequence, an origin of replication, one or more maker genes and a promoter.
- a promoter disclosed herein may comprise any appropriate promoter sequence suitable for a eukaryotic or prokaryotic host cell, which shows transcriptional activity, including mutant, truncated, and hybrid promoters, and may be obtained from polynucleotides encoding extracellular or intracellular polypeptides either endogenous (native) or heterologous (foreign) to the cell.
- the promoter may be a constitutive or an inducible promoter. In some embodiments, the promoter is a constitutive promoter. In some embodiments, the promoter is an inducible promoter.
- Suitable prokaryotic promoters useful for practice of this invention include, but not limited to, the promoters of Pc0, PL59, MTL, ParaBAD, lac, T3, T7, lambda Pr'P1', trp, the spc ribosomal protein operon promotor P spc , the ⁇ -lactamase gene promotor P bla of plasmid pBR322, the P L promoter of phage ⁇ , the replication control promoters P RNAI and P RNAII of plasmid pBR322, the P1 and P2 promoters of the rrnB ribosomal RNA operon, the tet promoter, and the pACYC promoter.
- the promoters may have different strength in terms of the amount of transcripts it can produce. Promoters can be a medium strength promoter, weak strength promoter and strong promoter. The strength of a promoter can be measured as the amount of transcription of a gene product initiated at that promoter, relative to a suitable control.
- a suitable control could use the same expression construct, except that the “wild type” version of the promoter, or a promoter from a “housekeeping” gene, is used in place of the promoter to be tested.
- the promoter strength is determined by measuring the amount of transcripts from the promoter as compared to a control promoter.
- host cells containing an expression construct with the promoter to be tested (‘test host cells”) and control host cells containing a control expression construct, can be grown in culture in replicates. The total RNA of the host cells and controls can be extracted and measured by absorbance at 260 nm.
- cDNA can then be synthesized from the equal amount of total RNA from the test host cells and the control host cells.
- RT-PCR can be performed to amplify the cDNA corresponding to the transcript produced from the promoter.
- An exemplary method is described in De Mey et al. ("Promoter knock-in: a novel rational method for the fine tuning of genes", BMC Biotechnol.2010 Mar 24; 10:26).
- the various transgenes are expressed in the E. coli under the control of promoters of different strength to regulate proper production of the recombinant proteins. This is useful for maintaining an oxidative cytoplasm in the bacteria and maximize protein yield.
- a strong promoter T7 is used to drive the expression of the protein of interest to ensure maximal yield.
- the E. coli strain expresses a recombinant T7 polymerase under the control of the paraBAD promoter, which allows tight regulation and control of the protein of interest, e.g., through the addition or absence of arabinose. Guzman et al., J. Bacteriol. July 1995177 (14): 4121-4130. [0109]
- clones of the E. coli carrying the desired modifications as disclosed herein can be selected by limited dilution.
- these clones can be sequenced to confirm that the desired mutations are present in various genes, e.g., the coding sequences for hda, lpxK, coaD, lolA, mreC, murF, and hemA.
- whole genome sequencing can be performed to determine the location of the insertion or mutation in the chromosomes.
- a mutation introduced in one or more of the genes does not abolish protein expression of RF1 or fabR, but results in a mutein that lacks the activity that the corresponding wild-type protein possesses, e.g., the activity of RF1 in catalyzing translational termination from a ribosomal complex stalled at the amber codon.
- an RF1 deficient E does not abolish protein expression of RF1 or fabR, but results in a mutein that lacks the activity that the corresponding wild-type protein possesses, e.g., the activity of RF1 in catalyzing translational termination from a ribosomal complex stalled at the amber codon.
- coli cell disclosed herein is produced using other genetic engineering methods to reduce the endogenous RF1 protein activity, including but not limited to, 1) replacing the endogenous RF1 promoter with a promoter with weaker promoter activity to reduce the transcription of the RF1 gene; or 2) replacing the endogenous RF1 ribosomal binding sites with an attenuated ribosomal binding site to reduce the RF1 transcription.
- a mutein e.g., an RF-1 mutein
- the various muteins generated can be tested to confirm the extent of the loss of the activity of the wild-type protein.
- each of the coding sequences for the muteins can be separately expressed in a host strain, and the muteins are purified and tested for their activities as described below.
- a mutation such as an RF2 protein containing a T246A substitution, which confers increased RF2 activity, for example at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70% increase in activity compared to the corresponding wild-type RF2 protein.
- the various mutant proteins (“muteins”) generated can be tested to confirm that they increase the activity of the wild-type RF2 protein.
- each of the coding sequences for the muteins can be separately expressed in a host strain, and the muteins are purified and tested for their activities as described below.
- the activity of RF1 and RF2 can be assayed using a peptidyl-tRNA hydrolysis assay. This assay measures the rate at which release factors can catalyze translational termination from a ribosomal complex stalled at different stop codons.
- a peptidyl-tRNA hydrolysis assay measures the rate at which release factors can catalyze translational termination from a ribosomal complex stalled at different stop codons.
- FabR activity can be tested using a gel shift assay that measures the binding of this transcriptional regulator to its cognate promoter DNA binding sequence. See for instance Mol Microbiol.2011 April; 80(1): 195–218. doi:10.1111/j.1365-2958.2011.07564.x. Measuring expression level
- Various methods can be used to determine protein expression level of the various modified genes in the E.
- coli and/or confirm whether a gene (for example, RF1) has been knocked out or inserted.
- expression of a gene can be determined by conventional Northern blotting to quantitate the transcription of mRNA.
- Various labels may be employed, most commonly radioisotopes. However, other techniques may also be employed, such as using biotin- modified nucleotides for introduction into a polynucleotide. The biotin then serves as the site for binding to avidin or antibodies, which may be labeled with a wide variety of labels, such as radionuclides, fluorescers, enzymes, or the like.
- the expressed protein can be purified and quantified using gel electrophoresis (e.g., PAGE), Western analysis or capillary electrophoresis (e.g., Caliper LabChip).
- Gel electrophoresis e.g., PAGE
- Western analysis e.g., Western analysis
- capillary electrophoresis e.g., Caliper LabChip
- Protein synthesis in cell-free translation reactions may be monitored by the incorporation of radiolabeled amino acids, typically, 35 S-labeled methionine or 14 C-labeled leucine.
- Radiolabeled proteins can be visualized for molecular size and quantitated by autoradiography after electrophoresis or isolated by immunoprecipitation.
- the incorporation of recombinant His tags affords another means of purification, i.e., purification by Ni 2+ affinity column chromatography.
- Protein production from expression systems can be measured as soluble protein yield or by using an assay of enzymatic or binding activity.
- the protein to be quantified possesses defined biological activity, for example, enzymatic activity (such as alkaline phosphatase) or growth inhibition activity, the expression of the protein of interest can be confirmed by assaying its activity by incubating with proper substrates.
- Similar methods can also be used to measure the expression level of a protein of interest (e.g., an NNAA-containing protein) in the RF1-deficient E. coli cells as disclosed herein. Kits [0117] This disclosure also provides kits that comprise RF1-deficient E.
- the kit further comprises one plasmid encoding an aminoacyl- tRNA synthetase (RS) preferentially aminoacylates an NNAA and one plasmid encoding a tRNA that can be specifically charged with said NNAA.
- RS aminoacyl- tRNA synthetase
- the kit further comprises one plasmid encoding an aminoacyl-tRNA synthetase (RS) preferentially aminoacylates pAMF and one plasmid encoding a tRNA that can be specifically charged with p- azidomethylphenylalanine.
- the kit further comprises a plasmid encoding an aminoacyl-tRNA synthetase (RS) preferentially aminoacylates the NNAA (e.g., pAMF) and a tRNA that can be specifically charged with said NNAA (for example, p- azidomethylphenylalanine).
- the plasmid is a multicistronic expression cassette comprising one copy of the RS and three copies of the tRNA, i.e., the plasmid uses a single promoter to produce one transcript encoding one copy of the RS and three copies of the tRNA.
- the kit may further comprise one or more reagents necessary for preparation an RF1-deficient E. coli cell of the disclosure.
- kit may comprise one or more reagents necessary for one or more of: 1) knocking out RF1, 2) introducing the gain-of function mutation in RF2, 3) introducing one or more stop codon mutations to the coding sequence of hda, lpxK, coaD, lolA, mreC, murF, hemA, sucB, atpE, fabH, and ubiF, or 4) knocking out fabR of a host E. coli cell.
- a kit may comprise agents necessary for improving the growth of the modified E. coli cells as disclosed above.
- Exemplary embodiments [0120] This disclosure includes the following non-limiting embodiments: [0121] Embodiment 1 is an RF1-deficient E.
- Embodiment 2 is the RF1-deficient E. coli cell of embodiment 1, the number of stop codon mutations is no greater than 20.
- Embodiment 3 is the RF1-deficient E. coli cell of embodiment 1, the number of stop codon mutations is in the range of between 2 and 10.
- Embodiment 4 is an RF1-deficient E.
- Embodiment 5 is the RF1-deficient E. coli cell of embodiment 4, wherein 2 to 7 of the coding sequences comprises non-TAG stop codons.
- Embodiment 6 is the RF1-deficient E.
- Embodiment 7 is the RF1-deficient E. coli cell of any one of embodiments 2-6, wherein the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA comprise at least the last 10 nucleotides of SEQ ID NO: 5, 7, 9, 11, 13, 15, 17, respectively.
- Embodiment 8 is the RF1-deficient E.
- Embodiment 9 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell has an oxidative cytoplasm.
- Embodiment 10 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell is a K-12 E. coli cell.
- Embodiment 11 is the RF1-deficient E.
- Embodiment 12 is the RF1-deficient E. coli cell of embodiment 11, wherein the RF2 comprises a T246X mutation as compared to SEQ ID NO: 2.
- Embodiment 13 is the RF1-deficient E. coli cell of embodiment 12, wherein T246X is T246A.
- Embodiment 14 is the RF1-deficient E.
- Embodiment 15 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell further expresses an aminoacyl-tRNA synthetase.
- Embodiment 17 is the RF1-deficient E. coli cell of embodiment 13, wherein the aminoacyl-tRNA synthetase recognizes the TAG stop codon.
- Embodiment 18 is the RF1-deficient E. coli cell of embodiment 16 or 17, wherein the aminoacyl-tRNA synthetase preferentially aminoacylates to a degree of greater than 90% a tRNA with a non-natural amino acid as compared to the 20 common naturally occurring amino acids.
- Embodiment 19 is the RF1-deficient E. coli cell of embodiment 18, wherein the non- natural amino acid is para-azido-methyl-L-phenylalanine (pAMF).
- Embodiment 20 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell further comprises a gene encoding a protein of interest.
- Embodiment 21 is the RF1-deficient E. coli cell of embodiment 20, wherein the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy chain, a cytokine, a cytokine fragment, and an immunogenic polypeptide.
- Embodiment 22 is the RF1-deficient E. coli cell of embodiment 21, wherein the antibody is a monoclonal antibody.
- Embodiment 23 is the RF1-deficient E. coli cell of embodiment 21 or 22, wherein the antibody is an IgA, an IgD, an IgE, an IgG, or an IgM.
- Embodiment 24 is the RF1-deficient E.
- Embodiment 25 is the RF1-deficient E. coli cell of any one of embodiments 21-24, wherein the antibody is aglycosylated.
- Embodiment 26 is the RF1-deficient E. coli cell of embodiment 21, wherein the antibody fragment is selected from an Fv fragment, a Fab fragment, a F(ab')2 fragment, a Fab' fragment, an scFv (sFv) fragment, and an scFv-Fc fragment.
- Embodiment 27 is the RF1-deficient E.
- Embodiment 28 is the RF1-deficient E. coli cell of embodiment 21, wherein the antibody light chain is a light chain of an anti-HER2 antibody.
- Embodiment 28 is the RF1-deficient E. coli cell of embodiment 21, wherein the immunogenic polypeptide is a carrier protein.
- Embodiment 29 is the RF1-deficient E. coli cell of embodiment 28, wherein the carrier protein comprises at least one T-cell activating epitope from a protein selected from the group consisting of Corynebacierium diphiheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D, and CRM197.
- Embodiment 30 is the RF1-deficient E.
- Embodiment 31 is the RF1-deficient E. coli cell of embodiment 21, wherein the cytokine is selected from the group consisting of interleukins, interferons, transforming growth factors, and chemokines.
- Embodiment 32 is the RF1-deficient E. coli cell of any one of embodiments 20-31, wherein the protein of interest comprises one or more non-natural amino acids (NNAAs).
- Embodiment 33 is the RF1-deficient E.
- the one or more NNAAs is selected from the group consisting of p-acetyl-L-phenylalanine, O-methyl-L- tyrosine, an -3-(2-naphthyl)alanine, 3-methyl-phenylalanine, O-4-allyl-L-tyrosine, 4-propyl-L- tyrosine, a tri O-acetyl-GlcNAc ⁇ -serine, L-Dopa, fluorinated phenylalanine, isopropyl-L- phenylalanine, p-azido-L-phenylalanine, p-azido-methyl-L-phenylalanine, p-acyl-L- phenylalanine, p-benzoyl-L-phenylalanine, L-phosphoserine, phosphonoserine, phosphonotyrosine, p-iodo-phenylalanine
- Embodiment 34 is the RF1-deficient E. coli cell of embodiment 32, wherein the one or more NNAAs is p-azido-methyl-L-phenylalanine.
- Embodiment 35 is the RF1-deficient E. coli cell of embodiment 20, wherein the gene encoding the protein of interest is operably linked to an inducible promoter.
- Embodiment 36 is the RF1-deficient E. coli cell of embodiment 35, wherein the inducible promoter is a T7 promoter.
- Embodiment 37 is the RF1-deficient E. coli cell of any one of embodiments 1-36, wherein the RF-1 deficient E.
- Embodiment 38 is a kit comprising the RF1-deficient E. coli cell of any of embodiments 4-36, wherein the kit further comprises a bacteria growth medium.
- Embodiment 39 is the kit of embodiment 37, wherein the kit further comprises a plasmid encoding a protein of interest.
- Embodiment 40 is the kit of embodiment 38 or 39, wherein the kit further comprises a plasmid encoding an aminoacyl-tRNA synthetase (RS) specific for pAMF and a tRNA specific for p-azidophenylalanine.
- RS aminoacyl-tRNA synthetase
- Embodiment 41 is a method for expressing a soluble, recombinant protein in an RF1- deficient E. coli bacterial cell comprising the steps of: culturing the RF1-deficient E. coli bacterial cell and an expression cassette for expressing the recombinant protein, wherein the coding sequences for one or more or all of hda, lpxK, coaD, lolA, mreC, murF, and hemA in the RF-1 deficient E. coli cell comprise non-TAG stop codons, and wherein the RF1-deficient E. coli cell has increased RF2 activity or expression as compared to a control E. coli cell.
- Embodiment 42 is the method of embodiment 41, wherein the RF1-deficient E. coli bacterial cell comprises an oxidative cytoplasm.
- Embodiment 43 is the method of embodiment 41, wherein the number of the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA that comprise non-TAG stop codons is 2, 3, 4, 5, 6, or 7.
- Embodiment 44 is a method for expressing a protein of interest comprising culturing the RF1-deficient E. coli bacterial cell of any one of embodiments 1-36, wherein the RF1- deficient E.
- Embodiment 45 is the method of embodiment 41, wherein the recombinant protein comprises one or more NNAAs.
- Embodiment 46 is the method of embodiment 41, wherein the stop codons are non-TAG stop codons due to genetic modifications of the stop codons in the wild-type coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA.
- Embodiment 47 is the method of embodiment 41 or 46, wherein the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA polypeptides comprise polynucleotide sequences of SEQ ID NO: 5, 7, 9, 11, 13, 15, and 17, respectively.
- Embodiment 48 is the method of any one of embodiments 41-47, wherein the cell further comprises the aminoacyl-tRNA synthetase.
- Embodiment 49 is the method of embodiment 48, wherein the aminoacyl-tRNA synthetase recognizes the TAG stop codon.
- Embodiment 50 is the method of any one of embodiments 41-49, wherein the RF1- deficient E. coli cell contains an oxidative cytoplasm.
- Embodiment 51 is the method of any one of embodiments 41-50, wherein the RF1- deficient E. coli cell is a K-12 cell.
- Embodiment 52 is the method of any one of embodiments 41-51, wherein the RF1- deficient E. coli cell comprises a T246A mutation in the RF2 coding sequence.
- Embodiment 53 is the method of any one of embodiments 41-52, wherein the RF1- deficient E.
- Embodiment 54 is the method of any one of embodiments 41-53, wherein the RF1- deficient E. coli cell further comprises a ⁇ fabR mutation.
- Embodiment 55 is the method of any one of embodiments 41-54, wherein the RF1- deficient E. coli strain further expresses an aminoacyl-tRNA synthetase.
- Embodiment 56 is the method of embodiment 55, wherein the aminoacyl-tRNA synthetase preferentially aminoacylates to a degree of greater than 90% a tRNA with a non- natural amino acid as compared to the twenty common naturally occurring amino acids.
- Embodiment 57 is the method of any one of embodiments 55-56, wherein the aminoacyl-tRNA synthetase recognizes the TAG stop codon.
- Embodiment 58 is the method of any one of embodiments 44-57, wherein the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy chain, a cytokine, a cytokine fragment, and an immunogenic polypeptide.
- Embodiment 59 is the method of embodiment 58, wherein the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, and an antibody heavy chain.
- Embodiment 60 is the method of embodiment 58 or 59, wherein the antibody is a monoclonal antibody.
- Embodiment 61 is the method of any one of embodiments 58-60, wherein the antibody is an IgA, an IgD, an IgE, an IgG, or an IgM.
- Embodiment 62 is the method of any one of embodiments 58-61, wherein the antibody is humanized or human.
- Embodiment 63 is the method of any one of embodiments 58-62, wherein the antibody is saglycosylated.
- Embodiment 64 is the method of embodiment 58 or 59, wherein the antibody fragment is selected from an Fv fragment, a Fab fragment, a F(ab’)2 fragment, a Fab’ fragment, an scFv (sFv) fragment, and an scFv-Fc fragment.
- Embodiment 65 is the method of embodiment 58, wherein the immunogenic polypeptide is a carrier protein.
- Embodiment 66 is the method of embodiment 65, wherein the carrier protein comprises at least one T-cell activating epitope from a protein selected from the group consisting of Corynebacierium diphiheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D, and CRM197.
- Embodiment 67 is the method of embodiment 65, wherein the carrier protein comprises an amino acid sequence having at least 80% sequence identity to SEQ ID NO: 33.
- Embodiment 68 is the method of embodiment 58, wherein the cytokine is selected from the group consisting of interleukins, interferons, transforming growth factors, and chemokines.
- Embodiment 69 is the method any one of embodiments 41-64 wherein the gene encoding the protein of interest is operably linked to an inducible promoter.
- Embodiment 70 is the method of embodiment 69, wherein the inducible promoter is a T7 promoter.
- the examples and embodiments described herein are for illustrative purposes only, and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, sequence accession numbers, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes. EXAMPLES Example 1.
- pJ411 trastuzumab LC E213TAG was generated.
- the LC gene was cloned into an operon with a T7 promoter and T7 terminator.
- the codons for residue 213 were mutated to TAG to specify the location of the NNAA.
- This plasmid has a high copy pUC origin of replication and contains the gene for Kanamycin resistance. Plasmid sequences were verified by cloning.
- Contranslational p-azidomethylphenylalanine (pAMF) incorporation also requires expression of an amber suppressor tRNA orthogonal to natural E. coli aminoacyl-tRNA synthetases (AAtRS) and an orthogonal pAMF AAtRS that specifically recognizes the amber suppressor tRNA and the pAMF NNAA.
- AAtRS E. coli aminoacyl-tRNA synthetases
- AAtRS E. coli aminoacyl-tRNA synthetases
- the coding sequence for an aminoacyl-tRNA synthetase preferentially aminoacylates pAMF and the coding sequence for a tRNA that can be specifically charged with p-azidomethylphenylalanine (pAMF) were cloned into a multicistronic expression cassette.
- One copy of the RS coding sequence and three copies of the pAzF (p-azidophenylalanine) tRNA coding sequence were cloned in a dual promoter system consisting of an inducible T7 promoter followed by a constitutive Pc0 promoter.
- the vector had a medium copy (p15a origin) with a b-lactamase selection marker. Both the origin and marker are compatible with pJ411. Plasmid sequences were verified by cloning.
- Example 3 Shake flask production of trastuzumab light chain containing 1 pAMF non- natural amino acid in E. coli strains lacking release factor 1 [0196] To assess Amber suppression efficiency, the expression of a trastuzumab light chain (LC) construct containing 1 Amber (TAG) codon at position E213 (LC E213 TAG) was tested in ⁇ RF1 Snuggle E. coli strains.
- LC trastuzumab light chain
- the E. coli strain for expression of LC E213 TAG was generated by transforming strain 711 or 713 ( ⁇ RF1) with the product plasmid. Single colonies were grown in overnight seed cultures at 37 o C in Terrific Broth (TB) containing 50 ⁇ g/mL kanamycin (TB +Kan). The next day, seed cultures were diluted 1:40 into larger expression cultures (25-250 mL) of fresh TB +Kan and grown at 37 o C until they reached an OD600 of 1.2-1.5.
- the fermentation process began by taking a 2 mL vial of the cell bank and inoculating a shake flask with I17-SF Shake Flask Media containing an added 24 g/L Bacto Yeast Extract, 50 ⁇ g/mL of Kanamycin and 100 ⁇ g/mL of Carbenicillin at about 8% (v/v) seeding density.
- the flask culture was used to inoculate a 500 mL bioreactor at a seeding density of 5% (v/v) in batched media, which consists of 50 ⁇ g/mL of Kanamycin, 100 ⁇ g/mL of Carbenicillin, 0.05% (v/v) A204 antifoam and 2% (v/v) 5x I17 Media + 120 g/L Bacto Yeast Extract in DI H 2 O.
- Tables 2 and 3 describe the components of I17-SF Media, and Tables 4, 5 and 6 describe the components of 5x I17 Media.
- Table 2 Components of I17-SF Shake Flask Media Solution
- Table 3 Components of 10x Base Salts for I17-SF Shake Flask Media Solution
- Table 4 Components of 5x I17 Media Solution
- Table 5 Components of concentrated stock solution of vitamins for 5x I17 Media Solution
- Table 6 Components of concentrated stock solution of trace metals for 5x I17 Media Solution [0198]
- the bioreactor temperature, dissolved oxygen and pH setpoints were 37° C, 30% and 7, respectively. Once the cells grew to an OD 595 nm between 3-5 in the batch phase, the fed batch phase began by feeding 5x I17 Media + 120 g/L Bacto Yeast Extract at an exponential rate of 0.2 h -1 .
- the feed rate was adjusted to ensure that all glucose was depleted prior to induction.
- the temperature of the bioreactor was decreased to 25°C and the exponential feed rate was decreased to 0.02 h -1 .
- the induction phase began by adding pAMF to a target concentration of 2 mM based on the culture volume of the bioreactor before induction and L-Arabinose to a target concentration of 4 g/L based on the starting volume of the bioreactor.
- Induction phase took 24-48 hours before the harvest.
- the culture was collected and centrifuged at 18,592 xG and 2-8° C for 30 min in a floor centrifuge.
- Table 8 Components of S30-5 Buffer [0200]
- the cell resuspension was then passed twice through an Avestin Homogenizer (EmulsiFlex-C5) at 14,000 Psi and 3,500 Psi to disrupt the cells and generate the crude lysate.
- the crude lysate was further clarified by centrifuging at 18,000-30,000 xG and 2-8° C for 30 minutes in a floor centrifuge. The supernatant was collected and centrifuged once more at 18,000-30,000 xG and 2-8° C for 30 minutes, and then the clarified lysate was aliquoted, flash frozen in liquid nitrogen and stored at -80°C.
- Example 5 Components of S30-5 Buffer
- Intact LC-MS was performed on an Agilent 6520 QTOF mass spectrometer coupled to an Agilent 1200 series HPLC. Prior to analysis, the QTOF MS was calibrated in Extended Dynamic Range mode (2 GHz) in Standard (3200 m/z) range. For each sample, 10-15 pmol of protein was separated over a reverse phase column prior to introduction into the MS source.
- the HPLC mobile phase consisted of 0.1% formic acid in H2O (A) and 0.1% formic acid in acetonitrile (B).
- Proteins were separated using a gradient method starting at 10% B and increasing to 95% B over 10 minutes. After separation, proteins were analyzed by the QTOF MS operating in positive ion MS (Seg) mode in a mass range of 500-3200 m/z with a scan rate of 1 spectra per second. Data was analyzed in Agilent MassHunter Qualitative analysis software. Proteins were identified by the existence of a peak in the total ion chromatogram. Mass spectra for the LC E213 TAG peak were deconvoluted using the Maximum Entropy algorithm with a mass range of 10,000-60,000 Daltons and a 1.00 Dalton mass step.
- Truncated, full length, and light chain dimer peak identities were confirmed by comparing peak mass to the predicted protein masses that had been calculated using gpmaw3.
- the peaks for each species in the deconvoluted mass spectra were integrated and used to calculate the percent truncated and full-length species.
- Percent truncated species was calculated according to the following formula: ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ 100 where the sum of all peak areas is equal to the sum of the truncated peak area, the full-length peak area, and two times the dimer peak area. [0202] In wild-type E.
- the plasmid for expressing the pAcPhe RS and tRNA was generated by cloning one copy of the pAcPhe RS and three copies of the pAzF tRNA were behind a dual promoter system consisting of an inducible T7 promoter followed by a constitutive Pc0 promoter.
- the vector had a medium copy (p15a origin) with a b-lactamase selection marker. Both the origin and marker are compatible with pJ411. Plasmid sequences were verified by cloning.
- the E. coli strain for expression of LC with pAcPhe was generated by transforming strain 711 or 713 ( ⁇ RF1) with the LC and RS/tRNA plasmids.
- IL18 can be expressed solubly with N-terminal SUMO Tag.
- the E. coli strains for expression of IL18 D157TAG was generated by transforming RF1+ strain SBDG674 and RF1 KO strain SBDG675 with both the RS/tRNA plasmid and product plasmid with the gene for SUMO- IL18 D157 TAG in the pJ411 vector.
- Single colonies were grown in overnight seed cultures at 37 o C in Terrific Broth (TB) containing 50 ⁇ g/mL kanamycin and 100 ⁇ g/mL carbenicillin. The next day, seed cultures were diluted 1:40 into larger expression cultures (25-250 mL) of fresh TB with 50 ⁇ g/mL kanamycin and 100 ⁇ g/mL carbenicillin and grown at 37 o C until they reached an OD 600 of 1.2-1.5. At that time, protein expression was induced by adding arabinose to a final concentration of 0.2%, pAMF was added to a final concentration of 2 mM and the temperature was lowered to 25 o C for 18-20 hours. Cells were then harvested by centrifugation at 6000g for 10 minutes.
- TB Terrific Broth
- pJ411 CRM197 TAG has a high copy pUC origin of replication and contains the gene for Kanamycin resistance.
- the NNAA CRM197 plasmid sequences were verified by sequencing.
- Co-translational NNAA incorporation also requires expression of an amber suppressor tRNA orthogonal to existing aminoacyl tRNA synthetases (AAtRS) and an orthogonal AAtRS that specifically recognizes the amber suppressor tRNA and the NNAA.
- AAtRS amber suppressor tRNA orthogonal to existing aminoacyl tRNA synthetases
- SBDG713 RF1- strain
- the NNAA CRM197 strain was be produced by transforming the strains SBDG711 and SBDG713 with pJ411 CRM197 TAG and p15a NNAA RS/tRNA and selecting for transformants on plates with Kanamycin. Expression of NNAA CRM197 in shake flasks and high- density fermentation proceeded as described for NNAA containing LC in Examples 6 and 4, respectively. [0207] After expression of NNAA CRM197, cells were harvested by centrifugation at 7000g for 7 minutes. Cell pellets were resuspended in PBS, 0.1 ⁇ g/mL lysozyme, and 0.05U/mL benzonase.
- CRM197 protein lysates from each expression were subsequently applied to 100 ⁇ L Ni-NTA resin by gravity flow, washed with 3 column volumes of PBS and 10 mM imidazole (wash buffer), and then eluted with 10 column volumes of PBS and 200 mM imidazole (elution buffer). Elution fractions were analyzed by SDS-PAGE for purity, and fractions that consisted mostly of CRM197 were pooled. Protein concentration was calculated using a Nanodrop spectrophotometer that had been blanked with elution buffer.
- the total protein yield was calculated by dividing the sample absorbance at 280 nm by the predicted protein molar absorbance (Snapgene) of 0.92 mg/mL and multiplying the resulting protein concentration by the sample volume in mL.
- Final titers (Table 12) were calculated based on the initial culture volume of 40 mL. Table 12.
- Table of CRM197 titers and conjugate-to-protein ratios [0209] Titers of CRM197 K25 pAMF (1X pAMF) expressed in SBDG711 and SBDG713 were approximately equal, at 36 and 44 +/-3 mg/L, respectively.
- CRM197 K25/K215 pAMF (2X pAMF) and CRM197 K25/K215/K228 pAMF (3X pAMF) derived from strain SBDG713 were similar, at 40 +/- 2 and 39 +/- 12 mg/L, respectively.
- SDS-PAGE analysis of the wash and elution fractions from the IMAC capture of the 2X pAMF, 3X pAMF, and 4X pAMF proteins expressed in the RF1+ strains revealed no prominent CRM197 band.
- CRM197 proteins were analyzed by intact LC-MS before and after strain-promoted azide-alkyne click conjugation with a small molecule dibenzylcyclooctyne (DBCO) amine. Protein concentrations were brought to 1 mg/mL in PBS. The DBCO-amine was added at a DBCO-amine to pAMF ratio of 3:1, and 500 mM NaCl was added to the reaction to improve DBCO-amine solubility. The conjugation reaction was incubated overnight at 30°C prior to intact LC-MS analysis.
- DBCO dibenzylcyclooctyne
- Intact LC-MS analysis and deconvolution was performed as described in Example 5
- Intact LC-MS analysis revealed the presence of a peak at the expected theoretical mass for each CRM197 sample, showing that pAMF, and not another amino acid, had been incorporated at each the site of each respective TAG codon.
- Each conjugated CRM197 protein sample showed the expected mass shift corresponding to its conjugation with 1-4 DBCO-amine molecules. No unconjugated protein could be detected, further showing that only pAMF was incorporated at the TAG codon(s).
- the conjugate-to-protein ratio (CPR) was calculated by multiplying the theoretical CPR (i.e.
- Nonnatural amino acid (NNAA) containing IgG production requires the concurrent synthesis of Heavy Chain (HC) and Light Chain (LC) polypeptides with either the HC and/or LC containing an NNAA.
- pJ411-HC-LC TAG will be generated by mutating codons to TAG at the desired NNAA incorporation sites in the HC and LC genes.
- These new genes for the HC and LC would be cloned into a single bicistronic operon with a T7 promoter and T7 terminator.
- This plasmid will have a high copy pUC origin of replication and contain the gene for Kanamycin resistance. Plasmid sequences will be verified by Sanger sequencing. Both the HC and LC will have independent ribosomal binding sites. To optimize the ratio of HC and LC, mutations could be made in the ribosomal binding site of either gene.
- Co-translational NNAA incorporation also requires expression of an amber suppressor tRNA orthogonal to existing aminoacyl tRNA synthetases (AAtRS) and an orthogonal AAtRS that specifically recognizes the amber suppressor tRNA and the NNAA.
- AAtRS amber suppressor tRNA orthogonal to existing aminoacyl tRNA synthetases
- AAtRS amber suppressor tRNA orthogonal to existing aminoacyl tRNA synthetases
- AAtRS amber suppressor tRNA orthogonal to existing aminoacyl tRNA synthetases
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- General Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Physics & Mathematics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
This disclosure provides RF1-deficient E. coli cells having an oxidative cytoplasm and with a limited number of stop codon mutations in coding sequences of the genes that are essential for survival of E. coli cells. These mutations convert TAG stop codons to non-TAG stop codons so that translational termination can be properly executed by RF2. In some embodiments the RF1-deficient E. coli cells comprise an RF2 variant that has greater activity than the RF2 in the control cell.
Description
PATENT Attorney Docket No.091200-1423627-007110PC Client Reference No.0218WO RF1 KO E. COLI STRAINS REFERENCE TO RELATED APPLICATION [0001] This application claims priority to U.S. Provisional Patent Application No. 63/456,770, filed April 3, 2023. The entire disclosure of said provisional application is herein incorporated by reference for all purposes. BACKGROUND [0002] Release Factor 1 (RF1) is a termination complex protein that facilitates translation termination by recognizing the amber codon in an mRNA molecule. RF1 terminates translation in response to the amber codon, i.e., the TAG stop codon. The action of RF1 is in direct competition with non-natural amino acid (NNAA) incorporation in response to TAG stop codons. RF1 recognition of the amber stop codon can result in pre-mature truncation products at the site of non- native amino acid incorporation and thus decrease protein yield. Therefore, deleting RF1 can promote NNAA incorporation into recombinant proteins. [0003] In the past 20 years, efforts have been made to knock out the RF1 gene in E. coli cells with a reducing cytoplasm to produce proteins comprising NNAAs. However, significant cytotoxicity was observed in these E. coli cells. The first source of cytotoxicity is the cells’ inability to terminate the translation of proteins ending in TAG. Although it is possible to alleviate the toxicity by mutating the TAG stop codons to either TGA or TAA, such that Release Factor 2 (RF2), which remains present in the cell, can recognize TGA or TAA and terminate translation. However, this approach is labor-intensive and has yet to produce the desired outcome. Researchers reportedly used brute force to make an RF1 KO strain by mutating a large number (>50) or all the stop codon TAGs to either TGA or TAA. Still, the mutant E. coli cells grew significantly slower than the unmutated cells. See Table 1, Mukai et al., 2015, scientific reports 5: 9699 | DOI: 10.1038/srep09699. Thus, this approach cannot meet the needs of commercial protein production. [0004] The second source of toxicity in K12 strains results from the poor activity of RF2 to terminate translation from TAA codons. Researchers were able to delete RF1 from the genome of E. coli B strains but not K12 strains. The failure in deleting the RF1 gene in K12 is due to a
mutation (T246A) present in the K12 RF2 gene that interferes with the recognition of TAA. Repairing the mutation resulted in slower growth of the RF1-deficient cells. Thus, this approach also cannot meet the need for producing commercially relevant proteins in K12 strains. As such, there exists a need for an E. coli strain that can produce proteins comprising NNAAs with high yield. DESCRIPTION OF THE DRAWINGS [0005] FIG. 1 shows data of soluble lysates from strains SBDG711 and SBDG713 expressing Cross reacting material (CRM197 or CRM197). Each lane represents the lysate from a single colony. Experiments were performed in biological duplicate.1X pAMF = CRM197 K25 pAMF, 2XpAMF = CRM197 K25/K215 pAMF, 3X pAMF = CRM197 K25/K215/K228 pAMF, 4X pAMF = CRM197 K25/K215/K228/K386 pAMF SUMMARY OF INVENTION [0006] It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, sequence accession numbers, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes. [0007] In one aspect, provided herein is an RF1-deficient E. coli cell comprising at least one stop codon mutation from TAG to a non-TAG stop codon, a functional release factor 2 (RF2), and an oxidative cytoplasm, wherein the functional RF2 has greater RF2 activity than a control. In some embodiments, the number of stop codon mutations is no greater than 20, for example, between 2 and 10. [0008] In some embodiments, at least one of the coding sequences selected from the group consisting of hda, lpxK, coaD, lolA, mreC, murF, and hemA comprises a non-TAG stop codon, and the cell has increased RF2 activity or expression as compared to a control E. coli cell. In some embodiments, 2 to 7 of the coding sequences comprises non-TAG stop codons. [0009] In some embodiments, the cell further comprises a gene encoding a protein of interest selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy
chain, a cytokine, a cytokine fragment, or an immunogenic polypeptide. In some embodiments, the immunogenic polypeptide is a carrier protein. In some embodiments, the carrier protein comprises an amino acid sequence having at least 80% sequence identity to SEQ ID NO:33. In some embodiments, the cytokine is selected from an interleukin, an interferon, a transforming growth factor, or a chemokine. In some embodiments, the protein of interest comprises one or more non-natural amino acids (NNAAs). In some embodiments, the one or more NNAAs is selected from the group consisting of p-acetyl-L-phenylalanine, O-methyl-L-tyrosine, an -3-(2- naphthyl)alanine, 3-methyl-phenylalanine, O-4-allyl-L-tyrosine, 4-propyl-L-tyrosine, a tri O- acetyl-GlcNAcȕ-serine, L-Dopa, fluorinated phenylalanine, isopropyl-L-phenylalanine, p-azido- L-phenylalanine, p-azido-methyl-L-phenylalanine, p-acyl-L-phenylalanine, p-benzoyl-L- phenylalanine, L-phosphoserine, phosphonoserine, phosphonotyrosine, p-iodo-phenylalanine, p- bromophenylalanine, p-amino-L-phenylalanine, isopropyl-L-phenylalanine, and p-propargyloxy- phenylalanine. [0010] In another aspect, provided herein is a kit comprising the RF1-deficient E. coli cell described above, and the kit further comprises a bacteria growth medium. In some embodiments, the kit further comprises a plasmid encoding a protein of interest. In some embodiments, the kit further comprises a plasmid encoding an aminoacyl-tRNA synthetase (RS) specific for pAMF and a tRNA specific for p-azidophenylalanine. [0011] In another aspect, provided herein is a method for expressing a soluble, recombinant protein in an RF1-deficient E. coli bacterial cell comprising the steps of: culturing the RF1- deficient E. coli bacterial cell and an expression cassette for expressing the recombinant protein, wherein the coding sequences for one or more or all of hda, lpxK, coaD, lolA, mreC, murF, and hemA in the RF-1 deficient E. coli cell comprise non-TAG stop codons, and wherein the RF1- deficient E. coli cell has increased RF2 activity or expression as compared to a control E. coli cell. In some embodiments, the RF1-deficient E. coli bacterial cell comprises an oxidative cytoplasm, which allows recombinant proteins, especially those comprising disulfide bonds, to be expressed as soluble proteins. [0012] In another aspect, provided herein is a method for expressing a protein of interest comprising culturing the RF1-deficient E. coli bacterial cell disclosed above, wherein the RF1- deficient E. coli bacterial cell comprises an expression cassette comprising a coding sequence for
the protein of interest. In some embodiments, the recombinant protein comprises one or more NNAAs. [0013] In some embodiments, the stop codons are non-TAG stop codons due to genetic modifications of the stop codons in the wild-type coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA, In some embodiments, the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA polypeptides comprise polynucleotide sequences of SEQ ID NO: 5, 7, 9, 11, 13, 15, and 17, respectively. In some embodiments, the RF1-deficient E. coli cell contains an oxidative cytoplasm. In some embodiments, the RF1-deficient E. coli cell comprises a stop codon mutation in one or more of sucB, atpE, fabH, and ubiF coding sequences. In some embodiments, the RF1- deficient E. coli cell further comprises a ǻfabR mutation. In some embodiments, the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy chain, a cytokine, a cytokine fragment, or an immunogenic polypeptide. In some embodiments, the immunogenic polypeptide is a carrier protein. DETAILED DESCRIPTION INTRODUCTION [0014] The present disclosure provides genetically engineered Release Factor 1 (RF1) knock- out (KO) Escherichia coli (E. coli) strains that have an oxidative cytoplasm. These RF1-deficient E. coli cells can produce proteins comprising NNAAs in high yield. In some embodiments, RF1- deficient E. coli cells comprise coding sequences in which the TAG stop codons have been mutated to non-TAG stop codons. In some embodiments, the coding sequences encode one or more proteins selected from hda, lpxK, coaD, lolA, mreC, murF, and hemA. In some embodiments, the E. coli cells have increased RF2 activity or expression as compared a control E. coli cell. In some embodiments, the RF2 protein has a T246A mutation with reference to SEQ ID NO: 2, and said mutation confers the increased RF2 activity. In some embodiments, the E. coli cells are from the K12 strain. [0015] Unlike previous attempts to delete RF1 from the genome, the methods and compositions disclosed in this application advantageously produce RF1-deficient E. coli cells that maintain a fast growth rate but with significantly fewer mutations. The approach repairs RF2 activity by
producing an RF2 with the T246A mutation with reference to SEQ ID NO: 2. Unlike previous efforts in the art that used brute force to mutate many (over 50) or all TAGs, the methods of the present disclosure involve making only 7-12 stop codon mutations in essential genes (mutating TAG to TAA or TGA). The methods further comprise knocking in an NNAA aminoacyl-tRNA synthetase and tRNA so the amber suppressor tRNA can release ribosomes stalled at the TAG stop codon and knocking out the fabR gene for enhanced growth of the RF1-deficient E. coli cells. DEFINITION [0016] A polynucleotide sequence is “operably linked” to another polynucleotide sequence placed into a functional relationship with another polynucleotide sequence. For example, a promoter or enhancer is operably linked to a coding sequence if it regulates the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if positioned to facilitate translation. Generally, “operably linked” means that the polynucleotide sequences being linked are contiguous and, in the case of a secretory leader, contiguous and in the same open reading frame. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice. [0017] The term “a non-TAG stop codon” refers to a trinucleotide at the 3 terminus of a coding sequence that is not TAG. A non-TAG stop codon may be TAA or TGA. [0018] The term “a control cell” refers to a cell from the same genetic background as the cell in which a gene of interest has been genetically modified, except that the control cell comprises the wild-type version of the gene of interest. For example, a control cell for an RF-1 deficient E. coli cell from the K12 strain in which RF2 has been mutagenized is an RF-1 deficient cell derived from the K12 strain in which the RF2 is the wild-type. [0019] The term “aminoacylation” or “aminoacylate” refers to the attachment of an amino acid to a tRNA, a process commonly referring to as charging a tRNA with its correct amino acid. Aminoacylation is typically a two-step process catalyzed by the aminoacyl-tRNA synthetases. The first step is the formation of an aminoacyl-AMP (aminoacyl-adenylate) on the enzyme through the hydrolysis of adenosine triphosphate (ATP). The second step is the transfer of the activated amino acid residue from the adenylate to a tRNA. As it pertains to this invention, a tRNA that undergoes
aminoacylation or has been aminoacylated is one that has been charged with an amino acid, and an amino acid that undergoes aminoacylation or has been aminoacylated has been charged to a tRNA molecule. [0020] The term “aminoacyl-tRNA synthetase,” “tRNA synthetase,” “synthetase,” “aaRS,” or “RS” refers to an enzyme that catalyzes the formation of a covalent linkage between an amino acid and a tRNA molecule. This results in an aminoacylated tRNA molecule, which is a tRNA molecule that has its respective amino acid attached via an ester bond. [0021] The term “charged” in the context of tRNA refers to aminoacylation of a tRNA with an amino acid, both natural and non-natural, where the aminoacylation permits a ribosome to incorporate the amino acid into a polypeptide that is being translated from mRNA. [0022] The term “biologically active adduct” refers to a molecule that can perfom a function in a cell or an organism. For example, the function may include cell proliferation, apoptosis, post- translational modification (e.g., phosphorylation), cell signaling activation, cell signaling inactivation, cell death, cell labeling, etc. [0023] The term “preferentially aminoacylates” refers to the preference of a tRNA synthtase to aminoacylate (charge) a particular tRNA molecule with a predetermined amino acid molecule compared to another amino acid molecule. In other words, the tRNA synthtase can selectively aminoacylate a non-natural amino acid (NNAA) over a naturally occurring amino acid, for example, the tRNA synthtase can aminoacylate a specific NNAA at a frequency greater of than 90%, e.g., 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%, compared to any or all other natural amino acids. [0024] The term “nucleic acid” or “polynucleotide” refers to polymers of deoxyribonucleotides (DNA) or ribonucleotides (RNA) in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleic acids that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by
generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res.19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)). The term “nucleic acid or polynucleotide” is used interchangeably with “gene,” “cDNA,” and “mRNA encoded by a gene.” [0025] The term “peptide,” “protein,” and “polypeptide” are used herein interchangeably and refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residues are artificial chemical mimetics of corresponding naturally occurring amino acids, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers. As used herein, the terms encompass amino acid chains of any length, including full-length proteins and truncated proteins, wherein the amino acid residues are linked by covalent peptide bonds. [0026] The term “mutein” refers to a protein comprising an amino acid substitution relative to a wild-type or reference amino acid sequence. [0027] The term “sequence identity” or “percent identity” in the context of two or more nucleic acids or polypeptide sequences, refers to two or more sequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence over a comparison window, as measured using a sequence comparison algorithm, e.g., BLASTP. For purposes of this document, the percent identity is determined over the full-length wild-type sequence such as the reference sequence set forth in SEQ ID NO:1. The method for calculating the sequence identity as provided herein is the BLASTP program having its defaults set at a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see, e.g., Henikoff and Henikoff, 1989, Proc Natl Acad Sci USA 89:10915). Exemplary determination of sequence alignment and % sequence identity can employ the BESTFIT or GAP programs in the GCG Wisconsin Software package (Accelrys, Madison WI), using default parameters provided. [0028] The term “substitution at amino acid position” refers to an amino acid residue at a specific position of the amino acid sequence of a protein being replaced by another, different amino acid. For example, the term “X20Y” refers to the substitution of the wild-type (reference) amino acid X at position 20 of the protein with amino acid Y.
[0029] The term “functional variant” refers to a molecule (a polypeptide or polynucleotide) that contain mutations as compared to a reference molecule while retaining at least some of the biological activity of the reference molecule. The biological activity can be determined by comparing the activity, function and/or structure of the reference molecule expressed by the methods described herein to the activity of a reference molecule. For example, if the reference molecule is an IgG, a functional variant of the reference molecule comprises a properly folded and assembled IgG molecule. In some embodiments, the biological activity of the reference molecule and its variants can be determined using an in vitro or in vivo assay that is appropriate for the reference molecule. In some embodiments, the biological activity of a functional variant is at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or 99% of the activity of a reference protein when assessed using the same or a similar assay. [0030] The term “antibody” refers to a protein functionally defined as a binding protein and structurally defined as comprising an amino acid sequence that is recognized by one of skill as being derived from the framework region of an immunoglobulin encoding gene of an animal- producing antibodies. An antibody can consist of one or more polypeptides substantially encoded by immunoglobulin genes or fragments of immunoglobulin genes. The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon, and mu constant region genes, as well as myriad immunoglobulin variable region genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD, and IgE, respectively. [0031] A typical immunoglobulin (antibody) structural unit is known to comprise a tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one “light” (about 25 kD) and one “heavy” chain (about 50-70 kD). The N-terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition. The terms variable light chain (VL) and variable heavy chain (VH) refer to these light and heavy chains respectively. [0032] As used herein, the term “Fab fragment” is an antibody fragment that contains the portion of the full-length antibody that results from digestion of a full-length immunoglobulin with papain, or a fragment having the same structure that is produced synthetically, e.g., recombinantly. A Fab fragment contains a light chain (containing a variable (VL) and constant
(CL) region domain) and another chain containing a variable domain of a heavy chain (VH) and one constant region domain portion of the heavy chain (CH1). [0033] As used herein, a F(ab’)2 fragment is an antibody fragment that results from digestion of an immunoglobulin with pepsin at pH 4.0-4.5, or a synthetically, e.g., recombinantly, produced antibody having the same structure. The F(ab')2 fragment contains two Fab fragments but where each heavy chain portion contains additional amino acids, including cysteine residues that form disulfide linkages joining the two fragments. [0034] The term “naturally occurring amino acid” or “natural amino acid”refers to any one of the 20 amino acids encoded by the genetic code, such as, (arginine, Arg, R; histidine, His, H; lysine, Lys, K; aspartic acid, Asp, D; glutamic acid, Glu, E; serine, S, Ser; threonine, Thr, T; asparagine, Asn, N; glutamine, Gln, Q; cysteine, Cys, G; glycine, Gly, G; proline, Pro, P; alanine, Ala, A; isoleucine, Ile, I; leucine, Leu, L; methionine, Met, M; phenylalanine; Phe, F; tryptophan, Trp, W; tyrosine, Tyr, Y, and valine, Val, V. [0035] A “null mutation” refers to a mutation in a gene that results in a non-functional gene. The null mutation can cause the complete lack of production of associated gene product or the production of a product that lacks the function of the wild type protein. [0036] When a protein is referred to as in a “reduced state,” the protein is in a state having more electrons than its oxidized form. [0037] The term “oxidative cytoplasm” refers to the cytosol of a cell in which a substrate is more likely to become oxidized than reduced. [0038] As used herein, the term “recombinant nucleic acid” has its convention meaning. A recombinant nucleic acid, or equivalently, polynucleotide, is one that is inserted into a heterologous location such that it is not associated with nucleotide sequences that normally flank the nucleic acid as it is found in nature (for example, a nucleic acid inserted into a vector or a genome of a heterologous organism). Likewise, a nucleic acid sequence that does not appear in nature, for example, a variant of a naturally occurring gene, is recombinant. A cell containing a recombinant nucleic acid, or protein expressed in vitro or in vivo from a recombinant nucleic acid are also “recombinant.” Examples of recombinant nucleic acids include a protein-encoding DNA
sequence that is (i) operably linked to a heterologous promoter and/or (ii) encodes a fusion polypeptide with a protein sequence and a heterologous signal peptide sequence. [0039] As used herein, “carrier protein” refers to a non-toxic or detoxified polypeptide containing a T-cell activating epitope which is able to be attached to an antigen (e.g., a polysaccharide) to enhance the humoral response to the conjugated antigen in a subject. The term includes any of the bacterial proteins used as epitope carriers in FDA-approved vaccines. In some embodiments, the carrier protein is Corynebacterium diphtheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D (PD, HiD), outer membrane protein complex of serogroup B meningococcus (OMPC), CRM197, or malaria ookinete specific surface protein Pfs25. In another embodiment, the carrier protein is BB, derived from the G protein of Streptococcus strain G148. [0040] As used herein, the term “immunogenic polypeptide” refers to a polypeptide comprising at least one T-cell activating epitope, wherein the T-cell epitope is derived from a protein capable of inducing immunologic memory in animals. [0041] The term “T-cell activating epitope” refers to a structural unit of molecular structure which is capable of inducing T-cell immunity. The function of carrier proteins which include T- cell activating epitopes is well known and documented for conjugates. Without wishing to be bound by theory, a T-cell activating epitope in the carrier protein enables the covalently attached antigen to be processed by antigen-presenting cells and presented to CD4+ T cells to induce immunological memory against the antigen. [0042] The term “cytokine” or “cytokines” refers to the general class of biological molecules which effect cells of the immune system. Exemplary cytokines include, but are not limited to, interferons and interleukins (IL), for example, IL-2, IL-12, IL-15, IL-18, and IL-21. RF1-DEFICIENT E. COLI CELLS RF1 [0043] During translation of mRNA, most codons are recognized by “charged” tRNA molecules, called aminoacyl-tRNAs because they are adhered to specific amino acids corresponding to each tRNA's anticodon. In the standard genetic code, there are three mRNA stop codons: UAG (“amber”), UAA (“ochre”), and UGA (“opal” or “umber”). These codons are not decoded by
tRNAs but are recognized by release factors. The release factors upon recognizing the stop codons release the newly synthesized peptides from the ribosome, thus terminating the translation. In E. coli, two types of release factors recognize stop codons: Release factor 1 (RF-1) and Release factor 2 (RF-2). [0044] The E. coli RF1 (SEQ ID NO: 1) and RF2 (SEQ ID NO: 2) are structurally similar and have related but distinct functions. RF1 and RF2 share a universally conserved GGQ motif that interacts with the peptidyl transferase center of the ribosome to promote catalysis (Youngman et al., Annu. Rev. Microbiol.62, 353–373 (2008)). Both RF1 and RF2 promote termination through induced-fit mechanisms. One main difference between the two release factors is that RF1 recognizes the UAA and UAG stop codons, while RF2 recognizes UAA and UGA stop codons. As used herein, the term “stop codon” refers to a trinucleotide in a DNA or mRNA sequence that signals a halt to protein synthesis in the cell. For example, both the TAG (DNA trinucleotide) and UAG (RNA trinucleotide) are referred to as stop codons. As discussed above, deleting RF1 is beneficial for production of NNAA-incorporated recombinant proteins. To maintain the translational termination activities required for growth and function of these RF1-deficient E. coli cells, the stop codons that are normally recognized by RF1 need to be converted to stop codons that can be recognized by RF2, which means the TAG stop codons need to be mutated to either TGA or TAA; both can be recognized by Release Factor 2 (RF2). E. coli cells [0045] The E. coli strain in this disclosure can be any E. coli strain known to one of skill in the art. In some embodiments, the E. coli strain is a A (K-12), B, C or D strain. RF1-deficient E. coli cells [0046] The E. coli cells disclosed herein are RF1-deficient. In some embodiments, RF1-deficient E. coli cells are generated by knocking out or introducing mutations to the wild type RF1 gene from the genome of the E. coli strain using methods that are well known in the art and as further described below; see the section below entitled “Methods of introducing mutations to E. coli.” Thus, an RF-1 deficient E. coli cell may possess 50% or less, 40% or less, 30% or less, 20% or less, 15% or less, 10% or less, 5% or less, or 0% of the RF-1 activity (catalyzing translational termination from a ribosomal complex stalled at the amber codon) of the control E. coli cell.
E. coli strain with oxidative cytosol [0047] The RF1-deficient E. coli strain disclosed herein contains an oxidative cytoplasm. In one embodiment, an E. coli cell is a Snuggle E. coli cell. In some embodiments, the E. coli cell is a Shuffle E. coli cell. Both types of E. coli cells are described in WO 2020/097385, the entire content of which is herein incorporated by reference. [0048] In some embodiments, E. coli cells (e.g., RF1-deficient E. coli cells) having an oxidative cytoplasm can be selected based on their ability to support production of a polypeptide having one or more disulfide bonds. It is often difficult to produce these polypeptides in E. coli cells having a reductive cytoplasm. In contrast, E. coli cells having an oxidative cytoplasm as described above can facilitate the formation of disulfide bonds that are required for the proper folding and functioning of those polypeptides. Accordingly, in some embodiments, selecting the E. coli having an oxidative cytoplasm can be conducted by transforming the bacteria with a gene encoding a polypeptide (a “test” polypeptide, for example, an antibody light chain) normally containing at least one disulfide bond. As one illustrative example, a coding sequence for an LC protein, e.g., an anti-MUC1 antibody light chain (SEQ ID NO: 15 as disclosed in WO 2020/097385), described above can be engineered into an expression cassette under a suitable promoter and transformed into the candidate E. coli cells. The soluble protein fraction that contains the LC is measured. A suitable E. coli strain can be selected if it is able to express the LC in a soluble form of at least 1 mg/100 mL. Methods for preparing a bacterial lysate and measuring the amount of protein expression (e.g., the expression of LC) in the lysate are well known. In some embodiments, the E. coli cells can be treated with a lysis agent to produce a lysate. Cytoplasmic proteins can be released by treating the lysate with enzymes, such as benzonase and egg white lysozyme. The insoluble protein fraction can be separated from the soluble fraction by e.g., centrifugation. The soluble protein fraction (containing the LC) can be collected and analyzed by SDS-PAGE. The amount of LC protein in the soluble protein fraction can then be quantified by, e.g., densitometry. Methods for selecting bacteria cells with oxidative cytoplasm is described in WO 2020/097385, the entire content of which is herein incorporated by reference. Stop codon mutations: TAG stop codon Ænon-TAG stop codons [0049] In some embodiments, an RF1-deficient E. coli cell having an oxidative cytoplasm comprises at least one stop codon mutation: TAG to non-TAG stop codon. In some embodiments,
the stop codon mutation is TAG to TAA. In some embodiments, RF1-deficient E. coli cells comprising one or more stop codon mutations are grown to produce a recombinant protein comprising an NNAA, as described herein. In some embodiments, at least one stop codon mutation is introduced to a coding sequence using TAG as the stop codon. Nonlimiting examples of such coding sequences include those that encode polypeptides hda, lpxK, coaD, lolA, mreC, murF, hemA, sucB, atpE, fabH, ubiF, or any combination thereof. In some embodiments, at least one stop codon mutation is introduced to one or more or all coding sequences encoding polypeptides include hda, lpxK, coaD, lolA, mreC, murF, and/or hemA. In some embodiments, at least one, at least two, at least three, at least four, at least five, at least six, or all of the coding sequences hda, lpxK, coaD, lolA, mreC, murF, hemA comprise stop codon mutations and the coding sequences comprise TAA or TGA as stop codons instead of TGA. In some embodiments, the number of stop codon mutations in the coding sequences hda, lpxK, coaD, lolA, mreC, murF, and hemA is in a range from 1 to 7, from 2 to 6, from 3 to 7, from 4 to 7, from 5 to 7, from 6 to 7. In some embodiments, all TAG codons in the coding sequences encoding polypeptides hda, lpxK, coaD, lolA, mreC, murF, and hemA have been mutated to TAA or TGA. In some embodiments, all TAG codons in the coding sequences encoding polypeptides hda, lpxK, coaD, lolA, mreC, murF, and hemA have been mutated to TAA. In some embodiments, the number of the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA that comprise non-TAG stop codons is 2, 3, 4, 5, 6, or 7. [0050] In some embodiments, an RF1-deficient E. coli cell comprises at least three, at least five, at least 10, at least 20, at least 30, at least 50, or at least 100 contiguous nucleotides starting from the 3-prime terminus of the coding sequence of one or more stop codon variants. Said one or more stop codon variants include the hda stop codon variant (SEQ ID NO: 5), the lpxK stop codon variant (SEQ ID NO: 7), the coaD stop codon variant (SEQ ID NO: 9), the lolA stop codon variant (SEQ ID NO: 11), the mreC stop codon variant (SEQ ID NO: 13), the murF stop codon variant (SEQ ID NO: 15), and the hemA stop codon variant (SEQ ID NO: 17), the sucB stop codon variant (SEQ ID NO: 19), the atpE stop codon variant (SEQ ID NO: 21), the fabH stop codon variant (SEQ ID NO: 23), and/or the ubiF stop codon variant (SEQ ID NO: 25). In some embodiments, the one or more stop codon variants encode one or more of the following polypeptides: hda (SEQ ID NO: 34), lpxK (SEQ ID NO: 35), coaD (SEQ ID NO: 36), lolA (SEQ ID NO: 37), mreC (SEQ ID NO: 38), murF (SEQ ID NO: 39), and hemA (SEQ ID NO: 40), sucB (SEQ ID NO: 41), atpE
(SEQ ID NO: 42), and/or fabH (SEQ ID NO: 43). In some embodiments, the one or more stop codon variants encode one or more following polypeptides that are functional variants of the above polypeptides. Such functional variants may have at least 70%, at least 75%, at least 80%, at least 85% at least 90% at least 95% at least 98% or at least 99% amino acid sequence identity to one of the hda (SEQ ID NO: 34), lpxK (SEQ ID NO: 35), coaD (SEQ ID NO: 36), lolA (SEQ ID NO: 37), mreC (SEQ ID NO: 38), murF (SEQ ID NO: 39), and hemA (SEQ ID NO: 40), sucB (SEQ ID NO: 41), atpE (SEQ ID NO: 42), and fabH (SEQ ID NO: 43). Mutations in RF2 [0051] In some embodiments, the RF1-deficient E. coli cell is derived from the K12 strain and has increased RF2 activity as compared to a control cell. In some embodiments, an RF1-deficient E. coli cell disclosed herein comprises a T246X mutation as compared to the wild-type RF2 (SEQ ID NO: 2), wherein X represent any naturally occurring amino acid (e.g., any one of G, A, V, L, and I) or modified amino acid. In some embodiments, the T246X mutation is T246A. In some embodiments, the E. coli cell comprises an RF2 variant having the sequence of SEQ ID NO: 3. In some embodiments, the RF1-deficient E. coli cell is from a B10 strain; an E. coli cell from B10 strain naturally contains the T246A substitution relative to SEQ ID NO: 2. Other modifications [0052] In some embodiment, the RF1-deficient E. coli cell further comprises a fabR null mutation (ǻfabR mutation) as described in Mukai et al., Sci. Rep. 2015; 5: 9699, doi: 10.1038/srep09699. The mutation improves E. coli cell growth and increase the production efficiency of a protein of interest. The fabR null mutation was introduced using homologous recombination with lambda red recombinase. The fabR gene was replaced with a linear piece of DNA that contained a selection marker flanked with arms homologous to the 5’ and 3’ regions of the fabR gene. See, Datsenko, K. A., & Wanner, B. L. (2000). One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proceedings of the National Academy of Sciences of the United States of America, 97(12), 6640–6645. doi.org/10.1073/pnas.120163297. Non-natural amino acids [0053] The RF1-deficient E. coli cells can be used as host cells to produce non-natural amino acid (NNAA)-containing recombinant proteins. Suitable non-natural amino acids that can be
incorporated in the antibodies include, for example, those disclosed in U.S. Pat. No. 10,179,909; U. S. Pat. No. 9,938,516; U. S. Pat. No. 9,682,934; U. S. Pat. No. 10,596,270; and U.S. Pat. No. 10,610,571, the entire contents of which are herein incorporated by reference. [0054] The non-natural amino acid may comprise a reactive group useful for forming a covalent bond to a linker or a biologically active adduct (aka., a payload), as described below. In certain embodiments, the reactive group is selected from the group consisting of amino, carboxy, acetyl, hydrazino, hydrazido, semicarbazido, sulfanyl, azido and alkynyl. The non-natural amino acids may be L-amino acids, or D-amino acids, or racemic amino acids. In certain embodiments, the non-natural amino acids described herein include D-versions of the natural amino acids and racemic versions of the natural amino acids. [0055] In certain embodiments, the non-natural amino acid is according to any of the following formulas:
[0056] In the above formulas, the wavy lines indicate bonds that connect to the remainder of the polypeptide chains of the antibodies. These non-natural amino acids can be incorporated into polypeptide chains just as natural amino acids are incorporated into the same polypeptide chains. In certain embodiments, the non-natural amino acids are incorporated into the polypeptide chain via amide bonds as indicated in the formulas. In the above formulas, R designates any functional group without limitation, so as long as the amino acid residue is not identical to a natural amino acid residue. In certain embodiments, R can be a hydrophobic group, a hydrophilic group, a polar
group, an acidic group, a basic group, a chelating group, a reactive group, a therapeutic moiety or a labeling moiety. In the above formulas, each L represents a linker (e.g., a divalent linker), as further described below. [0057] In some embodiments, the non-naturally encoded amino acids include side chain functional groups that react efficiently and selectively with functional groups not found in the 20 common amino acids (including but not limited to, azido, ketone, aldehyde and aminooxy groups) to form stable conjugates. For example, antigen-binding polypeptide that includes a non-naturally encoded amino acid containing an azido functional group can be reacted with a polymer (including but not limited to, poly(ethylene glycol) or, alternatively, a second polypeptide containing an alkyne moiety to form a stable conjugate resulting for the selective reaction of the azide and the alkyne functional groups to form a Huisgen [3+2] cycloaddition product. In some embodiments, a strong nucleophile (including, but not limited to, hydrazine, hydrazide, aminooxy, hydroxylamine, or semicarbazide) can be reacted with an aldehyde or ketone group present in a non-naturally encoded amino acid to form a hydrazone, oxime, or semicarbazone, as applicable, which in some cases can be further reduced by treatment with an appropriate reducing agent. [0058] Exemplary non-naturally encoded amino acids that may be suitable for use in the present invention and that are useful for reactions with water soluble polymers, including but are not limited to, those with carbonyl, aminooxy, hydrazine, hydrazide, semicarbazide, azide and alkyne reactive groups. In some embodiments, non-naturally encoded amino acids comprise a saccharide moiety. Examples of such amino acids include N-acetyl-L-glucosaminyl-L-serine, N-acetyl-L- galactosaminyl-L-serine, N-acetyl-L-glucosaminyl-L-threonine, N-acetyl-L-glucosaminyl-L- asparagine and O-mannosaminyl-L-serine. Examples of such amino acids also include examples where the naturally-occurring N- or O-linkage between the amino acid, and the saccharide is replaced by a covalent linkage not commonly found in nature-including but not limited to, an alkene, an oxime, a thioether, an amide and the like. Examples of such amino acids also include amino acids linked to saccharides that are not commonly found in naturally-occurring proteins, such as 2-deoxy-glucose, 2-deoxygalactose and the like. [0059] Many of the non-natural amino acids suitable for use in the present invention are commercially available, e.g., from Sigma (USA) or Aldrich (Milwaukee, Wis., USA). Those that are not commercially available are optionally synthesized as provided herein or as provided in
various publications or using standard methods known to those of skill in the art. For organic synthesis techniques, see, e.g., Organic Chemistry by Fessendon and Fessendon, (1982, Second Edition, Willard Grant Press, Boston Mass.); See, also, U.S. Patent Application Publications 2003/0082575 and 2003/0108885, which is incorporated by reference herein; Advanced Organic Chemistry by March (Third Edition, 1985, Wiley and Sons, New York); and Advanced Organic Chemistry by Carey and Sundberg (Third Edition, Parts A and B, 1990, Plenum Press, New York). Additional publications describing the synthesis of non-natural amino acids include, e.g., WO 2002/085923 entitled “In vivo incorporation of Non-natural Amino Acids;” Matsoukas et al., (1995) J. Med. Chem., 38, 4660-4669; King, F. E. & Kidd, D. A. A. (1949) A New Synthesis of Glutamine and of Ȗ-Dipeptides of Glutamic Acid from Phthylated Intermediates. J. Chem. Soc., 3315-3319; Friedman, O. M. & Chatterrji, R. (1959) Synthesis of Derivatives of Glutamine as Model Substrates for Anti-Tumor Agents. J. Am. Chem. Soc. 81, 3750-3752; Craig, J. C. et al. (1988) Absolute Configuration of the Enantiomers of 7-Chloro-4 [[4-(diethylamino)-1- methylbutyl]amino]quinoline (Chloroquine). J. Org. Chem.53, 1167-1170; Azoulay, M., Vilmont, M. & Frappier, F. (1991) Glutamine analogues as Potential Antimalarials, Eur. J. Med. Chem.26, 201-5; Koskinen, A. M. P. & Rapoport, H. (1989) Synthesis of 4-Substituted Prolines as Conformationally Constrained Amino Acid Analogues. J. Org. Chem.54, 1859-1866; Christie, B. D. & Rapoport, H. (1985) Synthesis of Optically Pure Pipecolates from L-Asparagine. Application to the Total Synthesis of (+)-Apovincamine through Amino Acid Decarbonylation and Iminium Ion Cyclization. J. Org. Chem. 1989:1859-1866; Barton et al., (1987) Synthesis of Novel a- Amino-Acids and Derivatives Using Radical Chemistry: Synthesis of L- and D-Į-Amino-Adipic Acids, L-a-aminopimelic Acid and Appropriate Unsaturated Derivatives. Tetrahedron Lett. 43:4297-4308; and, Subasinghe et al., (1992) Quisqualic acid analogues: synthesis of beta- heterocyclic 2-aminopropanoic acid derivatives and their activity at a novel quisqualate-sensitized site. J. Med. Chem.35:4602-7. See also, patent applications entitled “Protein Arrays,” filed Dec. 22, 2003, Ser. No.10/744,899 and Ser. No.60/435,821 filed on Dec.22, 2002. [0060] Many non-natural amino acids are based on natural amino acids, such as tyrosine, glutamine, phenylalanine, and the like and are suitable for use in the present invention. Tyrosine analogs include, but are not limited to, para-substituted tyrosines, ortho-substituted tyrosines, and meta substituted tyrosines, where the substituted tyrosine comprises, including but not limited to, a keto group (including but not limited to, an acetyl group), a benzoyl group, an amino group, a
hydrazine, an hydroxyamine, a thiol group, a carboxy group, an isopropyl group, a methyl group, a C6-C20 straight chain or branched hydrocarbon, a saturated or unsaturated hydrocarbon, an O- methyl group, a polyether group, a nitro group, an alkynyl group or the like. In addition, multiply substituted aryl rings are also contemplated. Glutamine analogs that may be suitable for use in the present invention include, but are not limited to, Į-hydroxy derivatives, Ȗ-substituted derivatives, cyclic derivatives, and amide substituted glutamine derivatives. Example phenylalanine analogs that may be suitable for use in the present invention include, but are not limited to, para-substituted phenylalanines, ortho-substituted phenyalanines, and meta-substituted phenylalanines, where the substituent comprises, including but not limited to, a hydroxy group, a methoxy group, a methyl group, an allyl group, an aldehyde, an azido, an iodo, a bromo, a keto group (including but not limited to, an acetyl group), a benzoyl, an alkynyl group, or the like. Specific examples of non- natural amino acids that may be suitable for use in the present invention include, but are not limited to, an azidoethoxycarbonyl lysine (AEK), a p-acetyl-L-phenylalanine, an O-methyl-L-tyrosine, an L-3-(2-naphthyl)alanine, a 3-methyl-phenylalanine, an O-4-allyl-L-tyrosine, a 4-propyl-L- tyrosine, a tri-O-acetyl-GlcNAcȕ-serine, an L-Dopa, a fluorinated phenylalanine, an isopropyl-L- phenylalanine, a p-azido-L-phenylalanine, a p-azido-methyl-L-phenylalanine, a p-acyl-L- phenylalanine, a p-benzoyl-L-phenylalanine, an L-phosphoserine, a phosphonoserine, a phosphonotyrosine, a p-iodo-phenylalanine, a p-bromophenylalanine, a p-amino-L-phenylalanine, an isopropyl-L-phenylalanine, and a p-propargyloxy-phenylalanine, and the like. Examples of structures of a variety of non-natural amino acids that may be suitable for use in the present invention are provided in, for example, WO 2002/085923 entitled “In vivo incorporation of non- natural amino acids.” See also Kiick et al., (2002) Incorporation of azides into recombinant proteins for chemoselective modification by the Staudinger ligation, PNAS 99:19-24, for additional methionine analogs. [0061] Particular examples of useful non-natural amino acids include, but are not limited to, p- acetyl-L-phenylalanine, O-methyl-L-tyrosine, L-3-(2-naphthyl)alanine, 3-methyl-phenylalanine, O-4-allyl-L-tyrosine, 4-propyl-L-tyrosine, tri-O-acetyl-GlcNAc b-serine, L-Dopa, fluorinated phenylalanine, isopropyl-L-phenylalanine, p-azido-methyl-L-phenyl alanine, p-azido-L- phenylalanine, p-acyl-L-phenylalanine, p-benzoyl-L-phenylalanine, L-phosphoserine, phosphonoserine, phosphonotyrosine, p-iodo-phenylalanine, p-bromophenylalanine, p-amino-L- phenylalanine, isopropyl-L-phenylalanine, and p-propargyloxy-phenylalanine. Further useful
examples include N-acetyl-L-glucosaminyl-L-serine, N-acetyl-L-galactosaminyl-L-serine, N- acetyl-L-glucosaminyl-L-threonine, N-acetyl-L-glucosaminyl-L-asparagine and O- mannosaminyl-L-serine. [0062] In particular embodiments, the non-natural amino acids are selected from p-acetyl- phenylalanine, p-ethynyl-phenylalanine, p-propargyloxyphenylalanine, p-azido-methyl- phenylalanine, and p-azido-phenylalanine. In one embodiment, the non-natural amino acid is p- azido phenylalanine. [0063] In certain embodiments, the first reactive group is an alkynyl moiety (including but not limited to, in the non-natural amino acid p-propargyloxyphenylalanine, where the propargyl group is also sometimes referred to as an acetylene moiety) and the second reactive group is an azido moiety, and [3+2] cycloaddition chemistry can be used. In certain embodiments, the first reactive group is the azido moiety (including but not limited to, in the non-natural amino acid p-azido-L- phenylalanine) and the second reactive group is the alkynyl moiety. [0064] The non-natural amino acids used in the methods and compositions described herein have at least one of the following four properties: (1) at least one functional group on the sidechain of the non-natural amino acid has at least one characteristics and/or activity and/or reactivity orthogonal to the chemical reactivity of the 20 common, genetically-encoded amino acids (i.e., alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine), or at least orthogonal to the chemical reactivity of the naturally occurring amino acids present in the polypeptide that includes the non-natural amino acid; (2) the introduced non-natural amino acids are substantially chemically inert toward the 20 common, genetically- encoded amino acids; (3) the non-natural amino acid can be stably incorporated into a polypeptide, preferably with the stability commensurate with the naturally-occurring amino acids or under typical physiological conditions, and further preferably such incorporation can occur via an in vivo system; and (4) the non-natural amino acid includes an oxime functional group or a functional group that can be transformed into an oxime group by reacting with a reagent, preferably under conditions that do not destroy the biological properties of the polypeptide that includes the non- natural amino acid (unless of course such a destruction of biological properties is the purpose of the modification/transformation), or where the transformation can occur under aqueous conditions
at a pH between about 4 and about 8, or where the reactive site on the non-natural amino acid is an electrophilic site. Any number of non-natural amino acids can be introduced into the polypeptide. Non-natural amino acids may also include protected or masked oximes or protected or masked groups that can be transformed into an oxime group after deprotection of the protected group or unmasking of the masked group. Non-natural amino acids may also include protected or masked carbonyl or dicarbonyl groups, which can be transformed into a carbonyl or dicarbonyl group after deprotection of the protected group or unmasking of the masked group and thereby are available to react with hydroxylamines or oximes to form oxime groups. [0065] In further embodiments, non-natural amino acids that may be used in the methods and compositions described herein include, but are not limited to, amino acids comprising a photoactivatable cross-linker, spin-labeled amino acids, fluorescent amino acids, metal binding amino acids, metal-containing amino acids, radioactive amino acids, amino acids with novel functional groups, amino acids that covalently or non-covalently interact with other molecules, photocaged and/or photoisomerizable amino acids, amino acids comprising biotin or a biotin analogue, glycosylated amino acids such as a sugar substituted serine, other carbohydrate modified amino acids, keto-containing amino acids, aldehyde-containing amino acids, amino acids comprising polyethylene glycol or other polyethers, heavy atom substituted amino acids, chemically cleavable and/or photocleavable amino acids, amino acids with an elongated side chains as compared to natural amino acids, including but not limited to, polyethers or long chain hydrocarbons, including but not limited to, greater than about 5 or greater than about 10 carbons, carbon-linked sugar-containing amino acids, redox-active amino acids, amino thioacid containing amino acids, and amino acids comprising one or more toxic moiety. [0066] In some embodiments, non-natural amino acids comprise a saccharide moiety. Examples of such amino acids include N-acetyl-L-glucosaminyl-L-serine, N-acetyl-L-galactosaminyl-L- serine, N-acetyl-L-glucosaminyl-L-threonine, N-acetyl-L-glucosaminyl-L-asparagine and O- mannosaminyl-L-serine. Examples of such amino acids also include examples where the naturally-occurring N- or O-linkage between the amino acid and the saccharide is replaced by a covalent linkage not commonly found in nature-including but not limited to, an alkene, an oxime, a thioether, an amide and the like. Examples of such amino acids also include saccharides that are
not commonly found in naturally-occurring proteins such as 2-deoxy-glucose, 2-deoxygalactose and the like. [0067] In particular embodiments, the non-natural amino acid is one selected from the group of non-natural amino acids shown in FIG. 8A-8D of WO 2021/222719, the entire disclosure of said international application publication is herein incorporated by reference. Such non-natural amino acids may be in the form of a salt or may be incorporated into a non-natural amino acid polypeptide, polymer, polysaccharide, or a polynucleotide and optionally post translationally modified. [0068] In some embodiments, the non-natural amino acid is para-azidomethyl-L-phenylalanine (pAMF), Azidoethoxycarbonyl lysine (AEK), or p-acetyl-L-phenylalanine (pAcF). In some embodiments, the non-natural amino acid is (S)-2-amino-3-(5-(6-methyl-1,2,4,5-tetrazin-3- ylamino)pyridin-3-yl)propanoic acid. Incorporation of non-natural amino acids [0069] Methods for incorporating non-natural amino acids into a protein of interest for production in the RF1-deficient E. coli cells described herein are well known, e.g., as described U.S. Pat. No. 9,988,619 and U.S. Pat. No. 9,938,516, the entire contents of which are herein incorporated by reference. [0070] In one approach, the coding sequence of the protein of interest is modified to contain at least one non-natural amino acid codon. The non-natural amino acid codon is one that does not result in the incorporation of any of the 20 natural amino acids. In some embodiments, the non- natural amino acid codon is an amber, opal, or ochre stop codon, which is repurposed to charge a non-natural amino acid to its cognate tRNA by a tRNA synthetase instead of terminating translation. As illustrative in the Examples 2 and 8, in some cases, one or more codons encoding one or more natural amino acids at a desired NNAA incorporation sites are mutated to one or more TAG codons, which are then repurposed to charge one or more NNAAs. [0071] A non-natural amino acid can be charged to a tRNA by a tRNA synthetase, which preferentially acetylates the non-natural amino acid as compared to any of the 20 natural amino acids. tRNA synthetases having such function are known, for example, U.S. Pat. No. 9,938,516 discloses tRNA synthetases that selectively incorporate a non-natural amino acid para- methylazido-L-phenylalanine (pAMF). tRNA synthetases that can selectively incorporate other
non-natural amino acids, for example, Azidoethoxycarbonyl lysine (AEK) or -acetyl-L- phenylalanine (pAcF), are also well known; see, for example, Chen et al., Angew Chem Int Ed Engl. 2009; 48(22):4052-5 (doi: 10.1002/anie.200900683); and Li et al., Proc Natl Acad Sci U S A, 2003 Jan 7;100(1):56-61. The entire contents of said publications are herein incorporated by reference. Additional exemplary non-natural amino acids that can be incorporated into the antibodies disclosed herein include aralkyl, heterocyclyl, and heteroaralkyl, and lysine-derivative unnatural amino acids. In some embodiments, such non-natural amino acid comprise pyridyl, pyrazinyl, pyrazolyl, triazolyl, oxazolyl, thiazolyl, thiophenyl, or other heterocycle. Such amino acids in some embodiments comprise azides, tetrazines, or other chemical group capable of conjugation to a coupling partner, such as a water-soluble moiety. [0072] tRNA synthetases that are capable of selectively incorporating a non-natural amino acid may also be obtained by genetically modifying a wild type tRNA synthetase to produce mutant tRNA synthetases. Each of these mutant tRNA synthetases can then be tested for its activity in selectively incorporating the non-natural amino acid using in a reporter gene, which contains the desired non-natural amino acid codon. The activity of the mutant tRNA synthetase variant in the presence of the non-natural amino acid (e.g., pAMF) as compared to the 20 common naturally occurring amino acids can be measured by detecting the presence or absence of the reporter protein. One exemplary method of generating mutant tRNA synthetases for incorporating non-natural amino acids is disclosed in U.S. Pat. No. 9,938,516, the entire content of which is herein incorporated by reference. [0073] In some embodiments, the non-natural amino acid codon is a synthetic codon, and the unnatural amino acid is incorporated into a protein (e.g., an antibody) using an orthogonal synthetase/tRNA pair. The orthogonal synthetase may be a synthetase that is modified from any of the natural amino acid synthetases. For example, the orthogonal synthetase may be a proline synthetase, a modified serine synthetase, a modified tryptophan synthetase, or a modified phosphoserine synthetase. The orthogonal tRNA may also be modified from any of the natural amino acid tRNA. For example, the orthogonal tRNA may be a modified alanine tRNA, a modified arginine tRNA, a modified aspartic acid tRNA, a modified cysteine tRNA, a modified glutamine tRNA, a modified glutamic acid tRNA, a modified alanine glycine, a modified histidine tRNA, a modified leucine tRNA, a modified isoleucine tRNA, a modified lysine tRNA, a modified
methionine tRNA, a modified phenylalanine tRNA, a modified proline tRNA, a modified serine tRNA, a modified threonine tRNA, or a modified tryptophan tRNA. In some embodiments, a modified tyrosine tRNA, a modified valine tRNA, or a modified phosphoserine tRNA. [0074] In some embodiments the tRNA is encoded by nucleotides 1018-1163 of SEQ ID NO: 32. In some embodiments, the tRNA shares at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to a tRNA encoded by the polynucleotide sequence comprising or consisting of nucleotides 1018-1163 of SEQ ID NO: 32. In some embodiments, the RNA synthetase is encoded by the polynucleotide sequence comprising or consisting of 152-1072 of SEQ ID NO: 32. In some embodiments, the RNA synthetase shares at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to a polypeptide encoded by the polynucleotide sequence comprising or consisting of 152-1072 of SEQ ID NO: 32. Codon optimization [0075] Codon optimization may be used to increase the rate of translation of the protein of interest or to produce recombinant RNA transcripts having desirable properties, such as a longer half-life, as compared with transcripts produced using a non-optimized sequence. The protein coding sequences may be optimized to maximize expression efficiency in E. coli. In particular, when expressing a polynucleotide having a non-natural amino acid codon in E. coli, bases that are in the vicinity of the non-natural amino acid codon may affect the mRNA conformation in the P site in the ribosome and thus have an impact on the efficiency of incorporating the non-natural amino acid. Thus, it is desirable to optimize the codons for the amino acid in the vicinity of the non-natural amino acid codon to maximize the yield of the antibodies having the non-natural amino acids. In some embodiments, the codon that is immediately 3’ to the non-natural amino acid codon is optimized to maximize protein expression. The optimal codon can be selected by comparing the yield of proteins produced from expressing the coding sequences having different codons for the same amino acid in the same location. Coding sequences that produce the desired high yield are then selected. Proteins of interest [0076] The RF1-deficient E. coli cell can be used to produce a protein of interest, for example, a recombinant protein comprising one or more NNAAs as described above.
[0077] The protein of interest can be eukaryotic or prokaryotic proteins, viral proteins, or plant proteins. In some embodiments, the protein of interest is of mammalian origin, including murine, bovine, ovine, feline, porcine, canine, goat, equine, and primate origin. In some embodiments, the protein of interest is of human origin. [0078] In some embodiments, the protein of interest is an antibody, such as single chain antibodies, a fragment of an antibody, as well as antibodies consisting of multiple polypeptide chains. In some embodiments, the antibody is an IgA, an IgD, an IgE, an IgG, or an IgM antibody. In some embodiments, the protein of interest is a light chain or heavy chain of an antibody. In some embodiments, the protein of interest is an scFv. In some embodiments, the protein of interest is a Fab fragment. In some embodiments, the protein of interest is a monoclonal antibody. In some embodiments, the antibody is a humanized antibody or a human antibody. [0079] Furthermore, in some embodiments, an antibody of the disclosure may be chemically modified (e.g., one or more chemical moieties can be attached to the antibody) or be modified, e.g., produced in cell lines and/or in cell culture conditions to alter its glycosylation (e.g., hypofucosylation, afucosylation, or increased sialylation) to alter one or more functional properties of the antibody. For example, the antibody can be linked to one of a variety of polymers, for example, polyethylene glycol. In some embodiments, the antibody is aglycosylated. In some embodiments, an antibody may comprise mutations to facilitate linkage to a chemical moiety and/or to alter residues that are subject to post-translational modifications, e.g., glycosylation. [0080] In some embodiments, the Fc region of the antibody containing no fucose (i.e., the Fc region is afucosylated). Afucosylated antibodies can be produced using cell lines that express a heterologous enzyme that depletes the fucose pool inside the cell (e.g., GlymaxX® by ProBioGen AG, Berlin, Germany). Non-fucosylated antibodies can also be produced using a host cell line in which the endogenous Į-1,6-fucosyltransferase (FUT8) gene is deleted. See Satoh, M. et al., “Non-fucosylated therapeutic antibodies as next-generation therapeutic antibodies,” Expert Opinion on Biological Therapy, 6:11, 1161-1173, DOI: 10.1517/14712598.6.11.1161. [0081] Antibodies produced using the methods in this disclosure can be conjugated to a biologically active adduct (aka, a payload) using a chemical reaction such as the click chemistry. In some cases, the antibody comprises one or more non-natural amino acids (as described above) at specific sites in the protein sequence, and the biologically active adduct can be conjugated to
these non-natural amino acids. Having antibodies containing the non-natural amino acids at the desired amino acid location, a biologically active adduct can be conjugated to the non-natural amino acid using a chemical reaction such as the click chemistry. For instance, the pAMF containing antibody produced using the methods disclosed herein can be purified by standard procedures. Then, the purified protein is subject to a click chemistry reaction (e.g., copper(I)- catalyzed azide-alkyne 1,3-cycloaddition reaction or copper-free catalyzed azide-aklyne 1,3- cycloaddition reaction) to directly conjugate a biologically active adduct to the pAMF residue. [0082] Exemplary biologically active adducts for use in the present invention include, but are not limited to, small molecules, oligonucleotides, peptides, amino acids, nucleic acids, sugars, oligosaccharides, polymers, synthetic polymers, chelators, fluorophores, chromophores, other detectable agents, drug moieties, cytotoxic agents, detectable agents, and the like. [0083] In some embodiments, the protein of interest is an immunogenic polypeptide. In some embodiments, the immunogenic polypeptide is a carrier protein. In some embodiments, a carrier protein disclosed herein comprises at least one T-cell activating epitope. In some embodiments, the T-cell activating epitope is from a protein selected from the group consisting of Corynebacterium diphtheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D (PD, HiD), outer membrane protein complex of serogroup B meningococcus (OMPC), and CRM197. [0084] In some embodiments, a carrier protein comprises a polypeptide that can be conjugated to an antigen to provide a T-cell dependent immune response. In some embodiments, the antigen comprises a T-cell independent antigen selected from the group consisting of a hapten, a bacterial capsular polysaccharide, a bacterial lipopolysaccharide, or a tumor-derived glycan. In another embodiment, the antigen comprises a bacterial non-capsular polysaccharide, such as an exopolysaccharide e.g. the S.aureus exopolysaccharide. In another embodiment, the antigen is a bacterial polysaccharide and the bacteria is selected from the group consisting of Streptococcus pneumoniae, Neisseria meningitidis, Haemophilus influenzae (e.g. Hib), Streptococcus pyogenes, and Streptococcus agalactiae. In another embodiment, at least one of the non-natural amino acids is selected from group consisting of 2-amino-3-(4-azidophenyl)propanoic acid (pAF), 2-amino-3- (4-(azidomethyl)phenyl)propanoic acid (pAMF), 2-amino-3-(5-(azidomethyl)pyridin-2- yl)propanoic acid, 2-amino-3-(4-(azidomethyl)pyridin-2-yl)propanoic acid, 2-amino-3-(6-
(azidomethyl)pyridin-3-yl)propanoic acid, 2-amino-5-azidopentanoic acid, and 2-amino-3-(4- (azidomethyl)phenyl)propanoic acid. [0085] In some embodiments, a carrier protein comprises a polypeptide that comprises one or more NNAAs as disclosed above. In another embodiment, the carrier protein comprises at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, or at least 9 NNAA residues. In another embodiment, the non-natural amino acid is selected from the group consisting of 2-amino- 3-(4-azidophenyl)propanoic acid (pAF), 2-amino-3-(4-(azidomethyl)phenyl)propanoic acid (pAMF), 2-amino-3-(5-(azidomethyl)pyridin-2-yl)propanoic acid, 2-amino-3-(4- (azidomethyl)pyridin-2-yl)propanoic acid, 2-amino-3-(6-(azidomethyl)pyridin-3-yl)propanoic acid, 2-amino-5-azidopentanoic acid, or 2-amino-3-(4-(azidomethyl)phenyl)propanoic acid, and any combination thereof. In some embodiments, a carrier protein comprises a polypeptide that comprises at least one NNAA, the NNAA comprising a bio-orthogonal reactive moiety through which the antigen is conjugated to the carrier protein. In another embodiment, the polypeptide comprises at least two non-natural amino acids comprising a bio-orthogonal reactive moiety through which the antigen is conjugated to the polypeptide. [0086] In some embodiments, a carrier protein disclosed herein comprises at least one T-cell activating epitope and at least one, and preferably at least two, NNAA, wherein the antigen is conjugated to the NNAA and further wherein the at least one NNAA is a 2,3-disubstituted propanoic acid bearing an amino substituent at the 2-position and an azido-containing substituent, a 1,2,4,5-tetrazinyl-containing substituent, or an ethynyl-containing substituent at the 3-position. [0087] In some embodiments, the carrier protein comprises an amino acid sequence having at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% amino acid sequence identity to CRM197 (SEQ ID NO: 33). In some embodiments, the carrier protein comprises one or more NNAA. In some embodiments, the carrier protein comprises or consists of a polypeptide having an amino acid sequence of SEQ ID NO: 33. In some embodiments, at least one of the K25, K34, K38, K40, K213, K215, K228, K245, K265, K386, K523 and K527 of SEQ ID NO: 33 is substituted by an NNAA. In some embodiments, at least one of the K34, K213, K245, K265, K386, and K527 of SEQ ID NO: 33 is substituted by an NNAA. In some embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 lysines selected from the group consisting of K25, K34, K38, K40, K213, K215, K228, K245,
K265, K386, K523 and K527 of SEQ ID NO: 33 are substituted by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 NNAA(s). In some embodiments, 1, 2, 3, 4, 5, 6 lysines of K34, K213, K245, K265, K386, and K527 of SEQ ID NO: 33 are substituted by 1, 2, 3, 4, 5, 6 NNAA(s). [0088] Additional examples of proteins of interest which can be produced include the following proteins: mammalian polypeptides including molecules such as, e.g., renin, growth hormone, receptors for hormones or growth factors; CD proteins such as CD-3, CD4, CD8, and CD-19; interleukins; interferons; T-cell receptors; surface membrane proteins; decay accelerating factor; viral antigen such as, for example, a portion of the AIDS envelope; transport proteins; homing receptors; addressins; regulatory proteins; antibodies; and fragments of any of the above-listed polypeptides. [0089] In certain embodiments, the protein of interest is cytokine. In certain embodiments, the protein is selected from the group consisting of IL-1-like, IL-1Į, IL-1ȕ, IL-1RA, IL-18, IL-2, IL- 4, IL-7, IL-9, IL-13, IL-15, IL-3, IL-5, , IL-16, IL-17, IFN-Į, IFN-ȕ, IFN-Ȗ, TNF, CD154, LT-ȕ, TNF-Į, TNF-ȕ, 4-1BBL, APRIL, CD70, CD153, CD178, GITRL, LIGHT, OX40L, TALL-1, TRAIL, TWEAK, TRANCE, TGF-ȕ, TGF-ȕ1, TGF-ȕ2, TGF-ȕ3, Epo, Tpo, Flt-3L, SCF, M-CSF, and MSP. In certain embodiments, the protein is IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL- 9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IL-16, IL-17, IL-18, IL-19, IL-20, IL-21, IL-22, IL- 23, IL-24, IL-25, IL-26, IL-27, IL-28, IL-29, IL-30, IL-31, IL-32, IL-33, IL-34, IL-35, or IL-36. In certain embodiments, the protein is IL-2. In certain embodiments, the protein is an interferon. In certain embodiments, the protein is interferon alpha. In certain embodiments, the protein is interferon beta. In certain embodiments, the protein is interferon gamma. In certain embodiments, the protein is a tumor necrosis factor. In certain embodiments, the protein is TNF alpha. In certain embodiments, the protein is TNF beta. In certain embodiments, the protein is a transforming growth factor. In certain embodiments, the protein is a chemokine. In certain embodiments, the protein is G-CSF. In certain embodiments, the protein is GM-CSF. In certain embodiments, the protein is erythropoietin. In certain embodiments, the protein is alpha-galactosidase A. In certain embodiments, the protein is tissue plasminogen activator. In certain embodiments, the protein is insulin. In certain embodiments, the protein is insulin-like growth factor. In certain embodiments, the protein is human growth hormone. In certain embodiments, the protein is erythropoietin.
[0090] A protein produced by the methods and compositions disclosed herein can be used for one or more of the following purposes or effects: inhibiting the growth, infection or function of, or killing, infectious agents, including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape (such as, for example, breast augmentation or diminution, change in bone form or shape); effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional factors or component(s); effecting behavioral characteristics, including, without limitation, appetite, libido, stress, cognition (including cognitive disorders), depression (including depressive disorders) and violent behaviors; providing analgesic effects or other pain-reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting deficiencies of the enzyme and treating deficiency-related diseases; treatment of hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such as, the ability to bind antigens or complement); and the ability to act as an antigen in vaccine composition to raise an immune response against such protein or another material or entity which is cross-reactive with such protein. [0091] The polypeptides and proteins produced by the invention can be used for any purpose known to one of skill in the art. Preferred uses include medical uses, including diagnostic uses, prophylactic and therapeutic uses. For example, the proteins can be prepared for topical or other type of administration. Another preferred medical use is for the preparation of vaccines. Accordingly, the proteins produced by the invention are solubilized or suspended in pharmacologically acceptable solutions to form pharmaceutical compositions for administration to a subject. Appropriate buffers for medical purposes and methods of administration of the pharmaceutical compositions are further set forth below. It will be understood by a person of skill in the art that medical compositions can also be administered to subjects other than humans, such as for veterinary purposes.
[0092] A protein of interest, such as an antibody, produced by the invention, including those incorporating non-natural amino acids can be used for any purpose known to one of skill in the art. Preferred uses include medical uses, including diagnostic uses, prophylactic, and therapeutic uses. For example, the antibodies can be prepared for topical or other type of administration. Accordingly, the proteins produced by the invention are solubilized or suspended in pharmacologically acceptable solutions to form pharmaceutical compositions for administration to a subject. Appropriate buffers for medical purposes and methods of administration of the pharmaceutical compositions are further set forth below. It will be understood by a person of skill in the art that medical compositions can also be administered to subjects other than humans, such as for veterinary purposes. Methods General methods [0093] Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by one of ordinary skill in the art to which this invention belongs. Practitioners are particularly directed to Green, M.R., and Sambrook, J., eds., Molecular Cloning: A Laboratory Manual, 4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2012), and Ausubel, F. M., et al., Current Protocols in Molecular Biology (Supplement 99), John Wiley & Sons, New York (2012), which are incorporated herein by reference, for definitions and terms of the art. Standard methods also appear in Bindereif, Sch^n, & Westhof (2005) Handbook of RNA Biochemistry, Wiley-VCH, Weinheim, Germany which describes detailed methods for RNA manipulation and analysis and is incorporated herein by reference. Examples of appropriate molecular techniques for generating recombinant nucleic acids, and instructions sufficient to direct persons of skill through many cloning exercises are found in Green, M.R., and Sambrook, J., (Id.); Ausubel, F. M., et al., (Id.); Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology (Volume 152 Academic Press, Inc., San Diego, Calif. 1987); and PCR Protocols: A Guide to Methods and Applications (Academic Press, San Diego, Calif.1990), which are incorporated by reference herein. [0094] Methods for protein purification, chromatography, electrophoresis, centrifugation, and crystallization are described in Coligan et al. (2000) Current Protocols in Protein Science, Vol.1, John Wiley and Sons, Inc., New York. Methods for cell-free synthesis are described in Spirin &
Swartz (2008) Cell-free Protein Synthesis, Wiley-VCH, Weinheim, Germany. Methods for incorporation of non-native amino acids into proteins using cell-free synthesis are described in Shimizu et al (2006) FEBS Journal, 273, 4133-4140. [0095] When the proteins described herein are referred to by name, it is understood that this includes proteins with similar functions and similar amino acid sequences. Thus, the proteins described herein include the wild-type prototype protein, as well as homologs, polymorphic variations and recombinantly created muteins. For example, the name “RF1” includes the wild- type prototype protein from E. coli (e.g., SEQ ID NO: 1), as well as homologs from other species, polymorphic variations and recombinantly created muteins. Proteins such as RF1 are defined as having similar functions if they have substantially the same biological activity as the wild-type protein when assessed using the same type of assay. The term “substantially the same biological activity” refers to that the activity of a protein is at least 70%, at least 80%, at least 85%, at least 90%, at least 92%, or at least 95% of that of the corresponding reference protein (e.g., the corresponding wild-type protein). Proteins are defined as homologs having similar amino acid sequences if they each has an amino acid sequence at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence of the corresponding prototype protein such as hda, lpxK, coaD, lolA, mreC, murF, hemA, sucB, atpE, fabH, and ubiF. The sequence identity of a protein is determined using the BLASTP program with the defaults wordlength of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89:10915-10919, 1992). [0096] A readily conventional test to determine if a protein homolog, polymorphic variant, or a recombinant mutein is inclusive of a protein having the function described herein is by specific binding to polyclonal antibodies generated against the prototype protein. Typically, a specific or selective reaction will be at least twice background signal or noise and more typically more than 10 to 100 times background. For example, an hda protein includes a protein that binds to polyclonal antibodies generated against the prototype protein of SEQ ID NO: 4. [0097] Methods and conditions for culturing E. coli bacterial cells (e.g., the RF-1 deficient E. coli cells) to express a recombinant protein are well known. For example, Rosano, G. and Ceccarelli, E. Front. Microbiol., 16 April 2014, Sec. Microbiology, Vol.5-2014, available at doi.org/10.3389/fmicb.2014.00172.
Methods of introducing mutations to E. coli [0098] In some embodiments, the gene modifications, e.g., the knock-out of RF1 or the stop codon mutations to the coding sequences of the genes of interest, or the gain of function point mutations in RF2, can be performed with a site-specific recombination. Site-specific recombination uses enzymes possessing both endonuclease activity and ligase activity and the enzymes recognize a certain part of DNA sequences and replace it with any other corresponding DNA sequences, see Yang W. and Mizuuchi K., Structure, 1997, Vol. 5, 1401-1406(9). Site- specific recombination systems are well known in the art, including, e.g., Int/att system from bacterio ^ phage, Cre/LoxP system from PI bacteriophage, and FLP-FRT system from yeast. For instance, site-specific integration into bacterial chromosomes has been reported (see, e.g., Fukushige et al., Proc. Natl. Acad. Sci., 89. 7905-7907 (1992); Baubonis et al., Nucleic Acids Research. 21, 2025-2029 (1993); Hasan et al., Gene, 150. 51-56 (1994)). Specific deletions of chromosomal sequences and rearrangements have also been engineered, and excision of foreign DNA as a plasmid from ^ vectors is presently possible (see, e.g., Barinaga, Science. 265, 27-28 (1994); Sauer, Methods in Enzymol.225.890-900 (1993)). Cloning schemes have been generated so that recombination either reconstitutes or inactivates a functional transcription unit by either deletion or inversion of sequences between recombination sites (see, e.g., Odell et al., Plant Phvsiol.106.447-458 (1994); Gu et al., Cell.73.1155-1164 (1993)). [0099] Genes encoding the Cre or Flp recombinases can be provided in trans under the control of either constitutive or inducible promoters, or purified recombinase has been introduced (see, e.g., Baubonis et al., supra; Dang et al., Develop. Genet.13, 367-375 (1992); Chou et al., Genetics. 131.643-653 (1992); Morris et al., Nucleic Acids Res. 19. 5895-5900 (1991)). [0100] In some embodiments, the genomic manipulations disclosed herein are performed with a modified site-specific recombination protocol from Kirill A. Datsenko and Barry L. Wanner Proc Natl Acad Sci U. S. A.2000 Jun 6; 97(12): 6640–6645. In one embodiment, knocking out a gene for example, RF1 or fabR, can be performed as follows. A PCR amplicon was generated comprising an antibiotic resistance gene flanked by two FRT sites and homology extensions, which are homologous to the two ends of the gene to be knocked out. After transforming cells with this PCR product, the gene to be knocked out is then replaced by the antibiotic resistance gene through Red-mediated recombination in these flanking homology regions. After selection,
the resistance gene can be eliminated using a helper plasmid expressing the FLP recombinase, which acts on the directly repeated FRT (FLP recognition target) sites flanking the resistance gene. The Red and FLP helper plasmid can be simply cured by growth at 37 ºC because they are temperature-sensitive replicons. Knocking-in a gene, if needed, can be performed by standard molecular cloning techniques that are well known for one skilled in the art. [0101] In certain embodiments, the nucleic acid modification is introduced by a (modified) CRISPR/Cas complex or system. In certain embodiments, the CRISPR/Cas system or complex is a class 2 CRISPR/Cas system. In certain embodiments, said CRISPR/Cas system or complex is a type II, type V, or type VI CRISPR/Cas system or complex. The CRISPR/Cas system does not require the generation of customized proteins to target specific sequences but rather a single Cas protein can be programmed by an RNA guide (gRNA) to recognize a specific nucleic acid target, in other words, the Cas enzyme protein can be recruited to a specific nucleic acid target locus (which may comprise or consist of RNA and/or DNA) of interest using said short RNA guide. [0102] In general, the CRISPR/Cas or CRISPR system is as used herein foregoing documents refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene and one or more of, a tracr (trans-activating CRISPR) sequence (e.g., tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), or “RNA(s)” as that term is herein used (e.g., RNA(s) to guide Cas, such as Cas9, e.g., CRISPR RNA and, where applicable, transactivating (tracr) RNA or a single guide RNA (sgRNA) (chimeric RNA)) or other sequences and transcripts from a CRISPR locus. In general, a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system). In the context of formation of a CRISPR complex, “target sequence” refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex. A target sequence may comprise any polynucleotide, such as DNA or RNA polynucleotides.
[0105] In certain embodiments, the gRNA is a chimeric guide RNA or single guide RNA (sgRNA). In certain embodiments, the gRNA comprises a guide sequence and a tracr mate sequence (or direct repeat). In certain embodiments, the gRNA comprises a guide sequence, a tracr mate sequence (or direct repeat), and a tracr sequence. In certain embodiments, the CRISPR/Cas system or complex as described herein does not comprise and/or does not rely on the presence of a tracr sequence (e.g., if the Cas protein is Cas12a). [0106] The Cas protein as referred to herein, such as but not limited to Cas9, Cas12a (formerly referred to as Cpf1), Cas12b (formerly referred to as C2c1), Cas13a (formerly referred to as C2c2), C2c3, Cas13b protein, may originate from any suitable source and hence may include different orthologues, originating from a variety of (prokaryotic) organisms, as is well documented in the art. In certain embodiments, the Cas protein is (modified) Cas9, preferably (modified) Staphylococcus aureus Cas9 (SaCas9) or (modified) Streptococcus pyogenes Cas9 (SpCas9). In certain embodiments, the Cas protein is Cas12a, optionally from Acidaminococcus sp., such as Acidaminococcus sp. BV3L6 Cpf1 (AsCas12a) or Lachnospiraceae bacterium Cas12a , such as Lachnospiraceae bacterium MA2020 or Lachnospiraceae bacterium MD2006 (LBCas12a). See U.S. Pat. No. 10,669,540, incorporated herein by reference in its entirety. Alternatively, the Cas12a protein may be from Moraxella bovoculi AAX08_00205 [Mb2Cas12a] or Moraxella bovoculi AAX11_00205 [Mb3Cas12a]. See WO 2017/189308, incorporated herein by reference in its entirety. In certain embodiments, the Cas protein is (modified) C2c2, preferably Leptotrichia wadei C2c2 (LwC2c2) or Listeria newyorkensis FSL M6-0635 C2c2 (LbFSLC2c2). In certain embodiments, the (modified) Cas protein is C2c1. In certain embodiments, the (modified) Cas protein is C2c3. In certain embodiments, the (modified) Cas protein is Cas13b. Other Cas enzymes are available to a person skilled in the art. Methods of using CRISPR/Cas system to eliminate gene expression are well known and also described in e.g., US. Pat. Pub. No. 2014/0170753, the disclosure of which hereby is incorporated by reference in its entirety. [0103] Additional methods of knocking out a target gene include, but are not limited to, homologous recombination technology, transcription activation of the effector nuclease (Transcription Activator-Like Effector Nuclease, TALEN) technology, a zinc finger nuclease (Zinc-Finger Nuclease, ZFN). These methods are also well known in the art.
Vectors and Promoters [0104] A nucleic acid encoding a protein of interest as disclosed herein can be inserted into a replicable vector for expression in the E. coli under the control of a suitable prokaryotic promoter. Many vectors are available for this purpose and one of skilled in the art can readily determine the selection of appropriate vector. Besides the gene of interest, a vector typically comprises one or more of the following: a signal sequence, an origin of replication, one or more maker genes and a promoter. [0105] A promoter disclosed herein may comprise any appropriate promoter sequence suitable for a eukaryotic or prokaryotic host cell, which shows transcriptional activity, including mutant, truncated, and hybrid promoters, and may be obtained from polynucleotides encoding extracellular or intracellular polypeptides either endogenous (native) or heterologous (foreign) to the cell. The promoter may be a constitutive or an inducible promoter. In some embodiments, the promoter is a constitutive promoter. In some embodiments, the promoter is an inducible promoter. Suitable prokaryotic promoters useful for practice of this invention include, but not limited to, the promoters of Pc0, PL59, MTL, ParaBAD, lac, T3, T7, lambda Pr'P1', trp, the spc ribosomal protein operon promotor Pspc, the ȕ-lactamase gene promotor Pbla of plasmid pBR322, the PL promoter of phage ^, the replication control promoters PRNAI and PRNAII of plasmid pBR322, the P1 and P2 promoters of the rrnB ribosomal RNA operon, the tet promoter, and the pACYC promoter. Tetracycline-regulated transcriptional modulators and CMV promoters are described in WO 96/01313, U.S. Pat. Nos.5,168,062 and 5,385,839, the entire disclosures of which are incorporated herein by reference. [0106] In some embodiments, the promoters may have different strength in terms of the amount of transcripts it can produce. Promoters can be a medium strength promoter, weak strength promoter and strong promoter. The strength of a promoter can be measured as the amount of transcription of a gene product initiated at that promoter, relative to a suitable control. For constitutive promoters directing expression of a gene product in an expression construct, a suitable control could use the same expression construct, except that the “wild type” version of the promoter, or a promoter from a “housekeeping” gene, is used in place of the promoter to be tested. [0107] In some embodiments, the promoter strength is determined by measuring the amount of transcripts from the promoter as compared to a control promoter. For example, host cells
containing an expression construct with the promoter to be tested (‘test host cells”) and control host cells containing a control expression construct, can be grown in culture in replicates. The total RNA of the host cells and controls can be extracted and measured by absorbance at 260 nm. cDNA can then be synthesized from the equal amount of total RNA from the test host cells and the control host cells. RT-PCR can be performed to amplify the cDNA corresponding to the transcript produced from the promoter. An exemplary method is described in De Mey et al. ("Promoter knock-in: a novel rational method for the fine tuning of genes", BMC Biotechnol.2010 Mar 24; 10:26). [0108] In some embodiments, the various transgenes are expressed in the E. coli under the control of promoters of different strength to regulate proper production of the recombinant proteins. This is useful for maintaining an oxidative cytoplasm in the bacteria and maximize protein yield. In some embodiments, a strong promoter T7 is used to drive the expression of the protein of interest to ensure maximal yield. In some embodiments, the E. coli strain expresses a recombinant T7 polymerase under the control of the paraBAD promoter, which allows tight regulation and control of the protein of interest, e.g., through the addition or absence of arabinose. Guzman et al., J. Bacteriol. July 1995177 (14): 4121-4130. [0109] Optionally, clones of the E. coli carrying the desired modifications as disclosed herein can be selected by limited dilution. Optionally, these clones can be sequenced to confirm that the desired mutations are present in various genes, e.g., the coding sequences for hda, lpxK, coaD, lolA, mreC, murF, and hemA. In some cases, whole genome sequencing can be performed to determine the location of the insertion or mutation in the chromosomes. Measuring the activities of the mutant proteins [0110] In some embodiments, a mutation introduced in one or more of the genes, e.g., RF1 or fabR, does not abolish protein expression of RF1 or fabR, but results in a mutein that lacks the activity that the corresponding wild-type protein possesses, e.g., the activity of RF1 in catalyzing translational termination from a ribosomal complex stalled at the amber codon. In some embodiments, an RF1 deficient E. coli cell disclosed herein is produced using other genetic engineering methods to reduce the endogenous RF1 protein activity, including but not limited to, 1) replacing the endogenous RF1 promoter with a promoter with weaker promoter activity to reduce the transcription of the RF1 gene; or 2) replacing the endogenous RF1 ribosomal binding
sites with an attenuated ribosomal binding site to reduce the RF1 transcription. It is understood by one of skill in the art that sometimes knocking out a gene does not require completely abolishing its activity, but resulting in a mutein (e.g., an RF-1 mutein) that possess 50% or less, 40% or less, 30% or less, 20% or less, 15% or less, 10% or less, 5% or less, or 0% of the activity of the corresponding wild type protein (e.g., the wild type RF-1 protein). The various muteins generated can be tested to confirm the extent of the loss of the activity of the wild-type protein. For example, each of the coding sequences for the muteins can be separately expressed in a host strain, and the muteins are purified and tested for their activities as described below. [0111] Likewise, in some embodiments, a mutation, such as an RF2 protein containing a T246A substitution, which confers increased RF2 activity, for example at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70% increase in activity compared to the corresponding wild-type RF2 protein. The various mutant proteins (“muteins”) generated can be tested to confirm that they increase the activity of the wild-type RF2 protein. For example, each of the coding sequences for the muteins can be separately expressed in a host strain, and the muteins are purified and tested for their activities as described below. [0112] The activity of RF1 and RF2 can be assayed using a peptidyl-tRNA hydrolysis assay. This assay measures the rate at which release factors can catalyze translational termination from a ribosomal complex stalled at different stop codons. An example can be seen in RNA (2010), 16:1623–1633. FabR activity can be tested using a gel shift assay that measures the binding of this transcriptional regulator to its cognate promoter DNA binding sequence. See for instance Mol Microbiol.2011 April; 80(1): 195–218. doi:10.1111/j.1365-2958.2011.07564.x. Measuring expression level [0113] Various methods can be used to determine protein expression level of the various modified genes in the E. coli, and/or confirm whether a gene (for example, RF1) has been knocked out or inserted. For example, expression of a gene can be determined by conventional Northern blotting to quantitate the transcription of mRNA. Various labels may be employed, most commonly radioisotopes. However, other techniques may also be employed, such as using biotin- modified nucleotides for introduction into a polynucleotide. The biotin then serves as the site for binding to avidin or antibodies, which may be labeled with a wide variety of labels, such as radionuclides, fluorescers, enzymes, or the like.
[0114] In some embodiments, the expressed protein can be purified and quantified using gel electrophoresis (e.g., PAGE), Western analysis or capillary electrophoresis (e.g., Caliper LabChip). Protein synthesis in cell-free translation reactions may be monitored by the incorporation of radiolabeled amino acids, typically, 35S-labeled methionine or 14C-labeled leucine. Radiolabeled proteins can be visualized for molecular size and quantitated by autoradiography after electrophoresis or isolated by immunoprecipitation. The incorporation of recombinant His tags affords another means of purification, i.e., purification by Ni2+ affinity column chromatography. Protein production from expression systems can be measured as soluble protein yield or by using an assay of enzymatic or binding activity. [0115] In some embodiments, if the protein to be quantified possesses defined biological activity, for example, enzymatic activity (such as alkaline phosphatase) or growth inhibition activity, the expression of the protein of interest can be confirmed by assaying its activity by incubating with proper substrates. [0116] Similar methods can also be used to measure the expression level of a protein of interest (e.g., an NNAA-containing protein) in the RF1-deficient E. coli cells as disclosed herein. Kits [0117] This disclosure also provides kits that comprise RF1-deficient E. coli cells disclosed above and herein, a cell growth media, a plasmid encoding a protein of interest as disclosed herein and/or instructions for use. In some embodiments, the coding sequence for the protein of interest comprises a mutation that converts a sense codon to an amber, opal, or ochre stop codon, which can charge an NNAA to its cognate tRNA by a tRNA synthetase instead of terminating translation. [0118] In some embodiments, the kit further comprises one plasmid encoding an aminoacyl- tRNA synthetase (RS) preferentially aminoacylates an NNAA and one plasmid encoding a tRNA that can be specifically charged with said NNAA. In some embodiments, the kit further comprises one plasmid encoding an aminoacyl-tRNA synthetase (RS) preferentially aminoacylates pAMF and one plasmid encoding a tRNA that can be specifically charged with p- azidomethylphenylalanine. In some embodiments, the kit further comprises a plasmid encoding an aminoacyl-tRNA synthetase (RS) preferentially aminoacylates the NNAA (e.g., pAMF) and a tRNA that can be specifically charged with said NNAA (for example, p- azidomethylphenylalanine). In some embodiments, the plasmid is a multicistronic expression
cassette comprising one copy of the RS and three copies of the tRNA, i.e., the plasmid uses a single promoter to produce one transcript encoding one copy of the RS and three copies of the tRNA. [0119] In some embodiments, the kit may further comprise one or more reagents necessary for preparation an RF1-deficient E. coli cell of the disclosure. Such a kit may comprise one or more reagents necessary for one or more of: 1) knocking out RF1, 2) introducing the gain-of function mutation in RF2, 3) introducing one or more stop codon mutations to the coding sequence of hda, lpxK, coaD, lolA, mreC, murF, hemA, sucB, atpE, fabH, and ubiF, or 4) knocking out fabR of a host E. coli cell. A kit may comprise agents necessary for improving the growth of the modified E. coli cells as disclosed above. Exemplary embodiments [0120] This disclosure includes the following non-limiting embodiments: [0121] Embodiment 1 is an RF1-deficient E. coli cell comprising at least one stop codon mutation from TAG to a non-TAG stop codon, a functional release factor 2 (RF2), and an oxidative cytoplasm, wherein the functional RF2 has greater RF2 activity than a control. [0122] Embodiment 2 is the RF1-deficient E. coli cell of embodiment 1, the number of stop codon mutations is no greater than 20. [0123] Embodiment 3 is the RF1-deficient E. coli cell of embodiment 1, the number of stop codon mutations is in the range of between 2 and 10. [0124] Embodiment 4 is an RF1-deficient E. coli cell, in which at least one of the coding sequences selected from the group consisting of hda, lpxK, coaD, lolA, mreC, murF, and hemA comprises a non-TAG stop codon, and wherein the cell has increased RF2 activity or expression as compared to a control E. coli cell. [0125] Embodiment 5 is the RF1-deficient E. coli cell of embodiment 4, wherein 2 to 7 of the coding sequences comprises non-TAG stop codons. [0126] Embodiment 6 is the RF1-deficient E. coli cell of embodiment 4, wherein the non-TAG stop codon is due to a genetic modification of the stop codon in the corresponding wild-type coding sequence.
[0127] Embodiment 7 is the RF1-deficient E. coli cell of any one of embodiments 2-6, wherein the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA comprise at least the last 10 nucleotides of SEQ ID NO: 5, 7, 9, 11, 13, 15, 17, respectively. [0128] Embodiment 8 is the RF1-deficient E. coli cell of any of above embodiments, wherein the stop codons of hda, lpxK, coaD, lolA, mreC, murF, and hemA comprise a non-TAG stop codon. [0129] Embodiment 9 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell has an oxidative cytoplasm. [0130] Embodiment 10 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell is a K-12 E. coli cell. [0131] Embodiment 11 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell has increased expression of the RF2 polypeptide or increased transcription of the RF2 gene as compared to the control E. coli cell. [0132] Embodiment 12 is the RF1-deficient E. coli cell of embodiment 11, wherein the RF2 comprises a T246X mutation as compared to SEQ ID NO: 2. [0133] Embodiment 13 is the RF1-deficient E. coli cell of embodiment 12, wherein T246X is T246A. [0134] Embodiment 14 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell further comprises a stop codon mutation in one or more of sucB, atpE, fabH, and ubiF coding sequences. [0135] Embodiment 15. The RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell further comprises a ǻfabR mutation. [0136] Embodiment 16 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell further expresses an aminoacyl-tRNA synthetase. [0137] Embodiment 17 is the RF1-deficient E. coli cell of embodiment 13, wherein the aminoacyl-tRNA synthetase recognizes the TAG stop codon.
[0138] Embodiment 18 is the RF1-deficient E. coli cell of embodiment 16 or 17, wherein the aminoacyl-tRNA synthetase preferentially aminoacylates to a degree of greater than 90% a tRNA with a non-natural amino acid as compared to the 20 common naturally occurring amino acids. [0139] Embodiment 19 is the RF1-deficient E. coli cell of embodiment 18, wherein the non- natural amino acid is para-azido-methyl-L-phenylalanine (pAMF). [0140] Embodiment 20 is the RF1-deficient E. coli cell of any of the previous embodiments, wherein the cell further comprises a gene encoding a protein of interest. [0141] Embodiment 21 is the RF1-deficient E. coli cell of embodiment 20, wherein the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy chain, a cytokine, a cytokine fragment, and an immunogenic polypeptide. [0142] Embodiment 22 is the RF1-deficient E. coli cell of embodiment 21, wherein the antibody is a monoclonal antibody. [0143] Embodiment 23 is the RF1-deficient E. coli cell of embodiment 21 or 22, wherein the antibody is an IgA, an IgD, an IgE, an IgG, or an IgM. [0144] Embodiment 24 is the RF1-deficient E. coli cell of any one of embodiments 21-23, wherein the antibody is humanized or human. [0145] Embodiment 25 is the RF1-deficient E. coli cell of any one of embodiments 21-24, wherein the antibody is aglycosylated. [0146] Embodiment 26 is the RF1-deficient E. coli cell of embodiment 21, wherein the antibody fragment is selected from an Fv fragment, a Fab fragment, a F(ab')2 fragment, a Fab' fragment, an scFv (sFv) fragment, and an scFv-Fc fragment. [0147] Embodiment 27 is the RF1-deficient E. coli cell of embodiment 21, wherein the antibody light chain is a light chain of an anti-HER2 antibody. [0148] Embodiment 28 is the RF1-deficient E. coli cell of embodiment 21, wherein the immunogenic polypeptide is a carrier protein.
[0149] Embodiment 29 is the RF1-deficient E. coli cell of embodiment 28, wherein the carrier protein comprises at least one T-cell activating epitope from a protein selected from the group consisting of Corynebacierium diphiheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D, and CRM197. [0150] Embodiment 30 is the RF1-deficient E. coli cell of embodiment 28 or 29, wherein the carrier protein comprises an amino acid sequence having at least 80% sequence identity to SEQ ID NO:33. [0151] Embodiment 31 is the RF1-deficient E. coli cell of embodiment 21, wherein the cytokine is selected from the group consisting of interleukins, interferons, transforming growth factors, and chemokines. [0152] Embodiment 32 is the RF1-deficient E. coli cell of any one of embodiments 20-31, wherein the protein of interest comprises one or more non-natural amino acids (NNAAs). [0153] Embodiment 33 is the RF1-deficient E. coli cell of embodiment 25, wherein the one or more NNAAs is selected from the group consisting of p-acetyl-L-phenylalanine, O-methyl-L- tyrosine, an -3-(2-naphthyl)alanine, 3-methyl-phenylalanine, O-4-allyl-L-tyrosine, 4-propyl-L- tyrosine, a tri O-acetyl-GlcNAcȕ-serine, L-Dopa, fluorinated phenylalanine, isopropyl-L- phenylalanine, p-azido-L-phenylalanine, p-azido-methyl-L-phenylalanine, p-acyl-L- phenylalanine, p-benzoyl-L-phenylalanine, L-phosphoserine, phosphonoserine, phosphonotyrosine, p-iodo-phenylalanine, p-bromophenylalanine, p-amino-L-phenylalanine, isopropyl-L-phenylalanine, and p-propargyloxy-phenylalanine. [0154] Embodiment 34 is the RF1-deficient E. coli cell of embodiment 32, wherein the one or more NNAAs is p-azido-methyl-L-phenylalanine. [0155] Embodiment 35 is the RF1-deficient E. coli cell of embodiment 20, wherein the gene encoding the protein of interest is operably linked to an inducible promoter. [0156] Embodiment 36 is the RF1-deficient E. coli cell of embodiment 35, wherein the inducible promoter is a T7 promoter.
[0157] Embodiment 37 is the RF1-deficient E. coli cell of any one of embodiments 1-36, wherein the RF-1 deficient E. coli cell possess 20% or less of the RF-1 activity as compared to a control E. coli cell. [0158] Embodiment 38 is a kit comprising the RF1-deficient E. coli cell of any of embodiments 4-36, wherein the kit further comprises a bacteria growth medium. [0159] Embodiment 39 is the kit of embodiment 37, wherein the kit further comprises a plasmid encoding a protein of interest. [0160] Embodiment 40 is the kit of embodiment 38 or 39, wherein the kit further comprises a plasmid encoding an aminoacyl-tRNA synthetase (RS) specific for pAMF and a tRNA specific for p-azidophenylalanine. [0161] Embodiment 41 is a method for expressing a soluble, recombinant protein in an RF1- deficient E. coli bacterial cell comprising the steps of: culturing the RF1-deficient E. coli bacterial cell and an expression cassette for expressing the recombinant protein, wherein the coding sequences for one or more or all of hda, lpxK, coaD, lolA, mreC, murF, and hemA in the RF-1 deficient E. coli cell comprise non-TAG stop codons, and wherein the RF1-deficient E. coli cell has increased RF2 activity or expression as compared to a control E. coli cell. [0162] Embodiment 42 is the method of embodiment 41, wherein the RF1-deficient E. coli bacterial cell comprises an oxidative cytoplasm. [0163] Embodiment 43 is the method of embodiment 41, wherein the number of the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA that comprise non-TAG stop codons is 2, 3, 4, 5, 6, or 7. [0164] Embodiment 44 is a method for expressing a protein of interest comprising culturing the RF1-deficient E. coli bacterial cell of any one of embodiments 1-36, wherein the RF1- deficient E. coli bacterial cell comprises an expression cassette comprising a coding sequence for the protein of interest. [0165] Embodiment 45 is the method of embodiment 41, wherein the recombinant protein comprises one or more NNAAs.
[0166] Embodiment 46 is the method of embodiment 41, wherein the stop codons are non-TAG stop codons due to genetic modifications of the stop codons in the wild-type coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA. [0167] Embodiment 47 is the method of embodiment 41 or 46, wherein the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA polypeptides comprise polynucleotide sequences of SEQ ID NO: 5, 7, 9, 11, 13, 15, and 17, respectively. [0168] Embodiment 48 is the method of any one of embodiments 41-47, wherein the cell further comprises the aminoacyl-tRNA synthetase. [0169] Embodiment 49 is the method of embodiment 48, wherein the aminoacyl-tRNA synthetase recognizes the TAG stop codon. [0170] Embodiment 50 is the method of any one of embodiments 41-49, wherein the RF1- deficient E. coli cell contains an oxidative cytoplasm. [0171] Embodiment 51 is the method of any one of embodiments 41-50, wherein the RF1- deficient E. coli cell is a K-12 cell. [0172] Embodiment 52 is the method of any one of embodiments 41-51, wherein the RF1- deficient E. coli cell comprises a T246A mutation in the RF2 coding sequence. [0173] Embodiment 53 is the method of any one of embodiments 41-52, wherein the RF1- deficient E. coli cell comprises a stop codon mutation in one or more of sucB, atpE, fabH, and ubiF coding sequences. [0174] Embodiment 54 is the method of any one of embodiments 41-53, wherein the RF1- deficient E. coli cell further comprises a ǻfabR mutation. [0175] Embodiment 55 is the method of any one of embodiments 41-54, wherein the RF1- deficient E. coli strain further expresses an aminoacyl-tRNA synthetase. [0176] Embodiment 56 is the method of embodiment 55, wherein the aminoacyl-tRNA synthetase preferentially aminoacylates to a degree of greater than 90% a tRNA with a non- natural amino acid as compared to the twenty common naturally occurring amino acids.
[0177] Embodiment 57 is the method of any one of embodiments 55-56, wherein the aminoacyl-tRNA synthetase recognizes the TAG stop codon. [0178] Embodiment 58 is the method of any one of embodiments 44-57, wherein the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy chain, a cytokine, a cytokine fragment, and an immunogenic polypeptide. [0179] Embodiment 59 is the method of embodiment 58, wherein the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, and an antibody heavy chain. [0180] Embodiment 60 is the method of embodiment 58 or 59, wherein the antibody is a monoclonal antibody. [0181] Embodiment 61 is the method of any one of embodiments 58-60, wherein the antibody is an IgA, an IgD, an IgE, an IgG, or an IgM. [0182] Embodiment 62 is the method of any one of embodiments 58-61, wherein the antibody is humanized or human. [0183] Embodiment 63 is the method of any one of embodiments 58-62, wherein the antibody is saglycosylated. [0184] Embodiment 64 is the method of embodiment 58 or 59, wherein the antibody fragment is selected from an Fv fragment, a Fab fragment, a F(ab’)2 fragment, a Fab’ fragment, an scFv (sFv) fragment, and an scFv-Fc fragment. [0185] Embodiment 65 is the method of embodiment 58, wherein the immunogenic polypeptide is a carrier protein. [0186] Embodiment 66 is the method of embodiment 65, wherein the carrier protein comprises at least one T-cell activating epitope from a protein selected from the group consisting of Corynebacierium diphiheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D, and CRM197. [0187] Embodiment 67 is the method of embodiment 65, wherein the carrier protein comprises an amino acid sequence having at least 80% sequence identity to SEQ ID NO: 33.
[0188] Embodiment 68 is the method of embodiment 58, wherein the cytokine is selected from the group consisting of interleukins, interferons, transforming growth factors, and chemokines. [0189] Embodiment 69 is the method any one of embodiments 41-64 wherein the gene encoding the protein of interest is operably linked to an inducible promoter. [0190] Embodiment 70 is the method of embodiment 69, wherein the inducible promoter is a T7 promoter. [0191] It is understood that the examples and embodiments described herein are for illustrative purposes only, and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, sequence accession numbers, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes. EXAMPLES Example 1. Production Of Oxidizing Rf1 KO Strain [0192] Knock-ins, deletions and site directed mutagenesis were produced using lambda red homologous recombination. [0193] Table 1 shows the RF1 strains with oxidative cytoplasm produced during this work. Table 1: RF1 strains with oxidative cytoplasm
Example 2. Production of plasmids for expression of pAMF containing proteins in SBDG419 and RF1 KO derivatives [0194] Non-natural amino acid (NNAA) containing protein production requires the concurrent synthesis of the NNAA containing protein, amber suppressor tRNA and a mutant aminoacyl-tRNA synthetase. To produce the plasmid for expression of trastuzumab LC with a nonnatural amino acid at position E213, pJ411 trastuzumab LC E213TAG was generated. To produce this plasmid, the LC gene was cloned into an operon with a T7 promoter and T7 terminator. The codons for residue 213 were mutated to TAG to specify the location of the NNAA. This plasmid has a high copy pUC origin of replication and contains the gene for Kanamycin resistance. Plasmid sequences were verified by cloning. [0195] Contranslational p-azidomethylphenylalanine (pAMF), incorporation also requires expression of an amber suppressor tRNA orthogonal to natural E. coli aminoacyl-tRNA synthetases (AAtRS) and an orthogonal pAMF AAtRS that specifically recognizes the amber suppressor tRNA and the pAMF NNAA. To generate the RS/tRNA plasmid, the coding sequence for an aminoacyl-tRNA synthetase (RS) preferentially aminoacylates pAMF and the coding sequence for a tRNA that can be specifically charged with p-azidomethylphenylalanine (pAMF) were cloned into a multicistronic expression cassette. One copy of the RS coding sequence and three copies of the pAzF (p-azidophenylalanine) tRNA coding sequence were cloned in a dual promoter system consisting of an inducible T7 promoter followed by a constitutive Pc0 promoter. The vector had a medium copy (p15a origin) with a b-lactamase selection marker. Both the origin and marker are compatible with pJ411. Plasmid sequences were verified by cloning. Example 3. Shake flask production of trastuzumab light chain containing 1 pAMF non- natural amino acid in E. coli strains lacking release factor 1 [0196] To assess Amber suppression efficiency, the expression of a trastuzumab light chain (LC) construct containing 1 Amber (TAG) codon at position E213 (LC E213 TAG) was tested in ǻRF1
Snuggle E. coli strains. Three copies of an operon containing the aminoacyl-tRNA synthetase (RS) and tRNA specific for para-azidomethylphenylalanine (pAzMeF) and driven by a strong constitutive promoter were integrated onto the genome of both the wild-type RF1 and ǻRF1 Snuggle strain. This allowed for expression of LC E213 pAMF using a single plasmid with the LC E213TAG gene. To generate the product plasmid, the coding sequence for LC E213 TAG was cloned into a high copy (pUC origin) plasmid with a kanamycin (Kan) selection cassette behind a T7p and strong RBS. A single copy of the para-azidophenylalanine tRNA was inserted after the coding sequence for LC E213 TAG and a short spacer sequence. [1] The E. coli strain for expression of LC E213 TAG was generated by transforming strain 711 or 713 (ǻRF1) with the product plasmid. Single colonies were grown in overnight seed cultures at 37oC in Terrific Broth (TB) containing 50 ^g/mL kanamycin (TB +Kan). The next day, seed cultures were diluted 1:40 into larger expression cultures (25-250 mL) of fresh TB +Kan and grown at 37oC until they reached an OD600 of 1.2-1.5. At that time, protein expression was induced by adding arabinose to a final concentration of 0.2%, pAzMeF was added to a final concentration of 4 mM, and the temperature was lowered to 25oC. Cultures were allowed to express for 18-20 hours. Cells were then harvested by centrifugation at 6000g for 10 minutes. Cells were resuspended in bacterial protein extraction reagent (B-PER) + 0.2 mg/mL lysozyme (lysis buffer) at a ratio of 1 g dry cell weight per mL lysis buffer, incubated on ice for 20-30 minutes, then sonicated to lyse. Lysates were centrifuged at 30,000g for 30 minutes, and the supernatants were purified in a 96 well format on a Biomek liquid handling robot using CaptureSelect KappaXL PhyTip columns following a standard protocol. Elution fractions were pooled then subsequently analyzed via intact LC-MS. Example 4. Expression and cell lysis for E213 pAMF-LC produced in E. coli RF1 KO with high density fermentations [0197] The fermentation process began by taking a 2 mL vial of the cell bank and inoculating a shake flask with I17-SF Shake Flask Media containing an added 24 g/L Bacto Yeast Extract, 50 ^g/mL of Kanamycin and 100 ^g/mL of Carbenicillin at about 8% (v/v) seeding density. Once an Optical Density measurement at wavelength of 595 nm (OD 595 nm) of 2-4 was reached, the flask culture was used to inoculate a 500 mL bioreactor at a seeding density of 5% (v/v) in batched media, which consists of 50 ^g/mL of Kanamycin, 100 ^g/mL of Carbenicillin, 0.05% (v/v) A204
antifoam and 2% (v/v) 5x I17 Media + 120 g/L Bacto Yeast Extract in DI H2O. Tables 2 and 3 describe the components of I17-SF Media, and Tables 4, 5 and 6 describe the components of 5x I17 Media. Table 2: Components of I17-SF Shake Flask Media Solution
Table 3: Components of 10x Base Salts for I17-SF Shake Flask Media Solution
Table 4: Components of 5x I17 Media Solution
Table 5: Components of concentrated stock solution of vitamins for 5x I17 Media Solution
Table 6: Components of concentrated stock solution of trace metals for 5x I17 Media Solution
[0198] The bioreactor temperature, dissolved oxygen and pH setpoints were 37° C, 30% and 7, respectively. Once the cells grew to an OD 595 nm between 3-5 in the batch phase, the fed batch phase began by feeding 5x I17 Media + 120 g/L Bacto Yeast Extract at an exponential rate of 0.2 h-1. The feed rate was adjusted to ensure that all glucose was depleted prior to induction. After 18 hours of the fed-batch phase, the temperature of the bioreactor was decreased to 25°C and the exponential feed rate was decreased to 0.02 h-1. An hour later, the induction phase began by adding pAMF to a target concentration of 2 mM based on the culture volume of the bioreactor before induction and L-Arabinose to a target concentration of 4 g/L based on the starting volume of the bioreactor. Induction phase took 24-48 hours before the harvest. At the end of the fermentation, the culture was collected and centrifuged at 18,592 xG and 2-8° C for 30 min in a floor centrifuge. The supernatant was discarded, and the cell pellets were resuspended and washed with S30 Buffer at a concentration of 16.67% (w/w; Wet weight of cell/weigh of solution) and centrifuged again with the same conditions used in the initial harvest step. Table 7 describes the components of S30 Buffer. Table 7: Components of S30 Buffer
[0199] After the wash, the supernatant was discarded, and the cells were resuspended with S30- 5 Buffer at a concentration of 16.67% (w/w). Table 8 shows the components of the S30-5 Buffer. Table 8: Components of S30-5 Buffer
[0200] The cell resuspension was then passed twice through an Avestin Homogenizer (EmulsiFlex-C5) at 14,000 Psi and 3,500 Psi to disrupt the cells and generate the crude lysate. The crude lysate was further clarified by centrifuging at 18,000-30,000 xG and 2-8° C for 30 minutes in a floor centrifuge. The supernatant was collected and centrifuged once more at 18,000-30,000 xG and 2-8° C for 30 minutes, and then the clarified lysate was aliquoted, flash frozen in liquid nitrogen and stored at -80°C. Example 5. Analysis of trastuzumab light chain containing non-natural amino acid at residue 213 expressed in RF1 WT and ǻRF1 E. coli cells [0201] Intact LC-MS was performed on an Agilent 6520 QTOF mass spectrometer coupled to an Agilent 1200 series HPLC. Prior to analysis, the QTOF MS was calibrated in Extended Dynamic Range mode (2 GHz) in Standard (3200 m/z) range. For each sample, 10-15 pmol of protein was separated over a reverse phase column prior to introduction into the MS source. The HPLC mobile phase consisted of 0.1% formic acid in H2O (A) and 0.1% formic acid in acetonitrile (B). Proteins were separated using a gradient method starting at 10% B and increasing to 95% B over 10 minutes. After separation, proteins were analyzed by the QTOF MS operating in positive ion MS (Seg) mode in a mass range of 500-3200 m/z with a scan rate of 1 spectra per second. Data was analyzed in Agilent MassHunter Qualitative analysis software. Proteins were identified by the existence of a peak in the total ion chromatogram. Mass spectra for the LC E213 TAG peak were deconvoluted using the Maximum Entropy algorithm with a mass range of 10,000-60,000 Daltons and a 1.00 Dalton mass step. Truncated, full length, and light chain dimer peak identities were confirmed by comparing peak mass to the predicted protein masses that had been calculated using gpmaw3. The peaks for each species in the deconvoluted mass spectra were integrated and used to calculate the percent truncated and full-length species. Percent truncated species was calculated according to the following formula: ^^^^ ^^ ^^^^^^^^^ ^^^^ ^^^ ^^ ^^^ ^^^^ ^^^^^ כ 100 where the sum of all peak areas is equal to the sum of the truncated peak area, the full-length peak area, and two times the dimer peak area. [0202] In wild-type E. coli Snuggle cells (bearing the prfA gene), Amber suppression at position 213 is very poor, resulting in high levels of truncation at the E213 TAG stop codon. Because the truncated protein lacks only 2 amino acids, chromatographic separation of the full length and
truncated species would be extremely difficult or would likely result in very low sample recovery. Expression of the same construct in E. coli Snuggle ǻprfA cells resulted in nearly complete Amber suppression, with only 5% truncated species detected during LC-MS analysis of the purified protein from shake flask and only 6.5% truncated protein from the fermenter. This represent a 91% reduction of this impurity in shake flasks and 83% reduction of this impurity from fermentation from RF1 removal. The data is shown in Table 9. This demonstrates the clear improvement in amber suppression for the strain lacking RF1 leading to a much higher percentage of the intended product facilitating downstream processes such as antibody production. The amount of truncated LC would make it infeasible to use position 213 for NNAA incorporation with RF1+ cells. Hopefully this makes it clear site 213 is only accessible for PFLC production if we use the RF1 KO strain. Table 9: Shake flask experiments were performed in duplicate (two unique colonies). Fermenter data is from a single replicate.
Example 6. Shake flask production of trastuzumab light chain containing pAcPhe non- natural amino acid in E. coli strains lacking release factor 1 [0203] To assess Amber suppression efficiency with the NNAA p-acetylphenyl alanine (pAcPhe) the expression of a trastuzumab light chain (LC) construct containing 1 Amber (TAG) codon at position K42, E161 or D170 were tested in WT and ǻRF1 Snuggle E. coli strains. The plasmid for expressing the pAcPhe RS and tRNA was generated by cloning one copy of the pAcPhe RS and three copies of the pAzF tRNA were behind a dual promoter system consisting of an inducible T7 promoter followed by a constitutive Pc0 promoter. The vector had a medium copy (p15a origin) with a b-lactamase selection marker. Both the origin and marker are compatible with pJ411. Plasmid sequences were verified by cloning. [0204] The E. coli strain for expression of LC with pAcPhe was generated by transforming strain 711 or 713 (ǻRF1) with the LC and RS/tRNA plasmids. Single colonies were grown in overnight seed cultures at 37oC in Terrific Broth (TB) containing 50 ^g/mL kanamycin and 100 ^g/mL carbenicillin (TB +Kan/Carb). The next day, seed cultures were diluted 1:40 into larger expression
cultures (25-250 mL) of fresh TB +Kan/Carb and grown at 37oC until they reached an OD600 of 1.2-1.5. At that time, protein expression was induced by adding arabinose to a final concentration of 0.2%, pAcPhe was added to a final concentration of 4 mM, and the temperature was lowered to 25oC. Cultures were allowed to express for 18-20 hours. Cells were then harvested by centrifugation at 6000g for 10 minutes. Cells were resuspended in bacterial protein extraction reagent (B-PER) + 0.2 mg/mL lysozyme (lysis buffer) at a ratio of 1 g dry cell weight per mL lysis buffer, incubated on ice for 20-30 minutes, then sonicated to lyse. Lysates were centrifuged at 30,000g for 30 minutes, and the supernatants were analyzed by SDS-PAGE chromatography. Protein titer was measured with gel densitometry. As shown in Table 10, at the E161 site, the RF1 deletion improved pAcPhe LC titers 31%. The LC titer with the D170 site was improved 62% with RF1 removal. Table 10: Shake flask experiments were performed in duplicate (two unique colonies). Fermenter data is from a single replicate.
Example 7. Shake Flask Production of IL18 D157pAMF in E. coli strain RF1 WT and RF- mutant [0205] IL18 can be expressed solubly with N-terminal SUMO Tag. The E. coli strains for expression of IL18 D157TAG was generated by transforming RF1+ strain SBDG674 and RF1 KO strain SBDG675 with both the RS/tRNA plasmid and product plasmid with the gene for SUMO- IL18 D157 TAG in the pJ411 vector. Single colonies were grown in overnight seed cultures at 37oC in Terrific Broth (TB) containing 50 ^g/mL kanamycin and 100 ^g/mL carbenicillin. The next day, seed cultures were diluted 1:40 into larger expression cultures (25-250 mL) of fresh TB with 50 ^g/mL kanamycin and 100 ^g/mL carbenicillin and grown at 37oC until they reached an OD600 of 1.2-1.5. At that time, protein expression was induced by adding arabinose to a final concentration of 0.2%, pAMF was added to a final concentration of 2 mM and the temperature was lowered to 25oC for 18-20 hours. Cells were then harvested by centrifugation at 6000g for 10 minutes. Cells were resuspended in phosphate buffered saline (PBS) supplemented with 0.2 mg/mL lysozyme and 1 mM DTT (lysis buffer) at a ratio of 1 g dry cell weight per 10 mL lysis buffer, incubated on ice for 20-30 minutes, then sonicated to lyse. Lysates were centrifuged at
30,000g for 30 minutes, and the supernatants were applied to Ni-NTA resin that had been pre- equilibrated with PBS. After application of the supernatant, the resin was washed with PBS containing 10 mM imidazole before the protein was eluted across several fractions with PBS containing 200 mM imidazole. The purest fractions were identified by analysis via SDS-PAGE then pooled and concentrated in 10 kDa MWCO Amicon centrifuge filters. Protein was digested with Ulp1 (1:20 w/w ratio) for 1 hour at 22oC prior to LC-MS analysis. Full length protein and truncated protein were analyzed using the method from example 5. As shown in Table 11, in the RF1 WT cells, around 33% of the protein was truncated at codon D157. These results indicate that using the RF1 KO strain eliminated production of this truncated IL18 product and drastically improved the product quality. Table 11: Shake flask expression of IL18 in RF1 WT and KO cells.
Example 8. Production of CRM197 with multiple pAMF NNAAs in RF1- cells [0206] To produce the plasmid for expression of CRM197 with multiple NNAAs, pJ411 CRM197 TAG was generated by mutating codons to TAG at the desired NNAA incorporation sites and these new genes were cloned into an operon with a T7 promoter and T7 terminator. These NNAA incorporation sites included one or more of the codons for K25, K34, K38, K40, K213, K215, K228, K245, K265, K386, K523 and K527 of SEQ ID NO: 33. pJ411 CRM197 TAG has a high copy pUC origin of replication and contains the gene for Kanamycin resistance. The NNAA CRM197 plasmid sequences were verified by sequencing. Co-translational NNAA incorporation also requires expression of an amber suppressor tRNA orthogonal to existing aminoacyl tRNA synthetases (AAtRS) and an orthogonal AAtRS that specifically recognizes the amber suppressor tRNA and the NNAA. These gene products were expressed from the genome in RF1+ strain (SBDG711) and RF1- strain (SBDG713) under the control of constitutive promoters and inducible T7 promoters. The NNAA CRM197 strain was be produced by transforming the strains SBDG711 and SBDG713 with pJ411 CRM197 TAG and p15a NNAA RS/tRNA and selecting for transformants on plates with Kanamycin. Expression of NNAA CRM197 in shake flasks and high- density fermentation proceeded as described for NNAA containing LC in Examples 6 and 4, respectively.
[0207] After expression of NNAA CRM197, cells were harvested by centrifugation at 7000g for 7 minutes. Cell pellets were resuspended in PBS, 0.1 ^g/mL lysozyme, and 0.05U/mL benzonase. Cells were allowed to sit on ice for 30 minutes, and then lysed by 3 subsequent cycles of flash freezing in liquid nitrogen followed by thawing in a circulating water bath set to 20°C. Lysates were then centrifuged at ~20,000g for 20 minutes to pellet insoluble components, after which the supernatant was removed and flash frozen until further analysis. [0208] SDS-PAGE analysis of soluble lysates (2 ^L) revealed the presence of a band at ~60 kDa (expected molecular weight of CRM197) in each of the lysates derived from strain SBDG713, while only the 1XpAMF CRM197 protein was visibly expressed in 1 of 2 colonies of strain SBDG711 (FIG. 1). CRM197 protein lysates from each expression were subsequently applied to 100 ^L Ni-NTA resin by gravity flow, washed with 3 column volumes of PBS and 10 mM imidazole (wash buffer), and then eluted with 10 column volumes of PBS and 200 mM imidazole (elution buffer). Elution fractions were analyzed by SDS-PAGE for purity, and fractions that consisted mostly of CRM197 were pooled. Protein concentration was calculated using a Nanodrop spectrophotometer that had been blanked with elution buffer. The total protein yield was calculated by dividing the sample absorbance at 280 nm by the predicted protein molar absorbance (Snapgene) of 0.92 mg/mL and multiplying the resulting protein concentration by the sample volume in mL. Final titers (Table 12) were calculated based on the initial culture volume of 40 mL. Table 12. Table of CRM197 titers and conjugate-to-protein ratios
[0209] Titers of CRM197 K25 pAMF (1X pAMF) expressed in SBDG711 and SBDG713 were approximately equal, at 36 and 44 +/-3 mg/L, respectively. The titers of CRM197 K25/K215 pAMF (2X pAMF) and CRM197 K25/K215/K228 pAMF (3X pAMF) derived from strain SBDG713 were similar, at 40 +/- 2 and 39 +/- 12 mg/L, respectively. The CRM197 K25/K215/K228/K386 pAMF (4X pAMF) protein expressed at lower levels of 15 +/- 1 mg/L.
SDS-PAGE analysis of the wash and elution fractions from the IMAC capture of the 2X pAMF, 3X pAMF, and 4X pAMF proteins expressed in the RF1+ strains revealed no prominent CRM197 band. Therefore, titers for these proteins could not be calculated. [0210] To show that the captured proteins contained the expected number of pAMF residues, CRM197 proteins were analyzed by intact LC-MS before and after strain-promoted azide-alkyne click conjugation with a small molecule dibenzylcyclooctyne (DBCO) amine. Protein concentrations were brought to 1 mg/mL in PBS. The DBCO-amine was added at a DBCO-amine to pAMF ratio of 3:1, and 500 mM NaCl was added to the reaction to improve DBCO-amine solubility. The conjugation reaction was incubated overnight at 30°C prior to intact LC-MS analysis. Intact LC-MS analysis and deconvolution was performed as described in Example 5 Intact LC-MS analysis revealed the presence of a peak at the expected theoretical mass for each CRM197 sample, showing that pAMF, and not another amino acid, had been incorporated at each the site of each respective TAG codon. Each conjugated CRM197 protein sample showed the expected mass shift corresponding to its conjugation with 1-4 DBCO-amine molecules. No unconjugated protein could be detected, further showing that only pAMF was incorporated at the TAG codon(s). The conjugate-to-protein ratio (CPR) was calculated by multiplying the theoretical CPR (i.e. a theoretical CPR of 2.0 would be expected for CRM197 K25/K215 pAMF) by the percent conjugated protein as determined by integrating the peak areas of the conjugated and unconjugated proteins in the samples that had been conjugated. Because all proteins were 100% conjugated, the CPR for each was equivalent to the theoretical number of pAMF residues in each protein. Example 9. Production of IgG with multiple pAMF NNAAs in RF1- cells [0211] Nonnatural amino acid (NNAA) containing IgG production requires the concurrent synthesis of Heavy Chain (HC) and Light Chain (LC) polypeptides with either the HC and/or LC containing an NNAA. To produce the plasmid for the expression of IgG with multiple NNAAs, pJ411-HC-LC TAG will be generated by mutating codons to TAG at the desired NNAA incorporation sites in the HC and LC genes. These new genes for the HC and LC would be cloned into a single bicistronic operon with a T7 promoter and T7 terminator. This plasmid will have a high copy pUC origin of replication and contain the gene for Kanamycin resistance. Plasmid sequences will be verified by Sanger sequencing. Both the HC and LC will have independent
ribosomal binding sites. To optimize the ratio of HC and LC, mutations could be made in the ribosomal binding site of either gene. [0212] Co-translational NNAA incorporation also requires expression of an amber suppressor tRNA orthogonal to existing aminoacyl tRNA synthetases (AAtRS) and an orthogonal AAtRS that specifically recognizes the amber suppressor tRNA and the NNAA. These gene products for the NNAA RS and tRNA were expressed from the genome in RF1+ strain (SBDG711) and RF1- strain (SBDG713) under the control of constitutive promoters and inducible T7 promoters. The NNAA IgG strain was produced by transforming the strains SBDG711 and SBDG713 with pJ411-HC-LC TAG and selecting for transformants on plates with Kanamycin. Expression of NNAA IgG in shake flasks and high-density fermentation will proceed as described for NNAA containing LC in Examples 6 and 4, respectively
ILLUSTRATIVE SEQUENCES WT RF1 (SEQ ID NO: 1) atgAAGCCTT CTATCGTTGC CAAACTGGAA GCCCTGCATG AACGCCATGA AGAAGTTCAG GCGTTGCTGG GTGACGCGCA AACTATCGCC GACCAGGAAC GTTTTCGCGC ATTATCACGC GAATATGCGC AGTTAAGTGA TGTTTCGCGC TGTTTTACCG ACTGGCAACA GGTTCAGGAA GATATCGAAA CCGCACAGAT GATGCTCGAT GATCCTGAAA TGCGTGAGAT GGCGCAGGAT GAACTGCGCG AAGCTAAAGA AAAAAGCGAG CAACTGGAAC AGCAATTACA GGTTCTGTTA CTGCCAAAAG ATCCTGATGA CGAACGTAAC GCCTTCCTCG AAGTCCGAGC CGGAACCGGC GGCGACGAAG CGGCGCTGTT CGCGGGCGAT CTGTTCCGTA TGTACAGCCG TTATGCCGAA GCCCGCCGCT GGCGGGTAGA AATCATGAGC GCCAGCGAGG GTGAACATGG TGGTTATAAA GAGATCATCG CCAAAATTAG CGGTGATGGT GTGTATGGTC GTCTGAAATT TGAATCCGGC GGTCATCGCG TGCAACGTGT TCCTGCTACG GAATCGCAGG GTCGTATTCA TACTTCTGCT TGTACCGTTG CGGTAATGCC AGAACTGCCT GACGCAGAAC TGCCGGACAT CAACCCAGCA GATTTACGCA TTGATACTTT CCGCTCGTCA GGGGCGGGTG GTCAGCACGT TAACACCACC GATTCGGCAA TTCGTATTAC TCACTTGCCG ACCGGGATTG TTGTTGAATG TCAGGACGAA CGTTCACAAC ATAAAAACAA AGCTAAAGCA CTTTCTGTTC TCGGTGCTCG CATCCACGCT GCTGAAATGG CAAAACGCCA ACAGGCCGAA GCGTCTACCC GTCGTAACCT GCTGGGGAGT GGCGATCGCA GCGACCGTAA CCGTACTTAC AACTTCCCGC AGGGGCGCGT TACCGATCAC CGCATCAACC TGACGCTCTA CCGCCTGGAT GAAGTGATGG AAGGTAAGCT GGATATGCTG ATTGAACCGA TTATCCAGGA ACATCAGGCC GACCAACTGG CGGCGTTGTC CGAGCAGGAA taa WT RF2 (SEQ ID NO: 2) atgTTTGAAA TTAATCCGGT AAATAATCGC ATTCAGGACC TCACGGAACG CTCCGACGTT CTTAGGGGGT ATCTTTGACT ACGACGCCAA GAAAGAGCGT CTGGAAGAAG TAAACGCCGA GCTGGAACAG CCGGATGTCT GGAACGAACC CGAACGCGCA CAGGCGCTGG GTAAAGAGCG TTCCTCCCTC GAAGCCGTTG TCGACACCCT CGACCAAATG AAACAGGGGC TGGAAGATGT TTCTGGTCTG CTGGAACTGG CTGTAGAAGC TGACGACGAA GAAACCTTTA ACGAAGCCGT TGCTGAACTC GACGCCCTGG AAGAAAAACT GGCGCAGCTT GAGTTCCGCC GTATGTTCTC TGGCGAATAT GACAGCGCCG ACTGCTACCT CGATATTCAG GCGGGGTCTG GCGGTACGGA AGCACAGGAC TGGGCGAGCA TGCTTGAGCG TATGTATCTG CGCTGGGCAG AATCGCGTGG TTTCAAAACT GAAATCATCG AAGAGTCGGA AGGTGAAGTG GCGGGTATTA AATCCGTGAC GATCAAAATC TCCGGCGATT ACGCTTACGG CTGGCTGCGT ACAGAAACCG GCGTTCACCG CCTGGTGCGT AAAAGCCCGT TTGACTCCGG CGGTCGTCGC CACACGTCGT TCAGCTCCGC GTTTGTTTAT CCGGAAGTTG ATGATGATAT TGATATCGAA ATCAACCCGG CGGATCTGCG CATTGACGTT TATCGCACGT CCGGCGCGGG CGGTCAGCAC GTTAACCGTA CCGAATCTGC GGTGCGTATT ACCCACATCC CGACCGGGAT CGTGACCCAG TGCCAGAACG ACCGTTCCCA GCACAAGAAC AAAGATCAGG CCATGAAGCA GATGAAAGCG AAGCTTTATG AACTGGAGAT GCAGAAGAAA AATGCCGAGA AACAGGCGAT GGAAGATAAC AAATCCGACA TCGGCTGGGG CAGCCAGATT CGTTCTTATG TCCTTGATGA CTCCCGCATT AAAGATCTGC GCACCGGGGT AGAAACCCGC AACACGCAGG CCGTGCTGGA CGGCAGCCTG GATCAATTTA TCGAAGCAAG TTTGAAAGCA GGGTTAtga RF2 T246A (SEQ ID NO: 3) ATGTTTGAAATTAATCCGGTAAATAATCGCATTCAGGACCTCACGGAACGCTCCGACGTTCTTAGGGGGT ATCTTTGACTACGACGCCAAGAAAGAGCGTCTGGAAGAAGTAAACGCCGAGCTGGAACAGCCGGATGTCT GGAACGAACCCGAACGCGCACAGGCGCTGGGTAAAGAGCGTTCCTCCCTCGAAGCCGTTGTCGACACCCT CGACCAAATGAAACAGGGGCTGGAAGATGTTTCTGGTCTGCTGGAACTGGCAGTGGAAGCGGACGATGAA GAGACTTTCAACGAAGCCGTTGCAGAGCTGGATGCGCTGGAAGAGAAACTGGCCCAGTTAGAGTTCAGAC
GCATGTTTAGCGGCGAGTATGATAGCGCGGACTGTTACCTGGACATCCAGGCTGGTAGCGGTGGCACTGA AGCGCAAGACTGGGCTAGCATGCTGGAGCGTATGTATTTGCGTTGGGCAGAGAGCCGTGGTTTTAAGACC GAGATCATCGAAGAGTCCGAGGGCGAAGTCGCCGGCATCAAGTCTGTAACCATCAAGATCTCTGGTGACT ATGCGTACGGTTGGCTGCGTACCGAAACCGGCGTGCACCGTTTGGTCCGTAAGTCACCGTTCGATTCCGG TGGCCGTCGCCACACCAGCTTTAGCAGCGCATTCGTTTACCCTGAAGTTGACGACGATATTGACATTGAG ATTAACCCGGCAGATCTGCGCATTGACGTTTACCGTGCGAGCGGTGCGGGTGGCCAACACGTGAATCGTA CCGAGAGCGCGGTTCGCATTACCCATATCCCGACGGGTATTGTCACCCAGTGCCAGAACGATCGCAGCCA GCATAAGAATAAAGATCAAGCGATGAAACAGATGAAAGCGAAGCTGTACGAATTGGAAATGCAGAAGAAA AATGCCGAGAAACAAGCGATGGAAGATAACAAGAGCGACATTGGCTGGGGTTCTCAGATTCGCAGCTACG TTCTGGACGACTCCCGCATCAAAGATCTGCGTACGGGTGTTGAAACGCGCAATACCCAAGCCGTCCTGGA CGGTTCGCTGGACCAATTTATCGAAGCGAGCCTGAAGGCCGGTCTGTAA WT hda (SEQ ID NO: 4) ctgAACACAC CGGCACAGCT CTCTTTGCCA CTTTATCTTC CTGACGACGA AACCTTTGCA AGTTTCTGGC CGGGGGATAA CTCCTCTTTA CTGGCCGCGC TGCAAAACGT GCTGCGTCAG GAACATAGCG GTTACATCTA TCTCTGGGCA CGCGAAGGCG CGGGGCGCAG CCATCTGCTG CACGCGGCTT GCGCGGAATT GTCGCAGCGT GGCGATGCGG TGGGCTATGT CCCGCTGGAT AAACGCACCT GGTTTGTTCC GGAAGTGCTC GACGGTATGG AGCATTTGTC GCTGGTCTGT ATCGACAACA TTGAGTGTAT TGCAGGCGAT GAGTTGTGGG AGATGGCGAT TTTCGATCTC TACAATCGAA TTCTGGAATC GGGCAAAACA CGGTTGTTGA TCACCGGCGA TCGTCCACCG CGGCAGTTGA ATCTGGGATT ACCGGATCTC GCGTCGCGAC TCGACTGGGG GCAGATCTAC AAATTGCAGC CACTTTCTGA TGAAGATAAG TTGCAGGCGC TACAGTTACG CGCGCGTTTG CGTGGTTTTG AACTGCCGGA AGATGTGGGG CGTTTCTTGC TGAAGCGGCT CGACAGAGAA ATGCGCACGC TATTTATGAC GTTGGATCAG TTGGATCGTG CGTCGATTAC CGCGCAACGT AAGCTGACCA TTCCGTTTGT GAAAGAAATT CTGAAGTTGt ag RF- hda (SEQ ID NO: 5) CTGAACACACCGGCACAGCTCTCTTTGCCACTTTATCTTCCTGACGACGAAACCTTTGCAAGTTTCTGGC CGGGGGATAACTCCTCTTTACTGGCCGCGCTGCAAAACGTGCTGCGTCAGGAACATAGCGGTTACATCTA TCTCTGGGCACGCGAAGGCGCGGGGCGCAGCCATCTGCTGCACGCGGCTTGCGCGGAATTGTCGCAGCGT GGCGATGCGGTGGGCTATGTCCCGCTGGATAAACGCACCTGGTTTGTTCCGGAAGTGCTCGACGGTATGG AGCATTTGTCGCTGGTCTGTATCGACAACATTGAGTGTATTGCAGGCGATGAGTTGTGGGAGATGGCGAT TTTCGATCTCTACAATCGAATTCTGGAATCGGGCAAAACACGGTTGTTGATCACCGGCGATCGTCCACCG CGGCAGTTGAATCTGGGATTACCGGATCTCGCGTCGCGACTCGACTGGGGGCAGATCTACAAATTGCAGC CACTTTCTGATGAAGATAAGTTGCAGGCGCTACAGTTACGCGCGCGTTTGCGTGGTTTTGAACTGCCGGA AGATGTGGGGCGTTTCTTGCTGAAGCGGCTCGACAGAGAAATGCGCACGCTATTTATGACGTTGGATCAG TTGGATCGTGCGTCGATTACCGCGCAACGTAAGCTGACCATTCCGTTTGTGAAAGAAATTCTGAAGTTGT AA WT lpxK (SEQ ID NO: 6) atgATCGAAA AAATCTGGTC TGGTGAATCC CCTTTGTGGC GGCTATTGCT GCCACTCTCC TGGTTGTATG GCCTGGTGAG TGGCGCGATC CGTCTTTGCT ATAAACTAAA ACTGAAGCGC GCCTGGCGTG CCCCCGTACC GGTTGTCGTG GTTGGTAATC TCACCGCAGG CGGCAACGGA AAAACCCCGG TCGTTGTCTG GCTGGTGGAA CAGTTGCAAC AGCGCGGTAT TCGCGTGGGG GTCGTATCGC GGGGATATGG TGGTAAGGCT GAATCTTATC CGCTGTTATT GTCGGCAGAT ACCACAACAG CACAGGCGGG TGATGAACCT GTGTTGATTT ATCAACGCAC TGATGCGCCT GTTGCGGTTT CTCCCGTTCG TTCTGATGCG GTAAAAGCCA TTCTGGCGCA ACACCCTGAT GTGCAGATCA TCGTAACCGA CGACGGTTTA CAGCATTACC GTCTGGCGCG TGATGTGGAA ATTGTCGTTA TTGATGGTGT GCGTCGCTTT GGCAATGGCT GGTGGTTGCC GGCGGGGCCA ATGCGTGAGC GAGCGGGGCG CTTAAAGTCG GTTGATGCGG TAATCGTCAA CGGCGGTGTC CCTCGCAGCG GTGAAATCCC CATGCATCTG CTGCCGGGTC AGGCGGTGAA TTTACGTACC GGTACGCGTT GTGACGTTGC TCAGCTTGAA CATGTAGTGG CGATGGCGGG GATTGGGCAT
CCGCCGCGCT TTTTTGCCAC GCTGAAGATG TGTGGCGTAC AACCGGAAAA ATGTGTACCG CTGGCCGATC ATCAGTCTTT GAACCATGCG GATGTCAGTG CGTTGGTAAG CGCCGGGCAA ACGCTGGTAA TGACTGAAAA AGATGCGGTG AAATGCCGGG CCTTTGCAGA AGAAAATTGG TGGTATTTGC CTGTAGACGC ACAGCTTTCA GGTGATGAAC CAGCGAAACT GCTTACGCAA CTAACCTTGC TGGCTTCTGG CAACtag RF- lpxK (SEQ ID NO: 7) ATGATCGAAAAAATCTGGTCTGGTGAATCCCCTTTGTGGCGGCTATTGCTGCCACTCTCCTGGTTGTATG GCCTGGTGAGTGGCGCGATCCGTCTTTGCTATAAACTAAAACTGAAGCGCGCCTGGCGTGCCCCCGTACC GGTTGTCGTGGTTGGTAATCTCACCGCAGGCGGCAACGGAAAAACCCCGGTCGTTGTCTGGCTGGTGGAA CAGTTGCAACAGCGCGGTATTCGCGTGGGGGTCGTATCGCGGGGATATGGTGGTAAGGCTGAATCTTATC CGCTGTTATTGTCGGCAGATACCACAACAGCACAGGCGGGTGATGAACCTGTGTTGATTTATCAACGCAC TGATGCGCCTGTTGCGGTTTCTCCCGTTCGTTCTGATGCGGTAAAAGCCATTCTGGCGCAACACCCTGAT GTGCAGATCATCGTAACCGACGACGGTTTACAGCATTACCGTCTGGCGCGTGATGTGGAAATTGTCGTTA TTGATGGTGTGCGTCGCTTTGGCAATGGCTGGTGGTTGCCGGCGGGGCCAATGCGTGAGCGAGCGGGGCG CTTAAAGTCGGTTGATGCGGTAATCGTCAACGGCGGTGTCCCTCGCAGCGGTGAAATCCCCATGCATCTG CTGCCGGGTCAGGCGGTGAATTTACGTACCGGTACGCGTTGTGACGTTGCTCAGCTTGAACATGTAGTGG CGATGGCGGGGATTGGGCATCCGCCGCGCTTTTTTGCCACGCTGAAGATGTGTGGCGTACAACCGGAAAA ATGTGTACCGCTGGCCGATCATCAGTCTTTGAACCATGCGGATGTCAGTGCGTTGGTAAGCGCCGGGCAA ACGCTGGTAATGACTGAAAAAGATGCGGTGAAATGCCGGGCCTTTGCAGAAGAAAATTGGTGGTATTTGC CTGTAGACGCACAGCTTTCAGGTGATGAACCAGCGAAACTGCTTACGCAACTAACCTTGCTGGCTTCTGG CAACTAA WT coaD (SEQ ID NO: 8) atgCAAAAAC GGGCGATTTA TCCGGGTACT TTCGATCCCA TTACCAATGG TCATATCGAT ATCGTGACGC GCGCCACGCA GATGTTCGAT CACGTTATTC TGGCGATTGC CGCCAGCCCC AGTAAAAAAC CGATGTTTAC CCTGGAAGAG CGTGTGGCAC TGGCACAGCA GGCAACCGCG CATCTGGGGA ACGTGGAAGT GGTCGGGTTT AGTGATTTAA TGGCGAACTT CGCCCGTAAT CAACACGCTA CGGTGCTGAT TCGTGGCCTG CGTGCGGTGG CAGATTTTGA ATATGAAATG CAGCTGGCGC ATATGAATCG CCACTTAATG CCGGAACTGG AAAGTGTGTT TCTGATGCCG TCGAAAGAGT GGTCGTTTAT CTCTTCATCG TTGGTGAAAG AGGTGGCGCG CCATCAGGGC GATGTCACCC ATTTCCTGCC GGAGAATGTC CATCAGGCGC TGATGGCGAA GTTAGCGtag RF- coaD (SEQ ID NO: 9) ATGCAAAAACGGGCGATTTATCCGGGTACTTTCGATCCCATTACCAATGGTCATATCGATATCGTGACGC GCGCCACGCAGATGTTCGATCACGTTATTCTGGCGATTGCCGCCAGCCCCAGTAAAAAACCGATGTTTAC CCTGGAAGAGCGTGTGGCACTGGCACAGCAGGCAACCGCGCATCTGGGGAACGTGGAAGTGGTCGGGTTT AGTGATTTAATGGCGAACTTCGCCCGTAATCAACACGCTACGGTGCTGATTCGTGGCCTGCGTGCGGTGG CAGATTTTGAATATGAAATGCAGCTGGCGCATATGAATCGCCACTTAATGCCGGAACTGGAAAGTGTGTT TCTGATGCCGTCGAAAGAGTGGTCGTTTATCTCTTCATCGTTGGTGAAAGAGGTGGCGCGCCATCAGGGC GATGTCACCCATTTCCTGCCGGAGAATGTCCATCAGGCGCTGATGGCGAAGTTAGCGTAA WT lolA (SEQ ID NO: 10) atgAAAAAAA TTGCCATCAC CTGTGCATTA CTCTCAAGCT TAGTAGCAAG CAGCGTTTGG GCTGATGCCG CAAGCGATCT GAAAAGCCGC CTGGATAAAG TCAGCAGCTT CCACGCCAGC TTCACACAAA AAGTGACTGA CGGTAGCGGC GCGGCGGTGC AGGAAGGTCA GGGCGATCTG TGGGTGAAAC GTCCAAACTT ATTCAACTGG CATATGACAC AACCTGATGA AAGCATTCTG GTTTCTGACG GTAAAACACT GTGGTTCTAT AACCCGTTCG TTGAGCAAGC TACGGCAACC TGGCTGAAAG ATGCCACCGG TAATACGCCG TTTATGCTGA TTGCCCGCAA CCAGTCCAGC GACTGGCAGC AGTACAATAT CAAACAGAAT GGCGATGACT TTGTCCTGAC GCCGAAAGCC AGCAATGGCA ATCTGAAGCA GTTCACCATT AACGTGGGAC GTGATGGCAC AATCCATCAG TTTAGCGCGG TGGAGCAGGA CGATCAGCGC AGCAGTTATC AACTGAAATC CCAGCAAAAT
GGGGCTGTGG ATGCAGCGAA ATTTACCTTC ACCCCGCCGC AAGGCGTCAC GGTAGATGAT CAACGTAAGt ag RF- lolA (SEQ ID NO: 11) ATGAAAAAAATTGCCATCACCTGTGCATTACTCTCAAGCTTAGTAGCAAGCAGCGTTTGGGCTGATGCCG CAAGCGATCTGAAAAGCCGCCTGGATAAAGTCAGCAGCTTCCACGCCAGCTTCACACAAAAAGTGACTGA CGGTAGCGGCGCGGCGGTGCAGGAAGGTCAGGGCGATCTGTGGGTGAAACGTCCAAACTTATTCAACTGG CATATGACACAACCTGATGAAAGCATTCTGGTTTCTGACGGTAAAACACTGTGGTTCTATAACCCGTTCG TTGAGCAAGCTACGGCAACCTGGCTGAAAGATGCCACCGGTAATACGCCGTTTATGCTGATTGCCCGCAA CCAGTCCAGCGACTGGCAGCAGTACAATATCAAACAGAATGGCGATGACTTTGTCCTGACGCCGAAAGCC AGCAATGGCAATCTGAAGCAGTTCACCATTAACGTGGGACGTGATGGCACAATCCATCAGTTTAGCGCGG TGGAGCAGGACGATCAGCGCAGCAGTTATCAACTGAAATCCCAGCAAAATGGGGCTGTGGATGCAGCGAA ATTTACCTTCACCCCGCCGCAAGGCGTCACGGTAGATGATCAACGTAAGTAA WT mreC (SEQ ID NO: 12) atgAAGCCAA TTTTTAGCCG TGGCCCGTCG CTACAGATTC GCCTTATTCT GGCGGTGCTG GTGGCGCTCG GCATTATTAT TGCCGACAGC CGCCTGGGGA CGTTCAGTCA AATCCGTACT TATATGGATA CCGCCGTCAG TCCTTTCTAC TTTGTTTCCA ATGCTCCTCG TGAATTGCTG GATGGCGTAT CGCAGACGCT GGCCTCGCGT GACCAATTAG AACTTGAAAA CCGGGCGTTA CGTCAGGAAC TGTTGCTGAA AAACAGTGAA CTGCTGATGC TTGGACAATA CAAACAGGAG AACGCGCGTC TGCGCGAGCT GCTGGGTTCC CCGCTGCGTC AGGATGAGCA GAAAATGGTG ACTCAGGTTA TCTCCACGGT TAACGATCCT TATAGCGATC AAGTTGTTAT CGATAAAGGT AGCGTTAATG GCGTTTATGA AGGCCAGCCG GTCATCAGCG ACAAAGGTGT TGTTGGTCAG GTGGTGGCCG TCGCTAAACT GACCAGTCGC GTGCTGCTGA TTTGTGATGC GACCCACGCG CTGCCAATCC AGGTGCTGCG CAACGATATC CGCGTAATTG CAGCCGGTAA CGGTTGTACG GATGATTTGC AGCTTGAGCA TCTGCCGGCG AATACGGATA TTCGTGTTGG TGATGTGCTG GTGACTTCCG GTCTGGGCGG TCGTTTCCCG GAAGGCTATC CGGTCGCGGT TGTCTCTTCC GTAAAACTCG ATACCCAGCG CGCTTATACT GTGATTCAGG CGCGTCCGAC TGCAGGGCTG CAACGTTTGC GTTATCTGCT GCTGCTGTGG GGGGCAGATC GTAACGGCGC TAACCCGATG ACGCCGGAAG AGGTGCATCG TGTTGCTAAT GAACGTCTGA TGCAGATGAT GCCGCAGGTA TTGCCTTCGC CAGACGCGAT GGGGCCAAAG TTACCTGAAC CGGCAACGGG GATCGCTCAG CCGACTCCGC AGCAACCGGC GACAGGAAAT GCAGCTACTG CGCCTGCTGC GCCGACACAG CCTGCTGCTA ATCGCTCTCC ACAAAGGGCT ACGCCGCCGC AAAGTGGTGC TCAACCGCCT GCGCGTGCGC CGGGAGGGCA Atag RF- mreC (SEQ ID NO: 13) ATGAAGCCAATTTTTAGCCGTGGCCCGTCGCTACAGATTCGCCTTATTCTGGCGGTGCTGGTGGCGCTCG GCATTATTATTGCCGACAGCCGCCTGGGGACGTTCAGTCAAATCCGTACTTATATGGATACCGCCGTCAG TCCTTTCTACTTTGTTTCCAATGCTCCTCGTGAATTGCTGGATGGCGTATCGCAGACGCTGGCCTCGCGT GACCAATTAGAACTTGAAAACCGGGCGTTACGTCAGGAACTGTTGCTGAAAAACAGTGAACTGCTGATGC TTGGACAATACAAACAGGAGAACGCGCGTCTGCGCGAGCTGCTGGGTTCCCCGCTGCGTCAGGATGAGCA GAAAATGGTGACTCAGGTTATCTCCACGGTTAACGATCCTTATAGCGATCAAGTTGTTATCGATAAAGGT AGCGTTAATGGCGTTTATGAAGGCCAGCCGGTCATCAGCGACAAAGGTGTTGTTGGTCAGGTGGTGGCCG TCGCTAAACTGACCAGTCGCGTGCTGCTGATTTGTGATGCGACCCACGCGCTGCCAATCCAGGTGCTGCG CAACGATATCCGCGTAATTGCAGCCGGTAACGGTTGTACGGATGATTTGCAGCTTGAGCATCTGCCGGCG AATACGGATATTCGTGTTGGTGATGTGCTGGTGACTTCCGGTCTGGGCGGTCGTTTCCCGGAAGGCTATC CGGTCGCGGTTGTCTCTTCCGTAAAACTCGATACCCAGCGCGCTTATACTGTGATTCAGGCGCGTCCGAC TGCAGGGCTGCAACGTTTGCGTTATCTGCTGCTGCTGTGGGGGGCAGATCGTAACGGCGCTAACCCGATG ACGCCGGAAGAGGTGCATCGTGTTGCTAATGAACGTCTGATGCAGATGATGCCGCAGGTATTGCCTTCGC
CAGACGCGATGGGGCCAAAGTTACCTGAACCGGCAACGGGGATCGCTCAGCCGACTCCGCAGCAACCGGC GACAGGAAATGCAGCTACTGCGCCTGCTGCGCCGACACAGCCTGCTGCTAATCGCTCTCCACAAAGGGCT ACGCCGCCGCAAAGTGGTGCTCAACCGCCTGCGCGTGCGCCGGGAGGGCAGTGA WT murF (SEQ ID NO: 14) atgATTAGCG TAACCCTTAG CCAACTTACC GACATTCTCA ACGGTGAACT GCAAGGTGCA GATATCACCC TTGATGCTGT AACCACTGAT ACCCGAAAAC TGACGCCGGG CTGCCTGTTT GTTGCCCTGA AAGGCGAACG TTTTGATGCC CACGATTTTG CCGACCAGGC GAAAGCTGGC GGCGCAGGCG CACTACTGGT TAGCCGTCCG CTGGACATCG ACCTGCCGCA GTTAATCGTC AAGGATACGC GTCTGGCGTT TGGTGAACTG GCTGCATGGG TTCGCCAGCA AGTTCCGGCG CGCGTGGTTG CTCTGACGGG GTCCTCCGGC AAAACCTCCG TTAAAGAGAT GACGGCGGCG ATTTTAAGCC AGTGCGGCAA CACGCTTTAT ACGGCAGGCA ATCTCAACAA CGACATCGGT GTACCGATGA CGCTGTTGCG CTTAACGCCG GAATACGATT ACGCAGTTAT TGAACTTGGC GCGAACCATC AGGGCGAAAT AGCCTGGACT GTGAGTCTGA CTCGCCCGGA AGCTGCGCTG GTCAACAACC TGGCAGCGGC GCATCTGGAA GGTTTTGGCT CGCTTGCGGG TGTCGCGAAA GCGAAAGGTG AAATCTTTAG CGGCCTGCCG GAAAACGGTA TCGCCATTAT GAACGCCGAC AACAACGACT GGCTGAACTG GCAGAGCGTA ATTGGCTCAC GCAAAGTGTG GCGTTTCTCA CCCAATGCCG CCAACAGCGA TTTCACCGCC ACCAATATCC ATGTGACCTC GCACGGTACG GAATTTACCC TACAAACCCC AACCGGTAGC GTCGATGTTC TGCTGCCGTT GCCGGGGCGT CACAATATTG CGAATGCGCT GGCAGCCGCT GCGCTCTCCA TGTCCGTGGG CGCAACGCTT GATGCTATCA AAGCGGGGCT GGCAAATCTG AAAGCTGTTC CAGGCCGTCT GTTCCCCATC CAACTGGCAG AAAACCAGTT GCTGCTCGAC GACTCCTACA ACGCCAATGT CGGTTCAATG ACTGCAGCAG TCCAGGTACT GGCTGAAATG CCGGGCTACC GCGTGCTGGT GGTGGGCGAT ATGGCGGAAC TGGGCGCTGA AAGCGAAGCC TGCCATGTAC AGGTGGGCGA GGCGGCAAAA GCTGCTGGTA TTGACCGCGT GTTAAGCGTG GGTAAACAAA GCCATGCTAT CAGCACCGCC AGCGGCGTTG GCGAACATTT TGCTGATAAA ACTGCGTTAA TTACGCGTCT TAAATTACTG ATTGCTGAGC AACAGGTAAT TACGATTTTA GTTAAGGGTT CACGTAGTGC CGCCATGGAA GAGGTAGTAC GCGCTTTACA GGAGAATGGG ACATGTtag RF- murF (SEQ ID NO: 15) ATGATTAGCGTAACCCTTAGCCAACTTACCGACATTCTCAACGGTGAACTGCAAGGTGCAGATATCACCC TTGATGCTGTAACCACTGATACCCGAAAACTGACGCCGGGCTGCCTGTTTGTTGCCCTGAAAGGCGAACG TTTTGATGCCCACGATTTTGCCGACCAGGCGAAAGCTGGCGGCGCAGGCGCACTACTGGTTAGCCGTCCG CTGGACATCGACCTGCCGCAGTTAATCGTCAAGGATACGCGTCTGGCGTTTGGTGAACTGGCTGCATGGG TTCGCCAGCAAGTTCCGGCGCGCGTGGTTGCTCTGACGGGGTCCTCCGGCAAAACCTCCGTTAAAGAGAT GACGGCGGCGATTTTAAGCCAGTGCGGCAACACGCTTTATACGGCAGGCAATCTCAACAACGACATCGGT GTACCGATGACGCTGTTGCGCTTAACGCCGGAATACGATTACGCAGTTATTGAACTTGGCGCGAACCATC AGGGCGAAATAGCCTGGACTGTGAGTCTGACTCGCCCGGAAGCTGCGCTGGTCAACAACCTGGCAGCGGC GCATCTGGAAGGTTTTGGCTCGCTTGCGGGTGTCGCGAAAGCGAAAGGTGAAATCTTTAGCGGCCTGCCG GAAAACGGTATCGCCATTATGAACGCCGACAACAACGACTGGCTGAACTGGCAGAGCGTAATTGGCTCAC GCAAAGTGTGGCGTTTCTCACCCAATGCCGCCAACAGCGATTTCACCGCCACCAATATCCATGTGACCTC GCACGGTACGGAATTTACCCTACAAACCCCAACCGGTAGCGTCGATGTTCTGCTGCCGTTGCCGGGGCGT CACAATATTGCGAATGCGCTGGCAGCCGCTGCGCTCTCCATGTCCGTGGGCGCAACGCTTGATGCTATCA AAGCGGGGCTGGCAAATCTGAAAGCTGTTCCAGGCCGTCTGTTCCCCATCCAACTGGCAGAAAACCAGTT GCTGCTCGACGACTCCTACAACGCCAATGTCGGTTCAATGACTGCAGCAGTCCAGGTACTGGCTGAAATG CCGGGCTACCGCGTGCTGGTGGTGGGCGATATGGCGGAACTGGGCGCTGAAAGCGAAGCCTGCCATGTAC AGGTGGGCGAGGCGGCAAAAGCTGCTGGTATTGACCGCGTGTTAAGCGTGGGTAAACAAAGCCATGCTAT CAGCACCGCCAGCGGCGTTGGCGAACATTTTGCTGATAAAACTGCGTTAATTACGCGTCTTAAATTACTG ATTGCTGAGCAACAGGTAATTACGATTTTAGTTAAGGGTTCACGTAGTGCCGCCATGGAAGAGGTAGTAC GCGCTTTACAGGAGAATGGGACATGCTAA WT hemA (SEQ ID NO: 16)
atgACCCTTT TAGCACTCGG TATCAACCAT AAAACGGCAC CTGTATCGCT GCGAGAACGT GTATCGTTTT CGCCGGATAA GCTCGATCAG GCGCTTGACA GCCTGCTTGC GCAGCCGATG GTGCAGGGCG GCGTGGTGCT GTCGACGTGC AACCGCACGG AACTTTATCT TAGCGTTGAA GAGCAGGACA ACCTGCAAGA GGCGTTAATC CGCTGGCTTT GCGATTATCA CAATCTTAAT GAAGAAGATC TGCGTAAAAG CCTCTACTGG CATCAGGATA ACGACGCGGT TAGCCATTTA ATGCGTGTTG CCAGCGGCCT GGATTCACTG GTTCTGGGGG AGCCGCAGAT CCTCGGTCAG GTTAAAAAAG CGTTTGCCGA TTCGCAAAAA GGTCATATGA AGGCCAGCGA ACTGGAACGC ATGTTCCAGA AATCTTTCTC TGTCGCGAAA CGCGTTCGCA CTGAAACAGA TATCGGTGCC AGCGCTGTGT CTGTCGCTTT TGCGGCTTGT ACGCTGGCGC GGCAGATCTT TGAATCGCTC TCTACGGTCA CAGTGTTGCT GGTAGGCGCG GGCGAAACTA TCGAGCTGGT GGCGCGTCAT CTGCGCGAAC ACAAAGTACA GAAGATGATT ATCGCCAACC GCACTCGCGA ACGTGCCCAA ATTCTGGCAG ATGAAGTCGG CGCGGAAGTG ATTGCCCTGA GTGATATCGA CGAACGTCTG CGCGAAGCCG ATATCATCAT CAGTTCCACC GCCAGCCCGT TACCGATTAT CGGGAAAGGC ATGGTGGAGC GCGCATTAAA AAGCCGTCGC AACCAACCAA TGCTGTTGGT GGATATTGCC GTTCCGCGCG ATGTTGAGCC GGAAGTTGGC AAACTGGCGA ATGCTTATCT TTATAGCGTT GATGATCTGC AAAGCATCAT TTCGCACAAC CTGGCGCAGC GTAAAGCCGC AGCGGTTGAG GCGGAAACTA TTGTCGCTCA GGAAACCAGC GAATTTATGG CGTGGCTGCG AGCACAAAGC GCCAGCGAAA CCATTCGCGA GTATCGCAGC CAGGCAGAGC AAGTTCGCGA TGAGTTAACC GCCAAAGCGT TAGCGGCCCT TGAGCAGGGC GGCGACGCGC AAGCCATTAT GCAGGATCTG GCATGGAAAC TGACTAACCG CTTGATCCAT GCGCCAACGA AATCACTTCA ACAGGCCGCC CGTGACGGGG ATAACGAACG CCTGAATATT CTGCGCGACA GCCTCGGGCT GGAGtag RF- hemA (SEQ ID NO: 17) atgACCCTTT TAGCACTCGG TATCAACCAT AAAACGGCAC CTGTATCGCT GCGAGAACGT GTATCGTTTT CGCCGGATAA GCTCGATCAG GCGCTTGACA GCCTGCTTGC GCAGCCGATG GTGCAGGGCG GCGTGGTGCT GTCGACGTGC AACCGCACGG AACTTTATCT TAGCGTTGAA GAGCAGGACA ACCTGCAAGA GGCGTTAATC CGCTGGCTTT GCGATTATCA CAATCTTAAT GAAGAAGATC TGCGTAAAAG CCTCTACTGG CATCAGGATA ACGACGCGGT TAGCCATTTA ATGCGTGTTG CCAGCGGCCT GGATTCACTG GTTCTGGGGG AGCCGCAGAT CCTCGGTCAG GTTAAAAAAG CGTTTGCCGA TTCGCAAAAA GGTCATATGA AGGCCAGCGA ACTGGAACGC ATGTTCCAGA AATCTTTCTC TGTCGCGAAA CGCGTTCGCA CTGAAACAGA TATCGGTGCC AGCGCTGTGT CTGTCGCTTT TGCGGCTTGT ACGCTGGCGC GGCAGATCTT TGAATCGCTC TCTACGGTCA CAGTGTTGCT GGTAGGCGCG GGCGAAACTA TCGAGCTGGT GGCGCGTCAT CTGCGCGAAC ACAAAGTACA GAAGATGATT ATCGCCAACC GCACTCGCGA ACGTGCCCAA ATTCTGGCAG ATGAAGTCGG CGCGGAAGTG ATTGCCCTGA GTGATATCGA CGAACGTCTG CGCGAAGCCG ATATCATCAT CAGTTCCACC GCCAGCCCGT TACCGATTAT CGGGAAAGGC ATGGTGGAGC GCGCATTAAA AAGCCGTCGC AACCAACCAA TGCTGTTGGT GGATATTGCC GTTCCGCGCG ATGTTGAGCC GGAAGTTGGC AAACTGGCGA ATGCTTATCT TTATAGCGTT GATGATCTGC AAAGCATCAT TTCGCACAAC CTGGCGCAGC GTAAAGCCGC AGCGGTTGAG GCGGAAACTA TTGTCGCTCA GGAAACCAGC GAATTTATGG CGTGGCTGCG AGCACAAAGC GCCAGCGAAA CCATTCGCGA GTATCGCAGC CAGGCAGAGC AAGTTCGCGA TGAGTTAACC GCCAAAGCGT TAGCGGCCCT TGAGCAGGGC GGCGACGCGC AAGCCATTAT GCAGGATCTG GCATGGAAAC TGACTAACCG CTTGATCCAT GCGCCAACGA AATCACTTCA ACAGGCCGCC CGTGACGGGG ATAACGAACG CCTGAATATT CTGCGCGACA GCCTCGGGCT GGAGtaa This is the Original 8 mutations that were required for viability of RF1 KO cells in oxidizing background. The following mutations increased cell health and growth rates
WT sucB (SEQ ID NO: 18) atgAGTAGCG TAGATATTCT GGTCCCTGAC CTGCCTGAAT CCGTAGCCGA TGCCACCGTC GCAACCTGGC ATAAAAAACC CGGCGACGCA GTCGTACGTG ATGAAGTGCT GGTAGAAATC GAAACTGACA AAGTGGTACT GGAAGTACCG GCATCAGCAG ACGGCATTCT GGATGCGGTT CTGGAAGATG AAGGTACAAC GGTAACGTCT CGTCAGATCC TTGGTCGCCT GCGTGAAGGC AACAGCGCCG GTAAAGAAAC CAGCGCCAAA TCTGAAGAGA AAGCGTCCAC TCCGGCGCAA CGCCAGCAGG CGTCTCTGGA AGAGCAAAAC AACGATGCGT TAAGCCCGGC GATCCGTCGC CTGCTGGCTG AACACAATCT CGACGCCAGC GCCATTAAAG GCACCGGTGT GGGTGGTCGT CTGACTCGTG AAGATGTGGA AAAACATCTG GCGAAAGCCC CGGCGAAAGA GTCTGCTCCG GCAGCGGCTG CTCCGGCGGC GCAACCGGCT CTGGCTGCAC GTAGTGAAAA ACGTGTCCCG ATGACTCGCC TGCGTAAGCG TGTGGCAGAG CGTCTGCTGG AAGCGAAAAA CTCCACCGCC ATGCTGACCA CGTTCAACGA AGTCAACATG AAGCCGATTA TGGATCTGCG TAAGCAGTAC GGTGAAGCGT TTGAAAAACG CCACGGCATC CGTCTGGGCT TTATGTCCTT CTACGTGAAA GCGGTGGTTG AAGCCCTGAA ACGTTACCCG GAAGTGAACG CTTCTATCGA CGGCGATGAC GTGGTTTACC ACAACTATTT CGACGTCAGC ATGGCGGTTT CTACGCCGCG CGGCCTGGTG ACGCCGGTTC TGCGTGATGT CGATACCCTC GGCATGGCAG ACATCGAGAA GAAAATCAAA GAGCTGGCAG TCAAAGGCCG TGACGGCAAG CTGACCGTTG AAGATCTGAC CGGTGGTAAC TTCACCATCA CCAACGGTGG TGTGTTCGGT TCCCTGATGT CTACGCCGAT CATCAACCCG CCGCAGAGCG CAATTCTGGG TATGCACGCT ATCAAAGATC GTCCGATGGC GGTGAATGGT CAGGTTGAGA TCCTGCCGAT GATGTACCTG GCGCTGTCCT ACGATCACCG TCTGATCGAT GGTCGCGAAT CCGTGGGCTT CCTGGTAACG ATCAAAGAGT TGCTGGAAGA TCCGACGCGT CTGCTGCTGG ACGTGtagta g RF- sucB (SEQ ID NO: 19) ATGAGTAGCGTAGATATTCTGGTCCCTGACCTGCCTGAATCCGTAGCCGATGCCACCGTCGCAACCTGGC ATAAAAAACCCGGCGACGCAGTCGTACGTGATGAAGTGCTGGTAGAAATCGAAACTGACAAAGTGGTACT GGAAGTACCGGCATCAGCAGACGGCATTCTGGATGCGGTTCTGGAAGATGAAGGTACAACGGTAACGTCT CGTCAGATCCTTGGTCGCCTGCGTGAAGGCAACAGCGCCGGTAAAGAAACCAGCGCCAAATCTGAAGAGA AAGCGTCCACTCCGGCGCAACGCCAGCAGGCGTCTCTGGAAGAGCAAAACAACGATGCGTTAAGCCCGGC GATCCGTCGCCTGCTGGCTGAACACAATCTCGACGCCAGCGCCATTAAAGGCACCGGTGTGGGTGGTCGT CTGACTCGTGAAGATGTGGAAAAACATCTGGCGAAAGCCCCGGCGAAAGAGTCTGCTCCGGCAGCGGCTG CTCCGGCGGCGCAACCGGCTCTGGCTGCACGTAGTGAAAAACGTGTCCCGATGACTCGCCTGCGTAAGCG TGTGGCAGAGCGTCTGCTGGAAGCGAAAAACTCCACCGCCATGCTGACCACGTTCAACGAAGTCAACATG AAGCCGATTATGGATCTGCGTAAGCAGTACGGTGAAGCGTTTGAAAAACGCCACGGCATCCGTCTGGGCT TTATGTCCTTCTACGTGAAAGCGGTGGTTGAAGCCCTGAAACGTTACCCGGAAGTGAACGCTTCTATCGA CGGCGATGACGTGGTTTACCACAACTATTTCGACGTCAGCATGGCGGTTTCTACGCCGCGCGGCCTGGTG ACGCCGGTTCTGCGTGATGTCGATACCCTCGGCATGGCAGACATCGAGAAGAAAATCAAAGAGCTGGCAG TCAAAGGCCGTGACGGCAAGCTGACCGTTGAAGATCTGACCGGTGGTAACTTCACCATCACCAACGGTGG TGTGTTCGGTTCCCTGATGTCTACGCCGATCATCAACCCGCCGCAGAGCGCAATTCTGGGTATGCACGCT ATCAAAGATCGTCCGATGGCGGTGAATGGTCAGGTTGAGATCCTGCCGATGATGTACCTGGCGCTGTCCT ACGATCACCGTCTGATCGATGGTCGCGAATCCGTGGGCTTCCTGGTAACGATCAAAGAGTTGCTGGAAGA TCCGACGCGTCTGCTGCTGGACGTGTAA WT atpE (SEQ ID NO: 20) atgGAAAACC TGAATATGGA TCTGCTGTAC ATGGCTGCCG CTGTGATGAT GGGTCTGGCG GCAATCGGTG CTGCGATCGG TATCGGCATC CTCGGGGGTA AATTCCTGGA AGGCGCAGCG CGTCAACCTG ATCTGATTCC TCTGCTGCGT ACTCAGTTCT TTATCGTTAT GGGTCTGGTG GATGCTATCC CGATGATCGC TGTAGGTCTG GGTCTGTACG TGATGTTCGC TGTCGCGtag RF- atpE (SEQ ID NO: 21)
ATGGAAAACCTGAATATGGATCTGCTGTACATGGCTGCCGCTGTGATGATGGGTCTGGCGGCAATCGGTG CTGCGATCGGTATCGGCATCCTCGGGGGTAAATTCCTGGAAGGCGCAGCGCGTCAACCTGATCTGATTCC TCTGCTGCGTACTCAGTTCTTTATCGTTATGGGTCTGGTGGATGCTATCCCGATGATCGCTGTAGGTCTG GGTCTGTACGTGATGTTCGCTGTCGCATAA WT fabH (SEQ ID NO: 22) atgTATACGA AGATTATTGG TACTGGCAGC TATCTGCCCG AACAAGTGCG GACAAACGCC GATTTGGAAA AAATGGTGGA CACCTCTGAC GAGTGGATTG TCACTCGTAC CGGTATCCGC GAACGCCACA TTGCCGCGCC AAACGAAACC GTTTCAACCA TGGGCTTTGA AGCGGCGACA CGCGCAATTG AGATGGCGGG CATTGAGAAA GACCAGATTG GCCTGATCGT TGTGGCAACG ACTTCTGCTA CGCACGCTTT CCCGAGCGCA GCTTGTCAGA TTCAAAGCAT GTTGGGCATT AAAGGTTGCC CGGCATTTGA CGTTGCAGCA GCCTGCGCAG GTTTCACCTA TGCATTAAGC GTAGCCGATC AATACGTGAA ATCTGGGGCG GTGAAGTATG CTCTGGTCGT CGGTTCCGAT GTACTGGCGC GCACCTGCGA TCCAACCGAT CGTGGGACTA TTATTATTTT TGGCGATGGC GCGGGCGCTG CGGTGCTGGC TGCCTCTGAA GAGCCGGGAA TCATTTCCAC CCATCTGCAT GCCGACGGTA GTTATGGTGA ATTGCTGACG CTGCCAAACG CCGACCGCGT GAATCCAGAG AATTCAATTC ATCTGACGAT GGCGGGCAAC GAAGTCTTCA AGGTTGCGGT AACGGAACTG GCGCACATCG TTGATGAGAC GCTGGCGGCG AATAATCTTG ACCGTTCTCA ACTGGACTGG CTGGTTCCGC ATCAGGCTAA CCTGCGTATT ATCAGTGCAA CGGCGAAAAA ACTCGGTATG TCTATGGATA ATGTCGTGGT GACGCTGGAT CGCCACGGTA ATACCTCTGC GGCCTCTGTC CCGTGCGCGC TGGATGAAGC TGTACGCGAC GGGCGCATTA AGCCGGGGCA GTTGGTTCTG CTTGAAGCCT TTGGCGGTGG ATTCACCTGG GGCTCCGCGC TGGTTCGTTT Ctag RF1- fabH (SEQ ID NO: 23) ATGTATACGAAGATTATTGGTACTGGCAGCTATCTGCCCGAACAAGTGCGGACAAACGCCGATTTGGAAA AAATGGTGGACACCTCTGACGAGTGGATTGTCACTCGTACCGGTATCCGCGAACGCCACATTGCCGCGCC AAACGAAACCGTTTCAACCATGGGCTTTGAAGCGGCGACACGCGCAATTGAGATGGCGGGCATTGAGAAA GACCAGATTGGCCTGATCGTTGTGGCAACGACTTCTGCTACGCACGCTTTCCCGAGCGCAGCTTGTCAGA TTCAAAGCATGTTGGGCATTAAAGGTTGCCCGGCATTTGACGTTGCAGCAGCCTGCGCAGGTTTCACCTA TGCATTAAGCGTAGCCGATCAATACGTGAAATCTGGGGCGGTGAAGTATGCTCTGGTCGTCGGTTCCGAT GTACTGGCGCGCACCTGCGATCCAACCGATCGTGGGACTATTATTATTTTTGGCGATGGCGCGGGCGCTG CGGTGCTGGCTGCCTCTGAAGAGCCGGGAATCATTTCCACCCATCTGCATGCCGACGGTAGTTATGGTGA ATTGCTGACGCTGCCAAACGCCGACCGCGTGAATCCAGAGAATTCAATTCATCTGACGATGGCGGGCAAC GAAGTCTTCAAGGTTGCGGTAACGGAACTGGCGCACATCGTTGATGAGACGCTGGCGGCGAATAATCTTG ACCGTTCTCAACTGGACTGGCTGGTTCCGCATCAGGCTAACCTGCGTATTATCAGTGCAACGGCGAAAAA ACTCGGTATGTCTATGGATAATGTCGTGGTGACGCTGGATCGCCACGGTAATACCTCTGCGGCCTCTGTC CCGTGCGCGCTGGATGAAGCTGTACGCGACGGGCGCATTAAGCCGGGGCAGTTGGTTCTGCTTGAAGCCT TTGGCGGTGGATTCACCTGGGGCTCCGCGCTGGTTCGTTTCTGA WT ubiF (SEQ ID NO: 24) atgACAAATC AACCAACGGA AATTGCCATT GTCGGCGGAG GAATGGTCGG CGGCGCACTG GCGCTGGGGC TGGCACAGCA CGGATTTGCG GTAACGGTGA TCGAGCACGC AGAACCAGCG CCGTTTGTCG CTGATAGCCA ACCGGACGTG CGGATCTCGG CGATCAGCGC GGCTTCGGTA TCATTGCTTA AAGGGTTAGG GGTCTGGGAT GCAGTACAGG CTATGCGTTG CCATCCTTAC CGCAGACTGG AAACGTGGGA GTGGGAAACG GCGCATGTGG TGTTTGACGC CGCTGAACTT AAGCTACCGC TGCTTGGCTA TATGGTGGAA AACACTGTCC TGCAACAGGC GTTGTGGCAG GCGCTGGAAG CGCATCCGAA AGTAACGTTA CGTGTGCCAG GCTCGCTGAT TGCGCTGCAT CGCCATGATG ATCTTCAGGA GCTGGAGCTG AAAGGCGGTG AAGTGATTCG CGCGAAGCTG GTGATTGGTG CCGACGGCGC AAATTCGCAG GTGCGGCAGA TGGCGGGAAT TGGCGTTCAT
GCATGGCAGT ATGCGCAGTC GTGCATGTTG ATTAGCGTCC AGTGCGAGAA CGATCCCGGC GACAGCACCT GGCAGCAATT TACTCCGGAC GGACCGCGTG CGTTTCTGCC GTTGTTTGAT AACTGGGCAT CGCTGGTGTG GTATGACTCT CCGGCGCGTA TTCGCCAGTT GCAGAATATG AATATGGCAC AGCTCCAGGC GGAAATCGCG AAGCATTTCC CGTCGCGTCT GGGTTACGTT ACACCGCTTG CCGCTGGTGC GTTTCCGCTG ACGCGTCGCC ATGCGTTGCA GTACGTGCAG CCAGGGCTTG CGCTGGTGGG CGATGCCGCG CATACCATCC ATCCGCTGGC GGGGCAGGGA GTGAATCTTG GTTATCGTGA TGTCGATGCC CTGATTGATG TTCTGGTCAA CGCCCGCAGC TACGGCGAAG CGTGGGCCAG TTATCCTGTC CTCAAGCGTT ACCAGATGCG GCGCATGGCG GATAACTTCA TTATGCAAAG CGGTATGGAT CTGTTTTATG CCGGATTCAG CAATAATCTG CCACCACTGC GTTTTATGCG TAATCTCGGG TTAATGGCGG CGGAGCGTGC TGGCGTGTTG AAACGTCAGG CGCTGAAATA TGCGTTAGGG TTGtag RF1- ubiF (SEQ ID NO: 25) ATGACAAATCAACCAACGGAAATTGCCATTGTCGGCGGAGGAATGGTCGGCGGCGCACTGGCGCTGGGGC TGGCACAGCACGGATTTGCGGTAACGGTGATCGAGCACGCAGAACCAGCGCCGTTTGTCGCTGATAGCCA ACCGGACGTGCGGATCTCGGCGATCAGCGCGGCTTCGGTATCATTGCTTAAAGGGTTAGGGGTCTGGGAT GCAGTACAGGCTATGCGTTGCCATCCTTACCGCAGACTGGAAACGTGGGAGTGGGAAACGGCGCATGTGG TGTTTGACGCCGCTGAACTTAAGCTACCGCTGCTTGGCTATATGGTGGAAAACACTGTCCTGCAACAGGC GTTGTGGCAGGCGCTGGAAGCGCATCCGAAAGTAACGTTACGTGTGCCAGGCTCGCTGATTGCGCTGCAT CGCCATGATGATCTTCAGGAGCTGGAGCTGAAAGGCGGTGAAGTGATTCGCGCGAAGCTGGTGATTGGTG CCGACGGCGCAAATTCGCAGGTGCGGCAGATGGCGGGAATTGGCGTTCATGCATGGCAGTATGCGCAGTC GTGCATGTTGATTAGCGTCCAGTGCGAGAACGATCCCGGCGACAGCACCTGGCAGCAATTTACTCCGGAC GGACCGCGTGCGTTTCTGCCGTTGTTTGATAACTGGGCATCGCTGGTGTGGTATGACTCTCCGGCGCGTA TTCGCCAGTTGCAGAATATGAATATGGCACAGCTCCAGGCGGAAATCGCGAAGCATTTCCCGTCGCGTCT GGGTTACGTTACACCGCTTGCCGCTGGTGCGTTTCCGCTGACGCGTCGCCATGCGTTGCAGTACGTGCAG CCAGGGCTTGCGCTGGTGGGCGATGCCGCGCATACCATCCATCCGCTGGCGGGGCAGGGAGTGAATCTTG GTTATCGTGATGTCGATGCCCTGATTGATGTTCTGGTCAACGCCCGCAGCTACGGCGAAGCGTGGGCCAG TTATCCTGTCCTCAAGCGTTACCAGATGCGGCGCATGGCGGATAACTTCATTATGCAAAGCGGTATGGAT CTGTTTTATGCCGGATTCAGCAATAATCTGCCACCACTGCGTTTTATGCGTAATCTCGGGTTAATGGCGG CGGAGCGTGCTGGCGTGTTGAAACGTCAGGCGCTGAAATATGCGTTAGGGTTGTAA WT pgpC (SEQ ID NO: 26) ttgGCAACTC ACGAGCGTCG TGTGGTGTTT TTTGACTTAG ATGGAACATT GCATCAGCAG GATATGTTCG GCAGTTTTCT GCGCTATTTA CTACGTCGCC AACCGCTGAA TGCGTTACTT GTCCTGCCGT TGTTACCGAT TATAGCCATT GCGTTATTGA TAAAAGGTCG TGCGGCACGC TGGCCGATGA GTCTGCTTCT GTGGGGGTGC ACTTTTGGTC ACAGCGAAGC ACGTTTACAG ACGTTGCAGG CCGATTTCGT GCGCTGGTTT CGCGACAATG TTACCGCCTT TCCGCTGGTT CAGGAGCGAT TAACCACCTA CCTGTTAAGT TCCGATGCTG ATATCTGGTT GATTACCGGC TCTCCGCAGC CGCTGGTTGA AGCGGTTTAT TTCGATACGC CCTGGCTGCC GCGGGTTAAT CTTATCGCCA GCCAAATTCA GCGTGGCTAT GGTGGTTGGG TATTGACGAT GCGTTGTCTG GGACATGAAA AGGTCGCACA ACTGGAGCGC AAAATCGGCA CTCCGCTGCG GCTGTACAGT GGCTATAGCG ACAGTAATCA GGACAATCCG CTGCTTTATT TCTGTCAGCA TCGTTGGCGA GTAACCCCGC GCGGTGAACT CCAGCAACTG GAAtag RF- pgpC (SEQ ID NO: 27) TTGGCAACTCACGAGCGTCGTGTGGTGTTTTTTGACTTAGATGGAACATTGCATCAGCAGGATATGTTCG GCAGTTTTCTGCGCTATTTACTACGTCGCCAACCGCTGAATGCGTTACTTGTCCTGCCGTTGTTACCGAT TATAGCCATTGCGTTATTGATAAAAGGTCGTGCGGCACGCTGGCCGATGAGTCTGCTTCTGTGGGGGTGC ACTTTTGGTCACAGCGAAGCACGTTTACAGACGTTGCAGGCCGATTTCGTGCGCTGGTTTCGCGACAATG TTACCGCCTTTCCGCTGGTTCAGGAGCGATTAACCACCTACCTGTTAAGTTCCGATGCTGATATCTGGTT GATTACCGGCTCTCCGCAGCCGCTGGTTGAAGCGGTTTATTTCGATACGCCCTGGCTGCCGCGGGTTAAT CTTATCGCCAGCCAAATTCAGCGTGGCTATGGTGGTTGGGTATTGACGATGCGTTGTCTGGGACATGAAA
AGGTCGCACAACTGGAGCGCAAAATCGGCACTCCGCTGCGGCTGTACAGTGGCTATAGCGACAGTAATCA GGACAATCCGCTGCTTTATTTCTGTCAGCATCGTTGGCGAGTAACCCCGCGCGGTGAACTCCAGCAACTG GAATAA WT luxS (SEQ ID NO: 28) atgCCGTTGT TAGATAGCTT CACAGTCGAT CATACCCGGA TGGAAGCGCC TGCAGTTCGG GTGGCGAAAA CAATGAACAC CCCGCATGGC GACGCAATCA CCGTGTTCGA TCTGCGCTTC TGCGTGCCGA ACAAAGAAGT GATGCCAGAA AGAGGGATCC ATACCCTGGA GCACCTGTTT GCTGGTTTTA TGCGTAACCA TCTTAACGGT AATGGTGTAG AGATTATCGA TATCTCGCCA ATGGGCTGCC GCACCGGTTT TTATATGAGT CTGATTGGTA CGCCAGATGA GCAGCGTGTT GCTGATGCCT GGAAAGCGGC AATGGAAGAC GTGCTGAAAG TGCAGGATCA GAATCAGATC CCGGAACTGA ACGTCTACCA GTGTGGCACT TACCAGATGC ACTCGTTGCA GGAAGCGCAG GATATTGCGC GTAGCATTCT GGAACGTGAC GTACGCATCA ACAGCAACGA AGAACTGGCA CTGCCGAAAG AGAAGTTGCA GGAACTGCAC ATCtag RF- luxS (SEQ ID NO: 29) ATGCCGTTGTTAGATAGCTTCACAGTCGATCATACCCGGATGGAAGCGCCTGCAGTTCGGGTGGCGAAAA CAATGAACACCCCGCATGGCGACGCAATCACCGTGTTCGATCTGCGCTTCTGCGTGCCGAACAAAGAAGT GATGCCAGAAAGAGGGATCCATACCCTGGAGCACCTGTTTGCTGGTTTTATGCGTAACCATCTTAACGGT AATGGTGTAGAGATTATCGATATCTCGCCAATGGGCTGCCGCACCGGTTTTTATATGAGTCTGATTGGTA CGCCAGATGAGCAGCGTGTTGCTGATGCCTGGAAAGCGGCAATGGAAGACGTGCTGAAAGTGCAGGATCA GAATCAGATCCCGGAACTGAACGTCTACCAGTGTGGCACTTACCAGATGCACTCGTTGCAGGAAGCGCAG GATATTGCGCGTAGCATTCTGGAACGTGACGTACGCATCAACAGCAACGAAGAACTGGCACTGCCGAAAG AGAAGTTGCAGGAACTGCACATCTAA WT priC (SEQ ID NO: 30) gtgAAAACCG CCCTGCTGCT GGAAAAACTG GAAGGACAGC TCGCTACGCT GCGTCAGCGT TGTGCCCCGG TGTCACAGTT CGCCACGCTA AGTGCTCGTT TCGACAGGCA TCTTTTTCAG ACTCGTGCGA CAACACTACA GGCTTGTCTC GACGAGGCGG GCGATAATCT GGCTGCGCTT CGTCATGCAG TTGAGCAGCA ACAGCTGCCG CAAGTGGCCT GGCTGGCGGA ACATCTGGCG GCACAACTGG AAGCCATCGC GCGTGAAGCC TCCGCCTGGT CATTGCGCGA GTGGGACAGT GCACCACCGA AAATTGCCCG CTGGCAGCGT AAACGTATTC AGCATCAGGA TTTTGAGCGG CGGCTACGTG AGATGGTTGC CGAACGCAGA GCCCGTCTGG CGCGGGTGAC CGATCTCGTG GAACAGCAAA CGCTGCATCG TGAAGTGGAA GCCTATGAAG CGCGCCTGGC ACGCTGCCGC CATGCGCTGG AAAAAATCGA AAACAGGTTA GCGCGTTTAA CCCGCtag RF- priC (SEQ ID NO: 31) GTGAAAACCGCCCTGCTGCTGGAAAAACTGGAAGGACAGCTCGCTACGCTGCGTCAGCGTTGTGCCCCGG TGTCACAGTTCGCCACGCTAAGTGCTCGTTTCGACAGGCATCTTTTTCAGACTCGTGCGACAACACTACA GGCTTGTCTCGACGAGGCGGGCGATAATCTGGCTGCGCTTCGTCATGCAGTTGAGCAGCAACAGCTGCCG CAAGTGGCCTGGCTGGCGGAACATCTGGCGGCACAACTGGAAGCCATCGCGCGTGAAGCCTCCGCCTGGT CATTGCGCGAGTGGGACAGTGCACCACCGAAAATTGCCCGCTGGCAGCGTAAACGTATTCAGCATCAGGA TTTTGAGCGGCGGCTACGTGAGATGGTTGCCGAACGCAGAGCCCGTCTGGCGCGGGTGACCGATCTCGTG GAACAGCAAACGCTGCATCGTGAAGTGGAAGCCTATGAAGCGCGCCTGGCACGCTGCCGCCATGCGCTGG AAAAAATCGAAAACAGGTTAGCGCGTTTAACCCGCTGA RS/tRNA (SEQ ID NO: 32; T promoter sequence: nucleotides 1-20; RS: 152-1072 (underlined); tRNA: 1018-1163; T7 terminator: 1210-1257 TAATACGACTCACTATAGGGGAATTGTGAGCGGATAACAATTATTATGTCTTGACATGTAGTGAGTGGGC TGGTATAATGCAGCAAGGAATCATACGCGTTAGGTCTACCCTCTAGAAATAATTTTGTTTAACTTTTAGG
AGGTAAAACATATGGATGAATTTGAAATGATTAAACGCAACACCAGCGAAATTATTAGCGAAGAAGAACT GCGCGAAGTGCTGAAAAAAGATGAAAAAAGCGCGGCGATTGGCTTTGAACCGAGCGGCAAAATTCATCTG GGCCATTATCTGCAGATTAAAAAAATGATTGATCTGCAGAACGCGGGCTTTGATATTATTATTGTGCTGG CGGATCTGCATGCGTATCTGAACCAGAAAGGCGAACTGGATGAAATTCGCAAAATTGGCGATTATAACAA AAAAGTGTTTGAAGCGATGGGCCTGAAAGCGAAATATGTGTATGGCAGCGAATGGCAGCTGGATAAAGAT TATACCCTGAACGTGTATCGCCTGGCGCTGAAAACCACCCTGAAACGCGCGCGCCGCAGCATGGAACTGA TTGCGCGCGAAGATGAAAACCCGAAAGTGGCGGAAGTGATTTATCCGATTATGCAGGTGAACGCGGGTCA TTATCTCGGCGTGGATGTGGCGGTGGGCGGCATGGAACAGCGCAAAATTCACATGCTGGCGCGCGAACTG CTGCCGAAAAAAGTGGTGTGCATTCATAACCCGGTGCTGACCGGCCTGGATGGCGAAGGCAAAATGAGCA GCAGCAAAGGCAACTTTATTGCGGTGGATGATAGCCCGGAAGAAATTCGCGCGAAAATTAAAAAAGCGTA TTGCCCGGCGGGCGTGGTGGAAGGCAACCCGATTATGGAAATTGCGAAATATTTTCTGGAATATCCGCTG ACCATTAAACGCCCGGAAAAATTTGGCGGCGATCTGACCGTGAACAGCTATGAAGAACTGGAAAGCCTGT TTAAAAACAAAGAACTGCATCCGATGCGCCTGAAAAACGCGGTGGCGGAAGAACTGATTAAAATTCTGGA ACCGATTCGCAAACGCCTGTAATAAGCGCCCCGCATTCCCGCCTTAGTTCAGAGGGCAGAACGGCGGACT CTAAATCCGCATGGCACGGGTTCAAATCCCGTAGGCGGGACCACTAATTCTTAAGAACCCGCCCACAAGG CGGGTTTTTGCTTTTCCCCCTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTG CRM197 (SEQ ID NO: 33) MGADDVVDSSKSFVMENFSSYHGTKPGYVDSIQKGIQKPKSGTQGNYDDDWKEFYSTDNKYDAAGYSVDN ENPLSGKAGGVVKVTYPGLTKVLALKVDNAETIKKELGLSLTEPLMEQVGTEEFIKRFGDGASRVVLSLP FAEGSSSVEYINNWEQAKALSVELEINFETRGKRGQDAMYEYMAQACAGNRVRRSVGSSLSCINLDWDVI RDKTKTKIESLKEHGPIKNKMSESPNKTVSEEKAKQYLEEFHQTALEHPELSELKTVTGTNPVFAGANYA AWAVNVAQVIDSETADNLEKTTAALSILPGIGSVMGIADGAVHHNTEEIVAQSIALSSLMVAQAIPLVGE LVDIGFAAYNFVESIINLFQVVHNSYNRPAYSPGHKTQPFLHDGYAVSWNTVEDSIIRTGFQGESGHDIK ITAENTPLPIAGVLLPTIPGKLDVNKSKTHISVNGRKIRMRCRAIDGDVTFCRPKSPVYVGNGVHANLHV AFHRSSSEKIHSNEISSDSIGVLGYQKTVDHTKVNSKLSLFFEIKS hda protein sequence (SEQ ID NO: 34) MNTPAQLSLP LYLPDDETFA SFWPGDNSSL LAALQNVLRQ EHSGYIYLWA REGAGRSHLL HAACAELSQR GDAVGYVPLD KRTWFVPEVL DGMEHLSLVC IDNIECIAGD ELWEMAIFDL YNRILESGKT RLLITGDRPP RQLNLGLPDL ASRLDWGQIY KLQPLSDEDK LQALQLRARL RGFELPEDVG RFLLKRLDRE MRTLFMTLDQ LDRASITAQR KLTIPFVKEI LKL ^ lpxK protein sequence (SEQ ID NO: 35) MIEKIWSGES PLWRLLLPLS WLYGLVSGAI RLCYKLKLKR AWRAPVPVVV VGNLTAGGNG KTPVVVWLVE QLQQRGIRVG VVSRGYGGKA ESYPLLLSAD TTTAQAGDEP VLIYQRTDAP VAVSPVRSDA VKAILAQHPD VQIIVTDDGL QHYRLARDVE IVVIDGVRRF GNGWWLPAGP MRERAGRLKS VDAVIVNGGV PRSGEIPMHL LPGQAVNLRT GTRCDVAQLE HVVAMAGIGH PPRFFATLKM CGVQPEKCVP LADHQSLNHA DVSALVSAGQ TLVMTEKDAV KCRAFAEENW WYLPVDAQLS GDEPAKLLTQ LTLLASGN coaD protein sequence (SEQ ID NO: 36) MQKRAIYPGT FDPITNGHID IVTRATQMFD HVILAIAASP SKKPMFTLEE RVALAQQATA HLGNVEVVGF SDLMANFARN QHATVLIRGL RAVADFEYEM QLAHMNRHLM PELESVFLMP SKEWSFISSS LVKEVARHQG DVTHFLPENV HQALMAKLA ^ lolA protein sequence (SEQ ID NO: 37) MKKIAITCAL LSSLVASSVW ADAASDLKSR LDKVSSFHAS FTQKVTDGSG AAVQEGQGDL
WVKRPNLFNW HMTQPDESIL VSDGKTLWFY NPFVEQATAT WLKDATGNTP FMLIARNQSS DWQQYNIKQN GDDFVLTPKA SNGNLKQFTI NVGRDGTIHQ FSAVEQDDQR SSYQLKSQQN GAVDAAKFTF TPPQGVTVDD QRK mreC protein sequence (SEQ ID NO: 38) MKPIFSRGPS LQIRLILAVL VALGIIIADS RLGTFSQIRT YMDTAVSPFY FVSNAPRELL DGVSQTLASR DQLELENRAL RQELLLKNSE LLMLGQYKQE NARLRELLGS PLRQDEQKMV TQVISTVNDP YSDQVVIDKG SVNGVYEGQP VISDKGVVGQ VVAVAKLTSR VLLICDATHA LPIQVLRNDI RVIAAGNGCT DDLQLEHLPA NTDIRVGDVL VTSGLGGRFP EGYPVAVVSS VKLDTQRAYT VIQARPTAGL QRLRYLLLLW GADRNGANPM TPEEVHRVAN ERLMQMMPQV LPSPDAMGPK LPEPATGIAQ PTPQQPATGN AATAPAAPTQ PAANRSPQRA TPPQSGAQPP ARAPGGQ ^ murF protein sequence (SEQ ID NO: 39) MISVTLSQLT DILNGELQGA DITLDAVTTD TRKLTPGCLF VALKGERFDA HDFADQAKAG GAGALLVSRP LDIDLPQLIV KDTRLAFGEL AAWVRQQVPA RVVALTGSSG KTSVKEMTAA ILSQCGNTLY TAGNLNNDIG VPMTLLRLTP EYDYAVIELG ANHQGEIAWT VSLTRPEAAL VNNLAAAHLE GFGSLAGVAK AKGEIFSGLP ENGIAIMNAD NNDWLNWQSV IGSRKVWRFS PNAANSDFTA TNIHVTSHGT EFTLQTPTGS VDVLLPLPGR HNIANALAAA ALSMSVGATL DAIKAGLANL KAVPGRLFPI QLAENQLLLD DSYNANVGSM TAAVQVLAEM PGYRVLVVGD MAELGAESEA CHVQVGEAAK AAGIDRVLSV GKQSHAISTA SGVGEHFADK TALITRLKLL IAEQQVITIL VKGSRSAAME EVVRALQENG TC hemA (SEQ ID NO: 40) MTLLALGINH KTAPVSLRER VSFSPDKLDQ ALDSLLAQPM VQGGVVLSTC NRTELYLSVE EQDNLQEALI RWLCDYHNLN EEDLRKSLYW HQDNDAVSHL MRVASGLDSL VLGEPQILGQ VKKAFADSQK GHMKASELER MFQKSFSVAK RVRTETDIGA SAVSVAFAAC TLARQIFESL STVTVLLVGA GETIELVARH LREHKVQKMI IANRTRERAQ ILADEVGAEV IALSDIDERL READIIISST ASPLPIIGKG MVERALKSRR NQPMLLVDIA VPRDVEPEVG KLANAYLYSV DDLQSIISHN LAQRKAAAVE AETIVAQETS EFMAWLRAQS ASETIREYRS QAEQVRDELT AKALAALEQG GDAQAIMQDL AWKLTNRLIH APTKSLQQAA RDGDNERLNI LRDSLGLE sucB (SEQ ID NO: 41) MSSVDILVPD LPESVADATV ATWHKKPGDA VVRDEVLVEI ETDKVVLEVP ASADGILDAV LEDEGTTVTS RQILGRLREG NSAGKETSAK SEEKASTPAQ RQQASLEEQN NDALSPAIRR LLAEHNLDAS AIKGTGVGGR LTREDVEKHL AKAPAKESAP AAAAPAAQPA LAARSEKRVP MTRLRKRVAE RLLEAKNSTA MLTTFNEVNM KPIMDLRKQY GEAFEKRHGI RLGFMSFYVK AVVEALKRYP EVNASIDGDD VVYHNYFDVS MAVSTPRGLV TPVLRDVDTL GMADIEKKIK ELAVKGRDGK LTVEDLTGGN FTITNGGVFG SLMSTPIINP PQSAILGMHA IKDRPMAVNG QVEILPMMYL ALSYDHRLID GRESVGFLVT IKELLEDPTR LLLDV atpE (SEQ ID NO: 42) MENLNMDLLY MAAAVMMGLA AIGAAIGIGI LGGKFLEGAA RQPDLIPLLR TQFFIVMGLV DAIPMIAVGL GLYVMFAVA fabH (SEQ ID NO: 43) MYTKIIGTGS YLPEQVRTNA DLEKMVDTSD EWIVTRTGIR ERHIAAPNET VSTMGFEAAT RAIEMAGIEK DQIGLIVVAT TSATHAFPSA ACQIQSMLGI KGCPAFDVAA ACAGFTYALS VADQYVKSGA VKYALVVGSD VLARTCDPTD RGTIIIFGDG AGAAVLAASE EPGIISTHLH ADGSYGELLT LPNADRVNPE NSIHLTMAGN EVFKVAVTEL AHIVDETLAA NNLDRSQLDW LVPHQANLRI ISATAKKLGM SMDNVVVTLD RHGNTSAASV PCALDEAVRD GRIKPGQLVL LEAFGGGFTW GSALVRF
Claims
WHAT IS CLAIMED IS: 1. An RF1-deficient E. coli cell comprising at least one stop codon mutation from TAG to a non-TAG stop codon, a functional release factor 2 (RF2), and an oxidative cytoplasm, wherein the functional RF2 has greater RF2 activity than a control.
2. The RF1-deficient E. coli cell of claim 1, the number of stop codon mutations is no greater than 20.
3. The RF1-deficient E. coli cell of claim 1, the number of stop codon mutations is in the range of between 2 and 10.
4. An RF1-deficient E. coli cell, comprising at least one of the coding sequences selected from the group consisting of hda, lpxK, coaD, lolA, mreC, murF, and hemA comprises a non-TAG stop codon, wherein the cell has increased RF2 activity or expression as compared to a control E. coli cell.
5. The RF1-deficient E. coli cell of claim 4, wherein 2 to 7 of the coding sequences comprises non-TAG stop codons.
6. The RF1-deficient E. coli cell of claim 4, wherein the non-TAG stop codon is due to a genetic modification of the stop codon in the corresponding wild-type coding sequence. 7. The RF1-deficient E. coli cell of any one of claims 2-6, wherein the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA comprise at least the last 10 nucleotides of SEQ ID NO: 5,
7, 9, 11, 13, 15, 17, respectively.
8. The RF1-deficient E. coli cell of any of the previous claims, wherein the stop codons of hda, lpxK, coaD, lolA, mreC, murF, and hemA comprise a non-TAG stop codon.
9. The RF1-deficient E. coli cell of any of the previous claims, wherein the cell has an oxidative cytoplasm.
10. The RF1-deficient E. coli cell of any of the previous claims, wherein the cell is a K-12 E. coli cell.
11. The RF1-deficient E. coli cell of any of the previous claims, wherein the cell has increased expression of the RF2 polypeptide or increased transcription of the RF2 gene as compared to the control E. coli cell.
12. The RF1-deficient E. coli cell of claim 11, wherein the RF2 comprises a T246X mutation as compared to SEQ ID NO: 2.
13. The RF1-deficient E. coli cell of claim 12, wherein T246X is T246A.
14. The RF1-deficient E. coli cell of any of the previous claims, wherein the cell further comprises a stop codon mutation in one or more of sucB, atpE, fabH, and ubiF coding sequences.
15. The RF1-deficient E. coli cell of any of the previous claims, wherein the cell further comprises a ǻfabR mutation.
16. The RF1-deficient E. coli cell of any of the previous claims, wherein the cell further expresses an aminoacyl-tRNA synthetase.
17. The RF1-deficient E. coli cell of claim 13, wherein the aminoacyl-tRNA synthetase recognizes the TAG stop codon.
18. The RF1-deficient E. coli cell of claim 16 or 17, wherein the aminoacyl- tRNA synthetase preferentially aminoacylates to a degree of greater than 90% a tRNA with a non-natural amino acid as compared to the 20 common naturally occurring amino acids.
19. The RF1-deficient E. coli cell of claim 18, wherein the non-natural amino acid is para-azido-methyl-L-phenylalanine (pAMF).
20. The RF1-deficient E. coli cell of any of the previous claims, wherein the cell further comprises a gene encoding a protein of interest.
21. The RF1-deficient E. coli cell of claim 20, wherein the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy chain, a cytokine, a cytokine fragment, and an immunogenic polypeptide.
22. The RF1-deficient E. coli cell of claim 21, wherein the antibody is a monoclonal antibody.
23. The RF1-deficient E. coli cell of claim 21 or 22, wherein the antibody is an IgA, an IgD, an IgE, an IgG, or an IgM.
24. The RF1-deficient E. coli cell of any one of claims 21-23, wherein the antibody is humanized or human.
25. The RF1-deficient E. coli cell of any one of claims 21-24, wherein the antibody is aglycosylated.
26. The RF1-deficient E. coli cell of claim 21, wherein the antibody fragment is selected from an Fv fragment, a Fab fragment, a F(ab')2 fragment, a Fab' fragment, an scFv (sFv) fragment, and an scFv-Fc fragment.
27. The RF1-deficient E. coli cell of claim 21, wherein the antibody light chain is a light chain of an anti-HER2 antibody.
28. The RF1-deficient E. coli cell of claim 21, wherein the immunogenic polypeptide is a carrier protein.
29. The RF1-deficient E. coli cell of claim 28, wherein the carrier protein comprises at least one T-cell activating epitope from a protein selected from the group consisting
of Corynebacierium diphiheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D, and CRM197.
30. The RF1-deficient E. coli cell of claim 28 or 29, wherein the carrier protein comprises an amino acid sequence having at least 80% sequence identity to SEQ ID NO:33.
31. The RF1-deficient E. coli cell of claim 21, wherein the cytokine is selected from the group consisting of interleukins, interferons, transforming growth factors, and chemokines.
32. The RF1-deficient E. coli cell of any one of claims 20-31, wherein the protein of interest comprises one or more non-natural amino acids (NNAAs).
33. The RF1-deficient E. coli cell of claim 32, wherein the one or more NNAAs is selected from the group consisting of p-acetyl-L-phenylalanine, O-methyl-L-tyrosine, an -3-(2-naphthyl)alanine, 3-methyl-phenylalanine, O-4-allyl-L-tyrosine, 4-propyl-L-tyrosine, a tri-O-acetyl-GlcNAcȕ-serine, L-Dopa, fluorinated phenylalanine, isopropyl-L-phenylalanine, p- azido-L-phenylalanine, p-azido-methyl-L-phenylalanine, p-acyl-L-phenylalanine, p-benzoyl-L- phenylalanine, L-phosphoserine, phosphonoserine, phosphonotyrosine, p-iodo-phenylalanine, p- bromophenylalanine, p-amino-L-phenylalanine, isopropyl-L-phenylalanine, and p-propargyloxy- phenylalanine.
34. The RF1-deficient E. coli cell of claim 32, wherein the one or more NNAAs is p-azido-methyl-L-phenylalanine.
35. The RF1-deficient E. coli cell of claim 20, wherein the gene encoding the protein of interest is operably linked to an inducible promoter.
36. The RF1-deficient E. coli cell of claim 35, wherein the inducible promoter is a T7 promoter.
37. The RF1-deficient E. coli cell of any one of claims 1-36, wherein the RF-1 deficient E. coli cell possess 20% or less of the RF-1 activity as compared to a control E. coli cell.
38. A kit comprising the RF1-deficient E. coli cell of any of claims 4-36, wherein the kit further comprises a bacteria growth medium.
39. The kit of claim 38, wherein the kit further comprises a plasmid encoding a protein of interest.
40. The kit of claim 38 or 39, wherein the kit further comprises a plasmid encoding an aminoacyl-tRNA synthetase (RS) specific for pAMF and a tRNA specific for p- azidophenylalanine.
41. A method for expressing a recombinant protein in an RF1-deficient E. coli bacterial cell comprising the steps of: culturing the RF1-deficient E. coli bacterial cell and an expression cassette for expressing the recombinant protein, wherein the coding sequences for one or more or all of hda, lpxK, coaD, lolA, mreC, murF, and hemA in the RF-1 deficient E. coli cell comprise non-TAG stop codons, and wherein the RF1-deficient E. coli cell has increased RF2 activity or expression as compared to a control E. coli cell.
42. The method of claim 41, wherein the RF1-deficient E. coli bacterial cell comprises an oxidative cytoplasm.
43. The method of claim 41, wherein the number of the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA that comprise non-TAG stop codons is 2, 3, 4, 5, 6, or 7.
44. A method for expressing a protein of interest comprising culturing the RF1-deficient E. coli bacterial cell of any one of claims 1-36, wherein the RF1-deficient E. coli bacterial cell comprises an expression cassette comprising a coding sequence for the protein of interest.
45. The method of claim 41, wherein the protein of interest comprises one or more NNAAs.
46. The method of claim 41, wherein the stop codons are non-TAG stop codons due to genetic modifications of the stop codons in the wild-type coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA.
47. The method of claim 41 or 46, wherein the coding sequences of hda, lpxK, coaD, lolA, mreC, murF, and hemA polypeptides comprise polynucleotide sequences of SEQ ID NO: 5, 7, 9, 11, 13, 15, and 17, respectively.
48. The method of any one of claims 41-47, wherein the cell further comprises an aminoacyl-tRNA synthetase.
49. The method of claim 48, wherein the aminoacyl-tRNA synthetase recognizes the TAG stop codon.
50. The method of any one of claims 41-49, wherein the RF1-deficient E. coli cell contains an oxidative cytoplasm.
51. The method of any one of claims 41-50, wherein the RF1-deficient E. coli cell is a K-12 cell.
52. The method of any one of claims 41-51, wherein the RF1-deficient E. coli cell comprises a T246A mutation in the RF2 coding sequence.
53. The method of any one of claims 41-52, wherein the RF1-deficient E. coli cell comprises a stop codon mutation in one or more of sucB, atpE, fabH, and ubiF coding sequences.
54. The method of any one of claims 41-53, wherein the RF1-deficient E. coli cell further comprises a ǻfabR mutation.
55. The method of any one of claims 41-54, wherein the RF1-deficient E. coli strain further expresses an aminoacyl-tRNA synthetase.
56. The method of claim 55, wherein the aminoacyl-tRNA synthetase preferentially aminoacylates to a degree of greater than 90% a tRNA with a non-natural amino acid as compared to the twenty common naturally occurring amino acids.
57. The method of any one of claims 55-56, wherein the aminoacyl-tRNA synthetase recognizes the TAG stop codon.
58. The method of any one of claims 44-57, wherein the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, an antibody heavy chain, a cytokine, a cytokine fragment, and an immunogenic polypeptide.
59. The method of claim 58, wherein the protein of interest is selected from an antibody, an antibody fragment, an antibody light chain, and an antibody heavy chain.
60. The method of claim 58 or 59, wherein the antibody is a monoclonal antibody.
61. The method of any one of claims 58-60, wherein the antibody is an IgA, an IgD, an IgE, an IgG, or an IgM.
62. The method of any one of claims 58-61, wherein the antibody is humanized or human.
63. The method of any one of claims 58-62, wherein the antibody is aglycosylated.
64. The method of claim 58 or 59, wherein the antibody fragment is selected from an Fv fragment, a Fab fragment, a F(ab’)2 fragment, a Fab’ fragment, an scFv (sFv) fragment, and an scFv-Fc fragment.
65. The method of claim 58, wherein the immunogenic polypeptide is a carrier protein.
66. The method of claim 65, wherein the carrier protein comprises at least one T-cell activating epitope from a protein selected from the group consisting of Corynebacierium diphiheriae toxin, Clostridium tetani tetanospasmin, Haemophilus influenzae protein D, and CRM197.
67. The method of claim 65, wherein the carrier protein comprises an amino acid sequence having at least 80% sequence identity to SEQ ID NO: 33.
68. The method of claim 58, wherein the cytokine is selected from the group consisting of interleukins, interferons, transforming growth factors, and chemokines.
69. The method any one of claims 41-64, wherein the gene encoding the protein of interest is operably linked to an inducible promoter.
70. The method of claim 69, wherein the inducible promoter is a T7 promoter.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202363456770P | 2023-04-03 | 2023-04-03 | |
US63/456,770 | 2023-04-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024211306A1 true WO2024211306A1 (en) | 2024-10-10 |
Family
ID=90923861
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2024/022668 WO2024211306A1 (en) | 2023-04-03 | 2024-04-02 | Rf1 ko e. coli strains |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024211306A1 (en) |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5168062A (en) | 1985-01-30 | 1992-12-01 | University Of Iowa Research Foundation | Transfer vectors and microorganisms containing human cytomegalovirus immediate-early promoter-regulatory DNA sequence |
WO1996001313A1 (en) | 1994-07-01 | 1996-01-18 | Hermann Bujard | Tetracycline-regulated transcriptional modulators |
WO2002085923A2 (en) | 2001-04-19 | 2002-10-31 | The Scripps Research Institute | In vivo incorporation of unnatural amino acids |
US20140170753A1 (en) | 2012-12-12 | 2014-06-19 | Massachusetts Institute Of Technology | Crispr-cas systems and methods for altering expression of gene products |
US9682934B2 (en) | 2012-08-31 | 2017-06-20 | Sutro Biopharma, Inc. | Modified amino acids |
WO2017189308A1 (en) | 2016-04-19 | 2017-11-02 | The Broad Institute Inc. | Novel crispr enzymes and systems |
US9938516B2 (en) | 2013-10-11 | 2018-04-10 | Sutro Biopharma, Inc. | Non-natural amino acid tRNA synthetases for para-methylazido-L-phenylalanine |
US9988619B2 (en) | 2013-10-11 | 2018-06-05 | Sutro Biopharma, Inc. | Non-natural amino acid tRNA synthetases for pyridyl tetrazine |
US10596270B2 (en) | 2017-09-18 | 2020-03-24 | Sutro Biopharma, Inc. | Anti-folate receptor antibody conjugates, compositions comprising anti-folate receptor antibody conjugates, and methods of making and using anti-folate receptor antibody conjugates |
US10610571B2 (en) | 2017-08-03 | 2020-04-07 | Synthorx, Inc. | Cytokine conjugates for the treatment of proliferative and infectious diseases |
WO2020097385A1 (en) | 2018-11-08 | 2020-05-14 | Sutro Biopharma, Inc. | E coli strains having an oxidative cytoplasm |
US10669540B2 (en) | 2015-06-18 | 2020-06-02 | The Board Institute, Inc. | CRISPR enzymes and systems |
WO2021222719A1 (en) | 2020-04-30 | 2021-11-04 | Sutro Biopharma, Inc. | Methods of producing full-length antibodies using e.coli |
WO2023015870A1 (en) * | 2021-08-11 | 2023-02-16 | 浙江新码生物医药有限公司 | Method for constructing strain for producing recombinant protein containing unnatural amino acids, and strain obtained therefrom |
-
2024
- 2024-04-02 WO PCT/US2024/022668 patent/WO2024211306A1/en unknown
Patent Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5385839A (en) | 1985-01-30 | 1995-01-31 | University Of Iowa Research Foundation | Transfer vectors and microorganisms containing human cytomegalovirus immediate-early promoter regulatory DNA sequence |
US5168062A (en) | 1985-01-30 | 1992-12-01 | University Of Iowa Research Foundation | Transfer vectors and microorganisms containing human cytomegalovirus immediate-early promoter-regulatory DNA sequence |
WO1996001313A1 (en) | 1994-07-01 | 1996-01-18 | Hermann Bujard | Tetracycline-regulated transcriptional modulators |
WO2002085923A2 (en) | 2001-04-19 | 2002-10-31 | The Scripps Research Institute | In vivo incorporation of unnatural amino acids |
US20030082575A1 (en) | 2001-04-19 | 2003-05-01 | The Scripps Research Institute | In vivo incorporation of unnatural amino acids |
US20030108885A1 (en) | 2001-04-19 | 2003-06-12 | The Scripps Research Institute | Methods and compositions for the production of orthogonal tRNA-aminoacyltRNA synthetase pairs |
US9682934B2 (en) | 2012-08-31 | 2017-06-20 | Sutro Biopharma, Inc. | Modified amino acids |
US20140170753A1 (en) | 2012-12-12 | 2014-06-19 | Massachusetts Institute Of Technology | Crispr-cas systems and methods for altering expression of gene products |
US9938516B2 (en) | 2013-10-11 | 2018-04-10 | Sutro Biopharma, Inc. | Non-natural amino acid tRNA synthetases for para-methylazido-L-phenylalanine |
US9988619B2 (en) | 2013-10-11 | 2018-06-05 | Sutro Biopharma, Inc. | Non-natural amino acid tRNA synthetases for pyridyl tetrazine |
US10179909B2 (en) | 2013-10-11 | 2019-01-15 | Sutro Biopharma, Inc. | Non-natural amino acid tRNA synthetases for pyridyl tetrazine |
US10669540B2 (en) | 2015-06-18 | 2020-06-02 | The Board Institute, Inc. | CRISPR enzymes and systems |
WO2017189308A1 (en) | 2016-04-19 | 2017-11-02 | The Broad Institute Inc. | Novel crispr enzymes and systems |
US10610571B2 (en) | 2017-08-03 | 2020-04-07 | Synthorx, Inc. | Cytokine conjugates for the treatment of proliferative and infectious diseases |
US10596270B2 (en) | 2017-09-18 | 2020-03-24 | Sutro Biopharma, Inc. | Anti-folate receptor antibody conjugates, compositions comprising anti-folate receptor antibody conjugates, and methods of making and using anti-folate receptor antibody conjugates |
WO2020097385A1 (en) | 2018-11-08 | 2020-05-14 | Sutro Biopharma, Inc. | E coli strains having an oxidative cytoplasm |
WO2021222719A1 (en) | 2020-04-30 | 2021-11-04 | Sutro Biopharma, Inc. | Methods of producing full-length antibodies using e.coli |
WO2023015870A1 (en) * | 2021-08-11 | 2023-02-16 | 浙江新码生物医药有限公司 | Method for constructing strain for producing recombinant protein containing unnatural amino acids, and strain obtained therefrom |
Non-Patent Citations (50)
Title |
---|
"Organic Chemistry by Fessendon and Fessendon", 1982, WILLARD GRANT PRESS |
"PCR Protocols: A Guide to Methods and Applications", 1990, ACADEMIC PRESS |
ASLUND F ET AL: "Efficient production of disulfide bonded proteins in the cytoplasm in "oxidizing" mutants of E. coli", INNOVATIONS, XX, XX, vol. 10, 1 November 1999 (1999-11-01), pages 11 - 12, XP002456634 * |
AUSUBEL, F. M. ET AL.: "Current Protocols in Molecular Biology", 2012, COLD SPRING HARBOR LABORATORY PRESS |
AZOULAY, M.VILMONT, M.FRAPPIER, F.: "Glutamine analogues as Potential Antimalarials", EUR. J. MED. CHEM., vol. 26, 1991, pages 201 - 5, XP023870208, DOI: 10.1016/0223-5234(91)90030-Q |
BARINAGA, SCIENCE, vol. 265, 1994, pages 27 - 28 |
BARTON ET AL.: "Synthesis of Novel a-Amino-Acids and Derivatives Using Radical Chemistry: Synthesis of L- and D-a-Amino-Adipic Acids, L-a-aminopimelic Acid and Appropriate Unsaturated Derivatives", TETRAHEDRON LETT., vol. 43, 1987, pages 4297 - 4308, XP001008711, DOI: 10.1016/S0040-4020(01)90305-9 |
BATZER ET AL., NUCLEIC ACID RES., vol. 19, 1991, pages 5081 |
BAUBONIS ET AL., NUCLEIC ACIDS RESEARCH, vol. 21, 1993, pages 2025 - 2029 |
BINDEREIFSCHONWESTHOF: "Handbook of RNA Biochemistry", 2005, WILEY-VCH |
CHEN ET AL., ANGEW CHEM INT ED ENGL., vol. 48, no. 22, 2009, pages 4052 - 5 |
COLIGAN ET AL.: "Current Protocols in Protein Science", vol. 1, 2000, JOHN WILEY AND SONS, INC. |
CRAIG, J. C.: "Absolute Configuration of the Enantiomers of 7-Chloro-4 [[4-(diethylamino)-1-methylbutyl]amino]quinoline (Chloroquine", J. ORG. CHEM., vol. 53, 1988, pages 1167 - 1170 |
DANG ET AL., DEVELOP. GENET., vol. 13, 1992, pages 367 - 375 |
DATSENKO, K. A.WANNER, B. L.: "One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 97, no. 12, 2000, pages 6640 - 6645, XP002210218, DOI: 10.1073/pnas.120163297 |
DE MEY: "Promoter knock-in: a novel rational method for the fine tuning of genes", BMC BIOTECHNOL., vol. 10, 24 March 2010 (2010-03-24), pages 26, XP021076423, DOI: 10.1186/1472-6750-10-26 |
FRIEDMAN, O. M.CHATTERRJI, R.: "Synthesis of Derivatives of Glutamine as Model Substrates for Anti-Tumor Agents", J. AM. CHEM. SOC., vol. 81, 1959, pages 3750 - 3752, XP055737782, DOI: 10.1021/ja01523a064 |
FUKUSHIGE ET AL., PROC. NATL. ACAD. SCI., vol. 89, 1992, pages 7905 - 7907 |
GANG YIN ET AL: "RF1 attenuation enables efficient non-natural amino acid incorporation for production of homogeneous antibody drug conjugates", SCIENTIFIC REPORTS, vol. 7, no. 1, 8 June 2017 (2017-06-08), XP055527126, DOI: 10.1038/s41598-017-03192-z * |
GREEN, M.R.SAMBROOK, J.AUSUBEL, F. M. ET AL.: "Methods in Enzymology", vol. 152, 1987, ACADEMIC PRESS, INC., article "Berger and Kimmel, Guide to Molecular Cloning Techniques" |
GU ET AL., CELL, vol. 73, 1993, pages 1155 - 1164 |
GUZMAN ET AL., J. BACTERIOL., vol. 177, no. 14, July 1995 (1995-07-01), pages 4121 - 4130 |
HASAN ET AL., GENE, vol. 150, 1994, pages 51 - 56 |
HENIKOFFHENIKOFF, PROC NATL ACAD SCI USA, vol. 89, 1989, pages 10915 |
HENIKOFFHENIKOFF, PROC. NATL. ACAD. SCI. USA, vol. 89, 1992, pages 10915 - 10919 |
KIICK ET AL.: "Incorporation of azides into recombinant proteins for chemoselective modification by the Staudinger ligation", PNAS, vol. 99, 2002, pages 19 - 24, XP002904363, DOI: 10.1073/pnas.012583299 |
KING, F. E.KIDD, D. A. A.: "A New Synthesis of Glutamine and of y-Dipeptides of Glutamic Acid from Phthylated Intermediates", J. CHEM. SOC., 1949, pages 3315 - 3319 |
KIRILL A. DATSENKOBARRY L. WANNER, PROC NATL ACAD SCI U. S. A., vol. 97, no. 12, 6 June 2000 (2000-06-06), pages 6640 - 6645 |
KOSKINEN, A. M. P.RAPOPORT, H.: "Synthesis of 4-Substituted Prolines as Conformationally Constrained Amino Acid Analogues", J. ORG. CHEM., vol. 54, 1989, pages 1859 - 1866 |
LI ET AL., PROC NATL ACAD SCI U S A, vol. 100, no. 1, 7 January 2003 (2003-01-07), pages 56 - 61 |
MATSOUKAS ET AL., J. MED. CHEM., vol. 38, 1995, pages 4660 - 4669 |
MOL MICROBIOL., vol. 80, no. 1, April 2011 (2011-04-01), pages 195 - 218 |
MORRIS ET AL., NUCLEIC ACIDS RES., vol. 19, 1991, pages 5895 - 5900 |
MUKAI ET AL., SCI. REP., vol. 5, 2015, pages 9699 |
ODELL ET AL., PLANT PHVSIOL., vol. 106, 1994, pages 447 - 458 |
OHTSUKA ET AL., J. BIOL. CHEM., vol. 260, 1985, pages 2605 - 2608 |
RAPOPORT, H.: "Synthesis of Optically Pure Pipecolates from L-Asparagine. Application to the Total Synthesis of (+)-Apovincamine through Amino Acid Decarbonylation and Iminium Ion Cyclization", J. ORG. CHEM., vol. 1989, 1985, pages 1859 - 1866 |
RNA, vol. 16, 2010, pages 1623 - 1633 |
ROSANO, G.CECCARELLI, E., FRONT. MICROBIOL., vol. 5, 16 April 2014 (2014-04-16) |
ROSSOLINI ET AL., MOL. CELL. PROBES, vol. 8, 1994, pages 91 - 98 |
SATOH, M. ET AL.: "Non-fucosylated therapeutic antibodies as next-generation therapeutic antibodies", EXPERT OPINION ON BIOLOGICAL THERAPY, vol. 6, no. 11, pages 1161 - 1173, XP008078583, DOI: 10.1517/14712598.6.11.1161 |
SAUER, METHODS IN ENZYMOL., vol. 225, 1993, pages 890 - 900 |
SHIMIZU ET AL.: "FEBS Journal", vol. 273, 2006, pages: 4133 - 4140 |
SMOLSKAYA SVIATLANA ET AL: "Escherichia coli Extract-Based Cell-Free Expression System as an Alternative for Difficult-to-Obtain Protein Biosynthesis", INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, vol. 21, no. 3, 1 February 2020 (2020-02-01), Basel, CH, pages 928, XP093176776, ISSN: 1661-6596, DOI: 10.3390/ijms21030928 * |
SPIRINSWARTZ: "Cell-free Protein Synthesis", 2008, WILEY-VCH |
SUBASINGHE ET AL.: "Quisqualic acid analogues: synthesis of beta-heterocyclic 2-aminopropanoic acid derivatives and their activity at a novel quisqualate-sensitized site", J. MED. CHEM., vol. 35, 1992, pages 4602 - 7, XP002198373, DOI: 10.1021/jm00102a014 |
T. MUKAI ET AL: "Codon reassignment in the Escherichia coli genetic code", NUCLEIC ACIDS RESEARCH, vol. 38, no. 22, 1 December 2010 (2010-12-01), pages 8188 - 8195, XP055081209, ISSN: 0305-1048, DOI: 10.1093/nar/gkq707 * |
WALS KIM ET AL: "Unnatural amino acid incorporation in E. coli: current and future applications in the design of therapeutic proteins", FRONTIERS IN CHEMISTRY, vol. 2, 1 April 2014 (2014-04-01), pages 1 - 12, XP055826160, Retrieved from the Internet <URL:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3982533/pdf/fchem-02-00015.pdf> DOI: 10.3389/fchem.2014.00015 * |
YANG W.MIZUUCHI K., STRUCTURE, vol. 5, 1997, pages 1401 - 1406 |
YOUNGMAN ET AL., ANNU. REV. MICROBIOL., vol. 62, 2008, pages 353 - 373 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7598243B2 (en) | Methods for optimizing antibody expression | |
US20230002722A1 (en) | E. coli strains having an oxidative cytoplasm | |
WO2023103578A1 (en) | A genetically engineered bacterium and a preparation method and use thereof | |
CN113980880B (en) | Genetically engineered bacterium and application thereof, and method for producing psicose by taking glucose as raw material | |
AU2021264001A1 (en) | Methods of producing full-length antibodies using e.coli | |
US20030153049A1 (en) | Escherichia coli strain secreting human granulocyte colony stimulating factor (g-csf) | |
WO2024211306A1 (en) | Rf1 ko e. coli strains | |
JP2007501622A (en) | Methods for purifying recombinant polypeptides | |
JP4427902B2 (en) | In vivo production method of chemically diverse proteins by introducing non-conventional amino acids | |
US12351850B2 (en) | Methods of producing full-length antibodies using E. coli | |
CN112888787A (en) | Novel bacterial lpp mutants and their use in the secretory production of recombinant proteins | |
AU2001256433C1 (en) | Mutant strains capable of producing chemically diversified proteins by incorporation of non-conventional amino acids | |
WO2024116051A1 (en) | Proteins with minimal n-terminal initiator methionine | |
HK40060526B (en) | E coli strains having an oxidative cytoplasm | |
HK40060526A (en) | E coli strains having an oxidative cytoplasm | |
KR101658082B1 (en) | Method for Preparing Recombinant Protein Using EDA as a Fusion Expression Partner |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24722869 Country of ref document: EP Kind code of ref document: A1 |