[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2021205155A2 - C5-modified thymidines - Google Patents

C5-modified thymidines Download PDF

Info

Publication number
WO2021205155A2
WO2021205155A2 PCT/GB2021/050839 GB2021050839W WO2021205155A2 WO 2021205155 A2 WO2021205155 A2 WO 2021205155A2 GB 2021050839 W GB2021050839 W GB 2021050839W WO 2021205155 A2 WO2021205155 A2 WO 2021205155A2
Authority
WO
WIPO (PCT)
Prior art keywords
och
oligonucleotide
compound according
halo
group
Prior art date
Application number
PCT/GB2021/050839
Other languages
French (fr)
Other versions
WO2021205155A3 (en
Inventor
Michael CHUN HAO CHEN
Michal ROBERT MATUSZEWSKI
Martin Fox
Gordon ROSS MCINROY
Original Assignee
Nuclera Nucleics Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nuclera Nucleics Ltd filed Critical Nuclera Nucleics Ltd
Publication of WO2021205155A2 publication Critical patent/WO2021205155A2/en
Publication of WO2021205155A3 publication Critical patent/WO2021205155A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H19/00Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
    • C07H19/02Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof sharing nitrogen
    • C07H19/04Heterocyclic radicals containing only nitrogen atoms as ring hetero atom
    • C07H19/06Pyrimidine radicals
    • C07H19/10Pyrimidine radicals with the saccharide radical esterified by phosphoric or polyphosphoric acids
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H21/00Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/34Polynucleotides, e.g. nucleic acids, oligoribonucleotides

Definitions

  • the invention relates to 3'-aminooxy C5 modified thymidine nucleotides.
  • the invention also relates to a method of using said nucleotides in nucleic acid synthesis.
  • the invention further relates to a kit comprising the nucleotides, a terminal transferase enzyme and an initiator oligonucleotide.
  • Nucleic acid synthesis is vital to modern biotechnology. The rapid pace of development in the biotechnology arena has been made possible by the scientific community's ability to artificially synthesise DNA, RNA and proteins.
  • Nucleic acid synthesis is the process by which strands are assembled in the correct order of bases. Once assembled, the strands are frequently required to base-pair with their complementary strands. This involves the complementary pairing of nucleoside bases via hydrogen bonds, which is known as a base pair (bp).
  • the four nucleoside bases found in DNA are Adenine (A), Thymine (T), Cytosine (C), Guanine (G).
  • A Adenine
  • T Thymine
  • C Cytosine
  • G Guanine
  • U Uracil
  • A-T base pair is weaker than the C-G base pair (GC) due to having a lower number of hydrogen bonds (two vs three, respectively) and weaker stacking interactions.
  • GC C-G base pair
  • hybridisation between AT rich DNA strands is less favourable than GC rich strands - an effect that is clearly observable in the melting temperature (T m ).
  • T m melting temperature
  • DNA synthesis technology does not meet the demands of the biotechnology industry. Despite being a mature technology, it is practically impossible to synthesise a DNA strand greater than 200 nucleotides in length in viable yield, and most DNA synthesis companies only offer up to 120 nucleotides routinely.
  • an average protein-coding gene is of the order of 2000- 3000 contiguous nucleotides
  • a chromosome is at least a million contiguous nucleotides in length and an average eukaryotic genome numbers in the billions of nucleotides.
  • Known methods of DNA sequencing use template-dependent DNA polymerases to add 3'-reversibly terminated nucleotides to a growing double-stranded substrate.
  • each added nucleotide contains a dye, allowing the user to identify the exact sequence of the template strand.
  • this technology is able to produce strands of between 500-1000 bps long.
  • this technology is not suitable for de novo nucleic acid synthesis because of the requirement for an existing nucleic acid strand to act as a template.
  • TdT has been shown not to efficiently add nucleoside triphosphates containing 3'-0- reversibly terminating moieties for building up a nascent single-stranded DNA chain necessary for a de novo synthesis cycle.
  • a 3'-0- reversible terminating moiety would prevent a terminal transferase like TdT from catalysing the nucleotide transferase reaction between the 3'-end of a growing DNA strand and the 5'-triphosphate of an incoming nucleoside triphosphate.
  • the inventors have previously discovered certain modified nucleotides can be incorporated using terminal transferases. Modified nucleotides suitable for terminal transferase extension have been disclosed in for example PCT/GB2018/053305.
  • a common reversible terminator is the aminooxy (0-NH 2 ) group. The aminooxy group is converted to OH by treatment with nitrite.
  • nucleic acid synthesis In order to improve the quality of nucleic acid synthesis, it is desirable to shorten the length of nucleic acid fragments without reducing the melting temperature. Furthermore, reducing the dispiriting between the strength of AT and GC base pairs can improve downstream applications some as assembly of oligonucleotides into larger nucleic acid products. Whilst certain modifications to thymidine nucleosides are known to increase the melting temperature, these modifications have not previously been used in enzymatic synthesis.
  • the compounds described herein enable a method of increasing the AT base pair stability by modifying the 5-position pyrimidine group.
  • An aspect of the present invention relates to a compound according to Formula (la) or (lb): wherein, R 1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
  • R 2 is H, halo, OH, NH 2 , COOH, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms;
  • R 3 is selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe.
  • a further aspect of the present invention relates to a compound according to Formula (lc) or (Id):
  • R 1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
  • R 2 is H, halo, OH, NH 2 , COOH, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms;
  • R 3 is selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe; and R 4 is H or D.
  • a further aspect of the present invention also relates to a method of nucleic acid synthesis comprising reacting compound (la) or compound (lb) with an oligonucleotide sequence in the presence of a terminal deoxynucleotidyl transferase (TdT) enzyme.
  • TdT terminal deoxynucleotidyl transferase
  • a further aspect of the present invention also relates to a method of nucleic acid synthesis comprising reacting compound (lc) or compound (Id) with an oligonucleotide sequence in the presence of a terminal deoxynucleotidyl transferase (TdT) enzyme.
  • a further aspect of the present invention further relates to a kit comprising:
  • TdT terminal deoxynucleotidyl transferase
  • a further aspect of the present invention further relates to a kit comprising:
  • TdT terminal deoxynucleotidyl transferase
  • a further aspect of the present invention relates to an oligonucleotide according to Formula (2a) or (2b): wherein, R 1 is an oligonucleotide;
  • R 2 is H, halo, OH, NH 2 , COOH, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms;
  • R 3 is selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe.
  • a further aspect of the present invention relates to an oligonucleotide according to Formula (2c) or (2d): wherein, R 1 is an oligonucleotide;
  • R 2 is H, halo, OFI, N H 2 , COOFI, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms;
  • R 3 is selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe; and R 4 is H or D.
  • FIG. 1 Enzymatic incorporation of 2'-deoxy-3'-aminooxy-5-(3-hydroxy-l-butynyl)-uridine triphosphate (Tl) into a single stranded nucleic acid initiator.
  • An engineered terminal deoxynucleotidyl transferase (TdT) was used to incorporate either 2'-deoxythymidine triphosphate (T) or Tl into six nucleic acid sequences with different 3'-terminal sequence contexts (SEQ 1-6). Quantitative addition efficiency was observed in all cases.
  • Tl triphosphate is a suitable monomer for de novo enzymatic DNA synthesis.
  • This experiment also demonstrates that Tl behaves similarly to T in a de novo enzymatic DNA synthesis process.
  • a second addition was then performed, whereby the engineered TdT enzyme added either A, C, T, G, or Tl (all in 3'-aminooxy reversibly terminated form).
  • a cleavage solution containing uracil DNA glycosylase (UDG) and N,N'-dimethylethylene diamine (DMED) was then used to cleave DNA from the paramagnetic particles for analysis by denaturing polyacrylamide gel electrophoresis.
  • the DNA was visualised by virtue of the internal cyanine 3 dye. Lane 1: no addition control. Lanes 2 and 3: single addition of Tl triphosphate. Lanes 4-8: addition of Tl followed by A, C, T, G, or Tl for lanes 4 to 8 respectively.
  • the gel clearly shows quantitative addition of Tl to the solid support immobilised initiator strand. Furthermore, the gel demonstrates that multi-cycle de novo enzymatic DNA synthesis can be achieved when Tl forms part or the entirety of the synthesised sequence.
  • the A-T base pair is weaker than the C-G base pair due to having a lower number of hydrogen bonds (two vs three).
  • hybridisation between AT rich DNA strands is less favourable than GC rich strands - an effect that is clearly observable, for example through biophysical characterisation of the melting temperature (Tm).
  • Tm is defined as the temperature where 50% of oligonucleotide is duplexed with its perfect complement and 50% is free in solution. Tm is of critical importance for many biochemical techniques including PCR, in-situ hybridisation, and Southern blotting. In order for an AT rich region to attain an equal tm to a GC rich region, it must have a longer length.
  • An aspect of the present invention relates to a compound according to Formula (la) or (lb): wherein, R 1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
  • R 2 is H, halo, OH, NH 2 , COOH, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms;
  • R 3 is selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe.
  • the oxime can be transformed into aminooxy as part of the unblocking process.
  • R 1 can be a phosphate or polyphosphate group.
  • the phosphate groups can be protonated or in salt form.
  • the phosphates can be entirely oxygen, or can contain one or more sulfur atoms.
  • R 1 can be a phosphate group.
  • R 1 can be a polyphosphate group.
  • R 1 can also be a phosphate or polyphosphate group selected from -(PO3) x (P02S) y (P0 3 ) z where x, y and z are independently 0-5 and and x+y+z is 1- 5.
  • R 1 can also be a phosphate or polyphosphate group having one or more sulfur atoms.
  • R 1 can be a phosphate group having one or more sulfur atoms.
  • R 1 can be a polyphosphate group having one or more sulfur atoms.
  • the sulfur atom can be in any position on any on the phosphate groups.
  • R 1 can further be a monophosphate, diphosphate, triphosphate, tetraphosphate, pentaphosphate, or (alpha-thio)triphosphate group.
  • R 1 can be a monophosphate group.
  • R 1 can be a diphosphate group.
  • R 1 can be a tetraphosphate group.
  • R 1 can be a pentaphosphate group.
  • R 1 can be an (alpha- thio)triphosphate group.
  • R 1 can be a triphosphate group.
  • a further embodiment of the present invention relates to a compound according to Formula (la) or (lb) wherein R 2 can be H, halo, OH, NH 2 , COOH, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms; and
  • R 2 can be H, F, OH, NH 2 , COOH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or F.
  • R 2 can be H, F, OH, NH 2 , Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or F.
  • R 2 can be H, F, OH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH or F.
  • R 2 can be H.
  • R 2 can be halo.
  • R 2 can be F.
  • R 2 can be Cl.
  • R 2 can be Br.
  • R 2 can be I.
  • R 2 can be NH 2 .
  • R 2 can be Ci_3 alkoxy.
  • R 2 can be OCH 3 .
  • R 2 can be OCH 2 CH 3 .
  • R 2 can be OCH 2 CH 2 CH 3 .
  • R 2 can be COOH.
  • R 2 can be COH.
  • R 2 can be Ci_ 3 alkyl optionally substituted with OH.
  • R 2 can be CH 2 OH.
  • R 2 can be CH 2 CH 2 OH.
  • R 2 can be CH 2 CH 2 CH 2 OH.
  • R 2 can be CH 2 CH(OH)CH 2 .
  • R 2 can be NH 2 .
  • R 2 can be OH or CH 2 OH.
  • R 2 can be OH.
  • R 2 can be CH 2 OH.
  • a further embodiment of the present invention relates to a compound according to Formula (la) or (lb) wherein R 3 can be selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe.
  • R 3 can be OH.
  • R 3 can be F.
  • R 3 can be OCH 3 .
  • R 3 can be OCH 2 CH 2 OMe.
  • R 3 can be H.
  • the compound of Formula (la) or (lb) of the present invention can be:
  • a further aspect of the present invention relates to a method of nucleic acid synthesis comprising reacting compound (la) or compound (lb) with an oligonucleotide sequence in the presence of a terminal deoxynucleotidyl transferase (TdT) enzyme.
  • the extended sequence can be treated with a nitrite salt.
  • the oxime can be converted to NH 2 prior to nitrite exposure, for example by hydrolysis of the oxime.
  • the terminal transferase or modified terminal transferase can be any enzyme capable of template independent strand extension.
  • the modified terminal deoxynucleotidyl transferase (TdT) enzyme can comprise amino acid modifications when compared to a wild type sequence or a truncated version thereof.
  • the terminal transferase can be the homologous amino acid sequence of a terminal deoxynucleotidyl transferase (TdT) enzyme in any species or the homologous amino acid sequence of RoIm, RoIb, RoIl, and RoIQ of any species or the homologous amino acid sequence of X family polymerases of any species.
  • Homologous refers to protein sequences between two or more proteins that possess a common evolutionary origin, including proteins from superfamilies in the same species of organism as well as homologous proteins from different species. Such proteins (and their encoding nucleic acids) have sequence homology, as reflected by their sequence similarity, whether in terms of percent identity or by the presence of specific residues or motifs and conserved positions.
  • a variety of protein (and their encoding nucleic acid) sequence alignment tools may be used to determine sequence homology. For example, the Clustal Omega multiple sequence alignment program provided by the European Molecular Biology Laboratory (EMBL) can be used to determine sequence homology or homologous regions.
  • EMBL European Molecular Biology Laboratory
  • a further embodiment of the present invention relates to the oligonucleotide sequence comprising a solid-supported oligonucleotide sequence.
  • the oligonucleotide sequence comprises 2 or more nucleotides.
  • the oligonucleotide sequence can be between 10 and 500 nucleotides, such as between 20 and 200 nucleotides, in particular between 20 and 50 nucleotides long.
  • a further embodiment of the present invention relates to a method further comprising a reaction step with a nitrite salt.
  • the nitrate salt is sodium nitrite.
  • a further aspect of the present invention relates to a kit comprising:
  • TdT terminal deoxynucleotidyl transferase
  • a further aspect of the present invention relates to a compound according to Formula (lc) or (Id): wherein, R 1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
  • R 2 is H, halo, OFI, N H 2 , COOFI, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms;
  • R 3 is selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe; and R 4 is H or D.
  • R 1 can be a phosphate or polyphosphate group.
  • the phosphate groups can be protonated or in salt form.
  • the phosphates can be entirely oxygen, or can contain one or more sulfur atoms.
  • R 1 can be a phosphate group.
  • R 1 can be a polyphosphate group.
  • R 1 can also be a phosphate or polyphosphate group selected from -(P03) x (P0 2 S) y (P03) z where x, y and z are independently 0-5 and and x+y+z is 1- 5.
  • R 1 can also be a phosphate or polyphosphate group having one or more sulfur atoms.
  • R 1 can be a phosphate group having one or more sulfur atoms.
  • R 1 can be a polyphosphate group having one or more sulfur atoms. The sulfur atom can be in any position on any on the phosphate groups.
  • R 1 can further be a monophosphate, diphosphate, triphosphate, tetraphosphate, pentaphosphate, or (alpha-thio)triphosphate group.
  • R 1 can be a monophosphate group.
  • R 1 can be a diphosphate group.
  • R 1 can be a tetraphosphate group.
  • R 1 can be a pentaphosphate group.
  • R 1 can be an (alpha- thio)triphosphate group.
  • R 1 can be a triphosphate group.
  • a further embodiment of the present invention relates to a compound according to Formula (lc) or (Id) wherein R 2 can be H, halo, OH, NH 2 , COOH, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms; and
  • R 2 can be H, F, OH, NH 2 , COOH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or F.
  • R 2 can be H, F, OH, NH 2 , Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or F.
  • R 2 can be H, F, OH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH or F.
  • R 2 can be H.
  • R 2 can be halo.
  • R 2 can be F.
  • R 2 can be Cl.
  • R 2 can be Br.
  • R 2 can be I.
  • R 2 can be NH 2 .
  • R 2 can be Ci_3 alkoxy.
  • R 2 can be OCH 3 .
  • R 2 can be OCH 2 CH 3 .
  • R 2 can be OCH 2 CH 2 CH 3 .
  • R 2 can be COOH.
  • R 2 can be COH.
  • R 2 can be Ci_ 3 alkyl optionally substituted with OH.
  • R 2 can be CH 2 OH.
  • R 2 can be CH 2 CH 2 OH.
  • R 2 can be CH 2 CH 2 CH 2 OH.
  • R 2 can be CH 2 CH(OH)CH 2 .
  • R 2 can be NH 2 .
  • R 2 can be OH or CH 2 OH.
  • R 2 can be OH.
  • R 2 can be CH 2 OH.
  • a further embodiment of the present invention relates to a compound according to Formula (lc) or (Id) wherein R 3 can be selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe.
  • R 3 can be OH.
  • R 3 can be F.
  • R 3 can be OCH 3 .
  • R 3 can be OCH 2 CH 2 OMe.
  • R 3 can be H.
  • a further embodiment of the present invention relates to a compound according to Formula (lc) or (Id) wherein R 4 can be H or D.
  • R 4 can be H.
  • R 4 can be D.
  • a further aspect of the present invention relates to an oligonucleotide according to Formula (2a) or (2b):
  • R 1 is an oligonucleotide
  • R 2 is H, halo, OH, NH 2 , Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms; and R 3 is selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe.
  • a further embodiment of the present invention relates to an oligonucleotide according to Formula (2a) or (2b) wherein R 2 can be H, halo, OH, NH 2 , Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms.
  • R 2 can be H.
  • R 2 can be halo.
  • R 2 can be F.
  • R 2 can be Cl.
  • R 2 can be Br.
  • R 2 can be I.
  • R 2 can be NH 2 .
  • R 2 can be Ci_ 3 alkoxy.
  • R 2 can be OCH 3 .
  • R 2 can be OCH 2 CH 3 .
  • R 2 can be OCH 2 CH 2 CH 3 .
  • R 2 can be Ci_ 3 alkyl optionally substituted with OH.
  • R 2 can be CH 2 OH.
  • R 2 can be CH 2 CH 2 OH.
  • R 2 can be CH 2 CH 2 CH 2 OH.
  • R 2 can be CH 2 CH(OH)CH 2 .
  • R 2 can be NH 2 .
  • R 2 can be OH or CH 2 OH.
  • R 2 can be OH.
  • R 2 can be CH 2 OH.
  • a further embodiment of the present invention relates to an oligonucleotide according to Formula (2a) or (2b) wherein R 3 can be selected from H, OH, F, or OCH 3 .
  • R 3 can be OH.
  • R 3 can be F.
  • R 3 can be OCH 3 .
  • R 3 can be H.
  • a further aspect of the present invention relates to an oligonucleotide according to Formula (2c) or (2d): wherein, R 1 is an oligonucleotide;
  • R 2 is H, halo, OH, NH 2 , COOH, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms;
  • R 3 is selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe; and R 4 is H or D.
  • a further embodiment of the present invention relates to an oligonucleotide according to Formula (2c) or (2d) wherein R 2 can be H, halo, OH, NH 2 , Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms.
  • R 2 can be H.
  • R 2 can be halo.
  • R 2 can be F.
  • R 2 can be Cl.
  • R 2 can be Br.
  • R 2 can be I.
  • R 2 can be NH 2 .
  • R 2 can be Ci_ 3 alkoxy.
  • R 2 can be OCH 3 .
  • R 2 can be OCH 2 CH 3 .
  • R 2 can be OCH 2 CH 2 CH 3 .
  • R 2 can be Ci_ 3 alkyl optionally substituted with OH.
  • R 2 can be CH 2 OH.
  • R 2 can be CH 2 CH 2 OH.
  • R 2 can be CH 2 CH 2 CH 2 OH.
  • R 2 can be CH 2 CH(OH)CH 2 .
  • R 2 can be NH 2 .
  • R 2 can be OH or CH 2 OH.
  • R 2 can be OH.
  • R 2 can be CH 2 OH.
  • a further embodiment of the present invention relates to an oligonucleotide according to Formula (2c) or (2d) wherein R 3 can be selected from H, OH, F, or OCH 3 .
  • R 3 can be OH.
  • R 3 can be F.
  • R 3 can be OCH 3 .
  • R 3 can be H.
  • a further embodiment of the present invention relates to a compound according to Formula (2c) or (2d) wherein R 4 can be H or D.
  • R 4 can be H.
  • R 4 can be D.
  • Described herein is a process of nucleic acid synthesis using the compounds described herein.
  • the process uses a nucleic acid polymerase, which may be a template independent polymerase or a template dependent polymerase to add a single nucleotide to one or more nucleic acid strands.
  • the strands may be immobilised on a solid support.
  • the process involves cleaving the 3'-aminooxy group and adding a further nucleotide, the base of which may or may not be T.
  • nucleic acid synthesis comprising:
  • extension reagents comprising a polymerase or terminal deoxynucleotidyl transferase (TdT) and a compounds according to Formula (la) or (lb):
  • R 1 is a phosphate or polyphosphate group or a salt thereof, optionally containing one or more sulfur atoms;
  • R 2 is H, halo, OH, NH 2 , COOH, COH, Ci_ 3 alkoxy, Ci_ 3 alkyl optionally substituted with OH, NH 2 or halo atoms;
  • R 3 is selected from H, OH, F, OCH 3 or OCH 2 CH 2 OMe.
  • the nucleic acids synthesised can be any sequence. One or more, possibly all, of the thymine bases will have a modification at the 5-position. A population of different sequences can be synthesised in parallel.
  • heterocyclic bases have exocyclic NH 2 groups, for example cytidine, adenine or guanine
  • these groups can optionally be masked by an orthogonal masking agent.
  • the amine masked nitrogenous heterocycles may be N4-masked cytidine, N6-amine masked adenine and N2-amine masked guanine.
  • the masking may be for example an azido (N 3 ) group.
  • references herein to an "amine masking group” refer to any chemical group which is capable of generating or “unmasking" an amine group which is involved in hydrogen bond base-pairing with a complementary base. Most typically the unmasking will follow a chemical reaction, most suitably a simple, single step chemical reaction.
  • the amine masking group will generally be orthogonal to the 3'-0-NH 2 blocking group in order to allow selective removal.
  • the purine compounds may be selected from: where R 1 and R 3 are as defined herein.
  • This embodiment has the advantage of reversibly masking the -NH 2 group. While blocked in the -N 3 state, the base (B) is impervious to deamination (e.g., deamination in the presence of sodium nitrite). The base (B) in the N-blocked form is incapable of forming secondary structures via base pairing. Thus even blocking a subset of the free amino groups in the nucleic acid polymer improves the availability of the 3'-end for further extension.
  • the canonical cytosine, adenine, guanine can be respectively recovered from 4-azido cytosine, 6-azido adenine and 2-azido guanine by exposure to a reducing agent (e.g., TCEP).
  • a reducing agent e.g., TCEP
  • nucleic acid synthesis may be readily applied to methods of enzymatic nucleic acid synthesis which are well known to the person skilled in the art.
  • Non-limiting methods of nucleic acid synthesis may be found in WO 2016/128731, WO 2016/139477, WO 2017/009663, GB 1613185.6 and GB 1714827.1, the contents of each of which are herein incorporated by reference.
  • Enzymatic nucleic acid synthesis is defined as any process in which a nucleotide is added to a nucleic acid strand through enzymatic catalysis in the presence or absence of a template.
  • a method of enzymatic nucleic acid synthesis could include non-templated de novo nucleic acid synthesis utilizing a PoIX family polymerase, such as terminal deoxynucleotidyl transferase, and reversibly terminated 2'-deoxynucleoside 5'-triphosphates or ribonucleoside 5'-triphosphate.
  • Another method of enzymatic nucleic acid synthesis could include templated nucleic acid synthesis, including sequencing-by-synthesis.
  • Reversibly terminated enzymatic nucleic acid synthesis is defined as any process in which a reversibly terminated nucleotide is added to a nucleic acid strand through enzymatic catalysis in the presence or absence of a template.
  • the method of enzymatic nucleic acid synthesis is selected from a method of reversibly terminated enzymatic nucleic acid synthesis and a method of templated and non-templated de novo enzymatic nucleic acid synthesis.
  • nucleoside triphosphates refer to a molecule containing a nucleoside (i.e. a base attached to a deoxyribose or ribose sugar molecule) bound to three phosphate groups.
  • nucleoside triphosphates that contain deoxyribose are: deoxyadenosine triphosphate (dATP), deoxyguanosine triphosphate (dGTP), deoxycytidine triphosphate (dCTP) or deoxythymidine triphosphate (dTTP).
  • nucleoside triphosphates examples include adenosine triphosphate (ATP), guanosine triphosphate (GTP), cytidine triphosphate (CTP) or uridine triphosphate (UTP).
  • ATP adenosine triphosphate
  • GTP guanosine triphosphate
  • CTP cytidine triphosphate
  • UDP uridine triphosphate
  • Other types of nucleosides may be bound to three phosphates to form nucleoside triphosphates, such as naturally occurring modified nucleosides and artificial nucleosides.
  • references herein to '3'-blocked nucleoside triphosphates' refer to nucleoside triphosphates (e.g., dATP, dGTP, dCTP or dTTP) which have an additional group on the 3' end which prevents further addition of nucleotides, i.e., by replacing the 3'-OH group with a protecting group.
  • the protecting group is NH 2 or a protected version thereof.
  • references herein to a 'DNA initiator sequence' refer to a small sequence of DNA which the 3'- blocked nucleoside triphosphate can be attached to, i.e., DNA will be synthesised from the end of the DNA initiator sequence.
  • the initiator sequence is between 5 and 10 nucleotides long, such as between 10 and 60 nucleotides long, in particular between 20 and 50 nucleotides long.
  • the initiator sequence is single-stranded. In an alternative embodiment, the initiator sequence is double-stranded. In another embodiment, the initiator sequence has a single stranded portion and a double stranded portion. It will be understood by persons skilled in the art that a 3'-overhang (i.e., a free 3'-end) allows for efficient addition.
  • the initiator sequence is immobilised on a solid support. This allows the enzyme and the cleaving agent to be removed without washing away the synthesised nucleic acid.
  • the initiator sequence may be attached to a solid support stable under aqueous conditions so that the method can be easily performed via a flow setup.
  • the initiator sequence is immobilised on a solid support via a reversible interacting moiety, such as a chemically-cleavable linker, an antibody/immunogenic epitope, a biotin/biotin binding protein (such as avidin or streptavidin), or glutathione-GST tag. Therefore, in a further embodiment, the method additionally comprises extracting the resultant nucleic acid by removing the reversible interacting moiety in the initiator sequence, such as by incubating with proteinase K.
  • a reversible interacting moiety such as a chemically-cleavable linker, an antibody/immunogenic epitope, a biotin/biotin binding protein (such as avidin or streptavidin), or glutathione-GST tag. Therefore, in a further embodiment, the method additionally comprises extracting the resultant nucleic acid by removing the reversible interacting moiety in the initiator sequence, such as by incubating with proteinase K
  • the initiator sequence contains a base or base sequence recognisable by an enzyme.
  • a base recognised by an enzyme such as a glycosylase, may be removed to generate an abasic site which may be cleaved by chemical or enzymatic means.
  • An example of such a glycosylase system includes the presence of a uracil base in the initiator sequence, which may be excised with uracil DNA glycosylase (UDG) to leave an abasic site which may be cleaved with, for example, basic solutions, organic amines, or an endonuclease (such as endonuclease VIII), to release a nucleic acid bearing a 5'-phosphate into solution.
  • a base sequence may be recognised and cleaved by a restriction enzyme.
  • the initiator sequence is immobilised on a solid support via an orthogonal chemically-cleavable linker, such as a disulfide, allyl, or azide-masked hemiaminal ether linker. Therefore, in one embodiment, where the N-masking group is not azido, the method additionally comprises extracting the resultant nucleic acid by cleaving the chemical linker through the addition of tris(2-carboxyethyl)phosphine (TCEP) or dithiothreitol (DTT) for a disulfide linker; palladium complexes or an allyl linker; or TCEP for an azide-masked hemiaminal ether linker.
  • TCEP tris(2-carboxyethyl)phosphine
  • DTT dithiothreitol
  • the resultant nucleic acid is extracted and amplified by polymerase chain reaction using the nucleic acid bound to the solid support as a template.
  • the initiator sequence could therefore contain an appropriate forward primer sequence and an appropriate reverse primer could be synthesised.
  • the terminal deoxynucleotidyl transferase (TdT) of the invention is added in the presence of an extension solution comprising one or more buffers (e.g., Tris or cacodylate), one or more salts (e.g., Na + , K + , Mg 2+ , Mn 2+ , Cu 2+ , Zn 2+ , Co 2+ , etc. all with appropriate counterions, such as Cl) and inorganic pyrophosphatase (e.g., the Saccharomyces cerevisiae homolog).
  • buffers e.g., Tris or cacodylate
  • salts e.g., Na + , K + , Mg 2+ , Mn 2+ , Cu 2+ , Zn 2+ , Co 2+ , etc. all with appropriate counterions, such as Cl
  • inorganic pyrophosphatase e.g., the Saccharomyces cerevisiae homolog
  • an inorganic pyrophosphatase helps to reduce the build-up of pyrophosphate due to nucleoside triphosphate hydrolysis by TdT. Therefore, the use of an inorganic pyrophosphatase has the advantage of reducing the rate of (1) backwards reaction and (2) TdT strand dismutation.
  • step (b) is performed at a pH range between 5 and 10. Therefore, it will be understood that any buffer with a buffering range of pH 5-10 could be used, for example cacodylate, Tris, HEPES orTricine, in particular cacodylate orTris.
  • the compounds of the invention can be used on a device for nucleic acid synthesis.
  • a solid support in the form of, for example, a planar array and further a plurality of beads onto which a plurality of immobilized initiation oligonucleotide sequences are attached.
  • the beads may be porous and a portion of the, optionally porous, beads are selected as anchors and unselected beads are exposed to harvest solution to cleave them from their solid support to release the oligonucleotide sequences into solution.
  • the term solid support can refer to an array having a plurality of beads which may or may not be immobilised.
  • the oligonucleotides may be attached to, or removed from beads whilst on the array.
  • the immobilised oligonucleotide may be attached to a bead, which remains in a fixed position on the array whilst other beads in other locations are subject to cleavage conditions to detach the oligonucleotides from the beads (the beads may or may not be immobilised).
  • the solid support can take the form of a digital microfluidic device.
  • Digital microfluidic devices consist of a plurality of electrodes arranged on a surface.
  • a dielectric layer e.g., aluminum oxide
  • a hydrophobic coating e.g., perfluorinated hydrocarbon polymer
  • the electrodes may be hardwired or formed from an active matrix thin film transistor (AM-TFT).
  • AM-TFT active matrix thin film transistor
  • the solid support can take the form of a digital microfluidic device.
  • Digital microfluidic devices consist of a plurality of electrodes arranged on a surface. These electrodes can be addressed in a passive manner or by active matrix methods. Passive addressing is a direct address where actuation signals are directly applied on individual electrode (for example by means of a hard-wired connection to that electrode in a single layer or multilayer fashion such as a printed circuit board, PCB).
  • PCB printed circuit board
  • direct drive methods is the inability to process large numbers of droplets due to difficulties in addressing large numbers of direct drive electrodes.
  • MxN electrodes can be controlled by M+N pins, significantly reducing the number of control pins.
  • An AM- TFT digital microfluidic device comprises a dielectric layer (e.g., aluminum oxide) deposited over the electrode layer on the thin-film transistor layer followed by a hydrophobic coating (e.g., perfluorinated hydrocarbon polymer) atop the dielectric layer.
  • a dielectric layer e.g., aluminum oxide
  • a hydrophobic coating e.g., perfluorinated hydrocarbon polymer
  • aqueous droplets may be actuated across the surface immersed in oil, air, or another fluid.
  • Enzymatic oligonucleotide synthesis can be deployed on a digital microfluidic device in several ways.
  • An initiator oligonucleotide can be immobilized via the 5'-end on super paramagnetic beads or directly to the hydrophobic surface of the digital microfluidic device.
  • a plurality of distinct positions containing immobilized initiator oligonucleotides on the digital microfluidic device may be present (henceforth named synthesis zones).
  • Solutions required for enzymatic oligonucleotide synthesis are then dispensed from multiple reservoirs onto the device.
  • an addition solution containing the components necessary for the TdT-mediated incorporation of reversibly terminated nucleoside 5'-triphosphates onto immobilized initiator oligonucleotides can be dispensed from a reservoir in droplets and actuated to the aforementioned positions containing immobilized initiator oligonucleotides.
  • each reservoir (and thus each droplet containing addition solution) can contain a distinct nitrogenous base reversibly terminated nucleoside 5'-triphosphate identity or a mixture thereof in order to control the sequence synthesized on aforementioned positions containing immobilized initiator oligonucleotides.
  • the method can be implemented on continuous flow microfluidic devices.
  • One such device consists of a surface with a plurality of microwells each containing a bead. On said bead, an oligonucleotide initiator can be immobilized.
  • each microwell can contain an electrode to perform electrochemistry.
  • the use of the modified thymine bases improves the quality of the synthesised strands due to lowering the length required in order to obtain a hybridising strand.
  • T1 -oxime Scheme 3 Oxime deprotection of Tl-oxime to yield Tl. Prior to use, the oxime can be removed by incubation of the triphosphate Tl-oxime in a solution of 1 M sodium acetate pH 5.5, 1.5% w/v methoxylamine, and ultrapure water for 60 minutes at room temperature.
  • A3 ⁇ 4Q ZnCI 2 Zinc chloride (0.49 g, 3.6 mmol) was placed in a 50 mL round bottomed flask. Acetic anhydride (8.0 mL, 86 mmol) was added, then 3-butyn-l-ol (5.4 mL, 5.0 g, 71 mmol) was added to the suspension over 20 minutes. The solution was stirred at room temperature for 30 minutes, then water (30 mL) was added over 5 minutes. The mixture was stirred vigorously for 30 minutes, when the layers were separated. The organic phase was washed with water (30 mL) and sodium bicarbonate solution (10 mL).
  • the solvent was evaporated from the bulk using a rotary evaporator, then 1 M citric acid (10 mL) and. ethyl acetate (30 mL) were added, the mixture was shaken and the layers were separated. The organic layer was washed with 1 M citric acid, (2 x 10 mL) and saturated sodium bicarbonate solution (10 mL). The organic layer was dried (Na 2 S0 4 ) and filtered. The solvent was removed using a rotary evaporator to give a white solid (1.31 g), which was dissolved in hot acetone (15 mL). The solution was allowed to cool to room temperature, then heptane (15 mL) was added over 90 minutes.
  • the white suspension was stirred at room temperature for 1 h, filtered and the solid was washed with heptane-acetone (1:1, 2 x 5 mL) and dried under vacuum, to give 5 '(O)-tert- butyldimethylsilyl-2'-deoxy-5-iodouridine as a fine white solid (652 mg). Additional heptane (20 mL) was added to the filtrate; this was stirred at room temperature for 30 minutes, the filtered and the solid was washed with heptane-acetone (3:1, 2 x 5 mL) and dried under vacuum, yield of second crop of as a fine white solid (257 mg). Additional solid precipitated from the filtrate.
  • the pale yellow solution was stirred for a further 1 h, when the solvent was removed using a rotary evaporator. Toluene (10 mL) was added and the solvent was evaporated again. Toluene (20 mL) was added, and the solution was washed with saturated sodium bicarbonate solution (10 mL). The mixture was shaken and the layers were separated. The organic layer was washed with saturated sodium bicarbonate solution (10 mL), dried (Na 2 S0 4 ), filtered and the solution was concentrated using a rotary evaporator to ⁇ 5 mL. The pale yellow solution was placed in the freezer (-20°C) for 16 h, then allowed to warm to room temperature, and the solvent was evaporated again.
  • the suspension was stirred at 0-5°C for 30 minutes, then allowed to warm to room temperature. After lh, the reaction was quenched with water (0.1 mL), and the solution was stirred for a further 30 minutes. Toluene (15 mL) was added, followed by saturated sodium bicarbonate solution (15 mL), the mixture was shaken, the layers were separated and the organic phase was washed with saturated sodium bicarbonate solution (10 mL + 2 x 5 mL), dried (Na 2 S0 4 ) and filtered. The solution was concentrated using a rotary evaporator to about 10 mL, then seeded with a small quantity of triphenylphosphine oxide-diisopropylhydrazine dicarboxylate complex.
  • iro/is-Dichlorobis(triphenylphosphine)palladium(ll) 7 mg, 0.0096 mmol
  • copper(l) iodide 7 mg, 0.053 mmol
  • the reaction flask was purged with nitrogen and heated in a heating block at 60°C for 1 h. After cooling, most of the solvent was removed using a rotary evaporator, then ethyl acetate (5 mL) and 5% disodium EDTA solution (5 mL) were added. The layers were separated, and the organic phase was washed with disodium EDTA (5 mL).
  • the organic phase was dried (Na 2 S0 4 ), filtered and the solvent was removed using a rotary evaporator.
  • the product was purified on a short silica column eluting with heptane-ethyl acetate (1:1).
  • the solid product was dissolved in toluene (2 mL). Heptane (3 mL) was added while stirring over 15 minutes.
  • Ethoxytrimethylsilane (1.0 mL, 6.5 mmol) was added over 2 minutes, the solution was stirred for 30 minutes then saturated sodium bicarbonate solution (5 mL) and ethyl acetate (5 mL) were added. The phases were separated. The aqueous phase was extracted with ethyl acetate (2 x 5 mL), the combined organic phases were dried (Na 2 S0 4 ) and filtered and the solvent was removed using a rotary evaporator to give a white solid.
  • the suspension was stirred at room temperature for 1 h, when most of the solvent was removed using a rotary evaporator (bath temperature 30°C).
  • the residue was dissolved in water (2 mL) and split between two 50 mL centrifuge tubes.
  • a white emulsion formed.
  • the upper layer was decanted to leave the lower, immiscible oily phase. This was dissolved in water (2 x 0.5 mL).
  • the oily residues were dissolved in water (total volume 2 mL) and the crude triphosphate solution obtained was purified by reverse phase HPLC using a Supelco Ascentis C18 column (25 cm x 10 mm, 5 pm), flow rate 3 mL/min, and a gradient of A: 100 mM triethylammonium bicarbonate pH 7.5, B: Acetonitrile; A to 30%B over 32 minutes (8 runs).
  • the product-containing fractions were combined and the solvent was evaporated, then methanol (10 mL) was added to the residue and the solvent was evaporated using a rotary evaporator.
  • Tl 2'-deoxy-3'-aminooxy-5-(3-hydroxy-l-butynyl)-uridine triphosphate
  • TdT An engineered terminal deoxynucleotidyl transferase (TdT) was used to incorporate Tl into six nucleic acid sequences (SEQs 1 to 6) with different 3'-terminal sequence contexts (see Figure 1). Quantitative addition efficiency was observed in all cases. This demonstrates that the Tl triphosphate is a suitable monomer for de novo enzymatic DNA synthesis.
  • Tl 2'-deoxy-3'-aminooxy-5-(3-hydroxy-l- butynylj-uridine triphosphate
  • Initiator oligonucleotide of SEQ 7 was immobilised on super paramagnetic particles to facilitate retention through multiple cycles of DNA synthesis.
  • An engineered TdT enzyme mediated the first addition of Tl to the initiator oligonucleotide.
  • Sodium nitrite in acetate buffer was then employed to deblock the 3'-aminooxy reversible terminator, revealing a 3'-hydroxyl competent for further addition.
  • a second addition was then performed, whereby the engineered TdT enzyme added either A, C, T, G, or Tl (all in 3'-aminooxy reversibly terminated form).
  • a cleavage solution containing uracil DNA glycosylase (UDG) and N,N'- dimethylethylene diamine (DMED) was then used to cleave DNA from the paramagnetic particles for analysis by denaturing polyacrylamide gel electrophoresis.
  • the DNA was visualised by virtue of the internal cyanine 3 dye (see figure 2). Lane 1: no addition control. Lanes 2 and 3: single addition of Tl triphosphate. Lanes 4-8: addition of Tl followed by A, C, T, G, or Tl for lanes 4 to 8 respectively.
  • an asterisk (*) indicates a phosphorothioate linkage and /icy3/ indicates an internal cyanine 3 dye.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Microbiology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Saccharide Compounds (AREA)
  • Phenolic Resins Or Amino Resins (AREA)

Abstract

The present invention relates to a compound according to Formula (1c) or (1d): wherein, R1, R2, R3 and R4 are disclosed herein.

Description

C5-modified Thymidines
FIELD OF THE INVENTION
The invention relates to 3'-aminooxy C5 modified thymidine nucleotides. The invention also relates to a method of using said nucleotides in nucleic acid synthesis. The invention further relates to a kit comprising the nucleotides, a terminal transferase enzyme and an initiator oligonucleotide.
BACKGROUND TO THE INVENTION
Nucleic acid synthesis is vital to modern biotechnology. The rapid pace of development in the biotechnology arena has been made possible by the scientific community's ability to artificially synthesise DNA, RNA and proteins.
Nucleic acid synthesis is the process by which strands are assembled in the correct order of bases. Once assembled, the strands are frequently required to base-pair with their complementary strands. This involves the complementary pairing of nucleoside bases via hydrogen bonds, which is known as a base pair (bp). The four nucleoside bases found in DNA are Adenine (A), Thymine (T), Cytosine (C), Guanine (G). In RNA the T is replaced by Uracil (U), with U being found only in RNA.
Watson-Crick base pairs follow specific hydrogen bonding patterns. The A-T base pair (AT) is weaker than the C-G base pair (GC) due to having a lower number of hydrogen bonds (two vs three, respectively) and weaker stacking interactions. As a result, hybridisation between AT rich DNA strands is less favourable than GC rich strands - an effect that is clearly observable in the melting temperature (Tm). In order for an AT rich region to attain an equal Tm to a GC rich region, it must have a longer length.
Current DNA synthesis technology does not meet the demands of the biotechnology industry. Despite being a mature technology, it is practically impossible to synthesise a DNA strand greater than 200 nucleotides in length in viable yield, and most DNA synthesis companies only offer up to 120 nucleotides routinely. In comparison, an average protein-coding gene is of the order of 2000- 3000 contiguous nucleotides, a chromosome is at least a million contiguous nucleotides in length and an average eukaryotic genome numbers in the billions of nucleotides. In order to prepare nucleic acid strands thousands of base pairs in length, all major gene synthesis companies today rely on variations of a 'synthesise and stitch' technique, where overlapping 40-60-mer fragments are synthesised and stitched together by enzymatic copying and extension. Current methods generally allow up to 3 kb in length for routine production.
The reason DNA cannot be routinely synthesised beyond 120-200 nucleotides at a time is due to the current methodology for generating DNA, which uses synthetic chemistry (i.e., phosphoramidite technology) to couple a nucleotide one at a time to make DNA. Even if the efficiency of each nucleotide-coupling step is 99% efficient, it is mathematically impossible to synthesise DNA longer than 200 nucleotides in acceptable yields. The Venter Institute illustrated this laborious process by spending 4 years and 20 million USD to synthesise the relatively small genome of a bacterium.
Known methods of DNA sequencing use template-dependent DNA polymerases to add 3'-reversibly terminated nucleotides to a growing double-stranded substrate. In the 'sequencing-by-synthesis' process, each added nucleotide contains a dye, allowing the user to identify the exact sequence of the template strand. Albeit on double-stranded DNA, this technology is able to produce strands of between 500-1000 bps long. However, this technology is not suitable for de novo nucleic acid synthesis because of the requirement for an existing nucleic acid strand to act as a template.
Various attempts have been made to use a terminal deoxynucleotidyl transferase for de novo single- stranded DNA synthesis. Uncontrolled de novo single stranded DNA synthesis, as opposed to controlled, takes advantage of TdT's deoxynucleoside triphosphate (dNTP) 3' tailing properties on single-stranded DNA to create, for example, homopolymeric adaptor sequences for next-generation sequencing library preparation. In controlled extensions, a reversible deoxynucleoside triphosphate termination technology needs to be employed to prevent uncontrolled addition of dNTPs to the 3'- end of a growing DNA strand. The development of a controlled single-stranded DNA synthesis process through TdT would be invaluable to in situ DNA synthesis for gene assembly or hybridization microarrays as it removes the need for an anhydrous environment and allows the use of various polymers incompatible with organic solvents.
However, TdT has been shown not to efficiently add nucleoside triphosphates containing 3'-0- reversibly terminating moieties for building up a nascent single-stranded DNA chain necessary for a de novo synthesis cycle. A 3'-0- reversible terminating moiety would prevent a terminal transferase like TdT from catalysing the nucleotide transferase reaction between the 3'-end of a growing DNA strand and the 5'-triphosphate of an incoming nucleoside triphosphate. The inventors have previously discovered certain modified nucleotides can be incorporated using terminal transferases. Modified nucleotides suitable for terminal transferase extension have been disclosed in for example PCT/GB2018/053305. A common reversible terminator is the aminooxy (0-NH2) group. The aminooxy group is converted to OH by treatment with nitrite.
In order to improve the quality of nucleic acid synthesis, it is desirable to shorten the length of nucleic acid fragments without reducing the melting temperature. Furthermore, reducing the dispiriting between the strength of AT and GC base pairs can improve downstream applications some as assembly of oligonucleotides into larger nucleic acid products. Whilst certain modifications to thymidine nucleosides are known to increase the melting temperature, these modifications have not previously been used in enzymatic synthesis.
SUMMARY OF THE INVENTION
The compounds described herein enable a method of increasing the AT base pair stability by modifying the 5-position pyrimidine group.
An aspect of the present invention relates to a compound according to Formula (la) or (lb):
Figure imgf000004_0001
wherein, R1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms; and
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe.
A further aspect of the present invention relates to a compound according to Formula (lc) or (Id):
Figure imgf000005_0001
wherein, R1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms;
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe; and R4 is H or D.
A further aspect of the present invention also relates to a method of nucleic acid synthesis comprising reacting compound (la) or compound (lb) with an oligonucleotide sequence in the presence of a terminal deoxynucleotidyl transferase (TdT) enzyme.
A further aspect of the present invention also relates to a method of nucleic acid synthesis comprising reacting compound (lc) or compound (Id) with an oligonucleotide sequence in the presence of a terminal deoxynucleotidyl transferase (TdT) enzyme.
A further aspect of the present invention further relates to a kit comprising:
(i) a compound according to any one of Formula (la) or (lb);
(ii) a terminal deoxynucleotidyl transferase (TdT) enzyme; and optionally
(iii) a nitrite salt.
A further aspect of the present invention further relates to a kit comprising:
(i) a compound according to any one of Formula (lc) or (Id);
(ii) a terminal deoxynucleotidyl transferase (TdT) enzyme; and optionally
(iii) a nitrite salt. A further aspect of the present invention relates to an oligonucleotide according to Formula (2a) or (2b):
Figure imgf000006_0001
wherein, R1 is an oligonucleotide;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms; and
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe.
A further aspect of the present invention relates to an oligonucleotide according to Formula (2c) or (2d):
Figure imgf000006_0002
wherein, R1 is an oligonucleotide;
R2 is H, halo, OFI, N H2, COOFI, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms;
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe; and R4 is H or D. BRIEF DESCRIPTION OF THE FIGURES
Figure 1: Enzymatic incorporation of 2'-deoxy-3'-aminooxy-5-(3-hydroxy-l-butynyl)-uridine triphosphate (Tl) into a single stranded nucleic acid initiator. An engineered terminal deoxynucleotidyl transferase (TdT) was used to incorporate either 2'-deoxythymidine triphosphate (T) or Tl into six nucleic acid sequences with different 3'-terminal sequence contexts (SEQ 1-6). Quantitative addition efficiency was observed in all cases. This demonstrates that the Tl triphosphate is a suitable monomer for de novo enzymatic DNA synthesis. This experiment also demonstrates that Tl behaves similarly to T in a de novo enzymatic DNA synthesis process.
SEQ 1: /5Cy3/CAATCAGGTGAAG SEQ 2: /5Cy3/CAATCAGGTGAAA SEQ 3: /5Cy3/CAATCAGGTGAAT SEQ 4: /5Cy3/CAATCAGGTGAAC SEQ 5: /5Cy3/CAATCAGGTGTTT SEQ 6: /5Cy3/CAATCAGGTGTTU
Figure 2. Denaturing polyacrylamide gel electrophoresis analysis of multi-cycle de novo enzymatic DNA synthesis involving 2'-deoxy-3'-aminooxy-5-(3-hydroxy-l-butynyi)-uridine triphosphate (Tl). Initiator oligonucleotide of sequence 7 (SEQ 7) was immobilised on super paramagnetic particles to facilitate retention through multiple cycles of DNA synthesis. An engineered TdT enzyme mediated the first addition of Tl to the initiator oligonucleotide. Sodium nitrite in acetate buffer was then employed to deblock the 3'-aminooxy reversible terminator, revealing a 3'-hydroxyl competent for further addition. A second addition was then performed, whereby the engineered TdT enzyme added either A, C, T, G, or Tl (all in 3'-aminooxy reversibly terminated form). A cleavage solution containing uracil DNA glycosylase (UDG) and N,N'-dimethylethylene diamine (DMED) was then used to cleave DNA from the paramagnetic particles for analysis by denaturing polyacrylamide gel electrophoresis. The DNA was visualised by virtue of the internal cyanine 3 dye. Lane 1: no addition control. Lanes 2 and 3: single addition of Tl triphosphate. Lanes 4-8: addition of Tl followed by A, C, T, G, or Tl for lanes 4 to 8 respectively. The gel clearly shows quantitative addition of Tl to the solid support immobilised initiator strand. Furthermore, the gel demonstrates that multi-cycle de novo enzymatic DNA synthesis can be achieved when Tl forms part or the entirety of the synthesised sequence.
SEQ 7: T*T*T*T I I I I I I I I I I I I I I I I I I I I I I I I I I I I I TTUTTTT/icy3/TTTTT
Note that an asterisk (*) indicates a phosphorothioate linkage and /icy3/ indicates an internal cyanine 3 dye. DETAILED DESCRIPTION OF THE INVENTION
The A-T base pair is weaker than the C-G base pair due to having a lower number of hydrogen bonds (two vs three). As a result, hybridisation between AT rich DNA strands is less favourable than GC rich strands - an effect that is clearly observable, for example through biophysical characterisation of the melting temperature (Tm). The Tm is defined as the temperature where 50% of oligonucleotide is duplexed with its perfect complement and 50% is free in solution. Tm is of critical importance for many biochemical techniques including PCR, in-situ hybridisation, and Southern blotting. In order for an AT rich region to attain an equal tm to a GC rich region, it must have a longer length. Certain 5' pyrimidine substituents have been shown to increase base pairing stability. A well-known example in the literature is "super T", which is a modified thymidine bearing a 5-position (3-hydroxy-l-butynyl) substituent. Super T increases the Tm by approximately 2°C per T replacement. The ability to synthesise DNA containing 5-position pyrimidine modifications to modulate Tm is therefore of great industrial utility. Notably, 5-position modified pyrimidines can be converted to canonical T through polymerase chain reaction (PCR) amplification with canonical nucleotides as the hydrogen bonding is unchanged; both canonical T and super T base pair with A.
An aspect of the present invention relates to a compound according to Formula (la) or (lb):
Figure imgf000008_0001
wherein, R1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms; and
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe.
The 3'-0-aminooxy reversible terminator precursors may include those where the aminooxy is protected as an oxime, for example N=C(CH3)2. The oxime can be transformed into aminooxy as part of the unblocking process. R1 can be a phosphate or polyphosphate group. The phosphate groups can be protonated or in salt form. The phosphates can be entirely oxygen, or can contain one or more sulfur atoms. R1 can be a phosphate group. R1 can be a polyphosphate group. R1 can also be a phosphate or polyphosphate group selected from -(PO3) x(P02S) y(P03) z where x, y and z are independently 0-5 and and x+y+z is 1- 5. R1 can also be a phosphate or polyphosphate group having one or more sulfur atoms. R1 can be a phosphate group having one or more sulfur atoms. R1 can be a polyphosphate group having one or more sulfur atoms. The sulfur atom can be in any position on any on the phosphate groups. R1 can further be a monophosphate, diphosphate, triphosphate, tetraphosphate, pentaphosphate, or (alpha-thio)triphosphate group. R1 can be a monophosphate group. R1 can be a diphosphate group. R1 can be a tetraphosphate group. R1 can be a pentaphosphate group. R1 can be an (alpha- thio)triphosphate group. R1 can be a triphosphate group.
A further embodiment of the present invention relates to a compound according to Formula (la) or (lb) wherein R2 can be H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2or halo atoms; and
R2 can be H, F, OH, NH2, COOH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or F.
R2 can be H, F, OH, NH2, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or F.
R2 can be H, F, OH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH or F.
R2 can be H. R2 can be halo. R2 can be F. R2 can be Cl. R2 can be Br. R2 can be I. R2 can be NH2. R2 can be Ci_3 alkoxy. R2 can be OCH3. R2 can be OCH2CH3. R2 can be OCH2CH2CH3. R2 can be COOH. R2 can be COH. R2 can be Ci_3 alkyl optionally substituted with OH. R2 can be CH2OH. R2 can be CH2CH2OH. R2 can be CH2CH2CH2OH. R2 can be CH2CH(OH)CH2. R2 can be NH2. Preferably, R2 can be OH or CH2OH. R2 can be OH. R2 can be CH2OH.
A further embodiment of the present invention relates to a compound according to Formula (la) or (lb) wherein R3 can be selected from H, OH, F, OCH3 or OCH2CH2OMe. R3 can be OH. R3 can be F. R3 can be OCH3. R3can be OCH2CH2OMe. Preferably, R3 can be H.
The compound of Formula (la) or (lb) of the present invention can be:
Figure imgf000010_0001
or a salt thereof.
A further aspect of the present invention relates to a method of nucleic acid synthesis comprising reacting compound (la) or compound (lb) with an oligonucleotide sequence in the presence of a terminal deoxynucleotidyl transferase (TdT) enzyme. The extended sequence can be treated with a nitrite salt. Where the compound added is an oxime, the oxime can be converted to NH2 prior to nitrite exposure, for example by hydrolysis of the oxime.
The terminal transferase or modified terminal transferase can be any enzyme capable of template independent strand extension. The modified terminal deoxynucleotidyl transferase (TdT) enzyme can comprise amino acid modifications when compared to a wild type sequence or a truncated version thereof. The terminal transferase can be the homologous amino acid sequence of a terminal deoxynucleotidyl transferase (TdT) enzyme in any species or the homologous amino acid sequence of RoIm, RoIb, RoIl, and RoIQ of any species or the homologous amino acid sequence of X family polymerases of any species.
Homologous refers to protein sequences between two or more proteins that possess a common evolutionary origin, including proteins from superfamilies in the same species of organism as well as homologous proteins from different species. Such proteins (and their encoding nucleic acids) have sequence homology, as reflected by their sequence similarity, whether in terms of percent identity or by the presence of specific residues or motifs and conserved positions. A variety of protein (and their encoding nucleic acid) sequence alignment tools may be used to determine sequence homology. For example, the Clustal Omega multiple sequence alignment program provided by the European Molecular Biology Laboratory (EMBL) can be used to determine sequence homology or homologous regions. A further embodiment of the present invention relates to the oligonucleotide sequence comprising a solid-supported oligonucleotide sequence. The oligonucleotide sequence comprises 2 or more nucleotides. The oligonucleotide sequence can be between 10 and 500 nucleotides, such as between 20 and 200 nucleotides, in particular between 20 and 50 nucleotides long.
A further embodiment of the present invention relates to a method further comprising a reaction step with a nitrite salt. Preferably, the nitrate salt is sodium nitrite.
A further aspect of the present invention relates to a kit comprising:
(i) a compound according to any one of Formula (la) or (lb);
(ii) a polymerase or terminal deoxynucleotidyl transferase (TdT) enzyme; and optionally
(iii) a nitrite salt.
A further aspect of the present invention relates to a compound according to Formula (lc) or (Id):
Figure imgf000011_0001
wherein, R1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
R2 is H, halo, OFI, N H2, COOFI, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms;
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe; and R4 is H or D.
R1 can be a phosphate or polyphosphate group. The phosphate groups can be protonated or in salt form. The phosphates can be entirely oxygen, or can contain one or more sulfur atoms. R1 can be a phosphate group. R1 can be a polyphosphate group. R1 can also be a phosphate or polyphosphate group selected from -(P03) x(P02S) y(P03) z where x, y and z are independently 0-5 and and x+y+z is 1- 5. R1 can also be a phosphate or polyphosphate group having one or more sulfur atoms. R1 can be a phosphate group having one or more sulfur atoms. R1 can be a polyphosphate group having one or more sulfur atoms. The sulfur atom can be in any position on any on the phosphate groups. R1 can further be a monophosphate, diphosphate, triphosphate, tetraphosphate, pentaphosphate, or (alpha-thio)triphosphate group. R1 can be a monophosphate group. R1 can be a diphosphate group. R1 can be a tetraphosphate group. R1 can be a pentaphosphate group. R1 can be an (alpha- thio)triphosphate group. R1 can be a triphosphate group.
A further embodiment of the present invention relates to a compound according to Formula (lc) or (Id) wherein R2 can be H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms; and
R2 can be H, F, OH, NH2, COOH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or F.
R2 can be H, F, OH, NH2, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or F.
R2 can be H, F, OH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH or F.
R2 can be H. R2 can be halo. R2 can be F. R2 can be Cl. R2 can be Br. R2 can be I. R2 can be NH2. R2 can be Ci_3 alkoxy. R2 can be OCH3. R2 can be OCH2CH3. R2 can be OCH2CH2CH3. R2 can be COOH. R2 can be COH. R2 can be Ci_3 alkyl optionally substituted with OH. R2 can be CH2OH. R2 can be CH2CH2OH. R2 can be CH2CH2CH2OH. R2 can be CH2CH(OH)CH2. R2 can be NH2. Preferably, R2 can be OH or CH2OH. R2 can be OH. R2 can be CH2OH.
A further embodiment of the present invention relates to a compound according to Formula (lc) or (Id) wherein R3 can be selected from H, OH, F, OCH3 or OCH2CH2OMe. R3 can be OH. R3 can be F. R3 can be OCH3. R3 can be OCH2CH2OMe. Preferably, R3 can be H.
A further embodiment of the present invention relates to a compound according to Formula (lc) or (Id) wherein R4 can be H or D. R4 can be H. R4 can be D.
A further aspect of the present invention relates to an oligonucleotide according to Formula (2a) or (2b):
Figure imgf000013_0001
wherein, R1 is an oligonucleotide;
R2 is H, halo, OH, NH2, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms; and R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe.
A further embodiment of the present invention relates to an oligonucleotide according to Formula (2a) or (2b) wherein R2 can be H, halo, OH, NH2, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms. R2 can be H. R2 can be halo. R2 can be F. R2 can be Cl. R2 can be Br. R2 can be I. R2 can be NH2. R2 can be Ci_3 alkoxy. R2 can be OCH3. R2 can be OCH2CH3. R2 can be OCH2CH2CH3. R2 can be Ci_3 alkyl optionally substituted with OH. R2 can be CH2OH. R2 can be CH2CH2OH. R2 can be CH2CH2CH2OH. R2 can be CH2CH(OH)CH2. R2 can be NH2. Preferably, R2 can be OH or CH2OH. R2 can be OH. R2 can be CH2OH. A further embodiment of the present invention relates to an oligonucleotide according to Formula (2a) or (2b) wherein R3 can be selected from H, OH, F, or OCH3. R3 can be OH. R3 can be F. R3 can be OCH3. Preferably, R3 can be H.
A further aspect of the present invention relates to an oligonucleotide according to Formula (2c) or (2d):
Figure imgf000013_0002
wherein, R1 is an oligonucleotide;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms;
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe; and R4 is H or D.
A further embodiment of the present invention relates to an oligonucleotide according to Formula (2c) or (2d) wherein R2 can be H, halo, OH, NH2, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms. R2 can be H. R2 can be halo. R2 can be F. R2 can be Cl. R2 can be Br. R2 can be I. R2 can be NH2. R2 can be Ci_3 alkoxy. R2 can be OCH3. R2 can be OCH2CH3. R2 can be OCH2CH2CH3. R2 can be Ci_3 alkyl optionally substituted with OH. R2 can be CH2OH. R2 can be CH2CH2OH. R2 can be CH2CH2CH2OH. R2 can be CH2CH(OH)CH2. R2 can be NH2. Preferably, R2 can be OH or CH2OH. R2 can be OH. R2 can be CH2OH.
A further embodiment of the present invention relates to an oligonucleotide according to Formula (2c) or (2d) wherein R3 can be selected from H, OH, F, or OCH3. R3 can be OH. R3 can be F. R3 can be OCH3. Preferably, R3 can be H.
A further embodiment of the present invention relates to a compound according to Formula (2c) or (2d) wherein R4 can be H or D. R4 can be H. R4 can be D.
Described herein is a process of nucleic acid synthesis using the compounds described herein. The process uses a nucleic acid polymerase, which may be a template independent polymerase or a template dependent polymerase to add a single nucleotide to one or more nucleic acid strands. The strands may be immobilised on a solid support. The process involves cleaving the 3'-aminooxy group and adding a further nucleotide, the base of which may or may not be T.
Disclosed is a method of nucleic acid synthesis comprising:
(a) providing an initiator sequence;
(b) adding extension reagents comprising a polymerase or terminal deoxynucleotidyl transferase (TdT) and a compounds according to Formula (la) or (lb):
Figure imgf000015_0001
to said initiator sequence to add a single nucleotide to the initiator sequence, wherein, R1 is a phosphate or polyphosphate group or a salt thereof, optionally containing one or more sulfur atoms;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms; and
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe.
(c) removal of the extension reagents;
(d) optionally transforming the N=C(CH3)2 to NH2;
(e) cleaving the 3'-0-NH2 group from the extended nucleic acid polymer;
(f) adding extension reagents comprising a 3'-0-NH2 or 3'-0-N=C(CH3)2 blocked nucleoside triphosphate and a polymerase or terminal deoxynucleotidyl transferase (TdT) to said initiator sequence to add a further single nucleotide to the initiator sequence.
The nucleic acids synthesised can be any sequence. One or more, possibly all, of the thymine bases will have a modification at the 5-position. A population of different sequences can be synthesised in parallel.
Where the other heterocyclic bases have exocyclic NH2 groups, for example cytidine, adenine or guanine, these groups can optionally be masked by an orthogonal masking agent. The amine masked nitrogenous heterocycles may be N4-masked cytidine, N6-amine masked adenine and N2-amine masked guanine. The masking may be for example an azido (N3) group. Example for suitable masking groups include azide (-N3), benzoylamine (N- benzoyl or -NHCOPh), N-methyl (-NHMe), isobutyrylamine, dimethylformamidine, 9-fluorenylmethyl carbamate, t-butyl carbamate, benzyl carbamate, acetamide (N-acetyl or -NHCOMe), trifluoroacetamide, pthlamide, benzylamine (N- benzyl or -NH-CH2-phenyl), triphenylmethylamine, benxylideneamine, tosylamide, isothiocyanate, N- allyl (such as N-dimethylallyl (-NHCH2-CH=CH2)) and N-anisoyl (-NHCOPh-OMe), such as azide (-N3), N- acetyl (-NHCOMe), N-benzyl (-NH-CH2-phenyl), N-anisoyl (-NHCOPh-OMe), N-methyl, (-NHMe), N- benzoyl (-NHCOPh), N-dimethylallyl (-NHCH2-CH=CH2).
References herein to an "amine masking group" refer to any chemical group which is capable of generating or "unmasking" an amine group which is involved in hydrogen bond base-pairing with a complementary base. Most typically the unmasking will follow a chemical reaction, most suitably a simple, single step chemical reaction. The amine masking group will generally be orthogonal to the 3'-0-NH2 blocking group in order to allow selective removal. The purine compounds may be selected from:
Figure imgf000016_0001
where R1 and R3 are as defined herein.
The term 'azide' or 'azido' used herein refers to an -N3, or more specifically, an -N=N+=N group. It will also be appreciated that azide extends to the presence of a tetrazolyl moiety. The "azide- tetrazole" equilibrium is well known to the skilled person from Lakshman et al (2010) J. Org. Chem. 75, 2461-2473. Thus, references herein to azide extend equally to tetrazole as illustrated below when applied to the R3 groups defined herein:
Figure imgf000017_0001
This embodiment has the advantage of reversibly masking the -NH2 group. While blocked in the -N3 state, the base (B) is impervious to deamination (e.g., deamination in the presence of sodium nitrite). The base (B) in the N-blocked form is incapable of forming secondary structures via base pairing. Thus even blocking a subset of the free amino groups in the nucleic acid polymer improves the availability of the 3'-end for further extension. The canonical cytosine, adenine, guanine can be respectively recovered from 4-azido cytosine, 6-azido adenine and 2-azido guanine by exposure to a reducing agent (e.g., TCEP). Thus, the -N3 group serves as an effective protecting group against deamination, especially in the presence of sodium nitrite.
It will be appreciated that the compounds of the invention may be readily applied to methods of enzymatic nucleic acid synthesis which are well known to the person skilled in the art. Non-limiting methods of nucleic acid synthesis may be found in WO 2016/128731, WO 2016/139477, WO 2017/009663, GB 1613185.6 and GB 1714827.1, the contents of each of which are herein incorporated by reference.
Enzymatic nucleic acid synthesis is defined as any process in which a nucleotide is added to a nucleic acid strand through enzymatic catalysis in the presence or absence of a template. For example, a method of enzymatic nucleic acid synthesis could include non-templated de novo nucleic acid synthesis utilizing a PoIX family polymerase, such as terminal deoxynucleotidyl transferase, and reversibly terminated 2'-deoxynucleoside 5'-triphosphates or ribonucleoside 5'-triphosphate. Another method of enzymatic nucleic acid synthesis could include templated nucleic acid synthesis, including sequencing-by-synthesis. Reversibly terminated enzymatic nucleic acid synthesis is defined as any process in which a reversibly terminated nucleotide is added to a nucleic acid strand through enzymatic catalysis in the presence or absence of a template. Thus, in one embodiment, the method of enzymatic nucleic acid synthesis is selected from a method of reversibly terminated enzymatic nucleic acid synthesis and a method of templated and non-templated de novo enzymatic nucleic acid synthesis.
References herein to 'nucleoside triphosphates' refer to a molecule containing a nucleoside (i.e. a base attached to a deoxyribose or ribose sugar molecule) bound to three phosphate groups. Examples of nucleoside triphosphates that contain deoxyribose are: deoxyadenosine triphosphate (dATP), deoxyguanosine triphosphate (dGTP), deoxycytidine triphosphate (dCTP) or deoxythymidine triphosphate (dTTP). Examples of nucleoside triphosphates that contain ribose are: adenosine triphosphate (ATP), guanosine triphosphate (GTP), cytidine triphosphate (CTP) or uridine triphosphate (UTP). Other types of nucleosides may be bound to three phosphates to form nucleoside triphosphates, such as naturally occurring modified nucleosides and artificial nucleosides.
Therefore, references herein to '3'-blocked nucleoside triphosphates' refer to nucleoside triphosphates (e.g., dATP, dGTP, dCTP or dTTP) which have an additional group on the 3' end which prevents further addition of nucleotides, i.e., by replacing the 3'-OH group with a protecting group. Herein the protecting group is NH2 or a protected version thereof.
References herein to a 'DNA initiator sequence' refer to a small sequence of DNA which the 3'- blocked nucleoside triphosphate can be attached to, i.e., DNA will be synthesised from the end of the DNA initiator sequence.
In one embodiment, the initiator sequence is between 5 and 10 nucleotides long, such as between 10 and 60 nucleotides long, in particular between 20 and 50 nucleotides long.
In one embodiment, the initiator sequence is single-stranded. In an alternative embodiment, the initiator sequence is double-stranded. In another embodiment, the initiator sequence has a single stranded portion and a double stranded portion. It will be understood by persons skilled in the art that a 3'-overhang (i.e., a free 3'-end) allows for efficient addition.
In one embodiment, the initiator sequence is immobilised on a solid support. This allows the enzyme and the cleaving agent to be removed without washing away the synthesised nucleic acid. The initiator sequence may be attached to a solid support stable under aqueous conditions so that the method can be easily performed via a flow setup.
In one embodiment, the initiator sequence is immobilised on a solid support via a reversible interacting moiety, such as a chemically-cleavable linker, an antibody/immunogenic epitope, a biotin/biotin binding protein (such as avidin or streptavidin), or glutathione-GST tag. Therefore, in a further embodiment, the method additionally comprises extracting the resultant nucleic acid by removing the reversible interacting moiety in the initiator sequence, such as by incubating with proteinase K.
In one embodiment, the initiator sequence contains a base or base sequence recognisable by an enzyme. A base recognised by an enzyme, such as a glycosylase, may be removed to generate an abasic site which may be cleaved by chemical or enzymatic means. An example of such a glycosylase system includes the presence of a uracil base in the initiator sequence, which may be excised with uracil DNA glycosylase (UDG) to leave an abasic site which may be cleaved with, for example, basic solutions, organic amines, or an endonuclease (such as endonuclease VIII), to release a nucleic acid bearing a 5'-phosphate into solution. A base sequence may be recognised and cleaved by a restriction enzyme.
In a further embodiment, the initiator sequence is immobilised on a solid support via an orthogonal chemically-cleavable linker, such as a disulfide, allyl, or azide-masked hemiaminal ether linker. Therefore, in one embodiment, where the N-masking group is not azido, the method additionally comprises extracting the resultant nucleic acid by cleaving the chemical linker through the addition of tris(2-carboxyethyl)phosphine (TCEP) or dithiothreitol (DTT) for a disulfide linker; palladium complexes or an allyl linker; or TCEP for an azide-masked hemiaminal ether linker.
In one embodiment, the resultant nucleic acid is extracted and amplified by polymerase chain reaction using the nucleic acid bound to the solid support as a template. The initiator sequence could therefore contain an appropriate forward primer sequence and an appropriate reverse primer could be synthesised.
In one embodiment, the terminal deoxynucleotidyl transferase (TdT) of the invention is added in the presence of an extension solution comprising one or more buffers (e.g., Tris or cacodylate), one or more salts (e.g., Na+, K+, Mg2+, Mn2+, Cu2+, Zn2+, Co2+, etc. all with appropriate counterions, such as Cl) and inorganic pyrophosphatase (e.g., the Saccharomyces cerevisiae homolog). It will be understood that the choice of buffers and salts depends on the optimal enzyme activity and stability. The use of an inorganic pyrophosphatase helps to reduce the build-up of pyrophosphate due to nucleoside triphosphate hydrolysis by TdT. Therefore, the use of an inorganic pyrophosphatase has the advantage of reducing the rate of (1) backwards reaction and (2) TdT strand dismutation.
In one embodiment, step (b) is performed at a pH range between 5 and 10. Therefore, it will be understood that any buffer with a buffering range of pH 5-10 could be used, for example cacodylate, Tris, HEPES orTricine, in particular cacodylate orTris.
The compounds of the invention can be used on a device for nucleic acid synthesis. In one embodiment of the invention there is a solid support in the form of, for example, a planar array and further a plurality of beads onto which a plurality of immobilized initiation oligonucleotide sequences are attached. The beads may be porous and a portion of the, optionally porous, beads are selected as anchors and unselected beads are exposed to harvest solution to cleave them from their solid support to release the oligonucleotide sequences into solution. Thus the term solid support can refer to an array having a plurality of beads which may or may not be immobilised. The oligonucleotides may be attached to, or removed from beads whilst on the array. Thus the immobilised oligonucleotide may be attached to a bead, which remains in a fixed position on the array whilst other beads in other locations are subject to cleavage conditions to detach the oligonucleotides from the beads (the beads may or may not be immobilised).
The solid support can take the form of a digital microfluidic device. Digital microfluidic devices consist of a plurality of electrodes arranged on a surface. A dielectric layer (e.g., aluminum oxide) is deposited over the electrodes followed by a hydrophobic coating (e.g., perfluorinated hydrocarbon polymer) atop the dielectric layer. The electrodes may be hardwired or formed from an active matrix thin film transistor (AM-TFT).
The solid support can take the form of a digital microfluidic device. Digital microfluidic devices consist of a plurality of electrodes arranged on a surface. These electrodes can be addressed in a passive manner or by active matrix methods. Passive addressing is a direct address where actuation signals are directly applied on individual electrode (for example by means of a hard-wired connection to that electrode in a single layer or multilayer fashion such as a printed circuit board, PCB). However, a limitation of direct drive methods is the inability to process large numbers of droplets due to difficulties in addressing large numbers of direct drive electrodes. In active matrix addressing, MxN electrodes can be controlled by M+N pins, significantly reducing the number of control pins. However, the resolution of the electrodes (size of electrodes as compared to the size of droplets) limits the scope of droplet operations. Active matrix thin film transistor (AM-TFT) technology enables the control of large numbers of droplets by replacing patterned electrodes with a thin film transistor array, each of which is individually addressable. The increased resolution (small size of pixels on the thin film transistor array) also increases the scope of droplet operations. An AM- TFT digital microfluidic device comprises a dielectric layer (e.g., aluminum oxide) deposited over the electrode layer on the thin-film transistor layer followed by a hydrophobic coating (e.g., perfluorinated hydrocarbon polymer) atop the dielectric layer.
Depending on applied voltage to a subset of the plurality of electrodes arranged on the aforementioned surface, aqueous droplets may be actuated across the surface immersed in oil, air, or another fluid. Enzymatic oligonucleotide synthesis can be deployed on a digital microfluidic device in several ways. An initiator oligonucleotide can be immobilized via the 5'-end on super paramagnetic beads or directly to the hydrophobic surface of the digital microfluidic device. A plurality of distinct positions containing immobilized initiator oligonucleotides on the digital microfluidic device may be present (henceforth named synthesis zones). Solutions required for enzymatic oligonucleotide synthesis are then dispensed from multiple reservoirs onto the device. Briefly, an addition solution containing the components necessary for the TdT-mediated incorporation of reversibly terminated nucleoside 5'-triphosphates onto immobilized initiator oligonucleotides can be dispensed from a reservoir in droplets and actuated to the aforementioned positions containing immobilized initiator oligonucleotides. During this stage, each reservoir (and thus each droplet containing addition solution) can contain a distinct nitrogenous base reversibly terminated nucleoside 5'-triphosphate identity or a mixture thereof in order to control the sequence synthesized on aforementioned positions containing immobilized initiator oligonucleotides.
Alternatively the method can be implemented on continuous flow microfluidic devices. One such device consists of a surface with a plurality of microwells each containing a bead. On said bead, an oligonucleotide initiator can be immobilized. In addition to each microwell containing a bead with immobilized initiator, each microwell can contain an electrode to perform electrochemistry. In all examples of nucleic acid synthesis, the use of the modified thymine bases improves the quality of the synthesised strands due to lowering the length required in order to obtain a hybridising strand. EXAMPLES
Synthesis route for 2'-deoxy-3'-acetone oxime-5-(3-acetoxy-l-butynyl)-uridine triphosphate (Tl).
Scheme 1: Synthesis route for 3-butyn-l-yl-acetate. 3-butyn-l-ol was acetylated with acetic anhydride in the presence of ZnCI2.
Figure imgf000022_0001
Scheme 2: Synthesis route for 2'-deoxy-3'-acetone oxime-5-(3-acetoxy-l-butynyl)-uridine triphosphate (Tl). (a) TBDMSCI, pyridine; (b) BzOH, PPh3, DIAD; (c) NaOMe, MeOH; (d) N- hydroxyphthalimide, PPh3, DIAD, THF; (e) /. MeNH2, EtOH, //. acetone; (f) 3-butyn-l-yl-acetate, Cul, PdCI2(PPh3)2, Et3N; (g) 3HF.Et3N, THF; (h) /. 2-Chloro-l,3,2-benzodioxaphosphorin-4-one, pyridine, dioxane; ii. Bu3N.H4P207, DMF; Hi. I2; iv. NH3, H20.
Figure imgf000022_0002
7 8 T1 -oxime Scheme 3: Oxime deprotection of Tl-oxime to yield Tl. Prior to use, the oxime can be removed by incubation of the triphosphate Tl-oxime in a solution of 1 M sodium acetate pH 5.5, 1.5% w/v methoxylamine, and ultrapure water for 60 minutes at room temperature.
Figure imgf000023_0001
T1 -oxime T1
3-Butyn-l-yl Acetate
A¾Q ZnCI2
Figure imgf000023_0002
Zinc chloride (0.49 g, 3.6 mmol) was placed in a 50 mL round bottomed flask. Acetic anhydride (8.0 mL, 86 mmol) was added, then 3-butyn-l-ol (5.4 mL, 5.0 g, 71 mmol) was added to the suspension over 20 minutes. The solution was stirred at room temperature for 30 minutes, then water (30 mL) was added over 5 minutes. The mixture was stirred vigorously for 30 minutes, when the layers were separated. The organic phase was washed with water (30 mL) and sodium bicarbonate solution (10 mL). The phases were separated, magnesium sulfate (0.5 g) was added to the organic phase, the free flowing suspension was stirred for 16 h, then filtered to give 3-butyn-l-yl acetate as pale yellow, mobile oil (6.24 g, 78%); XH NMR (400 MHz, CDCI3) 04.18 (2H, t, J = 6.8 Hz), 2.54 (2H, td, J =6.8. 2.8 Hz), 2.09 (3H, s) and 2.01 (1H, t, J = 2.7 Hz). 5'(0)-tert-Butyldimethylsilyl-2'-deoxy-5-lodouridine
Figure imgf000023_0003
2'-Deoxy-5-iodouridine (1.0 g, 2.82 mmol) was placed in a 25 mL 2-necked flask. This was purged with nitrogen, then anhydrous pyridine (4 mL) was added. ferf-Butyldimethylchlorosilane (511 mg, 3.39 mmol) was added in portions over 10 minutes while the temperature was maintained at 20-22 °C. The suspension was stirred for 18h, then the reaction was quenched with methanol (0.23 mL, 5.65 mmol). The solvent was evaporated from the bulk using a rotary evaporator, then 1 M citric acid (10 mL) and. ethyl acetate (30 mL) were added, the mixture was shaken and the layers were separated. The organic layer was washed with 1 M citric acid, (2 x 10 mL) and saturated sodium bicarbonate solution (10 mL). The organic layer was dried (Na2S04) and filtered. The solvent was removed using a rotary evaporator to give a white solid (1.31 g), which was dissolved in hot acetone (15 mL). The solution was allowed to cool to room temperature, then heptane (15 mL) was added over 90 minutes. The white suspension was stirred at room temperature for 1 h, filtered and the solid was washed with heptane-acetone (1:1, 2 x 5 mL) and dried under vacuum, to give 5 '(O)-tert- butyldimethylsilyl-2'-deoxy-5-iodouridine as a fine white solid (652 mg). Additional heptane (20 mL) was added to the filtrate; this was stirred at room temperature for 30 minutes, the filtered and the solid was washed with heptane-acetone (3:1, 2 x 5 mL) and dried under vacuum, yield of second crop of as a fine white solid (257 mg). Additional solid precipitated from the filtrate. This was filtered and washed with heptane-acetone (3:1, 2 x 5 mL). The solid was dried under vacuum to give a 3rd crop as a white solid, (77 mg), total yield 986 mg, 75%; m/z (ES+) 469 ([M+H], 100%), 491 [(M+Na], 31) and 507 ([M+K], (28); m/z (ES ) 467 ([M-H], 100%); XH NMR (400 MHz, DMSO-d6) EH(ppm) 11.71 (1H, br s), 8.00 (1H, s), 6.09 (1H, dd, J = 7.7, 5.9 Hz), 5.29 (1H, d, J = 4.14 Hz), 4.18 (1H, m), 3.87 (1H, m), 3.80 (1H, dd, J = 11.5, 2.7 Hz), 3.73 (1H, d, J = 11.5, 3.4 Hz), 2.14 (1H, ddd, J = 13.2, 5.9, 2.5 Hz), 2.04 (1H, ddd, J = 13.5, 7.7, 5.7 Hz), 0.90 (9H, s) 0.12 (3H, s) and 0.11 (3H, s).
3'-0-Phthalimido-5'(0)-tert-Butyldimethylsilyl-xy/o-2'-Deoxy-5-lodouridine-3'-benzoate
Figure imgf000024_0001
5'-TBDMS-2'-5-lodo-deoxyuridine (0.986 g, 2.11 mmol), benzoic acid (386 mg, 3.16 mmol) and triphenylphosphine (828 mg, 3.16 mmol) were placed in a 100 mL 3 necked flask. This was purged with nitrogen, then anhydrous THF (10 mL) was added. The solution was cooled to 0-5°C then diisopropyl azobisdicarboxylate (622 pL, 3.16 mmol) was added over 10 minutes. The solution was allowed to warm to room temperature and stirred for 2h, when water (76 pL) was added. The pale yellow solution was stirred for a further 1 h, when the solvent was removed using a rotary evaporator. Toluene (10 mL) was added and the solvent was evaporated again. Toluene (20 mL) was added, and the solution was washed with saturated sodium bicarbonate solution (10 mL). The mixture was shaken and the layers were separated. The organic layer was washed with saturated sodium bicarbonate solution (10 mL), dried (Na2S04), filtered and the solution was concentrated using a rotary evaporator to ~5 mL. The pale yellow solution was placed in the freezer (-20°C) for 16 h, then allowed to warm to room temperature, and the solvent was evaporated again. A 1/5 portion of the sample was removed, and the remaining 4/5 was allowed to stand for 64 h, when toluene (5 mL) was added. A white precipitate formed. The suspension was cooled in an ice/water bath and stirred for 4 h, then filtered to give triphenyphosphine-diisopropyl hydrazinedicarboxylate complex as a white solid (680 mg). The filtrate was purified by flash chromatography with a silica gel cartridge (40 g) using a heptane-ethyl acetate 75:25 to ethyl acetate gradient to give 5 '(O)-tert- butyldimethylsilyl-xy/o-2'-deoxy-5-iodouridine-3'-benzoate as a white foam (568 mg, 47%); m/z (ES+) 573 ([M+H], 76%), 595 ([M+Na], 100) and 611 = ([M+K], 45); m/z (ES ) 571 ([M-H], 100%); XH NMR (400 MHz, CD3CN) EH(ppm) 9.16 (1H, br s), 8.14 (1H, s), 7.99 (2H, dd, J = 7.1, 2.0 Hz), 7.64, (1H, tt, J = 7.4, 1.6 Hz), 7.53, (2H, td, J = 7.1, 1.4 Hz), 6.10 (1H, d, J = 7.8. 2.0 Hz), 5.63 (1H, dd, J = 5.1, 3.4 Hz), 4.25 (1H, J = 6.0, 3,3 Hz), 4.04 (2H, m), 2.81 (1H, ddd, J = 15.6, 7.8, 5.5 Hz), 2.28 (1H, d, J = 15.6, 1.5 Hz), 0.82 (9H, s), 0.01 (3H, s) and -0.03 (3H, s).
5'(0)-tert-Butyldimethylsilyl-xy/o-2'-deoxy-5-lodouridine
Figure imgf000025_0001
(3'/?)-5'-TBDMS-5-lodo-2'-deoxyuridine-3'-benzoate (565 mg, 0.987 mmol was dissolved in methanol (5 mL). Sodium methoxide (5.4M in MeOH, 0.18 mL was added). The colourless solution was stirred at room temperature for 75 minutes when additional sodium methoxide solution (0.18 mL) was added. After 30 minutes, acetic acid (0.12 mL, 1.98 mmol) was added. Most of the solvent was evaporated using a rotary evaporator, then saturated sodium bicarbonate solution (10 mL) and ethyl acetate (10 mL) were added. The mixture was shaken and the layers were separated. The organic phase was dried (Na2S04), filtered and the solvent was removed using a rotary evaporator. The residue was purified by flash chromatography using a silica gel cartridge (12 g) eluting with a heptane-ethyl acetate gradient (70:30 to 1:1) to give 5'(0)-ferf-Butyldimethylsilyl-xylo-2'-deoxy-5- iodouridine as a white solid (361 mg, 78%); m/z (ES+) 469 ([M+H], 100%), 491 (M+Na], 100) and 507 ([M+K], 17); m/z (ES ) 467 ([M-H], 100%); XH NMR (400 MHz, CD3CN) 9.15 (1H, br s), 8.33 (1H, s), 6.06 (1H, dd, J = 8.2, 2.0 Hz), 4.34 (1H, m), 4.00 (1H, dd, J = 10.9, 4.4 Hz), 3.91 (1H, dd, J = 16.7, 5.8 Hz), 3.88 (1H, m), 3.61 (1H, d, J = 3.2 Hz), 2.55 (1H, dd, J = 14.9, 8.2, 5.2 Hz), 1.96 (1H, m), 0.90 (9H, s, 0.10 (3H, s) and 0.09 (3H, s). 3'-0-Phthalimido-5'(0)-tert-Butyldimethylsilyl-2'-deoxy-5-lodouridine
Figure imgf000026_0001
5'(0)-ferf-Butyldimethylsilyl-xy/o-2'-deoxy-5-lodouridine (434 mg, 0.93 mmol), N- hydroxyphthalimide (378 mg, 2.32 mmol) and triphenylphosphine (608 mg, 2.32 mmol) were placed in a 55 mL round-bottomed flask. This was purged with nitrogen, then anhydrous THF (8.5 mL) was added. The solution was cooled in an ice-water bath (0-5 °C), then diisopropyl azobisdicarboxylate (0.46 mL, 2.3 mmol) was added over 20 minutes. The suspension was stirred at 0-5°C for 30 minutes, then allowed to warm to room temperature. After lh, the reaction was quenched with water (0.1 mL), and the solution was stirred for a further 30 minutes. Toluene (15 mL) was added, followed by saturated sodium bicarbonate solution (15 mL), the mixture was shaken, the layers were separated and the organic phase was washed with saturated sodium bicarbonate solution (10 mL + 2 x 5 mL), dried (Na2S04) and filtered. The solution was concentrated using a rotary evaporator to about 10 mL, then seeded with a small quantity of triphenylphosphine oxide-diisopropylhydrazine dicarboxylate complex. The suspension was stirred at room temperature for 16h, then filtered, and the solid triphenylphosphine oxide-diisopropylhydrazine dicarboxylate complex was washed with toluene (2 x 5 mL). The filtrate was concentrated using a rotary evaporator and the residue was purified by flash chromatography using a 24 g cartridge, eluting with a gradient of hexanes-ethyl acetate 65:35 for 10 min, then gradient to ethyl acetate followed by pure ethyl acetate ethyl acetate to give 3 '-0- phthalimido-5'(0)-ferf-butyldimethylsilyl- 2'-deoxy-5-iodouridine as a white, crystalline solid (416 mg, 73%); m/z (ES+) 614 ([M+H], 100%); m/z (ES ) 612 ([M-H], 100%); XH NMR (400 MHz, CD3CN) EH(ppm) 9.21 (1H, br s), 8.03 (1H, s), 7.83 (4H, m), 6.31 (1H, dd, J = 8.9, 5.4 Hz), 4.92 (1H, d, J = 5.5 Hz), 4.39 (1H, td, J = 2.9, 0.9 Hz), 3.90 (1H, dd, J = 11.4, 2.9 Hz), 3.86 (1H, dd, J = 11.4, 3.0 Hz), 2.65 (1H, dd, J = 14.8, 5.5 Hz), 2.14 (1H, dd, J = 14.6, 9.0, 5.6 Hz), 0.88 (9H, s), 0.12 (3H, s) and 0.10 (3H, s). 3'-0-(/V-Acetone oxime)-5'(0)-tert-Butyldimethylsilyl-xy/o-2'-deoxy-5-lodouridine
Figure imgf000027_0001
3'-(0)-Phthalimido-5'-TBDMS-5-lodo-2'-deoxyuridine (416 mg, 0.68 mmol) was placed in a 50 mL round-bottomed flask. Methylamine (33 wt% in ethanol, 6.4 mL, ~54 mmol) was added. Extra ethanol (2 mL) was added, then most of the solvent was removed using a rotary evaporator. Acetone (6.23 mL) was added and the solution was stirred for 70 minutes. The solvent was removed using a rotary evaporator to give a white solid. Ethyl acetate (3 mL) was added, and the white suspension was stirred for 30 minutes. Heptane (7 mL) was added over 10 minutes in portions, then the white suspension was stirred for a further 30 minutes. The suspension was filtered. The filtrate was applied to a silica flash column packed in heptane-ethyl acetate (65:35). Elution with the same solvent gave 3'-0-(A/-acetone oxime)-5'(0)-ferf-butyldimethylsilyl-xy/o-2'-deoxy-5-iodouridine as an extremely viscous pale yellow oil (250 mg, 70%); m/z (ES+) 524 ([M+H], 100% and 546 ([M+Na], 12%); m/z (ES ) 522 ([M-H], 100%); XH NMR (400 MHz, CD3CN) d (ppm) 9.22 (1H, br s), 8.09 (1H, s), 6.10 (1H, d, J = 8.6, 5.6 Hz), 4.66 (1H, dt, J = 6.1, 1.6 Hz), 4.16 (1H, m), 3.91 (1H, dd, J = 11.5, 2.5 Hz), 3.82 (1H, dd, J = 11.5, 2.9 Hz), 2.44 (1H, ddd, J = 13.9, 5.6, 1.3 Hz), 2.07 (1H, ddd, J = 14.3, 8.4, 5.9 Hz), 1.83 (6H, s), 0.93 (9H, s), 0.15 (3H, s) and 0.14 (3H, s).
3'-0-(/V-Acetone oxime)-5'(0)-tert-Butyldimethylsilyl-xy/o-2'-deoxy-5-(4-acetoxy-but-l- ynyl)uridine
Figure imgf000028_0001
3-Butyn-l-yl acetate (107 mg, 0.96 mmol) was dissolved in triethylamine (16 mL) under nitrogen. The solution was added to 3'-(0)-(A/-acetone oxime)-5'-TBDMS-5-iodo-2'-deoxyuridine acetone (250 mg, 0.48 mmol) under nitrogen. iro/is-Dichlorobis(triphenylphosphine)palladium(ll) (7 mg, 0.0096 mmol) and copper(l) iodide (7 mg, 0.053 mmol) were added. The reaction flask was purged with nitrogen and heated in a heating block at 60°C for 1 h. After cooling, most of the solvent was removed using a rotary evaporator, then ethyl acetate (5 mL) and 5% disodium EDTA solution (5 mL) were added. The layers were separated, and the organic phase was washed with disodium EDTA (5 mL). The organic phase was dried (Na2S04), filtered and the solvent was removed using a rotary evaporator. The product was purified on a short silica column eluting with heptane-ethyl acetate (1:1). The solid product was dissolved in toluene (2 mL). Heptane (3 mL) was added while stirring over 15 minutes. The suspension was stirred for 1.5h, then filtered, and the solid was washed with heptane-toluene (2:1, 2 x 0.5 mL) and dried to give 3'-0-(A/-acetone oxime)-5'(0)-tert- butyldimethylsilyl-xy/o-2'-deoxy-5-(4-acetoxy-but-l-ynyl)uridine as a white solid, 149 mg, 61%); m/z (ES+) 286 (30%) and 508 ([M+H], 100%); m/z (ES ) 506 ([M-H], 100%); XH NMR (400 MHz, CD3CN) 9.15 (1H, br s), 7.97 (1H, s), 6.15 (1H, dd, J = 8.5, 5.7 Hz), 4.68 (1H, dt, J = 6.0, 1.3 Hz), 4.17 (1H, m), 4.12 (2H, t, J = 6.7 Hz), 3.92 (1H, dd, J = 11.5, 2.3 Hz), 3.82 (1H, dd, J = 11.5, 2.5 Hz), 2.67 (2H, t, J = 6.7 Hz),
2.44 (1H, ddd, J = 13.9, 5.7, 1.2 Hz), 2.08 (1H, ddd, J = 14.3, 8.3. 5.9 Hz), 2.00 (3H, s), 0.92 (9H, s), 0.14 (3H, s) and 0.13 (3H, s).
3'-0-(/V-Acetone oxime)-2'-deoxy-5-(4-acetoxy-but-l-ynyl)uridine
Figure imgf000029_0001
5'-TBDMS-5-(3-acetoxy-l-butynyl)-3'-aminooxy-2'-deoxyuridine acetone oxime (164 mg, 0.32 mmol) was dissolved in anhydrous THF (2.5 mL). Triethylamine trihydrofluoride (0.31 mL, 1.90 mmol) was added, and the solution was stirred at room temperature for 17 h. Ethoxytrimethylsilane (1.0 mL, 6.5 mmol) was added over 2 minutes, the solution was stirred for 30 minutes then saturated sodium bicarbonate solution (5 mL) and ethyl acetate (5 mL) were added. The phases were separated. The aqueous phase was extracted with ethyl acetate (2 x 5 mL), the combined organic phases were dried (Na2S04) and filtered and the solvent was removed using a rotary evaporator to give a white solid. This was purified by flash chromatography using a 12 g silica cartridge, eluting with a gradient from dichloromethane to ethyl acetate to give 3'-0-(A/-acetone oxime)-2'-deoxy-5-(4-acetoxy-but-l- ynyl)uridine as a white solid (117 mg, 92%). m/z (ES+) 394 ([M+H], 100% and 809 ([2M+H], 74%); m/z (ES ) 392 ([M-H], 100%); XH NMR (400 MHz, CD3CN) 9.06 (1H, br s), 8.08 (1H, s), 6.13 (1H, dd, J = 8.3, 5.9 Hz), 4.70 (1H, dt, J = 6.3, 2.0 Hz), 4.15 (1H, t, J = 6.6 Hz), 4.11 (1H, m), 3.74 (1H, m), 3.72 (1H, m), 3.31 (1H, t, J = 5.0 Hz), 2.69 (2H, t, J = 6.6 Hz), 2.41 (1H, ddd, J = 14.1, 5.9, 2.0 Hz), 2.16 (1H, ddd, J = 14.3, 8.1, 6.2 Hz), 2.01 (3H, s) and 1.83 (6H, s).
3'-0-(/V-Acetone oxime)-2'-deoxy-5-(4-acetoxy-but-l-ynyl)uridine-5'-triphosphate
Figure imgf000029_0002
3'-0-(A/-Acetone oxime)-2'-deoxy-5-(4-acetoxy-but-l-ynyl)uridine (116 mg, 0.228 mmol) was dried by azeotropic distillation with toluene (2 x 2 mL). The reaction flask was purged with nitrogen and anhydrous 1,4-dioxane (0.81 mL) and anhydrous pyridine (0.26 mL) were added. 2-Chloro-4-H-l,3,2- benzodioxaphosphorin-4-one (56 mg, 0.274 mL) was added in one portion. The white suspension was stirred at room temperature under nitrogen for 45 minutes. After 135 minutes, a suspension of tributylammonium pyrophosphate (163 mg, 0.297 mmol) in anhydrous DMF (1.0 mL) and tributylamine (0.29 mL, 1.24 mmol) was added. The mixture was stirred for 45 minutes, when a solution of iodine (84 mg, 0.366 mmol), in pyridine (2.16 mL) and water (0.044 mL) was added rapidly. The dark brown solution was stirred at room temperature for 30 minutes, then 10% sodium thiosulfate solution (1 mL) was added followed by water (5 mL) The colourless solution was stirred at room temperature for 30 minutes, when 30% ammonia solution (5 mL) was added. The suspension was stirred at room temperature for 1 h, when most of the solvent was removed using a rotary evaporator (bath temperature 30°C). The residue was dissolved in water (2 mL) and split between two 50 mL centrifuge tubes. 2% sodium perchlorate solution in acetone (2 x 15 mL) cooled to -80°C was added to each. A white emulsion formed. After centrifugation for 15 minutes at 4000 rpm at - 10°C, the upper layer was decanted to leave the lower, immiscible oily phase. This was dissolved in water (2 x 0.5 mL). 2% Sodium perchlorate solution in acetone (15 mL x 2) cooled to -80°C) was added to each portion. The white emulsion was centrifuged for 20 minutes at 4000 rpm at -10°C). The upper layer was decanted to leave the lower, immiscible oily phase. This was washed with acetone cooled to -80 °C (2 x 1 mL for each portion). The oily residues were dissolved in water (total volume 2 mL) and the crude triphosphate solution obtained was purified by reverse phase HPLC using a Supelco Ascentis C18 column (25 cm x 10 mm, 5 pm), flow rate 3 mL/min, and a gradient of A: 100 mM triethylammonium bicarbonate pH 7.5, B: Acetonitrile; A to 30%B over 32 minutes (8 runs). The product-containing fractions were combined and the solvent was evaporated, then methanol (10 mL) was added to the residue and the solvent was evaporated using a rotary evaporator. The residue was dissolved in water (4 mL) and the solution was lyophilised to give semi- purified triphosphate triethylammonium salt as a colourless glass (87 mg). This was purified by ion exchange chromatography using a Sourcel5Q column (10 x 150 mm), flow rate 3 mL/min, and gradient of 10 mM triethylammonium bicarbonate pH 7.5 to 1 M triethylammonium bicarbonate pH 7.5 over 22 minutes, followed by hold for 10 minutes (12 runs). Fractions containing 5'- di-, tri-, tetra- and pentaphosphates were obtained. After evaporation of solvent from product containing fractions, methanol (10 mL) was added to each and the solvent was evaporated. This was repeated twice for each product sample. Samples of diphosphate, tetraphosphate and pentaphosphate were dissolved in water (2 mL) and the solvent was removed using a rotary evaporator to give colourless, extremely viscous oils; diphosphate (5 mg), tetraphosphate (9 mg) and pentaphosphate (9 mg). The triphosphate was dissolved in water (2 mL), and the sample was lyophilised to give the triphosphate as a colourless glass (81 mg). All 5'-phosphates were obtained as triethylammonium salts containing one molecule of triethylamine per phosphate group; diphosphate; m/z (ES ) 510 ([M-H], 100%; XH NMR (400 MHz, D20) 7.99 (1H, s), 6.12 (1H, dd, J = 8.3, 6.3 Hz), 4.76 (1H, d, J = 5.7 Hz), 4.25 (1H, m), 4.00 (2H, m), 3.58 (2H, t, J = 6.1 Hz), 3.35 (12H, q, J = 7.3 Hz), 2.48 (2H, J = 6.0 Hz), 2.40 (1H, dd, J = 14.4, 5.7 Hz), 2.18 (1H, ddd, J = 14.7, 8.8, 6.0 Hz), 1.76 (3H, s), 1.73 (3H, s) and 1.15 (18H, t, J = 7.3 Hz; 31P NMR (162 MHz, D20) -10.81 (d, J = 19.9 Hz) and -11.52 (d, J = 20.0 Hz); triphosphate; m/z (ES ) 590 ([M-H], 100%); XH NMR (400 MHz, D20) 8.00 (1H, s), 6.12 (1H, dd, J = 9.0, 5.7 Hz), 4.78 (1H, d, J = 5.7 Hz), 4.25 (1H, m), 4.04 (2H, m), 3.58 (2H, t, J = 6.1 Hz), 3.01 (18H, q, J = 7.3 Hz), 2.48 (2H, J = 6.1 Hz), 2.40 (1H, dd, J = 14.2, 5.7 Hz), 2.18 (1H, ddd, J = 14.6, 8.9, 5.8 Hz), 1.76 (3H, s), 1.73 (3H, s) and 1.09 (27H, t, J = 7.3 Hz); 31P NMR (162 MHz, D20) -10.93 (d, J = 19.8 Hz), -11.71 (d, J = 20.2 Hz) and - 23.37 (t, J = 19.8 Hz); tetraphosphate; m/z (ES ) 670 ([M-H], 100%); XH NMR (400 MHz, D20) 8.00 (1H, s), 6.12 (1H, dd, J = 8.9, 5.6 Hz), 4.78 (1H, d, J = 4.6 Hz), 4.26 (1H, m), 4.05 (2H, m), 3.58 (2H, t, J = 6.0 Hz), 3.31 (24H, q, J = 7.0 Hz), 2.49 (2H, J = 5.91 Hz), 2.40 (1H, dd, J = 14.3, 5.4 Hz), 2.19 (1H, ddd, J = 14.6, 8.6, 5.8 Hz), 1.76 (3H, s), 1.73 (3H, s) and 1.13 (36H, t, J = 7.4 Hz); 31P NMR (162 MHz, D20) - 10.86 (d, J = 17.1 Hz), -11.84 (d, J = 17.4 Hz), and -23.16 (m); pentaphosphate; m/z (ES ) 510 ([M-H], 100%), 772 ([M+Na-2H], 7%) and 788 ([M+K-2H], 60%); XH NMR (400 MHz, D20) 8.00 (1H, s), 6.12 (1H, dd, J = 8.9, 5.8 Hz), 4.77 (1H, d, J = 5.2 Hz), 4.26 (1H, m), 4.05 (2H, m), 3.58 (2H, t, J = 6.1 Hz), 3.31 (30H, q, J = 7.2 Hz), 2.48 (2H,t, J = 6.1 Hz), 2.40 (1H, dd, J = 14.3, 5.6 Hz), 2.18 (1H, ddd, J = 14.6, 8.5, 6.0 Hz), 1.76 (3H, s), 1.73 (3H, s) and 1.13 (45H, t, J = 7.2 Hz); 31P NMR (162 MHz, D20) -10.73 (d, J = 14.3 Hz), -11.74 (d, J = 14.6 Hz) and -22.75 (m).
Enzymatic incorporation of 2'-deoxy-3'-aminooxy-5-(3-hydroxy-l-butynyl)-uridine triphosphate (Tl) into a single stranded nucleic acid initiator. An engineered terminal deoxynucleotidyl transferase (TdT) was used to incorporate Tl into six nucleic acid sequences (SEQs 1 to 6) with different 3'-terminal sequence contexts (see Figure 1). Quantitative addition efficiency was observed in all cases. This demonstrates that the Tl triphosphate is a suitable monomer for de novo enzymatic DNA synthesis.
SEQ 1: /5Cy3/CAATCAGGTGAAG SEQ 2: /5Cy3/CAATCAGGTGAAA SEQ 3: /5Cy3/CAATCAGGTGAAT SEQ 4: /5Cy3/CAATCAGGTGAAC SEQ 5: /5Cy3/CAATCAGGTGTTT SEQ 6: /5Cy3/CAATCAGGTGTTU
Multi-cycle de novo enzymatic DNA synthesis involving 2'-deoxy-3'-aminooxy-5-(3-hydroxy-l- butynylj-uridine triphosphate (Tl). Initiator oligonucleotide of SEQ 7 was immobilised on super paramagnetic particles to facilitate retention through multiple cycles of DNA synthesis. An engineered TdT enzyme mediated the first addition of Tl to the initiator oligonucleotide. Sodium nitrite in acetate buffer was then employed to deblock the 3'-aminooxy reversible terminator, revealing a 3'-hydroxyl competent for further addition. A second addition was then performed, whereby the engineered TdT enzyme added either A, C, T, G, or Tl (all in 3'-aminooxy reversibly terminated form). A cleavage solution containing uracil DNA glycosylase (UDG) and N,N'- dimethylethylene diamine (DMED) was then used to cleave DNA from the paramagnetic particles for analysis by denaturing polyacrylamide gel electrophoresis. The DNA was visualised by virtue of the internal cyanine 3 dye (see figure 2). Lane 1: no addition control. Lanes 2 and 3: single addition of Tl triphosphate. Lanes 4-8: addition of Tl followed by A, C, T, G, or Tl for lanes 4 to 8 respectively. The gel clearly shows quantitative addition of T1 to the solid support immobilised initiator strand. Furthermore, the gel demonstrates that multi-cycle de novo enzymatic DNA synthesis can be achieved when T1 forms part or the entirety of the synthesised sequence. SEQ 7: T*T*T* I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I TTUTTTT/icy3/TTTTT
Note that an asterisk (*) indicates a phosphorothioate linkage and /icy3/ indicates an internal cyanine 3 dye.

Claims

A compound according to Formula (lc) or (Id):
Figure imgf000033_0001
wherein, R1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms;
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe; and R4 is H or D.
2. The compound according to claim 1 according to Formula (la) or (lb):
Figure imgf000033_0002
wherein, R1 is a phosphate or polyphosphate group or salt thereof, optionally containing one or more sulfur atoms;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms; and
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe.
3. The compound according to claim 1 or 2, wherein R1 is a phosphate or polyphosphate group selected from -(P03) x(P02S) y(P03) z where x, y and z are independently 0-5 and and x+y+z is
1-5.
4. The compound according to any one of claims 1 to 3, wherein R1 is a phosphate or polyphosphate group having one or more sulfur atoms.
5. The compound according any one of claims 1 to 4, wherein R1 is a monophosphate, diphosphate, triphosphate, tetraphosphate, pentaphosphate, or (alpha-thio)triphosphate group.
6. The compound according to claim 1, wherein R1 is a triphosphate group.
7. The compound according to any one of claims 1 to 6, wherein R2 is OH or CH2OH.
8. The compound according to any one of claims 1 to 7, wherein R3 is H.
9. The compound according to any one of claims 3 to 8, wherein R4 is D.
10. The compound according to claim 1 which is:
Figure imgf000034_0001
or a salt thereof.
11. A method of nucleic acid synthesis comprising reacting a compound according to any one of claims 1 to 10 with an oligonucleotide sequence in the presence of a terminal deoxynucleotidyl transferase (TdT) enzyme.
12. The method according to claim 11, wherein the oligonucleotide sequence is a solid- supported oligonucleotide sequence.
13. The method according to claim 11 or claim 12 further comprising a reaction step with a nitrite salt.
14. The method according to claim 13, wherein the nitrite salt is sodium nitrite.
15. A kit comprising:
(i) a compound according to any one of claims 1 to 10; (ii) a terminal deoxynucleotidyl transferase (TdT) enzyme; and optionally
(iii) a nitrite salt.
16. An oligonucleotide according to Formula (2c) or (2d):
Figure imgf000035_0001
wherein, R1 is an oligonucleotide;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms;
R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe; and R4 is H or D.
17. The oligonucleotide according to claim 16 according to Formula (2a) or (2b):
Figure imgf000035_0002
wherein, R1 is an oligonucleotide;
R2 is H, halo, OH, NH2, COOH, COH, Ci_3 alkoxy, Ci_3 alkyl optionally substituted with OH, NH2 or halo atoms; and R3 is selected from H, OH, F, OCH3 or OCH2CH2OMe.
18. The oligonucleotide according to claim 16 or 17, wherein R2 is OH or CH2OH.
19. The oligonucleotide according to claims 16 to 18, wherein R3 is H.
20. The oligonucleotide according to claim 16, wherein R4 is D.
PCT/GB2021/050839 2020-04-06 2021-04-06 C5-modified thymidines WO2021205155A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB2005043.1A GB202005043D0 (en) 2020-04-06 2020-04-06 C5-modified Thymidines
GB2005043.1 2020-04-06

Publications (2)

Publication Number Publication Date
WO2021205155A2 true WO2021205155A2 (en) 2021-10-14
WO2021205155A3 WO2021205155A3 (en) 2021-11-18

Family

ID=70769009

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2021/050839 WO2021205155A2 (en) 2020-04-06 2021-04-06 C5-modified thymidines

Country Status (2)

Country Link
GB (1) GB202005043D0 (en)
WO (1) WO2021205155A2 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016128731A1 (en) 2015-02-10 2016-08-18 Nuclera Nucleics Ltd Novel use
WO2016139477A1 (en) 2015-03-03 2016-09-09 Nuclera Nucleics Ltd A process for the preparation of nucleic acid by means of 3'-o-azidomethyl nucleotide triphosphate
WO2017009663A1 (en) 2015-07-15 2017-01-19 Nuclera Nucleics Ltd Azidomethyl ether deprotection method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8034923B1 (en) * 2009-03-27 2011-10-11 Steven Albert Benner Reagents for reversibly terminating primer extension
EP3356381A4 (en) * 2015-09-28 2019-06-12 The Trustees of Columbia University in the City of New York Design and synthesis of novel disulfide linker based nucleotides as reversible terminators for dna sequencing by synthesis
GB201718804D0 (en) * 2017-11-14 2017-12-27 Nuclera Nucleics Ltd Novel use

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016128731A1 (en) 2015-02-10 2016-08-18 Nuclera Nucleics Ltd Novel use
WO2016139477A1 (en) 2015-03-03 2016-09-09 Nuclera Nucleics Ltd A process for the preparation of nucleic acid by means of 3'-o-azidomethyl nucleotide triphosphate
WO2017009663A1 (en) 2015-07-15 2017-01-19 Nuclera Nucleics Ltd Azidomethyl ether deprotection method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LAKSHMAN ET AL., J. ORG. CHEM., vol. 75, 2010, pages 2461 - 2473

Also Published As

Publication number Publication date
WO2021205155A3 (en) 2021-11-18
GB202005043D0 (en) 2020-05-20

Similar Documents

Publication Publication Date Title
EP3091026B1 (en) Disulfide-linked reversible terminators
JP2022122950A (en) Novel use
EP3265468B1 (en) A process for the preparation of nucleic acid by means of 3'-o-azidomethyl nucleotide triphosphate
US6175001B1 (en) Functionalized pyrimidine nucleosides and nucleotides and DNA's incorporating same
US20180201968A1 (en) Azidomethyl Ether Deprotection Method
US8153779B2 (en) Nucleotide with an alpha-phosphate mimetic
US20110053154A1 (en) 2'-nitrobenzyl-modified ribonucleotides
EP3935187B1 (en) Method of oligonucleotide synthesis
JP3893057B2 (en) Novel nucleic acid base pair
WO2019053443A1 (en) Novel use
Beck et al. Double-headed nucleotides as xeno nucleic acids: information storage and polymerase recognition
CN114555818A (en) Template-free enzymatic polynucleotide synthesis using photocleavable linkages
WO2021205155A2 (en) C5-modified thymidines
WO2021205156A2 (en) 5-position modified pyrimidines
US20240158425A1 (en) Modified adenines
US20240150389A1 (en) Modified guanines
WO2020229831A1 (en) Nucleic acid polymer with amine-masked bases
EP1541581A1 (en) 4 -thionucleotide
JP2024513207A (en) Nucleotide analogs for sequencing
JPWO2019150564A1 (en) DNA replication method using an oligonucleotide having a sulfonamide skeleton as a template

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21719205

Country of ref document: EP

Kind code of ref document: A2