[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110835630A - A high-efficiency sgRNA and its application in gene editing - Google Patents

A high-efficiency sgRNA and its application in gene editing Download PDF

Info

Publication number
CN110835630A
CN110835630A CN201911200779.0A CN201911200779A CN110835630A CN 110835630 A CN110835630 A CN 110835630A CN 201911200779 A CN201911200779 A CN 201911200779A CN 110835630 A CN110835630 A CN 110835630A
Authority
CN
China
Prior art keywords
sequence
sgrna
protein
sakkhn
rna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911200779.0A
Other languages
Chinese (zh)
Other versions
CN110835630B (en
Inventor
张成伟
徐雯
刘亚
赵思
杨进孝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Academy of Agriculture and Forestry Sciences
Original Assignee
Beijing Academy of Agriculture and Forestry Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Academy of Agriculture and Forestry Sciences filed Critical Beijing Academy of Agriculture and Forestry Sciences
Priority to CN201911200779.0A priority Critical patent/CN110835630B/en
Publication of CN110835630A publication Critical patent/CN110835630A/en
Application granted granted Critical
Publication of CN110835630B publication Critical patent/CN110835630B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04001Cytosine deaminase (3.5.4.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Virology (AREA)
  • Cell Biology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention provides an efficient sgRNA and application thereof in gene editing. The sgRNA is shown as formula I: an RNA-engineered sgRNA backbone transcribed from the target sequence (formula I); the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework; the RNA segment A and the RNA segment B are reversely complementary; the sizes of the RNA fragment A and the RNA fragment B are both 3 nt; the sgRNA framework is an RNA molecule shown in a sequence 9. Experiments prove that: the modified sgRNA can obviously improve the C.T base replacement efficiency of a Cytosine Base Editor (CBE), and the highest efficiency can reach 86.4%.

Description

一种高效的sgRNA及其在基因编辑中的应用A high-efficiency sgRNA and its application in gene editing

技术领域technical field

本发明属于生物技术领域,具体涉及一种高效的sgRNA及其在基因编辑中的应用。The invention belongs to the field of biotechnology, in particular to an efficient sgRNA and its application in gene editing.

背景技术Background technique

CRISPR-Cas9技术已经成为强有力的基因组编辑手段,被广泛应用到很多组织和细胞中。CRISPR/Cas9 protein-RNA复合物通过向导RNA(guide RNA)定位于靶点上,切割产生DNA双链断裂(dsDNA break,DSB),而后生物体会本能的启动DNA修复机制修复DSB。修复机制一般有两种,一种是非同源末端连接(non-homologous end joining,NHEJ),另一种是同源重组(homology-directed repair,HDR)。通常情况下NHEJ占大多数,因此修复产生的随机的indels(insertions or deletions)比精确修复高很多。对于碱基精确替换,因为HDR效率低以及需要DNA模板,所以使用HDR实现碱基精确替换的应用受到很大的限制。CRISPR-Cas9 technology has become a powerful genome editing method, which is widely used in many tissues and cells. The CRISPR/Cas9 protein-RNA complex is positioned on the target through the guide RNA (guide RNA), and the cleavage produces a DNA double-strand break (dsDNA break, DSB), and then the organism will instinctively initiate the DNA repair mechanism to repair the DSB. There are generally two repair mechanisms, one is non-homologous end joining (NHEJ) and the other is homologous recombination (homology-directed repair, HDR). Usually NHEJ is in the majority, so the random indels (insertions or deletions) generated by repair are much higher than with precise repair. For base-accurate substitutions, the application of base-accurate substitutions using HDR is greatly limited due to the low efficiency of HDR and the need for DNA templates.

2016年,David Liu和Akihiko Kondo两个实验室分别独立报道了两种不同类型的胞嘧啶碱基编辑器(cytosine base editor,CBE),分别使用了两种不同的胞苷脱氨酶rAPOBEC1(rat APOBEC1)和PmCDA1(activation-induced cytidine deaminase(AID)ortholog from sea lamprey),原理都是通过使用胞苷脱氨酶直接实现对单个胞嘧啶(Cytosine,C)碱基进行编辑,而不再通过产生DSB和启动HDR修复,大大提高了C替换为胸腺嘧啶(Thymine,T)的碱基编辑效率。具体为dead Cas9(dCas9)或the Cas9 nickase(Cas9n)连带着rAPOBEC1或PmCDA1通过sgRNA定位到靶点,rAPOBEC1或PmCDA1催化非配对的单链DNA上的C发生胞嘧啶脱氨反应变成尿嘧啶(Uracil,U),通过DNA的修复使得U与腺嘌呤(Adenine,A)配对,又通过DNA复制,最终使得T与A配对,从而实现了C到T的转换。在所测试的编辑器中,SpCas9n(D10A)&rAPOBEC1/PmCDA1&UGI碱基编辑系统(其含有尿嘧啶DNA糖化酶抑制剂(uracil DNA glycosylase inhibitor,UGI))的平均突变率较高,原因有二:一是UGI可以抑制尿嘧啶DNA糖化酶(uracil DNA glycosylase,UDG)催化清除DNA中U,二是SpCas9n(D10A)在非编辑链上产生切口,诱导真核错配修复机制或long-patch BER(base-excision repair)修复机制,促使U:G错配更多的偏好性修复成U:A。为了提高工作效率,降低工作成本,C·T碱基替换效率的提高一直是动植物基因组碱基编辑系统的研究方向。In 2016, the laboratories of David Liu and Akihiko Kondo independently reported two different types of cytosine base editors (CBE), using two different cytidine deaminase rAPOBEC1 (rat APOBEC1) and PmCDA1 (activation-induced cytidine deaminase (AID) ortholog from sea lamprey), the principle is to directly edit a single cytosine (Cytosine, C) base by using cytidine deaminase, instead of generating DSB and initiation of HDR repair greatly improved the base editing efficiency of C replacement with Thymine (T). Specifically, dead Cas9 (dCas9) or the Cas9 nickase (Cas9n) together with rAPOBEC1 or PmCDA1 is localized to the target through sgRNA, and rAPOBEC1 or PmCDA1 catalyzes the deamination of C on unpaired single-stranded DNA into uracil ( Uracil, U), through DNA repair, U is paired with adenine (A), and through DNA replication, T is finally paired with A, thereby realizing the conversion of C to T. Among the editors tested, the SpCas9n(D10A)&rAPOBEC1/PmCDA1&UGI base editing system (which contains a uracil DNA glycosylase inhibitor (UGI)) had a higher average mutation rate for two reasons: one UGI can inhibit uracil DNA glycosylase (UDG) to catalyze the removal of U in DNA, and the second is SpCas9n (D10A) nicking on the non-editing strand, inducing eukaryotic mismatch repair mechanism or long-patch BER (base -excision repair) repair mechanism, which promotes more preferential repair of U:G mismatches into U:A. In order to improve work efficiency and reduce work costs, the improvement of C·T base replacement efficiency has always been the research direction of base editing systems for animal and plant genomes.

来源于金色葡萄球菌(Staphylococcus aureus)的Cas9(SaCas9)为SpCas9同源物,识别NNGRRT PAM,SaCas9变体SaKKH识别更为广泛的NNNRRT PAM,二者均被开发成有效的CBE,大大拓展了CBE在动物和植物基因组中的可编辑C的范围。目前尚无通过改造SaCas9相应的sgRNA(SaCas9 sgRNA)结构提高SaKKH相关的CBE的C·T碱基替换效率的研究报道。Cas9 (SaCas9) derived from Staphylococcus aureus is a SpCas9 homolog that recognizes NNGRRT PAM, and SaCas9 variant SaKKH recognizes the more extensive NNNRRT PAM, both of which were developed into effective CBEs, greatly expanding CBE The range of editable C in animal and plant genomes. There is no research report on improving the C·T base substitution efficiency of SaKKH-related CBE by modifying the structure of the corresponding sgRNA of SaCas9 (SaCas9 sgRNA).

发明内容SUMMARY OF THE INVENTION

本发明的目的是提高胞嘧啶碱基编辑器(CBE)的C·T碱基替换效率。The purpose of the present invention is to improve the C·T base substitution efficiency of a cytosine base editor (CBE).

为了实现上述目的,本发明提供了一种成套试剂,所述成套试剂包括sgRNA或与所述sgRNA相关的生物材料、Cas9核酸酶或与所述Cas9核酸酶相关的生物材料、胞嘧啶脱氨酶或与所述胞嘧啶脱氨酶相关的生物材料;In order to achieve the above purpose, the present invention provides a kit of reagents, the kit includes sgRNA or biological material related to the sgRNA, Cas9 nuclease or biological material related to the Cas9 nuclease, cytosine deaminase or biological material associated with said cytosine deaminase;

所述sgRNA靶向靶点序列;the sgRNA targeting target sequence;

所述sgRNA如式I所示:所述靶点序列转录的RNA-改造的sgRNA骨架(式I);Described sgRNA is as shown in formula I: the RNA-engineered sgRNA backbone (formula I) transcribed by the target sequence;

所述改造的sgRNA骨架为在sgRNA骨架第14-15位之间插入RNA片段甲,且在所述sgRNA骨架第18-19位之间插入RNA片段乙后得到的RNA分子;The modified sgRNA backbone is an RNA molecule obtained by inserting RNA fragment A between positions 14-15 of the sgRNA backbone, and inserting RNA fragment B between positions 18-19 of the sgRNA backbone;

所述RNA片段甲与所述RNA片段乙反向互补;The RNA fragment A is reverse complementary to the RNA fragment B;

所述RNA片段甲与所述RNA片段乙大小均为3nt;The size of the RNA fragment A and the RNA fragment B are both 3nt;

所述sgRNA骨架为m1)或m2)或m3):The sgRNA backbone is m1) or m2) or m3):

m1)序列9所示的RNA分子;m1) the RNA molecule shown in sequence 9;

m2)将m1)所示的RNA分子经过一个或几个核苷酸的取代和/或缺失和/或添加且具有相同功能的RNA分子;m2) the RNA molecule shown in m1) has undergone the substitution and/or deletion and/or addition of one or several nucleotides and has the same function;

m3)与m1)或m2)限定的核苷酸序列具有75%或75%以上同一性且具有相同功能的RNA分子。m3) An RNA molecule that is 75% or more identical to the nucleotide sequence defined by m1) or m2) and has the same function.

上述成套试剂中,所述改造的sgRNA骨架为n1)或n2)或n3):In the above-mentioned complete set of reagents, the modified sgRNA backbone is n1) or n2) or n3):

n1)序列10所示的RNA分子;n1) RNA molecule shown in sequence 10;

n2)将n1)所示的RNA分子经过一个或几个核苷酸的取代和/或缺失和/或添加且具有相同功能的RNA分子;n2) the RNA molecule shown in n1) has undergone the substitution and/or deletion and/or addition of one or several nucleotides and has the same function;

n3)与n1)或n2)限定的核苷酸序列具有75%或75%以上同一性且具有相同功能的RNA分子。n3) An RNA molecule having 75% or more identity to the nucleotide sequence defined by n1) or n2) and having the same function.

上述成套试剂中,所述Cas9核酸酶可为SaKKHn或SaCas9或SaKKH-HF或SaCas9-HF等蛋白质。在本发明的一个具体实施例中,所述Cas9核酸酶具体为SaKKHn蛋白质。In the above reagent kit, the Cas9 nuclease may be a protein such as SaKKHn or SaCas9 or SaKKH-HF or SaCas9-HF. In a specific embodiment of the present invention, the Cas9 nuclease is specifically SaKKHn protein.

所述SaKKHn蛋白质为E1)或E2)或E3):Said SaKKHn protein is E1) or E2) or E3):

E1)氨基酸序列是序列2所示的蛋白质;E1) the amino acid sequence is the protein shown in sequence 2;

E2)将序列表中序列2所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质;E2) The amino acid sequence shown in SEQ ID NO: 2 in the sequence listing is subjected to substitution and/or deletion and/or addition of one or several amino acid residues and has the same function as a protein;

E3)在E1)或E2)的N端或/和C端连接标签得到的融合蛋白质;E3) a fusion protein obtained by linking a tag at the N-terminus or/and C-terminus of E1) or E2);

与所述SaKKHn蛋白质相关的生物材料为F1)至F5)中的任一种:The biological material associated with the SaKKHn protein is any one of F1) to F5):

F1)编码所述SaKKHn蛋白质的核酸分子;F1) a nucleic acid molecule encoding the SaKKHn protein;

F2)含有F1)所述核酸分子的表达盒;F2) an expression cassette containing the nucleic acid molecule of F1);

F3)含有F1)所述核酸分子的重组载体、或含有F2)所述表达盒的重组载体;F3) a recombinant vector containing the nucleic acid molecule described in F1) or a recombinant vector containing the expression cassette described in F2);

F4)含有F1)所述核酸分子的重组微生物、或含有F2)所述表达盒的重组微生物、或含有F3)所述重组载体的重组微生物;F4) a recombinant microorganism containing the nucleic acid molecule described in F1), or a recombinant microorganism containing the expression cassette described in F2), or a recombinant microorganism containing the recombinant vector described in F3);

F5)含有F1)所述核酸分子的转基因细胞系、或含有F2)所述表达盒的转基因细胞系。F5) A transgenic cell line containing the nucleic acid molecule of F1), or a transgenic cell line containing the expression cassette of F2).

上述成套试剂中,所述胞嘧啶脱氨酶可为human APOBEC3A、human AID、PmCDA1或rAPOBEC1等蛋白质。在本发明的一个具体实施例中,所述胞嘧啶脱氨酶具体为PmCDA1蛋白质。In the above reagent kit, the cytosine deaminase can be proteins such as human APOBEC3A, human AID, PmCDA1 or rAPOBEC1. In a specific embodiment of the present invention, the cytosine deaminase is specifically PmCDA1 protein.

所述PmCDA1蛋白质为G1)或G2)或G3):The PmCDA1 protein is G1) or G2) or G3):

G1)氨基酸序列是序列3所示的蛋白质;G1) the amino acid sequence is the protein shown in sequence 3;

G2)将序列表中序列3所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质;G2) The amino acid sequence shown in SEQ ID NO: 3 in the sequence listing is subjected to substitution and/or deletion and/or addition of one or several amino acid residues and has the same function as a protein;

G3)在G1)或G2)的N端或/和C端连接标签得到的融合蛋白质;G3) a fusion protein obtained by linking a tag to the N-terminus or/and C-terminus of G1) or G2);

与所述PmCDA1蛋白质相关的生物材料为H1)至H5)中的任一种:The biological material related to the PmCDA1 protein is any one of H1) to H5):

H1)编码所述PmCDA1蛋白质的核酸分子;H1) a nucleic acid molecule encoding the PmCDA1 protein;

H2)含有H1)所述核酸分子的表达盒;H2) an expression cassette containing the nucleic acid molecule described in H1);

H3)含有H1)所述核酸分子的重组载体、或含有H2)所述表达盒的重组载体;H3) a recombinant vector containing the nucleic acid molecule described in H1) or a recombinant vector containing the expression cassette described in H2);

H4)含有H1)所述核酸分子的重组微生物、或含有H2)所述表达盒的重组微生物、或含有H3)所述重组载体的重组微生物;H4) a recombinant microorganism containing the nucleic acid molecule described in H1), or a recombinant microorganism containing the expression cassette described in H2), or a recombinant microorganism containing the recombinant vector described in H3);

H5)含有H1)所述核酸分子的转基因细胞系、或含有H2)所述表达盒的转基因细胞系。H5) a transgenic cell line containing the nucleic acid molecule of H1), or a transgenic cell line containing the expression cassette of H2).

上述成套试剂中,所述sgRNA可为tRNA-sgRNA;In the above-mentioned complete set of reagents, the sgRNA can be tRNA-sgRNA;

所述tRNA-sgRNA如式I所示:tRNA-所述靶点序列转录的RNA-改造的sgRNA骨架(式I);Described tRNA-sgRNA is as shown in formula I: tRNA-the RNA-remodeled sgRNA backbone (formula I) transcribed by the target sequence;

所述tRNA为1)或2)或3):The tRNA is 1) or 2) or 3):

1)将序列1第474-550位中的T替换为U得到的RNA分子;1) the RNA molecule obtained by replacing T in the 474th-550th position of sequence 1 with U;

2)将1)所示的RNA分子经过一个或几个核苷酸的取代和/或缺失和/或添加且具有相同功能的RNA分子;2) The RNA molecule shown in 1) has been replaced and/or deleted and/or added by one or several nucleotides and has the same function as the RNA molecule;

3)与1)或2)限定的核苷酸序列具有75%或75%以上同一性且具有相同功能的RNA分子。3) An RNA molecule having 75% or more identity with the nucleotide sequence defined in 1) or 2) and having the same function.

上述成套试剂还可包括UGI蛋白质或与所述UGI蛋白质相关的生物材料;The above-mentioned complete set of reagents can also include UGI protein or biological material related to the UGI protein;

所述UGI蛋白质为I1)或I2)或I3):The UGI protein is I1) or I2) or I3):

I1)氨基酸序列是序列4所示的蛋白质;I1) the amino acid sequence is the protein shown in sequence 4;

I2)将序列表中序列4所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质;12) The amino acid sequence shown in SEQ ID NO: 4 in the sequence listing is subjected to substitution and/or deletion and/or addition of one or several amino acid residues and has the same function as a protein;

I3)在I1)或I2)的N端或/和C端连接标签得到的融合蛋白质;I3) a fusion protein obtained by linking a tag at the N-terminus or/and C-terminus of I1) or I2);

与所述UGI蛋白质相关的生物材料为J1)至J5)中的任一种:The biological material associated with the UGI protein is any one of J1) to J5):

J1)编码所述UGI蛋白质的核酸分子;J1) a nucleic acid molecule encoding the UGI protein;

J2)含有J1)所述核酸分子的表达盒;J2) an expression cassette containing the nucleic acid molecule of J1);

J3)含有J1)所述核酸分子的重组载体、或含有J2)所述表达盒的重组载体;J3) a recombinant vector containing the nucleic acid molecule described in J1), or a recombinant vector containing the expression cassette described in J2);

J4)含有J1)所述核酸分子的重组微生物、或含有J2)所述表达盒的重组微生物、或含有J3)所述重组载体的重组微生物;J4) a recombinant microorganism containing the nucleic acid molecule described in J1), or a recombinant microorganism containing the expression cassette described in J2), or a recombinant microorganism containing the recombinant vector described in J3);

J5)含有J1)所述核酸分子的转基因细胞系、或含有J2)所述表达盒的转基因细胞系。J5) A transgenic cell line containing the nucleic acid molecule of J1), or a transgenic cell line containing the expression cassette of J2).

为了使E1)、G1)、I1)中的蛋白质便于纯化,可在由序列表中序列2或序列3或序列4所示的氨基酸序列组成的蛋白质的氨基末端或羧基末端连接上如下表所示的标签。In order to facilitate the purification of the proteins in E1), G1) and I1), the amino terminus or carboxyl terminus of the protein consisting of the amino acid sequences shown in SEQ ID NO: 2 or SEQ ID NO: 3 or SEQ ID NO: 4 can be linked as shown in the table below Tag of.

表、标签的序列Sequence of tables, tags

标签Label 残基Residues 序列sequence Poly-ArgPoly-Arg 5-6(通常为5个)5-6 (usually 5) RRRRRRRRRR Poly-HisPoly-His 2-10(通常为6个)2-10 (usually 6) HHHHHHHHHHHH FLAGFLAG 88 DYKDDDDKDYKDDDDK Strep-tag IIStrep-tag II 88 WSHPQFEKWSHPQFEK c-mycc-myc 1010 EQKLISEEDLEQKLISEEDL

上述E2)、G2)、I2)中的蛋白质,为与序列2或序列3或序列4所示蛋白质的氨基酸序列具有75%或75%以上同一性且具有相同功能的蛋白质。所述具有75%或75%以上同一性为具有75%、具有80%、具有85%、具有90%、具有95%、具有96%、具有97%、具有98%或具有99%的同一性。The above-mentioned proteins in E2), G2) and I2) are proteins that are 75% or more identical to the amino acid sequence of the protein represented by SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4 and have the same function. Having 75% or more identity is 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical .

上述E2)、G2)、I2)中的蛋白质可人工合成,也可先合成其编码基因,再进行生物表达得到。The proteins in the above E2), G2), and I2) can be artificially synthesized, or their encoding genes can be synthesized first, and then biologically expressed.

上述E2)、G2)、I2)中的蛋白质的编码基因可通过将序列1的第3013-6225位(编码序列2所示的蛋白质)、序列1的第6511-7134位(编码序列3所示的蛋白质)、序列1的第7156-7452位(编码序列4所示的蛋白质)所示的DNA序列中缺失一个或几个氨基酸残基的密码子,和/或进行一个或几个碱基对的错义突变,和/或在其5′端和/或3′端连接上表所示的标签的编码序列得到。The genes encoding the proteins in the above E2), G2), and I2) can be obtained by combining the 3013-6225th position of sequence 1 (the protein shown in coding sequence 2) and the 6511-7134 position of sequence 1 (coding sequence 3). protein), 7156-7452 of SEQ ID NO: 1 (encoding the protein shown in SEQ ID NO: 4) in the DNA sequence with deletion of one or several amino acid residues in codons, and/or one or several base pairs missense mutation, and/or linking the coding sequences of the tags shown in the table above at its 5' and/or 3' ends.

进一步的,F1)所述核酸分子为f1)或f2)或f3):Further, the nucleic acid molecule of F1) is f1) or f2) or f3):

f1)序列表中序列1第3013-6225位所示的cDNA分子或DNA分子;f1) The cDNA molecule or DNA molecule shown in position 3013-6225 of sequence 1 in the sequence listing;

f2)与f1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述SaKKHn的cDNA分子或DNA分子;f2) has 75% or more identity with the nucleotide sequence defined in f1), and encodes a cDNA molecule or DNA molecule of said SaKKHn;

f3)在严格条件下与f1)或f2)限定的核苷酸序列杂交,且编码所述SaKKHn的cDNA分子或DNA分子;f3) hybridizes under stringent conditions to the nucleotide sequence defined in f1) or f2) and encodes a cDNA molecule or DNA molecule of said SaKKHn;

H1)所述核酸分子为h1)或h2)或h3):H1) the nucleic acid molecule is h1) or h2) or h3):

h1)序列表中序列1第6511-7134位所示的cDNA分子或DNA分子;h1) The cDNA molecule or DNA molecule shown in position 6511-7134 of sequence 1 in the sequence listing;

h2)与h1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述PmCDA1的cDNA分子或DNA分子;h2) has 75% or more identity with the nucleotide sequence defined in h1), and encodes a cDNA molecule or DNA molecule of said PmCDA1;

h3)在严格条件下与h1)或h2)限定的核苷酸序列杂交,且编码所述PmCDA1的cDNA分子或DNA分子;h3) hybridizes to a nucleotide sequence defined by h1) or h2) under stringent conditions, and encodes a cDNA molecule or DNA molecule of said PmCDA1;

J1)所述核酸分子为j1)或j2)或j3):J1) The nucleic acid molecule is j1) or j2) or j3):

j1)序列表中序列1第7156-7452位所示的cDNA分子或DNA分子;j1) The cDNA molecule or DNA molecule shown in position 7156-7452 of sequence 1 in the sequence listing;

j2)与j1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述UGI的cDNA分子或DNA分子;j2) a cDNA molecule or DNA molecule that is 75% or more identical to the nucleotide sequence defined in j1) and encodes the UGI;

j3)在严格条件下与j1)或j2)限定的核苷酸序列杂交,且编码所述UGI的cDNA分子或DNA分子。j3) a cDNA molecule or a DNA molecule that hybridizes under stringent conditions to a nucleotide sequence defined in j1) or j2) and encodes said UGI.

其中,所述核酸分子可以是DNA,如cDNA、基因组DNA或重组DNA;所述核酸分子也可以是RNA,如mRNA或hnRNA等。Wherein, the nucleic acid molecule can be DNA, such as cDNA, genomic DNA or recombinant DNA; the nucleic acid molecule can also be RNA, such as mRNA or hnRNA.

本领域普通技术人员可以很容易地采用已知的方法,例如定向进化和点突变的方法,对本发明的编码所述SaKKHn或所述PmCDA1或所述UGI的核苷酸序列进行突变。那些经过人工修饰的,具有与本发明的所述SaKKHn或所述PmCDA1或所述UGI的核苷酸序列75%或者更高同一性的核苷酸,只要编码所述SaKKHn或所述PmCDA1或所述UGI且具有相同功能,均是衍生于本发明的核苷酸序列并且等同于本发明的序列。Those of ordinary skill in the art can easily mutate the nucleotide sequence encoding the SaKKHn or the PmCDA1 or the UGI of the present invention using known methods, such as directed evolution and point mutation. Those artificially modified nucleotides having 75% or higher identity to the nucleotide sequence of the SaKKHn or the PmCDA1 or the UGI of the present invention, as long as they encode the SaKKHn or the PmCDA1 or the UGI. The UGI described above and having the same function are all derived from the nucleotide sequence of the present invention and are equivalent to the sequence of the present invention.

这里使用的术语“同一性”指与天然核酸序列的序列相似性。“同一性”包括与本发明的编码序列2、3、4所示的氨基酸序列组成的蛋白质的核苷酸序列具有75%或更高,或85%或更高,或90%或更高,或95%或更高同一性的核苷酸序列。同一性可以用肉眼或计算机软件进行评价。使用计算机软件,两个或多个序列之间的同一性可以用百分比(%)表示,其可以用来评价相关序列之间的同一性。The term "identity" as used herein refers to sequence similarity to a native nucleic acid sequence. "Identity" includes 75% or more, or 85% or more, or 90% or more of the nucleotide sequence of the protein consisting of the amino acid sequences shown in the coding sequences 2, 3, and 4 of the present invention, or nucleotide sequences of 95% or greater identity. Identity can be assessed with the naked eye or with computer software. Using computer software, the identity between two or more sequences can be expressed in percent (%), which can be used to assess the identity between related sequences.

所述严格条件是在2×SSC,0.1%SDS的溶液中,在68℃下杂交并洗膜2次,每次5min,又于0.5×SSC,0.1%SDS的溶液中,在68℃下杂交并洗膜2次,每次15min;或,0.1×SSPE(或0.1×SSC)、0.1%SDS的溶液中,65℃条件下杂交并洗膜。The stringent conditions were hybridization in a solution of 2×SSC, 0.1% SDS at 68°C and washing the membrane twice for 5 min each, and hybridization in a solution of 0.5×SSC, 0.1% SDS at 68°C. And wash the membrane twice for 15 min each time; or, in a solution of 0.1×SSPE (or 0.1×SSC) and 0.1% SDS, hybridize and wash the membrane at 65°C.

上述75%或75%以上同一性,可为80%、85%、90%或95%以上的同一性。The above-mentioned 75% or more identity may be 80%, 85%, 90% or more than 95% identity.

F2)所述的含有编码SaKKHn蛋白质的核酸分子的表达盒(SaKKHn基因表达盒),是指能够在宿主细胞中表达SaKKHn蛋白质的DNA,该DNA不但可包括启动SaKKHn基因转录的启动子,还可包括终止SaKKHn基因转录的终止子。进一步,所述表达盒还可包括增强子序列。可用现有的表达载体构建含有所述SaKKHn基因表达盒的重组载体。F2) The described expression cassette (SaKKHn gene expression cassette) containing the nucleic acid molecule encoding SaKKHn protein refers to the DNA capable of expressing SaKKHn protein in the host cell. Includes a terminator that terminates transcription of the SaKKHn gene. Further, the expression cassette may also include enhancer sequences. The recombinant vector containing the SaKKHn gene expression cassette can be constructed by using the existing expression vector.

H2)所述的含有编码PmCDA1蛋白质的核酸分子的表达盒(PmCDA1基因表达盒),是指能够在宿主细胞中表达PmCDA1蛋白质的DNA,该DNA不但可包括启动PmCDA1基因转录的启动子,还可包括终止PmCDA1基因转录的终止子。进一步,所述表达盒还可包括增强子序列。可用现有的表达载体构建含有所述PmCDA1基因表达盒的重组载体。H2) The described expression cassette (PmCDA1 gene expression cassette) containing the nucleic acid molecule encoding the PmCDA1 protein refers to the DNA capable of expressing the PmCDA1 protein in the host cell. Includes a terminator that terminates transcription of the PmCDA1 gene. Further, the expression cassette may also include enhancer sequences. The recombinant vector containing the PmCDA1 gene expression cassette can be constructed by using the existing expression vector.

J2)所述的含有编码UGI蛋白质的核酸分子的表达盒(UGI基因表达盒),是指能够在宿主细胞中表达UGI蛋白质的DNA,该DNA不但可包括启动UGI基因转录的启动子,还可包括终止UGI基因转录的终止子。进一步,所述表达盒还可包括增强子序列。可用现有的表达载体构建含有所述UGI基因表达盒的重组载体。The expression cassette (UGI gene expression cassette) containing the nucleic acid molecule encoding UGI protein described in J2) refers to the DNA capable of expressing the UGI protein in the host cell. Includes a terminator that terminates transcription of the UGI gene. Further, the expression cassette may also include enhancer sequences. The recombinant vector containing the UGI gene expression cassette can be constructed by using the existing expression vector.

所述载体可为质粒、黏粒、噬菌体或病毒载体。在本发明的具体实施例中,所述重组载体具体为SaKKHn-pBE+3bp-1重组表达载体、SaKKHn-pBE+3bp-2重组表达载体、SaKKHn-pBE+3bp-3重组表达载体、SaKKHn-pBE+3bp-4重组表达载体、SaKKHn-pBE+3bp-5重组表达载体、SaKKHn-pBE+3bp-6重组表达载体或SaKKHn-pBE+3bp-7重组表达载体。The vector may be a plasmid, cosmid, phage or viral vector. In a specific embodiment of the present invention, the recombinant vector is specifically SaKKHn-pBE+3bp-1 recombinant expression vector, SaKKHn-pBE+3bp-2 recombinant expression vector, SaKKHn-pBE+3bp-3 recombinant expression vector, SaKKHn- pBE+3bp-4 recombinant expression vector, SaKKHn-pBE+3bp-5 recombinant expression vector, SaKKHn-pBE+3bp-6 recombinant expression vector or SaKKHn-pBE+3bp-7 recombinant expression vector.

所述SaKKHn-pBE+3bp-1重组表达载体的核苷酸序列为将SaKKHn-pBE-1重组表达载体序列中Original sgRNA骨架的DNA序列均替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-1 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the SaKKHn-pBE-1 recombinant expression vector sequence with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6 , and keep other sequences unchanged.

所述SaKKHn-pBE+3bp-2重组表达载体的核苷酸序列为将SaKKHn-pBE-2重组表达载体序列中Original sgRNA骨架的DNA序列均替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-2 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the SaKKHn-pBE-2 recombinant expression vector sequence with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6 , and keep other sequences unchanged.

所述SaKKHn-pBE+3bp-3重组表达载体的核苷酸序列为将SaKKHn-pBE-3重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-3 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the SaKKHn-pBE-3 recombinant expression vector sequence with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, The sequence obtained after keeping other sequences unchanged.

所述SaKKHn-pBE+3bp-4重组表达载体的核苷酸序列为将SaKKHn-pBE-4重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-4 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the SaKKHn-pBE-4 recombinant expression vector sequence with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, The sequence obtained after keeping other sequences unchanged.

所述SaKKHn-pBE+3bp-5重组表达载体的核苷酸序列为将SaKKHn-pBE-5重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-5 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the SaKKHn-pBE-5 recombinant expression vector sequence with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, The sequence obtained after keeping other sequences unchanged.

所述SaKKHn-pBE+3bp-6重组表达载体的核苷酸序列为将SaKKHn-pBE-6重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-6 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the SaKKHn-pBE-6 recombinant expression vector sequence with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, The sequence obtained after keeping other sequences unchanged.

所述SaKKHn-pBE+3bp-7重组表达载体的核苷酸序列为将SaKKHn-pBE-7重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-7 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the SaKKHn-pBE-7 recombinant expression vector sequence with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, The sequence obtained after keeping other sequences unchanged.

所述微生物可为酵母、细菌、藻或真菌。其中,所述细菌可为农杆菌,如农杆菌EHA105。在本发明的具体实施例中,所述重组微生物具体为含有所述SaKKHn-pBE+3bp-1重组表达载体、所述SaKKHn-pBE+3bp-2重组表达载体、所述SaKKHn-pBE+3bp-3重组表达载体、所述SaKKHn-pBE+3bp-4重组表达载体、所述SaKKHn-pBE+3bp-5重组表达载体、所述SaKKHn-pBE+3bp-6重组表达载体或所述SaKKHn-pBE+3bp-7重组表达载体的农杆菌EHA105。The microorganism can be yeast, bacteria, algae or fungi. Wherein, the bacteria can be Agrobacterium, such as Agrobacterium EHA105. In a specific embodiment of the present invention, the recombinant microorganism specifically contains the SaKKHn-pBE+3bp-1 recombinant expression vector, the SaKKHn-pBE+3bp-2 recombinant expression vector, the SaKKHn-pBE+3bp- 3 recombinant expression vector, described SaKKHn-pBE+3bp-4 recombinant expression vector, described SaKKHn-pBE+3bp-5 recombinant expression vector, described SaKKHn-pBE+3bp-6 recombinant expression vector or described SaKKHn-pBE+ 3bp-7 recombinant expression vector of Agrobacterium EHA105.

所述转基因细胞系不包括繁殖材料。The transgenic cell line does not include propagation material.

上述成套试剂具有如下用途:The above-mentioned complete set of reagents has the following purposes:

X1)生物体或生物细胞基因组靶点序列的编辑;X1) Editing of genome target sequences of organisms or biological cells;

X2)制备生物体或生物细胞基因组靶点序列的编辑的产品;X2) Preparation of edited products of organism or biological cell genome target sequence;

X3)提高生物体或生物细胞基因组靶点序列的编辑效率;X3) Improve the editing efficiency of the genome target sequence of the organism or biological cell;

X4)制备提高生物体或生物细胞基因组靶点序列的编辑效率的产品。X4) Prepare a product that improves the editing efficiency of the genome target sequence of an organism or a biological cell.

上述成套试剂中的sgRNA或改造的sgRNA骨架也均属于本发明的保护范围。The sgRNA or the modified sgRNA backbone in the above-mentioned complete set of reagents also belong to the protection scope of the present invention.

为了实现上述目的,本发明还提供了上述成套试剂或sgRNA或改造的sgRNA骨架的新用途。In order to achieve the above-mentioned purpose, the present invention also provides a new use of the above-mentioned complete set of reagents or sgRNA or modified sgRNA backbone.

本发明提供了上述成套试剂或sgRNA或改造的sgRNA骨架的新用途在X1)-X4)中任一种中的应用:The invention provides the application of the above-mentioned complete set of reagents or sgRNA or the new application of the modified sgRNA backbone in any one of X1)-X4):

X1)生物体或生物细胞基因组靶点序列的编辑;X1) Editing of genome target sequences of organisms or biological cells;

X2)制备生物体或生物细胞基因组靶点序列的编辑的产品;X2) Preparation of edited products of organism or biological cell genome target sequence;

X3)提高生物体或生物细胞基因组靶点序列的编辑效率;X3) Improve the editing efficiency of the genome target sequence of the organism or biological cell;

X4)制备提高生物体或生物细胞基因组靶点序列的编辑效率的产品。X4) Prepare a product that improves the editing efficiency of the genome target sequence of an organism or a biological cell.

为了实现上述目的,本发明最后提供了Y1)或Y2)所述的方法:In order to achieve the above-mentioned purpose, the present invention finally provides the method described in Y1) or Y2):

Y1)基因组靶点序列的编辑方法或提高生物体或生物细胞基因组靶点序列的编辑效率的方法,包括使生物体或生物细胞内表达上述sgRNA、上述Cas9核酸酶、上述胞嘧啶脱氨酶,实现基因组靶点序列的编辑;所述sgRNA靶向所述靶点序列;Y1) a method for editing a genome target sequence or a method for improving the editing efficiency of an organism or biological cell genome target sequence, comprising expressing the above-mentioned sgRNA, the above-mentioned Cas9 nuclease, and the above-mentioned cytosine deaminase in the organism or biological cell, Realize the editing of the genome target sequence; the sgRNA targets the target sequence;

Y2)生物突变体的制备方法,包括如下步骤:按照Y1)所述的方法对生物体的基因组进行编辑,获得生物突变体。Y2) A method for preparing biological mutants, comprising the steps of: editing the genome of an organism according to the method described in Y1) to obtain biological mutants.

上述方法中,所述Y1)中,所述sgRNA为上述tRNA-sgRNA,转录所述tRNA-sgRNA的DNA分子转录后得到的tRNA-sgRNA为不成熟的RNA前体,该RNA前体中的tRNA会被两种酶(RNase P和RNase Z)切割掉后得到成熟的RNA。一个重组表达载体中有多少个靶点,就会得到多少个独立的成熟的RNA,每个成熟的RNA依次由所述靶点序列转录的RNA和所述sgRNA骨架组成,或依次由所述靶点序列转录的RNA、所述sgRNA骨架和所述tRNA残留的个别碱基组成。In the above method, in the Y1), the sgRNA is the above-mentioned tRNA-sgRNA, and the tRNA-sgRNA obtained after the DNA molecule transcribing the tRNA-sgRNA is transcribed is an immature RNA precursor, and the tRNA in the RNA precursor is Mature RNA is obtained after being cleaved by two enzymes (RNase P and RNase Z). There are as many targets in a recombinant expression vector as there are independent mature RNAs, and each mature RNA is composed of the RNA transcribed by the target sequence and the sgRNA backbone in turn, or the target sequence is composed of the target sequence in turn. The individual base composition of the dot sequence transcribed RNA, the sgRNA backbone, and the tRNA residues.

上述方法中,所述Y1)还包括使生物体或生物细胞内表达UGI的步骤,所述UGI的个数可为1个或2个或多个。在本发明的具体实施例中,所述UGI的个数具体为1个。In the above method, the Y1) further includes the step of expressing UGI in the organism or biological cell, and the number of the UGI can be one or two or more. In a specific embodiment of the present invention, the number of the UGI is specifically one.

进一步的,使生物体或生物细胞内表达上述sgRNA、上述Cas9核酸酶、上述胞嘧啶脱氨酶和上述UGI的方法为将所述Cas9核酸酶的编码基因、转录所述sgRNA的DNA分子、所述胞嘧啶脱氨酶的编码基因和所述UGI的编码基因导入生物体或生物细胞内。Further, the method for expressing the above-mentioned sgRNA, the above-mentioned Cas9 nuclease, the above-mentioned cytosine deaminase and the above-mentioned UGI in the organism or biological cell is to transcribe the encoding gene of the Cas9 nuclease, the DNA molecule of the sgRNA, the The cytosine deaminase-encoding gene and the UGI-encoding gene are introduced into an organism or biological cell.

更进一步的,所述Cas9核酸酶的编码基因、转录所述sgRNA的DNA分子、所述胞嘧啶脱氨酶的编码基因和所述UGI的编码基因通过重组表达载体导入生物体或生物细胞内。Further, the coding gene of the Cas9 nuclease, the DNA molecule transcribing the sgRNA, the coding gene of the cytosine deaminase and the coding gene of the UGI are introduced into the organism or biological cell through a recombinant expression vector.

所述Cas9核酸酶的编码基因、转录所述sgRNA的DNA分子、所述胞嘧啶脱氨酶的编码基因和所述UGI的编码基因可通过同一个重组表达载体导入生物体或生物细胞内,也可通过两个或者多个重组表达载体共同导入生物体或生物细胞内。The encoding gene of the Cas9 nuclease, the DNA molecule transcribing the sgRNA, the encoding gene of the cytosine deaminase and the encoding gene of the UGI can be introduced into the organism or biological cell through the same recombinant expression vector, or Two or more recombinant expression vectors can be co-introduced into organisms or biological cells.

在本发明的具体实施例中,所述Cas9核酸酶的编码基因、转录所述sgRNA的DNA分子、所述胞嘧啶脱氨酶的编码基因和所述UGI的编码基因通过同一个重组表达载体导入生物体或生物细胞内。所述重组表达载体含有表达盒甲和表达盒乙;所述表达盒甲表达上述sgRNA,所述表达盒乙表达由上述Cas9核酸酶、上述胞嘧啶脱氨酶和上述UGI组成的融合蛋白。In a specific embodiment of the present invention, the coding gene of the Cas9 nuclease, the DNA molecule transcribing the sgRNA, the coding gene of the cytosine deaminase and the coding gene of the UGI are introduced through the same recombinant expression vector within an organism or biological cell. The recombinant expression vector contains an expression cassette A and an expression cassette B; the expression cassette A expresses the above-mentioned sgRNA, and the expression cassette B expresses a fusion protein composed of the above-mentioned Cas9 nuclease, the above-mentioned cytosine deaminase and the above-mentioned UGI.

所述重组表达载体具体为所述SaKKHn-pBE+3bp-1重组表达载体、所述SaKKHn-pBE+3bp-2重组表达载体、所述SaKKHn-pBE+3bp-3重组表达载体、所述SaKKHn-pBE+3bp-4重组表达载体、所述SaKKHn-pBE+3bp-5重组表达载体、所述SaKKHn-pBE+3bp-6重组表达载体或所述SaKKHn-pBE+3bp-7重组表达载体。The recombinant expression vector is specifically the SaKKHn-pBE+3bp-1 recombinant expression vector, the SaKKHn-pBE+3bp-2 recombinant expression vector, the SaKKHn-pBE+3bp-3 recombinant expression vector, the SaKKHn- pBE+3bp-4 recombinant expression vector, the SaKKHn-pBE+3bp-5 recombinant expression vector, the SaKKHn-pBE+3bp-6 recombinant expression vector or the SaKKHn-pBE+3bp-7 recombinant expression vector.

上述成套试剂或应用或方法中,所述靶点序列的个数可为1个或2个或多个。所述靶点序列的PAM序列为NNNRRT。In the above kit of reagents or applications or methods, the number of the target sequence may be one or two or more. The PAM sequence of the target sequence is NNNRRT.

上述成套试剂或应用或方法中,所述基因组靶点序列的编辑为将所述靶点序列中的C突变为T。所述C为所述靶点序列中任意位置的C。In the above kit or application or method, the editing of the genomic target sequence is to mutate C in the target sequence to T. The C is C at any position in the target sequence.

上述成套试剂或应用或方法中,所述生物体为S1)或S2)或S3)或S4):In the above-mentioned complete set of reagents or applications or methods, the organism is S1) or S2) or S3) or S4):

S1)植物或动物;S1) plants or animals;

S2)单子叶植物或双子叶植物;S2) monocotyledonous or dicotyledonous plants;

S3)禾本科植物;S3) Poaceae;

S4)水稻;S4) rice;

所述生物细胞为T1)或T2)或T3)或T4):The biological cell is T1) or T2) or T3) or T4):

T1)植物细胞或动物细胞;T1) plant cells or animal cells;

T2)单子叶植物细胞或双子叶植物细胞;T2) monocotyledonous plant cells or dicotyledonous plant cells;

T3)禾本科植物细胞;T3) Gramineae cells;

T4)水稻细胞。T4) Rice cells.

本发明提供了一种改造的sgRNA,其结构如式I所示:靶点序列转录的RNA-改造的sgRNA骨架(式I);所述改造的sgRNA骨架为在sgRNA骨架第14-15位之间插入RNA片段甲,且在所述sgRNA骨架第18-19位之间插入RNA片段乙后得到的RNA分子;所述RNA片段甲与所述RNA片段乙反向互补;所述RNA片段甲与所述RNA片段乙大小均为3nt;所述sgRNA骨架为序列9所示的RNA分子。通过实验证明:本发明的改造的sgRNA可显著提高胞嘧啶碱基编辑器(CBE)的C·T碱基替换效率,最高可达86.4%。The present invention provides a modified sgRNA whose structure is shown in formula I: RNA-modified sgRNA backbone (formula I) transcribed by the target sequence; the modified sgRNA backbone is between positions 14-15 of the sgRNA backbone An RNA molecule obtained by inserting RNA fragment A between positions 18 and 19 of the sgRNA backbone, and inserting RNA fragment B between positions 18-19 of the sgRNA backbone; the RNA fragment A and the RNA fragment B are inversely complementary; the RNA fragment A and The size of the RNA fragment B is 3nt; the sgRNA backbone is the RNA molecule shown in sequence 9. It is proved by experiments that the modified sgRNA of the present invention can significantly improve the C·T base substitution efficiency of the cytosine base editor (CBE), up to 86.4%.

附图说明Description of drawings

图1为未经改造的SaCas9 sgRNA结构及改造后的SaCas9 sgRNA结构。Figure 1 shows the unmodified SaCas9 sgRNA structure and the modified SaCas9 sgRNA structure.

图2为重组表达载体的结构示意图。Figure 2 is a schematic structural diagram of a recombinant expression vector.

具体实施方式Detailed ways

下面结合具体实施方式对本发明进行进一步的详细描述,给出的实施例仅为了阐明本发明,而不是为了限制本发明的范围。下述实施例中的实验方法,如无特殊说明,均为常规方法。下述实施例中所用的材料、试剂、仪器等,如无特殊说明,均可从商业途径得到。下述实施例中,如无特殊说明,序列表中各核苷酸序列的第1位均为相应DNA/RNA的5′末端核苷酸,末位均为相应DNA/RNA的3′末端核苷酸。The present invention will be further described in detail below with reference to the specific embodiments, and the given examples are only for illustrating the present invention, rather than for limiting the scope of the present invention. The experimental methods in the following examples are conventional methods unless otherwise specified. Materials, reagents, instruments, etc. used in the following examples can be obtained from commercial sources unless otherwise specified. In the following examples, unless otherwise specified, the first position of each nucleotide sequence in the sequence listing is the 5'-terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3'-terminal nucleus of the corresponding DNA/RNA. Glycosides.

引物对T1由引物T1-F:5’-ttcaaattctaatccccaatcc-3’和引物T1-R:5’-tcgtacctgtctgcaaccttg-3’组成,用于扩增靶点T1。Primer pair T1 consists of primer T1-F: 5'-ttcaaattctaatccccaatcc-3' and primer T1-R: 5'-tcgtacctgtctgcaaccttg-3', which is used to amplify target T1.

引物对T2由引物T2-F:5’-gctttagatgatttgttacatttcgc-3’和引物T2-R:5’-tgagttggtatggcaagaacaag-3’组成,用于扩增靶点T2。Primer pair T2 consists of primer T2-F: 5'-gctttagatgatttgttacatttcgc-3' and primer T2-R: 5'-tgagttggtatggcaagaacaag-3', which is used to amplify target T2.

引物对T3由引物T3-F:5’-aacacggtcaccaacttcatc-3’和引物T3-R:5’-acaacctggcttgctatatatgc-3’组成,用于扩增靶点T3。Primer pair T3 consists of primer T3-F: 5'-aacacggtcaccaacttcatc-3' and primer T3-R: 5'-acaacctggcttgctatatgc-3', which is used to amplify target T3.

引物对T4由引物T4-F:5’-tggatcggatatggacttctc-3’和引物T4-R:5’-gaaatgaacaatcacctgagatctttg-3’组成,用于扩增靶点T4和T7。Primer pair T4 consists of primer T4-F: 5'-tggatcggatatggacttctc-3' and primer T4-R: 5'-gaaatgaacaatcacctgagatctttg-3', which is used to amplify the targets T4 and T7.

引物对T5由引物T5-F:5’-cgagctacctgaagaacaactacc-3’和引物T5-R:5’-cctcgattgcctgaaatttg-3’组成,用于扩增靶点T5。Primer pair T5 consists of primer T5-F: 5'-cgagctacctgaagaacaactacc-3' and primer T5-R: 5'-cctcgattgcctgaaatttg-3', which is used to amplify target T5.

引物对T6由引物T6-F:5’-tgcgagctcgacaacatcatg-3’和引物T6-R:5’-gacggcccatgtggaaacc-3’组成,用于扩增靶点T6。Primer pair T6 consists of primer T6-F: 5'-tgcgagctcgacaacatcatg-3' and primer T6-R: 5'-gacggcccatgtggaaacc-3', which is used to amplify target T6.

引物对T8由引物T8-F:5’-gacgcccatagtcgaggtc-3’和引物T8-R:5’-ctctgctggatcaatgtcaatg-3’组成,用于扩增靶点T8。Primer pair T8 consists of primer T8-F: 5'-gacgcccatagtcgaggtc-3' and primer T8-R: 5'-ctctgctggatcaatgtcaatg-3', which is used to amplify target T8.

引物对T9由引物T9-F:5’-cctcatccaatcgactgacac-3’和引物T9-R:5’-gtaattgtgcttggtgatggag-3’组成,用于扩增靶点T9。Primer pair T9 consists of primer T9-F: 5'-cctcatccaatcgactgacac-3' and primer T9-R: 5'-gtaattgtgcttggtgatggag-3', which is used to amplify target T9.

以下实施例中,C·T碱基替换是指靶点序列中任何位置的C突变为T。In the following examples, C·T base substitution means that C at any position in the target sequence is mutated to T.

C·T碱基替换效率=发生C·T碱基替换的阳性T0苗数/分析的总阳性T0苗数×100%。C·T base substitution efficiency=number of positive T0 seedlings with C·T base substitutions/total number of positive T0 seedlings analyzed×100%.

日本晴水稻:参考文献:梁卫红,王高华,杜京尧,等.硝普钠及其光解产物对日本晴水稻幼苗生长和5种激素标记基因表达的影响[J].河南师范大学学报(自然版),2017(2):48-52.;公众可以从北京市农林科学院获得。Nipponbare Rice: References: Liang Weihong, Wang Gaohua, Du Jingyao, et al. Effects of sodium nitroprusside and its photolysis products on the growth of Nipponbare rice seedlings and the expression of five hormone marker genes [J]. Journal of Henan Normal University (Nature Edition), 2017 (2): 48-52.; Publicly available from Beijing Academy of Agriculture and Forestry.

恢复培养基:含有200mg/L特美汀的N6固体培养基。Recovery medium: N6 solid medium containing 200 mg/L Timentin.

筛选培养基:含有50mg/L潮霉素的N6固体培养基。Screening medium: N6 solid medium containing 50 mg/L hygromycin.

分化培养基:含有2mg/L KT、0.2mg/L NAA、0.5g/L谷氨酸、0.5g/L脯氨酸的N6固体培养基。Differentiation medium: N6 solid medium containing 2 mg/L KT, 0.2 mg/L NAA, 0.5 g/L glutamic acid, 0.5 g/L proline.

生根培养基:含有0.2mg/L NAA、0.5g/L谷氨酸、0.5g/L脯氨酸的N6固体培养基。Rooting medium: N6 solid medium containing 0.2 mg/L NAA, 0.5 g/L glutamic acid, and 0.5 g/L proline.

实施例1、SaCas9 sgRNA中sgRNA骨架结构的改造Embodiment 1, the transformation of sgRNA backbone structure in SaCas9 sgRNA

SaCas9 sgRNA结构如下:靶点序列转录的RNA-sgRNA骨架。The SaCas9 sgRNA structure is as follows: RNA-sgRNA backbone transcribed from the target sequence.

对SaCas9 sgRNA结构中的sgRNA骨架结构进行改造,两种sgRNA骨架结构改造方式如图1所示。The sgRNA backbone structure in the SaCas9 sgRNA structure was modified, and the two sgRNA backbone structure modification methods are shown in Figure 1.

Original表示未经改造的SaCas9 sgRNA结构,将未经改造的SaCas9 sgRNA记作Original sgRNA,将Original sgRNA中的sgRNA骨架记作Original sgRNA骨架,OriginalsgRNA骨架的RNA序列如下:GUUUUAGUACUCUGGAAACAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU(序列9);Original sgRNA骨架的DNA序列如下:GTTTTAGTACTCTGGAAACAGAATCTACTAAAACAAGGCAAAATGCCGTGTTTATCTCGTCAACTTGTTGGCGAGAT。Original represents the unmodified SaCas9 sgRNA structure, the unmodified SaCas9 sgRNA is recorded as the Original sgRNA, and the sgRNA backbone in the Original sgRNA is recorded as the Original sgRNA backbone. The RNA sequence of the OriginalsgRNA backbone is as follows: GUUUUAGUACUCUGGAAACAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCGUCGAACUUGUUGGCGAGAU (sequence 9); Original sgRNA The DNA sequence of the backbone is as follows: GTTTTAGTACTCTGGAAACAGAATCTACTAAAACAAGGCAAAATGCCGTGTTTATCTCGTCAACTTGTTGGCGAGAT.

O+3bp表示在未经改造的SaCas9 sgRNA结构基础上添加3对碱基后得到的SaCas9sgRNA结构,将该改造后的SaCas9 sgRNA记作+3bp sgRNA,将+3bp sgRNA中的sgRNA骨架记作+3bp sgRNA骨架,+3bp sgRNA骨架的RNA序列如下:GUUUUAGUACUCUGCUGGAAACAGCAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU(序列10,下划线所示的序列为增加的3对碱基);+3bp sgRNA骨架的DNA序列如序列6所示。O+3bp represents the SaCas9 sgRNA structure obtained by adding 3 pairs of bases to the unmodified SaCas9 sgRNA structure, the modified SaCas9 sgRNA is denoted as +3bp sgRNA, and the sgRNA backbone in the +3bp sgRNA is denoted as +3bp sgRNA backbone, the RNA sequence of the +3bp sgRNA backbone is as follows: GUUUUAGUACUCUG CUG GAAA CAG CAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU (SEQ ID NO: 10, the underlined sequence is an added 3 base pairs); the DNA sequence of the +3bp sgRNA backbone is shown in SEQ ID NO: 6.

O+8bp表示在未经改造的SaCas9 sgRNA结构基础上添加8对碱基后得到的SaCas9sgRNA结构,将该改造后的SaCas9 sgRNA记作+8bp sgRNA,将+8bp sgRNA中的sgRNA骨架记作+8bp sgRNA骨架,+8bp sgRNA骨架的RNA序列如下:GUUUUAGUACUCUGUAAUUUUAGAAAUAAAAUUACAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU(下划线所示的序列为增加的8对碱基);+8bp sgRNA骨架的DNA序列如序列7所示。O+8bp represents the SaCas9 sgRNA structure obtained by adding 8 pairs of bases to the unmodified SaCas9 sgRNA structure, the modified SaCas9 sgRNA is denoted as +8bp sgRNA, and the sgRNA backbone in the +8bp sgRNA is denoted as +8bp sgRNA backbone, the RNA sequence of the +8bp sgRNA backbone is as follows: GUUUUAGUACUCUG UAAUUUUA GAAA UAAAAUUA CAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCUCGUCAACUUGUUGGCGAGAU (the underlined sequence is an additional 8 base pairs); the DNA sequence of the +8bp sgRNA backbone is shown in SEQ ID NO:7.

实施例2、改造的SaCas9 sgRNA在提高SaKKHn&PmCDA1&UGI碱基编辑系统的C·T碱基替换效率中的应用Example 2. Application of the modified SaCas9 sgRNA in improving the C·T base substitution efficiency of the SaKKHn&PmCDA1&UGI base editing system

一、重组表达载体的构建1. Construction of recombinant expression vector

人工合成如下重组表达载体,各表达载体均为环状质粒:The following recombinant expression vectors were artificially synthesized, and each expression vector was a circular plasmid:

两个含有Original sgRNA的重组表达载体:SaKKHn-pBE-1和SaKKHn-pBE-2;Two recombinant expression vectors containing Original sgRNA: SaKKHn-pBE-1 and SaKKHn-pBE-2;

两个含有+3bp sgRNA的重组表达载体:SaKKHn-pBE+3bp-1和SaKKHn-pBE+3bp-2;Two recombinant expression vectors containing +3bp sgRNA: SaKKHn-pBE+3bp-1 and SaKKHn-pBE+3bp-2;

两个含有+8bp sgRNA的重组表达载体:SaKKHn-pBE+8bp-1和SaKKHn-pBE+8bp-2。Two recombinant expression vectors containing +8bp sgRNA: SaKKHn-pBE+8bp-1 and SaKKHn-pBE+8bp-2.

SaKKHn-pBE-1重组表达载体的核苷酸序列为序列表中的序列1。其中,序列1的第131-467位为OsU3启动子的核苷酸序列,第474-550位和第648-724位均为tRNA的核苷酸序列,第551-647位和第725-821位分别为靶向OsWaxy基因的两个sgRNA的核苷酸序列,这两个sgRNA的共同sgRNA骨架(Original sgRNA骨架)的DNA序列为序列1的第571-647位或序列1的第745-821位,第996-1286位为OsU3终止子的核苷酸序列;序列1的第1293-3006位为OsUbq3启动子的核苷酸序列,第3013-6225位为SaKKHn蛋白质的编码序列(不含有终止密码子),编码序列2所示的SaKKHn蛋白质;序列1的第6511-7134位为PmCDA1蛋白质的编码序列(不含有终止密码子),编码序列3所示的PmCDA1蛋白质;序列1的第7156-7452位为UGI蛋白质的编码序列,编码序列4所示的UGI蛋白质;序列1的第7459-7653位为35S终止子的核苷酸序列;序列1的第7728-9720位为ZmUbi1启动子的核苷酸序列,第9727-10749位为潮霉素磷酸转移酶的编码序列,第10779-10994位为CaMV35S polyA的核苷酸序列。SaKKHn-pBE-1重组表达载体中的两个靶点分别为T1和T2,序列见表1。The nucleotide sequence of the SaKKHn-pBE-1 recombinant expression vector is sequence 1 in the sequence listing. Among them, the 131-467th position of sequence 1 is the nucleotide sequence of the OsU3 promoter, the 474-550th and 648-724th positions are the nucleotide sequences of tRNA, and the 551-647th and 725th-821th are the nucleotide sequences of tRNA. The nucleotide sequences of the two sgRNAs targeting the OsWaxy gene are respectively, and the DNA sequences of the common sgRNA backbone (Original sgRNA backbone) of the two sgRNAs are positions 571-647 of sequence 1 or 745-821 of sequence 1 996-1286 is the nucleotide sequence of the OsU3 terminator; 1293-3006 of sequence 1 is the nucleotide sequence of the OsUbq3 promoter, and 3013-6225 is the coding sequence of the SaKKHn protein (without termination codon), encoding the SaKKHn protein shown in sequence 2; the 6511-7134th position of sequence 1 is the coding sequence of PmCDA1 protein (without stop codon), encoding the PmCDA1 protein shown in sequence 3; Position 7452 is the coding sequence of UGI protein, which encodes the UGI protein shown in sequence 4; position 7459-7653 of sequence 1 is the nucleotide sequence of the 35S terminator; position 7728-9720 of sequence 1 is the nucleus of the ZmUbi1 promoter The nucleotide sequence, the 9727-10749th position is the coding sequence of hygromycin phosphotransferase, and the 10779th-10994th position is the nucleotide sequence of CaMV35S polyA. The two targets in the SaKKHn-pBE-1 recombinant expression vector are T1 and T2 respectively, and the sequences are shown in Table 1.

SaKKHn-pBE-2重组表达载体的核苷酸序列为将序列1第474-995位替换为序列5,且保持其他序列不变后得到的序列。其中,序列5的第1-77位和第175-251位均为tRNA的核苷酸序列,第78-174位和第252-348位分别为靶向OsNRT1.1B基因和OsGRF4基因的两个sgRNA的核苷酸序列。第98-174位和第272-348位为Original sgRNA骨架的DNA序列。SaKKHn-pBE-2重组表达载体中的两个靶点分别为T3和T4,序列见表1。The nucleotide sequence of the SaKKHn-pBE-2 recombinant expression vector is the sequence obtained by replacing the 474-995th position of the sequence 1 with the sequence 5, and keeping the other sequences unchanged. Among them, positions 1-77 and 175-251 of sequence 5 are nucleotide sequences of tRNA, and positions 78-174 and 252-348 are two targeting OsNRT1.1B gene and OsGRF4 gene, respectively. Nucleotide sequence of sgRNA. Positions 98-174 and 272-348 are the DNA sequences of the Original sgRNA backbone. The two targets in the SaKKHn-pBE-2 recombinant expression vector are T3 and T4, respectively, and the sequences are shown in Table 1.

SaKKHn-pBE+3bp-1重组表达载体的核苷酸序列为将SaKKHn-pBE-1重组表达载体序列中Original sgRNA骨架的DNA序列均替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-1 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the sequence of the SaKKHn-pBE-1 recombinant expression vector with the DNA sequence of the +3bp sgRNA backbone shown in SEQ ID NO: 6, and The sequence obtained after keeping other sequences unchanged.

SaKKHn-pBE+3bp-2重组表达载体的核苷酸序列为将SaKKHn-pBE-2重组表达载体序列中Original sgRNA骨架的DNA序列均替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-2 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the sequence of the SaKKHn-pBE-2 recombinant expression vector with the DNA sequence of the +3bp sgRNA backbone shown in SEQ ID NO: 6, and The sequence obtained after keeping other sequences unchanged.

SaKKHn-pBE+8bp-1重组表达载体的核苷酸序列为将SaKKHn-pBE-1重组表达载体序列中Original sgRNA骨架的DNA序列均替换为序列7所示的+8bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+8bp-1 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the sequence of the SaKKHn-pBE-1 recombinant expression vector with the DNA sequence of the +8bp sgRNA backbone shown in sequence 7, and The sequence obtained after keeping other sequences unchanged.

SaKKHn-pBE+8bp-2重组表达载体的核苷酸序列为将SaKKHn-pBE-2重组表达载体序列中Original sgRNA骨架的DNA序列均替换为序列7所示的+8bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+8bp-2 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the sequence of the SaKKHn-pBE-2 recombinant expression vector with the DNA sequence of the +8bp sgRNA backbone shown in sequence 7, and The sequence obtained after keeping other sequences unchanged.

各载体的靶点核苷酸序列及相应的PAM序列如表1所示。The target nucleotide sequences of each vector and the corresponding PAM sequences are shown in Table 1.

表1Table 1

Figure BDA0002295817890000101
Figure BDA0002295817890000101

二、水稻阳性T0苗的获得Second, the acquisition of rice positive T0 seedlings

将步骤一获得的SaKKHn-pBE-1载体,SaKKHn-pBE-2载体,SaKKHn-pBE+3bp-1载体,SaKKHn-pBE+3bp-2载体,SaKKHn-pBE+8bp-1载体和SaKKHn-pBE+8bp-2载体分别按照如下步骤1-9进行操作:The SaKKHn-pBE-1 vector, SaKKHn-pBE-2 vector, SaKKHn-pBE+3bp-1 vector, SaKKHn-pBE+3bp-2 vector, SaKKHn-pBE+8bp-1 vector and SaKKHn-pBE+ The 8bp-2 vector is operated according to the following steps 1-9:

1、将载体导入农杆菌EHA105(上海唯地生物技术有限公司的产品,CAT#:AC1010),得到重组农杆菌。1. The vector was introduced into Agrobacterium EHA105 (product of Shanghai Weidi Biotechnology Co., Ltd., CAT#: AC1010) to obtain recombinant Agrobacterium.

2、采用培养基(含50μg/ml卡那霉素和25μg/ml利福平的YEP培养基)培养重组农杆菌,28℃,150rpm震荡培养至OD600为1.0-2.0,室温条件下,10000rpm离心1min,用侵染液(将N6液体培养基中的糖替换为葡萄糖和蔗糖,葡萄糖和蔗糖在侵染液中的浓度分别为10g/L和20g/L)重悬菌体并稀释至OD600为0.2,得到农杆菌侵染液。2. Use medium (YEP medium containing 50 μg/ml kanamycin and 25 μg/ml rifampicin) to cultivate recombinant Agrobacterium, 28 ° C, 150 rpm shaking culture to OD 600 of 1.0-2.0, at room temperature, 10000 rpm Centrifuge for 1 min, resuspend the cells with the infection solution (replace the sugar in the N6 liquid medium with glucose and sucrose, the concentrations of glucose and sucrose in the infection solution are 10 g/L and 20 g/L, respectively) and dilute to OD 600 is 0.2 to obtain Agrobacterium infection solution.

3、水稻品种日本晴成熟种子去壳脱粒,置于100mL三角瓶中,加入70%(v/v)乙醇水溶液浸泡30sec,再置于25%(v/v)次氯酸钠水溶液中,120rpm震荡灭菌30min,无菌水冲洗3次,用滤纸吸干水分,然后将种子胚朝下置于N6固体培养基上,28℃暗培养4-6周,得到水稻愈伤。3. The mature seeds of the rice variety Nipponbare were peeled and threshed, placed in a 100mL conical flask, soaked in 70% (v/v) ethanol aqueous solution for 30sec, then placed in 25% (v/v) sodium hypochlorite aqueous solution, sterilized by shaking at 120rpm for 30min , rinsed with sterile water for 3 times, blotted the water with filter paper, then placed the seed embryos face down on N6 solid medium, and cultivated in the dark at 28°C for 4-6 weeks to obtain rice callus.

4、完成步骤3后,将水稻愈伤浸泡置于农杆菌侵染液甲(农杆菌侵染液甲为向农杆菌侵染液中加入乙酰丁香酮得到的液体,乙酰丁香酮的添加量满足乙酰丁香酮与农杆菌侵染液的体积比为25μl:50ml)中浸泡10min,然后,放在铺有两层灭菌滤纸的培养皿(内含约200ml不含农杆菌的侵染液)上,21℃暗培养1天。4, after completing step 3, the rice callus is soaked and placed in Agrobacterium infection solution A (Agrobacterium infection solution A is the liquid obtained by adding acetosyringone to the Agrobacterium infection solution, and the addition of acetosyringone satisfies The volume ratio of acetosyringone and Agrobacterium infection solution is 25μl: 50ml) for 10min, and then placed on a petri dish (containing about 200ml of Agrobacterium-free infection solution) covered with two layers of sterile filter paper. , 21 ℃ dark culture for 1 day.

5、取步骤4得到的水稻愈伤放入恢复培养基上,25-28℃暗培养3天。5. Take the rice callus obtained in step 4, put it on recovery medium, and cultivate in the dark at 25-28°C for 3 days.

6、取步骤5得到的水稻愈伤,置于筛选培养基上,28℃暗培养2周。6. Take the rice callus obtained in step 5, place it on the screening medium, and cultivate in the dark at 28°C for 2 weeks.

7、取步骤6得到的水稻愈伤,再次置于筛选培养基上,28℃暗培养2周,得到水稻抗性愈伤。7. Take the rice callus obtained in step 6, place it on the screening medium again, and cultivate in the dark at 28° C. for 2 weeks to obtain the rice callus with resistance.

8、取步骤7得到的水稻抗性愈伤放入分化培养基上,25℃光照培养1个月左右,将分化出来的小苗移至生根培养基上,25℃光照培养2周,获取水稻T0苗。8. Take the rice resistant callus obtained in step 7 and put it on the differentiation medium, cultivate it in the light of 25°C for about 1 month, move the differentiated seedlings to the rooting medium, and cultivate in the light of 25°C for 2 weeks to obtain the rice T0 Seedling.

9、提取水稻T0苗的基因组DNA并以其作为模板,采用引物F(5’-attatgtagcttgtgcgtttcg-3’)和引物R(5’-ctccacctcattgacattatgc-3’)组成的引物对进行PCR扩增,得到PCR扩增产物;将该PCR扩增产物进行琼脂糖凝胶电泳,然后进行如下判断:如果PCR扩增产物中含有约898bp的DNA片段,则相应的水稻T0苗为水稻阳性T0苗;如果PCR扩增产物中不含有约898bp的DNA片段,则相应的水稻T0苗不为水稻阳性T0苗。9. Extract the genomic DNA of rice T0 seedlings and use it as a template, use primers F (5'-attatgtagcttgtgcgtttcg-3') and primer R (5'-ctccacctcattgacattatgc-3') to form primer pairs to carry out PCR amplification to obtain PCR Amplification product; carry out agarose gel electrophoresis on this PCR amplification product, and then judge as follows: if the PCR amplification product contains a DNA fragment of about 898bp, the corresponding rice T0 seedling is a rice positive T0 seedling; If the amplified product does not contain a DNA fragment of about 898 bp, the corresponding rice T0 seedling is not a rice positive T0 seedling.

三、结果分析3. Analysis of results

1、每载体分别取步骤二所获得的水稻阳性T0苗的基因组DNA作为模板,对于T1靶点,采用引物对T1进行PCR扩增,得到PCR扩增产物;对于T2靶点,采用引物对T2进行PCR扩增,得到PCR扩增产物;对于T3靶点,采用引物对T3进行PCR扩增,得到PCR扩增产物;对于T4靶点,采用引物对T4进行PCR扩增,得到PCR扩增产物。1. Take the genomic DNA of the rice-positive T0 seedlings obtained in step 2 as a template for each vector. For the T1 target, use the primer pair T1 to carry out PCR amplification to obtain a PCR amplification product; for the T2 target, use the primer pair T2 Perform PCR amplification to obtain PCR amplification products; for T3 targets, use primers to perform PCR amplification on T3 to obtain PCR amplification products; for T4 targets, use primers to perform PCR amplification on T4 to obtain PCR amplification products .

2、将步骤1得到的PCR扩增产物进行Sanger测序及分析。测序结果只针对各靶点区进行分析。分别统计T1、T2、T3和T4的发生C·T碱基替换的阳性T0苗数,计算得出C·T碱基替换效率,结果见表2。2. Sanger sequencing and analysis of the PCR amplification product obtained in step 1. Sequencing results were only analyzed for each target region. The number of positive T0 seedlings with C·T base substitution in T1, T2, T3 and T4 were counted respectively, and the C·T base substitution efficiency was calculated. The results are shown in Table 2.

结果表明,对所有四个靶点,与使用Original sgRNA的SaKKHn&PmCDA1&UGI碱基编辑系统相比,使用+3bp sgRNA的SaKKHn&PmCDA1&UGI碱基编辑系统均能够提高C·T碱基替换效率,仅对T2靶点而言,提高了3倍。而使用+8bp sgRNA的SaKKHn&PmCDA1&UGI碱基编辑系统则表现不稳定,对T2、T3和T4靶点均不同程度的提高了C·T碱基替换效率,但对T1靶点却一定程度的降低了C·T碱基替换效率。整体增效水平上,除T4靶点外,使用+3bp sgRNA的SaKKHn&PmCDA1&UGI碱基编辑系统实现C·T碱基替换的效率优于+8bp sgRNA的SaKKHn&PmCDA1&UGI碱基编辑系统。The results showed that for all four targets, the SaKKHn&PmCDA1&UGI base editing system using +3bp sgRNA could improve the C T base substitution efficiency compared with the SaKKHn&PmCDA1&UGI base editing system using the Original sgRNA, but only for the T2 target. language, an increase of 3 times. The SaKKHn&PmCDA1&UGI base editing system using +8bp sgRNA is unstable, and it can improve the C·T base substitution efficiency for T2, T3 and T4 targets to varying degrees, but it reduces C to a certain extent for T1 targets. • T base substitution efficiency. On the overall synergistic level, except for the T4 target, the base editing system of SaKKHn&PmCDA1&UGI with +3bp sgRNA is more efficient than the base editing system of SaKKHn&PmCDA1&UGI with +8bp sgRNA in realizing C·T base editing.

表2Table 2

实施例3、+3bp sgRNA在提高SaKKHn&PmCDA1&UGI碱基编辑系统的C·T碱基替换效率中的应用Example 3. Application of +3bp sgRNA in improving the C·T base substitution efficiency of SaKKHn&PmCDA1&UGI base editing system

一、重组表达载体的构建1. Construction of recombinant expression vector

人工合成如下重组表达载体,各表达载体均为环状质粒:The following recombinant expression vectors were artificially synthesized, and each expression vector was a circular plasmid:

五个含有Original sgRNA的重组表达载体:SaKKHn-pBE-3、SaKKHn-pBE-4、SaKKHn-pBE-5、SaKKHn-pBE-6和SaKKHn-pBE-7;Five recombinant expression vectors containing Original sgRNA: SaKKHn-pBE-3, SaKKHn-pBE-4, SaKKHn-pBE-5, SaKKHn-pBE-6 and SaKKHn-pBE-7;

五个含有+3bp sgRNA的重组表达载体:SaKKHn-pBE+3bp-3、SaKKHn-pBE+3bp-4、SaKKHn-pBE+3bp-5、SaKKHn-pBE+3bp-6和SaKKHn-pBE+3bp-7。Five recombinant expression vectors containing +3bp sgRNA: SaKKHn-pBE+3bp-3, SaKKHn-pBE+3bp-4, SaKKHn-pBE+3bp-5, SaKKHn-pBE+3bp-6 and SaKKHn-pBE+3bp-7 .

SaKKHn-pBE-3重组表达载体的核苷酸序列为将序列1第474-995位替换为序列8,且保持其他序列不变后得到的序列。其中,序列8的第1-77位为tRNA的核苷酸序列,第78-174位为靶向OsWaxy基因的sgRNA的核苷酸序列,第98-174位为Original sgRNA骨架的DNA序列。SaKKHn-pBE-3重组表达载体中的靶点为T5,序列见表3。The nucleotide sequence of the SaKKHn-pBE-3 recombinant expression vector is the sequence obtained by replacing the 474-995th position of the sequence 1 with the sequence 8, and keeping the other sequences unchanged. Wherein, the 1-77th position of sequence 8 is the nucleotide sequence of tRNA, the 78th-174th position is the nucleotide sequence of the sgRNA targeting the OsWaxy gene, and the 98th-174th position is the DNA sequence of the Original sgRNA backbone. The target in the SaKKHn-pBE-3 recombinant expression vector is T5, and the sequence is shown in Table 3.

SaKKHn-pBE-4重组表达载体的核苷酸序列为将SaKKHn-pBE-3重组表达载体序列中的T5靶点序列替换为T6靶点序列,且保持其他序列不变后得到的序列。T6靶点序列见表3。The nucleotide sequence of the SaKKHn-pBE-4 recombinant expression vector is the sequence obtained by replacing the T5 target sequence in the SaKKHn-pBE-3 recombinant expression vector sequence with the T6 target sequence, and keeping other sequences unchanged. The T6 target sequences are shown in Table 3.

SaKKHn-pBE-5重组表达载体的核苷酸序列为将SaKKHn-pBE-3重组表达载体序列中的T5靶点序列替换为T7靶点序列,且保持其他序列不变后得到的序列。T7靶点序列见表3。The nucleotide sequence of the SaKKHn-pBE-5 recombinant expression vector is the sequence obtained by replacing the T5 target sequence in the SaKKHn-pBE-3 recombinant expression vector sequence with the T7 target sequence, and keeping other sequences unchanged. The T7 target sequences are shown in Table 3.

SaKKHn-pBE-6重组表达载体的核苷酸序列为将SaKKHn-pBE-3重组表达载体序列中的T5靶点序列替换为T8靶点序列,且保持其他序列不变后得到的序列。T8靶点序列见表3。The nucleotide sequence of the SaKKHn-pBE-6 recombinant expression vector is the sequence obtained by replacing the T5 target sequence in the SaKKHn-pBE-3 recombinant expression vector sequence with the T8 target sequence, and keeping other sequences unchanged. The T8 target sequence is shown in Table 3.

SaKKHn-pBE-7重组表达载体的核苷酸序列为将SaKKHn-pBE-3重组表达载体中序列的T5靶点序列替换为T9靶点序列,且保持其他序列不变后得到的序列。T9靶点序列见表3。The nucleotide sequence of the SaKKHn-pBE-7 recombinant expression vector is the sequence obtained by replacing the T5 target sequence of the sequence in the SaKKHn-pBE-3 recombinant expression vector with the T9 target sequence, and keeping other sequences unchanged. The T9 target sequence is shown in Table 3.

SaKKHn-pBE+3bp-3重组表达载体的核苷酸序列为将SaKKHn-pBE-3重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-3 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the sequence of the SaKKHn-pBE-3 recombinant expression vector with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, and keep The sequence obtained with other sequences unchanged.

SaKKHn-pBE+3bp-4重组表达载体的核苷酸序列为将SaKKHn-pBE-4重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-4 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the sequence of the SaKKHn-pBE-4 recombinant expression vector with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, and keep The sequence obtained with other sequences unchanged.

SaKKHn-pBE+3bp-5重组表达载体的核苷酸序列为将SaKKHn-pBE-5重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-5 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the sequence of the SaKKHn-pBE-5 recombinant expression vector with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, and keep The sequence obtained with other sequences unchanged.

SaKKHn-pBE+3bp-6重组表达载体的核苷酸序列为将SaKKHn-pBE-6重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-6 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the sequence of the SaKKHn-pBE-6 recombinant expression vector with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, and keep The sequence obtained with other sequences unchanged.

SaKKHn-pBE+3bp-7重组表达载体的核苷酸序列为将SaKKHn-pBE-7重组表达载体序列中Original sgRNA骨架的DNA序列替换为序列6所示的+3bp sgRNA骨架的DNA序列,且保持其他序列不变后得到的序列。The nucleotide sequence of the SaKKHn-pBE+3bp-7 recombinant expression vector is to replace the DNA sequence of the Original sgRNA backbone in the sequence of the SaKKHn-pBE-7 recombinant expression vector with the DNA sequence of the +3bp sgRNA backbone shown in sequence 6, and keep The sequence obtained with other sequences unchanged.

各载体的靶点核苷酸序列及相应的PAM序列如表3所示。The target nucleotide sequences of each vector and the corresponding PAM sequences are shown in Table 3.

表3table 3

靶点名称target name 靶标基因target gene 靶点序列(5′-3′)Target sequence (5'-3') PAMPAM 重组表达载体名称Recombinant expression vector name T5T5 OsWaxyOsWaxy tcctcggcgtagtacgggcttcctcggcgtagtacgggct CACGGTCACGGT SaKKHn-pBE-3;SaKKHn-pBE+3bp-3SaKKHn-pBE-3; SaKKHn-pBE+3bp-3 T6T6 OsWaxyOsWaxy tatccgggcaaggtgagggctatccgggcaaggtgagggc CGTGGTCGTGGT SaKKHn-pBE-4;SaKKHn-pBE+3bp-4SaKKHn-pBE-4; SaKKHn-pBE+3bp-4 T7T7 OsGRF4OsGRF4 acgccggcaccgccctggctacgccggcaccgccctggct CTGGGTCTGGGT SaKKHn-pBE-5;SaKKHn-pBE+3bp-5SaKKHn-pBE-5; SaKKHn-pBE+3bp-5 T8T8 OsALSOsALS cccaagcatgcgcagggacacccaagcatgcgcagggaca ACGGGTACGGGT SaKKHn-pBE-6;SaKKHn-pBE+3bp-6SaKKHn-pBE-6; SaKKHn-pBE+3bp-6 T9T9 OsALSOsALS cacgtccttcccgctcgaggcacgtccttcccgctcgagg CCGGGTCCGGGT SaKKHn-pBE-7;SaKKHn-pBE+3bp-7SaKKHn-pBE-7; SaKKHn-pBE+3bp-7

二、水稻阳性T0苗的获得Second, the acquisition of rice positive T0 seedlings

将步骤一构建的SaKKHn-pBE-3载体,SaKKHn-pBE-4载体,SaKKHn-pBE-5载体,SaKKHn-pBE-6载体,SaKKHn-pBE-7载体,SaKKHn-pBE+3bp-3载体,SaKKHn-pBE+3bp-4载体,SaKKHn-pBE+3bp-5载体,SaKKHn-pBE+3bp-6载体和SaKKHn-pBE+3bp-7载体分别按照实施例2中步骤二的1-9进行操作,得到水稻阳性T0苗。The SaKKHn-pBE-3 vector, SaKKHn-pBE-4 vector, SaKKHn-pBE-5 vector, SaKKHn-pBE-6 vector, SaKKHn-pBE-7 vector, SaKKHn-pBE+3bp-3 vector, SaKKHn-pBE-7 vector constructed in step 1, SaKKHn -pBE+3bp-4 vector, SaKKHn-pBE+3bp-5 vector, SaKKHn-pBE+3bp-6 vector and SaKKHn-pBE+3bp-7 vector were respectively operated according to 1-9 of step 2 in Example 2, to obtain Rice-positive T0 seedlings.

三、结果分析3. Analysis of results

1、每载体分别取步骤二所获得的水稻阳性T0苗的基因组DNA作为模板,对于T5靶点,采用引物对T5进行PCR扩增,得到PCR扩增产物;对于T6靶点,采用引物对T6进行PCR扩增,得到PCR扩增产物;对于T7靶点,采用引物对T4进行PCR扩增,得到PCR扩增产物;对于T8靶点,采用引物对T8进行PCR扩增,得到PCR扩增产物;对于T9靶点,采用引物对T9进行PCR扩增,得到PCR扩增产物。1. Take the genomic DNA of the rice-positive T0 seedlings obtained in step 2 as a template for each vector. For the T5 target, use primer pair T5 to carry out PCR amplification to obtain a PCR amplification product; for the T6 target, use primer pair T6 Perform PCR amplification to obtain PCR amplification products; for T7 targets, use primers to perform PCR amplification on T4 to obtain PCR amplification products; for T8 targets, use primers to perform PCR amplification on T8 to obtain PCR amplification products ; For the T9 target, use primers to carry out PCR amplification on T9 to obtain PCR amplification products.

2、将步骤1得到的PCR扩增产物进行Sanger测序及分析。测序结果只针对各靶点区进行分析。分别统计T5、T6、T7、T8和T9的发生C·T碱基替换的阳性T0苗数,计算得出C·T碱基替换效率,结果见表4。2. Sanger sequencing and analysis of the PCR amplification product obtained in step 1. Sequencing results were only analyzed for each target region. The number of positive T0 seedlings with C·T base substitution in T5, T6, T7, T8 and T9 were counted respectively, and the C·T base substitution efficiency was calculated. The results are shown in Table 4.

结果表明,对所有五个靶点,与使用Original sgRNA的SaKKHn&PmCDA1&UGI碱基编辑系统相比,使用+3bp sgRNA的SaKKHn&PmCDA1&UGI碱基编辑系统均能够提高C·T碱基替换效率。仅对T9靶点而言,使用Original sgRNA的SaKKHn&PmCDA1&UGI碱基编辑系统并不能实现C·T碱基替换,而使用+3bp sgRNA的SaKKHn&PmCDA1&UGI碱基编辑系统能够成功实现C·T碱基替换。The results showed that for all five targets, the SaKKHn&PmCDA1&UGI base editing system using +3bp sgRNA could improve the C·T base substitution efficiency compared to the SaKKHn&PmCDA1&UGI base editing system using the Original sgRNA. For the T9 target only, the SaKKHn&PmCDA1&UGI base editing system using Original sgRNA cannot achieve C·T base substitution, while the SaKKHn&PmCDA1&UGI base editing system using +3bp sgRNA can successfully achieve C·T base editing.

表4Table 4

Figure BDA0002295817890000131
Figure BDA0002295817890000131

以上对本发明进行了详述。对于本领域技术人员来说,在不脱离本发明的宗旨和范围,以及无需进行不必要的实验情况下,可在等同参数、浓度和条件下,在较宽范围内实施本发明。虽然本发明给出了特殊的实施例,应该理解为,可以对本发明作进一步的改进。总之,按本发明的原理,本申请欲包括任何变更、用途或对本发明的改进,包括脱离了本申请中已公开范围,而用本领域已知的常规技术进行的改变。按以下附带的权利要求的范围,可以进行一些基本特征的应用。The present invention has been described in detail above. For those skilled in the art, without departing from the spirit and scope of the present invention, and without unnecessary experimentation, the present invention can be implemented in a wide range under equivalent parameters, concentrations and conditions. While the invention has been given particular embodiments, it should be understood that the invention can be further modified. In conclusion, in accordance with the principles of the present invention, this application is intended to cover any alterations, uses or improvements of the invention, including changes made using conventional techniques known in the art, departing from the scope disclosed in this application. The application of some of the essential features can be made within the scope of the following appended claims.

序列表sequence listing

<110>北京市农林科学院<110> Beijing Academy of Agriculture and Forestry

<120>一种高效的sgRNA及其在基因编辑中的应用<120> An efficient sgRNA and its application in gene editing

<160>10<160>10

<170>PatentIn version 3.5<170>PatentIn version 3.5

<210>1<210>1

<211>17400<211>17400

<212>DNA<212> DNA

<213>Artificial Sequence<213>Artificial Sequence

<400>1<400>1

ggtggcagga tatattgtgg tgtaaacatg gcactagcct caccgtcttc gcagacgagg 60ggtggcagga tatattgtgg tgtaaacatg gcactagcct caccgtcttc gcagacgagg 60

ccgctaagtc gcagctacgc tctcaacggc actgactagg tagtttaaac gtgcacttaa 120ccgctaagtc gcagctacgc tctcaacggc actgactagg tagtttaaac gtgcacttaa 120

ttaaggtacc gaagcaactt aaagttatca ggcatgcatg gatcttggag gaatcagatg 180ttaaggtacc gaagcaactt aaagttatca ggcatgcatg gatcttggag gaatcagatg 180

tgcagtcagg gaccatagca caagacaggc gtcttctact ggtgctacca gcaaatgctg 240tgcagtcagg gaccatagca caagacaggc gtcttctact ggtgctacca gcaaatgctg 240

gaagccggga acactgggta cgttggaaac cacgtgatgt gaagaagtaa gataaactgt 300gaagccggga acactgggta cgttggaaac cacgtgatgt gaagaagtaa gataaactgt 300

aggagaaaag catttcgtag tgggccatga agcctttcag gacatgtatt gcagtatggg 360aggagaaaag catttcgtag tgggccatga agcctttcag gacatgtatt gcagtatggg 360

ccggcccatt acgcaattgg acgacaacaa agactagtat tagtaccacc tcggctatcc 420ccggcccatt acgcaattgg acgacaacaa agactagtat tagtaccacc tcggctatcc 420

acatagatca aagctgattt aaaagagttg tgcagatgat ccgtggcgga tccaacaaag 480acatagatca aagctgattt aaaagagttg tgcagatgat ccgtggcgga tccaacaaag 480

caccagtggt ctagtggtag aatagtaccc tgccacggta cagacccggg ttcgattccc 540caccagtggt ctagtggtag aatagtaccc tgccacggta cagacccggg ttcgattccc 540

ggctggtgca catccaatgc gatgatcaag gttttagtac tctggaaaca gaatctacta 600ggctggtgca catccaatgc gatgatcaag gttttagtac tctggaaaca gaatctacta 600

aaacaaggca aaatgccgtg tttatctcgt caacttgttg gcgagataac aaagcaccag 660aaacaaggca aaatgccgtg tttatctcgt caacttgttg gcgagataac aaagcaccag 660

tggtctagtg gtagaatagt accctgccac ggtacagacc cgggttcgat tcccggctgg 720tggtctagtg gtagaatagt accctgccac ggtacagacc cgggttcgat tcccggctgg 720

tgcaaatcac cagtggaagc taaggtttta gtactctgga aacagaatct actaaaacaa 780tgcaaatcac cagtggaagc taaggtttta gtactctgga aacagaatct actaaaacaa 780

ggcaaaatgc cgtgtttatc tcgtcaactt gttggcgaga taacaaagca ccagtggtct 840ggcaaaatgc cgtgtttatc tcgtcaactt gttggcgaga taacaaagca ccagtggtct 840

agtggtagaa tagtaccctg ccacggtaca gacccgggtt cgattcccgg ctggtgcaac 900agtggtagaa tagtaccctg ccacggtaca gacccgggtt cgattcccgg ctggtgcaac 900

cggatttgaa cgatggacgt tttagtactc tggaaacaga atctactaaa acaaggcaaa 960cggatttgaa cgatggacgt tttagtactc tggaaacaga atctactaaa acaaggcaaa 960

atgccgtgtt tatctcgtca acttgttggc gagatttttt tttttcgttt tgcattgagt 1020atgccgtgtt tatctcgtca acttgttggc gagatttttt tttttcgttt tgcattgagt 1020

tttctccgtc gcatgtttgc agttttattt tccgttttgc attgaaattt ctccgtctca 1080tttctccgtc gcatgtttgc agtttttattt tccgttttgc attgaaattt ctccgtctca 1080

tgtttgcagc gtgttcaaaa agtacgcagc tgtatttcac ttatttacgg cgccacattt 1140tgtttgcagc gtgttcaaaa agtacgcagc tgtatttcac ttatttacgg cgccacattt 1140

tcatgccgtt tgtgccaact atcccgagct agtgaataca gcttggcttc acacaacact 1200tcatgccgtt tgtgccaact atcccgagct agtgaataca gcttggcttc acacaacact 1200

ggtgacccgc tgacctgctc gtacctcgta ccgtcgtacg gcacagcatt tggaattaaa 1260ggtgacccgc tgacctgctc gtacctcgta ccgtcgtacg gcacagcatt tggaattaaa 1260

gggtgtgatc gatactgctt gctgctaagc ttacaaattc gggtcaaggc ggaagccagc 1320gggtgtgatc gatactgctt gctgctaagc ttacaaattc gggtcaaggc ggaagccagc 1320

gcgccacccc acgtcagcaa atacggaggc gcggggttga cggcgtcacc cggtcctaac 1380gcgccacccc acgtcagcaa atacggaggc gcggggttga cggcgtcacc cggtcctaac 1380

ggcgaccaac aaaccagcca gaagaaatta cagtaaaaaa aaagtaaatt gcactttgat 1440ggcgaccaac aaaccagcca gaagaaatta cagtaaaaaa aaagtaaatt gcactttgat 1440

ccacctttta ttacctaagt ctcaatttgg atcaccctta aacctatctt ttcaatttgg 1500ccacctttta ttacctaagt ctcaatttgg atcaccctta aacctatctt ttcaatttgg 1500

gccgggttgt ggtttggact accatgaaca acttttcgtc atgtctaact tccctttcag 1560gccgggttgt ggtttggact accatgaaca acttttcgtc atgtctaact tccctttcag 1560

caaacatatg aaccatatat agaggagatc ggccgtatac tagagctgat gtgtttaagg 1620caaacatatg aaccatatat agaggagatc ggccgtatac tagagctgat gtgtttaagg 1620

tcgttgattg cacgagaaaa aaaaatccaa atcgcaacaa tagcaaattt atctggttca 1680tcgttgattg cacgagaaaa aaaaatccaa atcgcaacaa tagcaaattt atctggttca 1680

aagtgaaaag atatgtttaa aggtagtcca aagtaaaact tatagataat aaaatgtggt 1740aagtgaaaag atatgtttaa aggtagtcca aagtaaaact tatagataat aaaatgtggt 1740

ccaaagcgta attcactcaa aaaaaatcaa cgagacgtgt accaaacgga gacaaacggc 1800ccaaagcgta attcactcaa aaaaaatcaa cgagacgtgt accaaacgga gacaaacggc 1800

atcttctcga aatttcccaa ccgctcgctc gcccgcctcg tcttcccgga aaccgcggtg 1860atcttctcga aatttcccaa ccgctcgctc gcccgcctcg tcttcccgga aaccgcggtg 1860

gtttcagcgt ggcggattct ccaagcagac ggagacgtca cggcacggga ctcctcccac 1920gtttcagcgt ggcggattct ccaagcagac ggagacgtca cggcacggga ctcctcccac 1920

cacccaaccg ccataaatac cagccccctc atctcctctc ctcgcatcag ctccaccccc 1980cacccaaccg ccataaatac cagccccctc atctcctctc ctcgcatcag ctccaccccc 1980

gaaaaatttc tccccaatct cgcgaggctc tcgtcgtcga atcgaatcct ctcgcgtcct 2040gaaaaatttc tccccaatct cgcgaggctc tcgtcgtcga atcgaatcct ctcgcgtcct 2040

caaggtacgc tgcttctcct ctcctcgctt cgtttcgatt cgatttcgga cgggtgaggt 2100caaggtacgc tgcttctcct ctcctcgctt cgtttcgatt cgatttcgga cgggtgaggt 2100

tgttttgttg ctagatccga ttggtggtta gggttgtcga tgtgattatc gtgagatgtt 2160tgttttgttg ctagatccga ttggtggtta gggttgtcga tgtgattatc gtgagatgtt 2160

taggggttgt agatctgatg gttgtgattt gggcacggtt ggttcgatag gtggaatcgt 2220taggggttgt agatctgatg gttgtgattt gggcacggtt ggttcgatag gtggaatcgt 2220

ggttaggttt tgggattgga tgttggttct gatgattggg gggaattttt acggttagat 2280ggttaggttt tgggattgga tgttggttct gatgattggg gggaattttt acggttagat 2280

gaattgttgg atgattcgat tggggaaatc ggtgtagatc tgttggggaa ttgtggaact 2340gaattgttgg atgattcgat tggggaaatc ggtgtagatc tgttggggaa ttgtggaact 2340

agtcatgcct gagtgattgg tgcgatttgt agcgtgttcc atcttgtagg ccttgttgcg 2400agtcatgcct gagtgattgg tgcgatttgt agcgtgttcc atcttgtagg ccttgttgcg 2400

agcatgttca gatctactgt tccgctcttg attgagttat tggtgccatg ggttggtgca 2460agcatgttca gatctactgt tccgctcttg attgagttat tggtgccatg ggttggtgca 2460

aacacaggct ttaatatgtt atatctgttt tgtgtttgat gtagatctgt agggtagttc 2520aacacaggct ttaatatgtt atatctgttt tgtgtttgat gtagatctgt agggtagttc 2520

ttcttagaca tggttcaatt atgtagcttg tgcgtttcga tttgatttca tatgttcaca 2580ttcttagaca tggttcaatt atgtagcttg tgcgtttcga tttgatttca tatgttcaca 2580

gattagataa tgatgaactc ttttaattaa ttgtcaatgg taaataggaa gtcttgtcgc 2640gattagataa tgatgaactc ttttaattaa ttgtcaatgg taaataggaa gtcttgtcgc 2640

tatatctgtc ataatgatct catgttacta tctgccagta atttatgcta agaactatat 2700tatatctgtc ataatgatct catgttacta tctgccagta atttatgcta agaactatat 2700

tagaatatca tgttacaatc tgtagtaata tcatgttaca atctgtagtt catctatata 2760tagaatatca tgttacaatc tgtagtaata tcatgttaca atctgtagtt catctatata 2760

atctattgtg gtaatttctt tttactatct gtgtgaagat tattgccact agttcattct 2820atctattgtg gtaatttctt tttactatct gtgtgaagat tattgccact agttcattct 2820

acttatttct gaagttcagg atacgtgtgc tgttactacc tatctgaata catgtgtgat 2880acttatttct gaagttcagg atacgtgtgc tgttactacc tatctgaata catgtgtgat 2880

gtgcctgtta ctatcttttt gaatacatgt atgttctgtt ggaatatgtt tgctgtttga 2940gtgcctgtta ctatcttttt gaatacatgt atgttctgtt ggaatatgtt tgctgtttga 2940

tccgttgttg tgtccttaat cttgtgctag ttcttaccct atctgtttgg tgattatttc 3000tccgttgttg tgtccttaat cttgtgctag ttcttaccct atctgtttgg tgattatttc 3000

ttgcagtacg taatggctcc taagaagaag cggaaggttg gcatccacgg tgtcccggcg 3060ttgcagtacg taatggctcc taagaagaag cggaaggttg gcatccacgg tgtcccggcg 3060

gcaaagagaa actacatcct gggtctggcc atcggtatta catcggtggg ctacggcatc 3120gcaaagagaa actacatcct gggtctggcc atcggtatta catcggtggg ctacggcatc 3120

atcgactacg agacaaggga tgtcatcgat gccggcgtcc ggctcttcaa ggaggccaac 3180atcgactacg agacaaggga tgtcatcgat gccggcgtcc ggctcttcaa ggaggccaac 3180

gtggagaata acgagggcag gcgctccaag cgcggcgcgc ggaggctgaa gcgcaggcgg 3240gtggagaata acgagggcag gcgctccaag cgcggcgcgc ggaggctgaa gcgcaggcgg 3240

aggcatcgca tccagcgggt gaagaagctc ctcttcgact acaatctgct cacggatcat 3300aggcatcgca tccagcgggt gaagaagctc ctcttcgact acaatctgct cacggatcat 3300

tccgagctgt ctggcatcaa cccatacgag gcgcgggtga agggcctgtc ccagaagctc 3360tccgagctgt ctggcatcaa cccatacgag gcgcgggtga agggcctgtc ccagaagctc 3360

tcggaggagg agttctcggc ggccctgctg catctcgcga agaggcgcgg cgtgcataat 3420tcggaggagg agttctcggc ggccctgctg catctcgcga agaggcgcgg cgtgcataat 3420

gtcaatgagg tggaggagga taccggcaat gagctgtcaa ccaaggagca gatcagcagg 3480gtcaatgagg tggaggagga taccggcaat gagctgtcaa ccaaggagca gatcagcagg 3480

aactccaagg cgctggagga gaagtatgtg gcggagctcc agctcgagag gctgaagaag 3540aactccaagg cgctggagga gaagtatgtg gcggagctcc agctcgagag gctgaagaag 3540

gatggcgagg tccggggctc catcaatagg ttcaagacat cggactacgt gaaggaggcc 3600gatggcgagg tccggggctc catcaatagg ttcaagacat cggactacgt gaaggaggcc 3600

aagcagctcc tgaaggtgca gaaggcgtac caccagctgg accagagctt catcgacacc 3660aagcagctcc tgaaggtgca gaaggcgtac caccagctgg accagagctt catcgacacc 3660

tacatcgatc tgctcgagac acgccggacg tactacgagg gcccgggcga gggctcaccg 3720tacatcgatc tgctcgagac acgccggacg tactacgagg gcccgggcga gggctcaccg 3720

ttcggctgga aggacatcaa ggagtggtac gagatgctga tgggccactg cacctacttc 3780ttcggctgga aggacatcaa ggagtggtac gagatgctga tgggccactg cacctacttc 3780

cctgaggagc tgaggagcgt gaagtacgcg tacaatgcgg acctctacaa cgccctgaac 3840cctgaggagc tgaggagcgt gaagtacgcg tacaatgcgg acctctacaa cgccctgaac 3840

gacctcaata acctcgtgat cacgcgcgac gagaatgaga agctcgagta ctacgagaag 3900gacctcaata acctcgtgat cacgcgcgac gagaatgaga agctcgagta ctacgagaag 3900

ttccagatca tcgagaacgt gttcaagcag aagaagaagc cgaccctcaa gcagatcgcc 3960ttccagatca tcgagaacgt gttcaagcag aagaagaagc cgaccctcaa gcagatcgcc 3960

aaggagatcc tcgtcaatga ggaggacatc aagggctaca gggtgacctc gaccggcaag 4020aaggagatcc tcgtcaatga ggaggacatc aagggctaca gggtgacctc gaccggcaag 4020

ccagagttca ccaacctgaa ggtctaccac gacatcaagg atatcaccgc ccgcaaggag 4080ccagagttca ccaacctgaa ggtctaccac gacatcaagg atatcaccgc ccgcaaggag 4080

atcatcgaga atgcggagct cctggatcag atcgcgaaga tcctcaccat ctaccagtcc 4140atcatcgaga atgcggagct cctggatcag atcgcgaaga tcctcaccat ctaccagtcc 4140

agcgaggaca tccaggagga gctcacgaac ctgaatagcg agctgaccca ggaggagatc 4200agcgaggaca tccaggagga gctcacgaac ctgaatagcg agctgaccca ggaggagatc 4200

gagcagatct ccaacctcaa gggctacacc ggcacgcaca atctgagcct caaggcgatc 4260gagcagatct ccaacctcaa gggctacacc ggcacgcaca atctgagcct caaggcgatc 4260

aatctcatcc tcgatgagct ctggcataca aatgataacc agatcgccat cttcaatcgc 4320aatctcatcc tcgatgagct ctggcataca aatgataacc agatcgccat cttcaatcgc 4320

ctcaagctgg tcccaaagaa ggtcgatctg tcgcagcaga aggagatccc aacgacactg 4380ctcaagctgg tcccaaagaa ggtcgatctg tcgcagcaga aggagatccc aacgacactg 4380

gtcgatgact tcatcctctc acctgtcgtg aagaggtcgt tcatccagtc gatcaaggtc 4440gtcgatgact tcatcctctc acctgtcgtg aagaggtcgt tcatccagtc gatcaaggtc 4440

atcaatgcga tcatcaagaa gtacggcctc cctaatgata tcatcatcga gctggcccgc 4500atcaatgcga tcatcaagaa gtacggcctc cctaatgata tcatcatcga gctggcccgc 4500

gagaagaatt caaaggacgc gcagaagatg atcaacgaga tgcagaagag gaatcggcag 4560gagaagaatt caaaggacgc gcagaagatg atcaacgaga tgcagaagag gaatcggcag 4560

acaaacgagc gcatcgagga gatcatccgc acaaccggca aggagaatgc caagtacctg 4620acaaacgagc gcatcgagga gatcatccgc acaaccggca aggagaatgc caagtacctg 4620

atcgagaaga tcaagctgca tgacatgcag gagggcaagt gcctctactc actggaggcc 4680atcgagaaga tcaagctgca tgacatgcag gagggcaagt gcctctactc actggaggcc 4680

atcccactcg aggacctgct gaataaccca ttcaattacg aggtcgacca tatcatcccg 4740atcccactcg aggacctgct gaataaccca ttcaattacg aggtcgacca tatcatcccg 4740

cgctccgtgt cgttcgacaa ttccttcaat aacaaggtcc tcgtcaagca ggaggagaac 4800cgctccgtgt cgttcgacaa ttccttcaat aacaaggtcc tcgtcaagca ggaggagaac 4800

tccaagaagg gcaatcgcac cccgttccag tacctgtcct cttcggacag caagatctct 4860tccaagaagg gcaatcgcac cccgttccag tacctgtcct cttcggacag caagatctct 4860

tacgagacat tcaagaagca catcctcaac ctggccaagg gcaagggccg gatctccaag 4920tacgagacat tcaagaagca catcctcaac ctggccaagg gcaagggccg gatctccaag 4920

accaagaagg agtacctcct ggaggagagg gatatcaacc ggttcagcgt gcagaaggac 4980accaagaagg agtacctcct ggaggagagg gatatcaacc ggttcagcgt gcagaaggac 4980

ttcatcaatc gcaacctggt cgatacccgg tacgccacca ggggcctcat gaacctgctc 5040ttcatcaatc gcaacctggt cgatacccgg tacgccacca ggggcctcat gaacctgctc 5040

cggtcctact tccgggtgaa caatctcgac gtgaaggtca agagcatcaa cggcggcttc 5100cggtcctact tccgggtgaa caatctcgac gtgaaggtca agagcatcaa cggcggcttc 5100

acctcgttcc tcaggcggaa gtggaagttc aagaaggagc ggaacaaggg ctacaagcac 5160acctcgttcc tcaggcggaa gtggaagttc aagaaggagc ggaacaaggg ctacaagcac 5160

catgccgagg acgccctcat catcgcgaac gcggacttca tcttcaagga gtggaagaag 5220catgccgagg acgccctcat catcgcgaac gcggacttca tcttcaagga gtggaagaag 5220

ctcgataagg cgaagaaggt catggagaac cagatgttcg aggagaagca ggccgagtcg 5280ctcgataagg cgaagaaggt catggagaac cagatgttcg aggagaagca ggccgagtcg 5280

atgccagaga tcgagacaga gcaggagtac aaggagatct tcatcacccc gcaccagatc 5340atgccagaga tcgagacaga gcaggagtac aaggagatct tcatcacccc gcaccagatc 5340

aagcacatca aggacttcaa ggactacaag tactcccatc gggtcgataa gaagccaaat 5400aagcacatca aggacttcaa ggactacaag tactcccatc gggtcgataa gaagccaaat 5400

cggaagctca tcaatgatac cctctactcg acacgcaagg atgacaaggg caacaccctg 5460cggaagctca tcaatgatac cctctactcg acacgcaagg atgacaaggg caacaccctg 5460

atcgtcaata acctcaatgg cctctacgac aaggataacg acaagctgaa gaagctcatc 5520atcgtcaata acctcaatgg cctctacgac aaggataacg acaagctgaa gaagctcatc 5520

aacaagagcc cagagaagct cctcatgtac caccacgatc cgcagacata ccagaagctc 5580aacaagagcc cagagaagct cctcatgtac caccacgatc cgcagacata ccagaagctc 5580

aagctgatca tggagcagta cggcgacgag aagaacccac tctacaagta ctacgaggag 5640aagctgatca tggagcagta cggcgacgag aagaacccac tctacaagta ctacgaggag 5640

acaggcaact acctgaccaa gtactccaag aaggacaatg gcccagtgat caagaagatc 5700acaggcaact acctgaccaa gtactccaag aaggacaatg gcccagtgat caagaagatc 5700

aagtactacg gcaataagct gaacgcccac ctcgatatca cggacgatta ccctaacagc 5760aagtactacg gcaataagct gaacgcccac ctcgatatca cggacgatta ccctaacagc 5760

cggaataagg tggtcaagct gtccctcaag ccgtaccgct tcgacgtcta cctggataac 5820cggaataagg tggtcaagct gtccctcaag ccgtaccgct tcgacgtcta cctggataac 5820

ggcgtctaca agttcgtgac agtcaagaat ctcgacgtca tcaagaagga gaactactac 5880ggcgtctaca agttcgtgac agtcaagaat ctcgacgtca tcaagaagga gaactactac 5880

gaggtcaatt ctaagtgcta cgaggaggcc aagaagctca agaagatcag caaccaggcc 5940gaggtcaatt ctaagtgcta cgaggaggcc aagaagctca agaagatcag caaccaggcc 5940

gagttcatcg ccagcttcta caagaacgat ctgatcaaga tcaacggcga gctctacagg 6000gagttcatcg ccagcttcta caagaacgat ctgatcaaga tcaacggcga gctctacagg 6000

gtcatcggcg tgaacaatga cctgctcaat aggatcgagg tgaacatgat cgacatcacc 6060gtcatcggcg tgaacaatga cctgctcaat aggatcgagg tgaacatgat cgacatcacc 6060

taccgcgagt acctcgagaa catgaacgat aagcggcctc cacacatcat caagacaatc 6120taccgcgagt acctcgagaa catgaacgat aagcggcctc cacacatcat caagacaatc 6120

gcctctaaga cccagtccat caagaagtac tccacggata tcctcggcaa cctctacgag 6180gcctctaaga cccagtccat caagaagtac tccacggata tcctcggcaa cctctacgag 6180

gtgaagtcaa agaagcaccc gcagatcatc aagaagggct cggctggagg aggaggcacg 6240gtgaagtcaa agaagcaccc gcagatcatc aagaagggct cggctggagg aggaggcacg 6240

ggaggaggag gctccgccga gtatgtgcgc gcgctcttcg acttcaacgg caatgacgag 6300ggaggaggag gctccgccga gtatgtgcgc gcgctcttcg acttcaacgg caatgacgag 6300

gaggatctcc ctttcaagaa gggcgacatc ctccgcatcc gcgataagcc ggaggagcag 6360gaggatctcc ctttcaagaa gggcgacatc ctccgcatcc gcgataagcc ggaggagcag 6360

tggtggaacg cagaggactc cgagggcaag cggggcatga tcctggtgcc atacgtcgag 6420tggtggaacg cagaggactc cgagggcaag cggggcatga tcctggtgcc atacgtcgag 6420

aagtacagcg gcgattacaa ggaccacgat ggcgactaca aggatcatga catcgattac 6480aagtacagcg gcgattacaa ggaccacgat ggcgactaca aggatcatga catcgattac 6480

aaggacgatg acgataagtc cggcgtcgac atgacggacg cggagtatgt gcgcatccac 6540aaggacgatg acgataagtc cggcgtcgac atgacggacg cggagtatgt gcgcatccac 6540

gagaagctcg atatctacac cttcaagaag cagttcttca acaataagaa gtcggtgtcc 6600gagaagctcg atatctacac cttcaagaag cagttcttca acaataagaa gtcggtgtcc 6600

catcggtgct acgtcctctt cgagctgaag cgcaggggag agcgccgcgc ctgcttctgg 6660catcggtgct acgtcctctt cgagctgaag cgcaggggag agcgccgcgc ctgcttctgg 6660

ggctacgcgg tgaataagcc gcagtcaggc acagagcgcg gcatccacgc cgagatcttc 6720ggctacgcgg tgaataagcc gcagtcaggc acagagcgcg gcatccacgc cgagatcttc 6720

tcgatccgga aggtcgagga gtacctccgc gacaacccag gccagttcac gatcaattgg 6780tcgatccgga aggtcgagga gtacctccgc gacaacccag gccagttcac gatcaattgg 6780

tactccagct ggtccccttg cgcagattgc gcagagaaga tcctcgagtg gtacaaccag 6840tactccagct ggtccccttg cgcagattgc gcagagaaga tcctcgagtg gtacaaccag 6840

gagctgaggg gcaatggcca taccctcaag atctgggcct gcaagctgta ctacgagaag 6900gagctgaggg gcaatggcca taccctcaag atctgggcct gcaagctgta ctacgagaag 6900

aacgcgagga atcagatcgg cctctggaac ctgcgggata atggcgtggg cctcaacgtg 6960aacgcgagga atcagatcgg cctctggaac ctgcgggata atggcgtggg cctcaacgtg 6960

atggtgtccg agcactacca gtgctgccgc aagatcttca tccagtcctc ccacaatcag 7020atggtgtccg agcactacca gtgctgccgc aagatcttca tccagtcctc ccacaatcag 7020

ctgaacgaga ataggtggct cgaaaagacc ctgaagcgcg ccgagaagtg gaggagcgag 7080ctgaacgaga ataggtggct cgaaaagacc ctgaagcgcg ccgagaagtg gaggagcgag 7080

ctgtctatca tgatccaggt caagatcctg cacaccacaa agtcaccggc ggtgggcggc 7140ctgtctatca tgatccaggt caagatcctg cacaccacaa agtcaccggc ggtgggcggc 7140

ggcggcagcg aattctccgg cggcagcacg aacctcagcg acatcatcga gaaggagaca 7200ggcggcagcg aattctccgg cggcagcacg aacctcagcg acatcatcga gaaggagaca 7200

ggcaagcagc tcgtgatcca ggagtctatc ctcatgctgc ctgaggaggt ggaggaggtc 7260ggcaagcagc tcgtgatcca ggagtctatc ctcatgctgc ctgaggaggt ggaggaggtc 7260

atcggcaaca agccggagtc cgatatcctc gtgcacaccg cctacgacga gtcgacagat 7320atcggcaaca agccggagtc cgatatcctc gtgcacaccg cctacgacga gtcgacagat 7320

gagaatgtca tgctcctgac ctccgacgca ccagagtaca agccatgggc gctcgtgatc 7380gagaatgtca tgctcctgac ctccgacgca ccagagtaca agccatgggc gctcgtgatc 7380

caggattcca acggcgagaa taagatcaag atgctgtctg gcggctcccc gaagaagaag 7440caggattcca acggcgagaa taagatcaag atgctgtctg gcggctcccc gaagaagaag 7440

cgcaaggtct agactagtct gaaatcacca gtctctctct acaaatctat ctctctctat 7500cgcaaggtct agactagtct gaaatcacca gtctctctct acaaatctat ctctctctat 7500

aataatgtgt gagtagttcc cagataaggg aattagggtt cttatagggt ttcgctcatg 7560aataatgtgt gagtagttcc cagataaggg aattagggtt cttatagggt ttcgctcatg 7560

tgttgagcat ataagaaacc cttagtatgt atttgtattt gtaaaatact tctatcaata 7620tgttgagcat ataagaaacc cttagtatgt atttgtattt gtaaaatact tctatcaata 7620

aaatttctaa ttcctaaaac caaaatccag tggggcgccc gacctgtact cgcgaaggtt 7680aaatttctaa ttcctaaaac caaaatccag tggggcgccc gacctgtact cgcgaaggtt 7680

aacttacaga gagtgtccgg gcgcgcctgg tggatcgtcc gcctaggctg cagtgcagcg 7740aacttacaga gagtgtccgg gcgcgcctgg tggatcgtcc gcctaggctg cagtgcagcg 7740

tgacccggtc gtgcccctct ctagagataa tgagcattgc atgtctaagt tataaaaaat 7800tgacccggtc gtgcccctct ctagagataa tgagcattgc atgtctaagt tataaaaaat 7800

taccacatat tttttttgtc acacttgttt gaagtgcagt ttatctatct ttatacatat 7860taccacatat ttttttttgtc acacttgttt gaagtgcagt ttatctatct ttatacatat 7860

atttaaactt tactctacga ataatataat ctatagtact acaataatat cagtgtttta 7920atttaaactt tactctacga ataatataat ctatagtact acaataatat cagtgtttta 7920

gagaatcata taaatgaaca gttagacatg gtctaaagga caattgagta ttttgacaac 7980gagaatcata taaatgaaca gttagacatg gtctaaagga caattgagta ttttgacaac 7980

aggactctac agttttatct ttttagtgtg catgtgttct cctttttttt tgcaaatagc 8040aggactctac agttttatct ttttagtgtg catgtgttct cctttttttt tgcaaatagc 8040

ttcacctata taatacttca tccattttat tagtacatcc atttagggtt tagggttaat 8100ttcacctata taatacttca tccattttat tagtacatcc atttagggtt tagggttaat 8100

ggtttttata gactaatttt tttagtacat ctattttatt ctattttagc ctctaaatta 8160ggtttttata gactaatttt tttagtacat ctattttatt ctattttagc ctctaaatta 8160

agaaaactaa aactctattt tagttttttt atttaataat ttagatataa aatagaataa 8220agaaaactaa aactctattt tagttttttt atttaataat ttagatataa aatagaataa 8220

aataaagtga ctaaaaatta aacaaatacc ctttaagaaa ttaaaaaaac taaggaaaca 8280aataaagtga ctaaaaatta aacaaatacc ctttaagaaa ttaaaaaaac taaggaaaca 8280

tttttcttgt ttcgagtaga taatgccagc ctgttaaacg ccgtcgacga gtctaacgga 8340ttttttcttgt ttcgagtaga taatgccagc ctgttaaacg ccgtcgacga gtctaacgga 8340

caccaaccag cgaaccagca gcgtcgcgtc gggccaagcg aagcagacgg cacggcatct 8400caccaaccag cgaaccagca gcgtcgcgtc gggccaagcg aagcagacgg cacggcatct 8400

ctgtcgctgc ctctggaccc ctctcgagag ttccgctcca ccgttggact tgctccgctg 8460ctgtcgctgc ctctggaccc ctctcgagag ttccgctcca ccgttggact tgctccgctg 8460

tcggcatcca gaaattgcgt ggcggagcgg cagacgtgag ccggcacggc aggcggcctc 8520tcggcatcca gaaattgcgt ggcggagcgg cagacgtgag ccggcacggc aggcggcctc 8520

ctcctcctct cacggcaccg gcagctacgg gggattcctt tcccaccgct ccttcgcttt 8580ctcctcctct cacggcaccg gcagctacgg gggattcctt tcccaccgct ccttcgcttt 8580

cccttcctcg cccgccgtaa taaatagaca ccccctccac accctctttc cccaacctcg 8640cccttcctcg cccgccgtaa taaatagaca ccccctccac accctctttc cccaacctcg 8640

tgttgttcgg agcgcacaca cacacaacca gatctccccc aaatccaccc gtcggcacct 8700tgttgttcgg agcgcacaca cacacaacca gatctcccccc aaatccaccc gtcggcacct 8700

ccgcttcaag gtacgccgct cgtcctcccc ccccccccct ctctaccttc tctagatcgg 8760ccgcttcaag gtacgccgct cgtcctcccc ccccccccct ctctaccttc tctagatcgg 8760

cgttccggtc catggttagg gcccggtagt tctacttctg ttcatgtttg tgttagatcc 8820cgttccggtc catggttagg gcccggtagt tctacttctg ttcatgtttg tgttagatcc 8820

gtgtttgtgt tagatccgtg ctgctagcgt tcgtacacgg atgcgacctg tacgtcagac 8880gtgtttgtgt tagatccgtg ctgctagcgt tcgtacacgg atgcgacctg tacgtcagac 8880

acgttctgat tgctaacttg ccagtgtttc tctttgggga atcctgggat ggctctagcc 8940acgttctgat tgctaacttg ccagtgtttc tctttgggga atcctgggat ggctctagcc 8940

gttccgcaga cgggatcgat ttcatgattt tttttgtttc gttgcatagg gtttggtttg 9000gttccgcaga cgggatcgat ttcatgattt tttttgtttc gttgcatagg gtttggtttg 9000

cccttttcct ttatttcaat atatgccgtg cacttgtttg tcgggtcatc ttttcatgct 9060cccttttcct ttatttcaat atatgccgtg cacttgtttg tcgggtcatc ttttcatgct 9060

tttttttgtc ttggttgtga tgatgtggtc tggttgggcg gtcgttctag atcggagtag 9120ttttttttgtc ttggttgtga tgatgtggtc tggttgggcg gtcgttctag atcggagtag 9120

aattctgttt caaactacct ggtggattta ttaattttgg atctgtatgt gtgtgccata 9180aattctgttt caaactacct ggtggattta ttaattttgg atctgtatgt gtgtgccata 9180

catattcata gttacgaatt gaagatgatg gatggaaata tcgatctagg ataggtatac 9240catattcata gttacgaatt gaagatgatg gatggaaata tcgatctagg ataggtatac 9240

atgttgatgc gggttttact gatgcatata cagagatgct ttttgttcgc ttggttgtga 9300atgttgatgc gggttttact gatgcatata cagagatgct ttttgttcgc ttggttgtga 9300

tgatgtggtg tggttgggcg gtcgttcatt cgttctagat cggagtagaa tactgtttca 9360tgatgtggtg tggttgggcg gtcgttcatt cgttctagat cggagtagaa tactgtttca 9360

aactacctgg tgtatttatt aattttggaa ctgtatgtgt gtgtcataca tcttcatagt 9420aactacctgg tgtatttatt aattttggaa ctgtatgtgt gtgtcataca tcttcatagt 9420

tacgagttta agatggatgg aaatatcgat ctaggatagg tatacatgtt gatgtgggtt 9480tacgagttta agatggatgg aaatatcgat ctaggatagg tatacatgtt gatgtgggtt 9480

ttactgatgc atatacatga tggcatatgc agcatctatt catatgctct aaccttgagt 9540ttactgatgc atatacatga tggcatatgc agcatctatt catatgctct aaccttgagt 9540

acctatctat tataataaac aagtatgttt tataattatt ttgatcttga tatacttgga 9600acctatctat tataataaac aagtatgttt tataattatt ttgatcttga tatacttgga 9600

tgatggcata tgcagcagct atatgtggat ttttttagcc ctgccttcat acgctattta 9660tgatggcata tgcagcagct atatgtggat ttttttagcc ctgccttcat acgctattta 9660

tttgcttggt actgtttctt ttgtcgatgc tcaccctgtt gtttggtgtt acttctgcag 9720tttgcttggt actgtttctt ttgtcgatgc tcaccctgtt gtttggtgtt acttctgcag 9720

gagctcatga aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag 9780gagctcatga aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag 9780

ttcgacagcg tctccgacct gatgcagctc tcggagggcg aagaatctcg tgctttcagc 9840ttcgacagcg tctccgacct gatgcagctc tcggagggcg aagaatctcg tgctttcagc 9840

ttcgatgtag gagggcgtgg atatgtcctg cgggtaaata gctgcgccga tggtttctac 9900ttcgatgtag gagggcgtgg atatgtcctg cgggtaaata gctgcgccga tggtttctac 9900

aaagatcgtt atgtttatcg gcactttgca tcggccgcgc tcccgattcc ggaagtgctt 9960aaagatcgtt atgtttatcg gcactttgca tcggccgcgc tcccgattcc ggaagtgctt 9960

gacattgggg agtttagcga gagcctgacc tattgcatct cccgccgttc acagggtgtc 10020gacattgggg agtttagcga gagcctgacc tattgcatct cccgccgttc acagggtgtc 10020

acgttgcaag acctgcctga aaccgaactg cccgctgttc tacaaccggt cgcggaggct 10080acgttgcaag acctgcctga aaccgaactg cccgctgttc tacaaccggt cgcggaggct 10080

atggatgcga tcgctgcggc cgatcttagc cagacgagcg ggttcggccc attcggaccg 10140atggatgcga tcgctgcggc cgatcttagc cagacgagcg ggttcggccc attcggaccg 10140

caaggaatcg gtcaatacac tacatggcgt gatttcatat gcgcgattgc tgatccccat 10200caaggaatcg gtcaatacac tacatggcgt gatttcatat gcgcgattgc tgatccccat 10200

gtgtatcact ggcaaactgt gatggacgac accgtcagtg cgtccgtcgc gcaggctctc 10260gtgtatcact ggcaaactgt gatggacgac accgtcagtg cgtccgtcgc gcaggctctc 10260

gatgagctga tgctttgggc cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat 10320gatgagctga tgctttgggc cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat 10320

ttcggctcca acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc 10380ttcggctcca acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc 10380

gaggcgatgt tcggggattc ccaatacgag gtcgccaaca tcttcttctg gaggccgtgg 10440gaggcgatgt tcggggattc ccaatacgag gtcgccaaca tcttcttctg gaggccgtgg 10440

ttggcttgta tggagcagca gacgcgctac ttcgagcgga ggcatccgga gcttgcagga 10500ttggcttgta tggagcagca gacgcgctac ttcgagcgga ggcatccgga gcttgcagga 10500

tcgccacgac tccgggcgta tatgctccgc attggtcttg accaactcta tcagagcttg 10560tcgccacgac tccgggcgta tatgctccgc attggtcttg accaactcta tcagagcttg 10560

gttgacggca atttcgatga tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga 10620gttgacggca atttcgatga tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga 10620

tccggagccg ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc 10680tccggagccg ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc 10680

gatggctgtg tagaagtact cgccgatagt ggaaaccgac gccccagcac tcgtccgagg 10740gatggctgtg tagaagtact cgccgatagt ggaaaccgac gccccagcac tcgtccgagg 10740

gcaaagaaat agagtagatg ccgaccggga tctgtcgatc gacaagctcg agtttctcca 10800gcaaagaaat agagtagatg ccgaccggga tctgtcgatc gacaagctcg agtttctcca 10800

taataatgtg tgagtagttc ccagataagg gaattagggt tcctataggg tttcgctcat 10860taataatgtg tgagtagttc ccagataagg gaattagggt tcctataggg tttcgctcat 10860

gtgttgagca tataagaaac ccttagtatg tatttgtatt tgtaaaatac ttctatcaat 10920gtgttgagca tataagaaac ccttagtatg tatttgtatt tgtaaaatac ttctatcaat 10920

aaaatttcta attcctaaaa ccaaaatcca gtactaaaat ccagatcccc cgaattaatt 10980aaaatttcta attcctaaaa ccaaaatcca gtactaaaat ccagatcccc cgaattaatt 10980

cggcgttaat tcagcctgca ggacgcgttt aattaagtgc acgcggccgc ctacttagtc 11040cggcgttaat tcagcctgca ggacgcgttt aattaagtgc acgcggccgc ctacttagtc 11040

aagagcctcg cacgcgactg tcacgcggcc aggatcgcct cgtgagcctc gcaatctgta 11100aagagcctcg cacgcgactg tcacgcggcc aggatcgcct cgtgagcctc gcaatctgta 11100

cctagtgttt aaactatcag tgtttgacag gatatattgg cgggtaaacc taagagaaaa 11160cctagtgttt aaactatcag tgtttgacag gatatattgg cgggtaaacc taagagaaaa 11160

gagcgtttat tagaataacg gatatttaaa agggcgtgaa aaggtttatc cgttcgtcca 11220gagcgtttat tagaataacg gatatttaaa agggcgtgaa aaggtttatc cgttcgtcca 11220

tttgtatgtg catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc 11280tttgtatgtg catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc 11280

ctccgctgct atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac 11340ctccgctgct atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac 11340

atgtcgcaca agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt 11400atgtcgcaca agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt 11400

cttgtcgcgt gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac 11460cttgtcgcgt gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac 11460

gccatgaaca agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac 11520gccatgaaca agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac 11520

caggacttga ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc 11580caggacttga ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc 11580

gagaagatca ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta 11640gagaagatca ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta 11640

cgccctggcg acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac 11700cgccctggcg acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac 11700

ctactggaca ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag 11760ctactggaca ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag 11760

ccgtgggccg acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt 11820ccgtgggccg acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt 11820

gccgagttcg agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag 11880gccgagttcg agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag 11880

gcccgaggcg tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc 11940gcccgaggcg tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc 11940

cgcgagctga tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg 12000cgcgagctga tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg 12000

catcgctcga ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc 12060catcgctcga ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc 12060

aggcggcgcg gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc 12120aggcggcgcg gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc 12120

gagaatgaac gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt 12180gagaatgaac gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt 12180

ttttcattac cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc 12240ttttcattac cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc 12240

ccgcgcacgt ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc 12300ccgcgcacgt ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc 12300

tggcggcctg gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt 12360tggcggcctg gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt 12360

gatgtgtatt tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag 12420gatgtgtatt tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag 12420

taaataaaca aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg 12480taaataaaca aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg 12480

cgggtcaggc aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc 12540cgggtcaggc aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc 12540

cgatgttctg ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg 12600cgatgttctg ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg 12600

ggaagatcaa ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa 12660ggaagatcaa ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa 12660

ggccatcggc cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc 12720ggccatcggc cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc 12720

tgtgtccgcg atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga 12780tgtgtccgcg atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga 12780

catatgggcc accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg 12840catatgggcc accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg 12840

aaggctacaa gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga 12900aaggctacaa gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga 12900

ggttgccgag gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg 12960ggttgccgag gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg 12960

cgtgagctac ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg 13020cgtgagctac ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg 13020

cgacgctgcc cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt 13080cgacgctgcc cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt 13080

taatgaggta aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc 13140taatgaggta aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc 13140

gcacgcagca gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg 13200gcacgcagca gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg 13200

gtcaactttc agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa 13260gtcaactttc agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa 13260

ggcaagacca ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc 13320ggcaagacca ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc 13320

aaatgaataa atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga 13380aaatgaataa atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga 13380

acaaccaggc accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg 13440acaaccaggc accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg 13440

cgtaagcggc tgggttgtct gccggccctg caatggcact ggaaccccca agcccgagga 13500cgtaagcggc tgggttgtct gccggccctg caatggcact ggaaccccca agcccgagga 13500

atcggcgtga cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg 13560atcggcgtga cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg 13560

acctggtgga gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag 13620acctggtgga gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag 13620

cacgccccgg tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac 13680cacgccccgg tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac 13680

cgccggcagc cggtgcgccg tcgattagga agccgcccaa gggcgacgag caaccagatt 13740cgccggcagc cggtgcgccg tcgattagga agccgcccaa gggcgacgag caaccagatt 13740

ttttcgttcc gatgctctat gacgtgggca cccgcgatag tcgcagcatc atggacgtgg 13800ttttcgttcc gatgctctat gacgtgggca cccgcgatag tcgcagcatc atggacgtgg 13800

ccgttttccg tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc tacgagcttc 13860ccgttttccg tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc tacgagcttc 13860

cagacgggca cgtagaggtt tccgcagggc cggccggcat ggccagtgtg tgggattacg 13920cagacgggca cgtagaggtt tccgcagggc cggccggcat ggccagtgtg tgggattacg 13920

acctggtact gatggcggtt tcccatctaa ccgaatccat gaaccgatac cgggaaggga 13980acctggtact gatggcggtt tcccatctaa ccgaatccat gaaccgatac cgggaaggga 13980

agggagacaa gcccggccgc gtgttccgtc cacacgttgc ggacgtactc aagttctgcc 14040agggagacaa gcccggccgc gtgttccgtc cacacgttgc ggacgtactc aagttctgcc 14040

ggcgagccga tggcggaaag cagaaagacg acctggtaga aacctgcatt cggttaaaca 14100ggcgagccga tggcggaaag cagaaagacg acctggtaga aacctgcatt cggttaaaca 14100

ccacgcacgt tgccatgcag cgtacgaaga aggccaagaa cggccgcctg gtgacggtat 14160ccacgcacgt tgccatgcag cgtacgaaga aggccaagaa cggccgcctg gtgacggtat 14160

ccgagggtga agccttgatt agccgctaca agatcgtaaa gagcgaaacc gggcggccgg 14220ccgagggtga agccttgatt agccgctaca agatcgtaaa gagcgaaacc gggcggccgg 14220

agtacatcga gatcgagcta gctgattgga tgtaccgcga gatcacagaa ggcaagaacc 14280agtacatcga gatcgagcta gctgattgga tgtaccgcga gatcacagaa ggcaagaacc 14280

cggacgtgct gacggttcac cccgattact ttttgatcga tcccggcatc ggccgttttc 14340cggacgtgct gacggttcac cccgattact ttttgatcga tcccggcatc ggccgttttc 14340

tctaccgcct ggcacgccgc gccgcaggca aggcagaagc cagatggttg ttcaagacga 14400tctaccgcct ggcacgccgc gccgcaggca aggcagaagc cagatggttg ttcaagacga 14400

tctacgaacg cagtggcagc gccggagagt tcaagaagtt ctgtttcacc gtgcgcaagc 14460tctacgaacg cagtggcagc gccggagagt tcaagaagtt ctgtttcacc gtgcgcaagc 14460

tgatcgggtc aaatgacctg ccggagtacg atttgaagga ggaggcgggg caggctggcc 14520tgatcgggtc aaatgacctg ccggagtacg atttgaagga ggaggcgggg caggctggcc 14520

cgatcctagt catgcgctac cgcaacctga tcgagggcga agcatccgcc ggttcctaat 14580cgatcctagt catgcgctac cgcaacctga tcgagggcga agcatccgcc ggttcctaat 14580

gtacggagca gatgctaggg caaattgccc tagcagggga aaaaggtcga aaaggtctct 14640gtacggagca gatgctaggg caaattgccc tagcagggga aaaaggtcga aaaggtctct 14640

ttcctgtgga tagcacgtac attgggaacc caaagccgta cattgggaac cggaacccgt 14700ttcctgtgga tagcacgtac attgggaacc caaagccgta cattgggaac cggaacccgt 14700

acattgggaa cccaaagccg tacattggga accggtcaca catgtaagtg actgatataa 14760acattgggaa cccaaagccg tacattggga accggtcaca catgtaagtg actgatataa 14760

aagagaaaaa aggcgatttt tccgcctaaa actctttaaa acttattaaa actcttaaaa 14820aagagaaaaa aggcgatttt tccgcctaaa actctttaaa acttattaaa actcttaaaa 14820

cccgcctggc ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg caaaaagcgc 14880cccgcctggc ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg caaaaagcgc 14880

ctacccttcg gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct atcgcggccg 14940ctacccttcg gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct atcgcggccg 14940

ctggccgctc aaaaatggct ggcctacggc caggcaatct accagggcgc ggacaagccg 15000ctggccgctc aaaaatggct ggcctacggc caggcaatct accagggcgc ggacaagccg 15000

cgccgtcgcc actcgaccgc cggcgcccac atcaaggcac cctgcctcgc gcgtttcggt 15060cgccgtcgcc actcgaccgc cggcgcccac atcaaggcac cctgcctcgc gcgtttcggt 15060

gatgacggtg aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa 15120gatgacggtg aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa 15120

gcggatgccg ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg 15180gcggatgccg ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg 15180

ggcgcagcca tgacccagtc acgtagcgat agcggagtgt atactggctt aactatgcgg 15240ggcgcagcca tgacccagtc acgtagcgat agcggagtgt atactggctt aactatgcgg 15240

catcagagca gattgtactg agagtgcacc atatgcggtg tgaaataccg cacagatgcg 15300catcagagca gattgtactg agagtgcacc atatgcggtg tgaaataccg cacagatgcg 15300

taaggagaaa ataccgcatc aggcgctctt ccgcttcctc gctcactgac tcgctgcgct 15360taaggagaaa ataccgcatc aggcgctctt ccgcttcctc gctcactgac tcgctgcgct 15360

cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 15420cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 15420

cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 15480cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 15480

accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 15540accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 15540

acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 15600acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 15600

cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 15660cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 15660

acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 15720acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 15720

atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 15780atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 15780

agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 15840agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 15840

acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 15900acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 15900

gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg 15960gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg 15960

gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 16020gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 16020

gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 16080gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 16080

gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga 16140gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga 16140

acgaaaactc acgttaaggg attttggtca tgcattctag gtactaaaac aattcatcca 16200acgaaaactc acgttaaggg attttggtca tgcattctag gtactaaaac aattcatcca 16200

gtaaaatata atattttatt ttctcccaat caggcttgat ccccagtaag tcaaaaaata 16260gtaaaatata atattttatt ttctcccaat caggcttgat ccccagtaag tcaaaaaata 16260

gctcgacata ctgttcttcc ccgatatcct ccctgatcga ccggacgcag aaggcaatgt 16320gctcgacata ctgttcttcc ccgatatcct ccctgatcga ccggacgcag aaggcaatgt 16320

cataccactt gtccgccctg ccgcttctcc caagatcaat aaagccactt actttgccat 16380cataccactt gtccgccctg ccgcttctcc caagatcaat aaagccactt actttgccat 16380

ctttcacaaa gatgttgctg tctcccaggt cgccgtggga aaagacaagt tcctcttcgg 16440ctttcacaaa gatgttgctg tctcccaggt cgccgtggga aaagacaagt tcctcttcgg 16440

gcttttccgt ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga gtgtcttctt 16500gcttttccgt ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga gtgtcttctt 16500

cccagttttc gcaatccaca tcggccagat cgttattcag taagtaatcc aattcggcta 16560cccagttttc gcaatccaca tcggccagat cgttattcag taagtaatcc aattcggcta 16560

agcggctgtc taagctattc gtatagggac aatccgatat gtcgatggag tgaaagagcc 16620agcggctgtc taagctattc gtatagggac aatccgatat gtcgatggag tgaaagagcc 16620

tgatgcactc cgcatacagc tcgataatct tttcagggct ttgttcatct tcatactctt 16680tgatgcactc cgcatacagc tcgataatct tttcagggct ttgttcatct tcatactctt 16680

ccgagcaaag gacgccatcg gcctcactca tgagcagatt gctccagcca tcatgccgtt 16740ccgagcaaag gacgccatcg gcctcactca tgagcagatt gctccagcca tcatgccgtt 16740

caaagtgcag gacctttgga acaggcagct ttccttccag ccatagcatc atgtcctttt 16800caaagtgcag gacctttgga acaggcagct ttccttccag ccatagcatc atgtcctttt 16800

cccgttccac atcataggtg gtccctttat accggctgtc cgtcattttt aaatataggt 16860cccgttccac atcataggtg gtccctttat accggctgtc cgtcattttt aaatataggt 16860

tttcattttc tcccaccagc ttatatacct tagcaggaga cattccttcc gtatctttta 16920tttcattttc tcccaccagc ttatatacct tagcaggaga cattccttcc gtatctttta 16920

cgcagcggta tttttcgatc agttttttca attccggtga tattctcatt ttagccattt 16980cgcagcggta tttttcgatc agttttttca attccggtga tattctcatt ttagccattt 16980

attatttcct tcctcttttc tacagtattt aaagataccc caagaagcta attataacaa 17040attatttcct tcctcttttc tacagtattt aaagataccc caagaagcta attataacaa 17040

gacgaactcc aattcactgt tccttgcatt ctaaaacctt aaataccaga aaacagcttt 17100gacgaactcc aattcactgt tccttgcatt ctaaaacctt aaataccaga aaacagcttt 17100

ttcaaagttg ttttcaaagt tggcgtataa catagtatcg acggagccga ttttgaaacc 17160ttcaaagttg ttttcaaagt tggcgtataa catagtatcg acggagccga ttttgaaacc 17160

gcggtgatca caggcagcaa cgctctgtca tcgttacaat caacatgcta ccctccgcga 17220gcggtgatca caggcagcaa cgctctgtca tcgttacaat caacatgcta ccctccgcga 17220

gatcatccgt gtttcaaacc cggcagctta gttgccgttc ttccgaatag catcggtaac 17280gatcatccgt gtttcaaacc cggcagctta gttgccgttc ttccgaatag catcggtaac 17280

atgagcaaag tctgccgcct tacaacggct ctcccgctga cgccgtcccg gactgatggg 17340atgagcaaag tctgccgcct tacaacggct ctcccgctga cgccgtcccg gactgatggg 17340

ctgcctgtat cgagtggtga ttttgtgccg agctgccggt cggggagctg ttggctggct 17400ctgcctgtat cgagtggtga ttttgtgccg agctgccggt cggggagctg ttggctggct 17400

<210>2<210>2

<211>1071<211>1071

<212>PRT<212> PRT

<213>Artificial Sequence<213>Artificial Sequence

<400>2<400>2

Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro AlaMet Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala

1 5 10 151 5 10 15

Ala Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser ValAla Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val

20 25 30 20 25 30

Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala GlyGly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly

35 40 45 35 40 45

Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg ArgVal Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg

50 55 60 50 55 60

Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg IleSer Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile

65 70 75 8065 70 75 80

Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp HisGln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His

85 90 95 85 90 95

Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly LeuSer Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu

100 105 110 100 105 110

Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His LeuSer Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu

115 120 125 115 120 125

Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp ThrAla Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr

130 135 140 130 135 140

Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys AlaGly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala

145 150 155 160145 150 155 160

Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys LysLeu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys

165 170 175 165 170 175

Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp TyrAsp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr

180 185 190 180 185 190

Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His GlnVal Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln

195 200 205 195 200 205

Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr ArgLeu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg

210 215 220 210 215 220

Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp LysArg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys

225 230 235 240225 230 235 240

Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr PheAsp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe

245 250 255 245 250 255

Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu TyrPro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr

260 265 270 260 265 270

Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu AsnAsn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn

275 280 285 275 280 285

Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val PheGlu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe

290 295 300 290 295 300

Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile LeuLys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu

305 310 315 320305 310 315 320

Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly LysVal Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys

325 330 335 325 330 335

Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile ThrPro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr

340 345 350 340 345 350

Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile AlaAla Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala

355 360 365 355 360 365

Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu LeuLys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu

370 375 380 370 375 380

Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile SerThr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser

385 390 395 400385 390 395 400

Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala IleAsn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile

405 410 415 405 410 415

Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile AlaAsn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala

420 425 430 420 425 430

Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser GlnIle Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln

435 440 445 435 440 445

Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser ProGln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro

450 455 460 450 455 460

Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala IleVal Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile

465 470 475 480465 470 475 480

Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala ArgIle Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg

485 490 495 485 490 495

Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln LysGlu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys

500 505 510 500 505 510

Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr ThrArg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr

515 520 525 515 520 525

Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His AspGly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp

530 535 540 530 535 540

Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu GluMet Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu

545 550 555 560545 550 555 560

Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile ProAsp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro

565 570 575 565 570 575

Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val LysArg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys

580 585 590 580 585 590

Gln Glu Glu Asn Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr LeuGln Glu Glu Asn Ser Lys Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu

595 600 605 595 600 605

Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His IleSer Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile

610 615 620 610 615 620

Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys GluLeu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu

625 630 635 640625 630 635 640

Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys AspTyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp

645 650 655 645 650 655

Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly LeuPhe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu

660 665 670 660 665 670

Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val LysMet Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys

675 680 685 675 680 685

Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys TrpVal Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp

690 695 700 690 695 700

Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu AspLys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp

705 710 715 720705 710 715 720

Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys LysAla Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys

725 730 735 725 730 735

Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu LysLeu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu Lys

740 745 750 740 745 750

Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys GluGln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu

755 760 765 755 760 765

Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys AspIle Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp

770 775 780 770 775 780

Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Lys Leu IleTyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Lys Leu Ile

785 790 795 800785 790 795 800

Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr LeuAsn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu

805 810 815 805 810 815

Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys LeuIle Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu

820 825 830 820 825 830

Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His HisLys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His

835 840 845 835 840 845

Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr GlyAsp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly

850 855 860 850 855 860

Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn TyrAsp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr

865 870 875 880865 870 875 880

Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys IleLeu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile

885 890 895 885 890 895

Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp AspLys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp

900 905 910 900 905 910

Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro TyrTyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr

915 920 925 915 920 925

Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr ValArg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val

930 935 940 930 935 940

Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn SerLys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser

945 950 955 960945 950 955 960

Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln AlaLys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala

965 970 975 965 970 975

Glu Phe Ile Ala Ser Phe Tyr Lys Asn Asp Leu Ile Lys Ile Asn GlyGlu Phe Ile Ala Ser Phe Tyr Lys Asn Asp Leu Ile Lys Ile Asn Gly

980 985 990 980 985 990

Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg IleGlu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile

995 1000 1005 995 1000 1005

Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu AsnGlu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn

1010 1015 1020 1010 1015 1020

Met Asn Asp Lys Arg Pro Pro His Ile Ile Lys Thr Ile Ala SerMet Asn Asp Lys Arg Pro Pro His Ile Ile Lys Thr Ile Ala Ser

1025 1030 1035 1025 1030 1035

Lys Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly AsnLys Thr Gln Ser Ile Lys Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn

1040 1045 1050 1040 1045 1050

Leu Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys LysLeu Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys

1055 1060 1065 1055 1060 1065

Gly Ser AlaGly Ser Ala

1070 1070

<210>3<210>3

<211>208<211>208

<212>PRT<212> PRT

<213>Artificial Sequence<213>Artificial Sequence

<400>3<400>3

Met Thr Asp Ala Glu Tyr Val Arg Ile His Glu Lys Leu Asp Ile TyrMet Thr Asp Ala Glu Tyr Val Arg Ile His Glu Lys Leu Asp Ile Tyr

1 5 10 151 5 10 15

Thr Phe Lys Lys Gln Phe Phe Asn Asn Lys Lys Ser Val Ser His ArgThr Phe Lys Lys Gln Phe Phe Asn Asn Lys Lys Ser Val Ser His Arg

20 25 30 20 25 30

Cys Tyr Val Leu Phe Glu Leu Lys Arg Arg Gly Glu Arg Arg Ala CysCys Tyr Val Leu Phe Glu Leu Lys Arg Arg Gly Glu Arg Arg Ala Cys

35 40 45 35 40 45

Phe Trp Gly Tyr Ala Val Asn Lys Pro Gln Ser Gly Thr Glu Arg GlyPhe Trp Gly Tyr Ala Val Asn Lys Pro Gln Ser Gly Thr Glu Arg Gly

50 55 60 50 55 60

Ile His Ala Glu Ile Phe Ser Ile Arg Lys Val Glu Glu Tyr Leu ArgIle His Ala Glu Ile Phe Ser Ile Arg Lys Val Glu Glu Tyr Leu Arg

65 70 75 8065 70 75 80

Asp Asn Pro Gly Gln Phe Thr Ile Asn Trp Tyr Ser Ser Trp Ser ProAsp Asn Pro Gly Gln Phe Thr Ile Asn Trp Tyr Ser Ser Trp Ser Pro

85 90 95 85 90 95

Cys Ala Asp Cys Ala Glu Lys Ile Leu Glu Trp Tyr Asn Gln Glu LeuCys Ala Asp Cys Ala Glu Lys Ile Leu Glu Trp Tyr Asn Gln Glu Leu

100 105 110 100 105 110

Arg Gly Asn Gly His Thr Leu Lys Ile Trp Ala Cys Lys Leu Tyr TyrArg Gly Asn Gly His Thr Leu Lys Ile Trp Ala Cys Lys Leu Tyr Tyr

115 120 125 115 120 125

Glu Lys Asn Ala Arg Asn Gln Ile Gly Leu Trp Asn Leu Arg Asp AsnGlu Lys Asn Ala Arg Asn Gln Ile Gly Leu Trp Asn Leu Arg Asp Asn

130 135 140 130 135 140

Gly Val Gly Leu Asn Val Met Val Ser Glu His Tyr Gln Cys Cys ArgGly Val Gly Leu Asn Val Met Val Ser Glu His Tyr Gln Cys Cys Arg

145 150 155 160145 150 155 160

Lys Ile Phe Ile Gln Ser Ser His Asn Gln Leu Asn Glu Asn Arg TrpLys Ile Phe Ile Gln Ser Ser His Asn Gln Leu Asn Glu Asn Arg Trp

165 170 175 165 170 175

Leu Glu Lys Thr Leu Lys Arg Ala Glu Lys Trp Arg Ser Glu Leu SerLeu Glu Lys Thr Leu Lys Arg Ala Glu Lys Trp Arg Ser Glu Leu Ser

180 185 190 180 185 190

Ile Met Ile Gln Val Lys Ile Leu His Thr Thr Lys Ser Pro Ala ValIle Met Ile Gln Val Lys Ile Leu His Thr Thr Lys Ser Pro Ala Val

195 200 205 195 200 205

<210>4<210>4

<211>98<211>98

<212>PRT<212> PRT

<213>Artificial Sequence<213>Artificial Sequence

<400>4<400>4

Ser Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr GlySer Gly Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly

1 5 10 151 5 10 15

Lys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu ValLys Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val

20 25 30 20 25 30

Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His ThrGlu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr

35 40 45 35 40 45

Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser AspAla Tyr Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp

50 55 60 50 55 60

Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn GlyAla Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly

65 70 75 8065 70 75 80

Glu Asn Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys ArgGlu Asn Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg

85 90 95 85 90 95

Lys ValLys Val

<210>5<210>5

<211>522<211>522

<212>DNA<212> DNA

<213>Artificial Sequence<213>Artificial Sequence

<400>5<400>5

aacaaagcac cagtggtcta gtggtagaat agtaccctgc cacggtacag acccgggttc 60aacaaagcac cagtggtcta gtggtagaat agtaccctgc cacggtacag acccgggttc 60

gattcccggc tggtgcagat gccacacagc aaggagtgtt ttagtactct ggaaacagaa 120gattcccggc tggtgcagat gccacacagc aaggagtgtt ttagtactct ggaaacagaa 120

tctactaaaa caaggcaaaa tgccgtgttt atctcgtcaa cttgttggcg agataacaaa 180tctactaaaa caaggcaaaa tgccgtgttt atctcgtcaa cttgttggcg agataacaaa 180

gcaccagtgg tctagtggta gaatagtacc ctgccacggt acagacccgg gttcgattcc 240gcaccagtgg tctagtggta gaatagtacc ctgccacggt acagacccgg gttcgattcc 240

cggctggtgc acagaaccga caacagatga ggttttagta ctctggaaac agaatctact 300cggctggtgc acagaaccga caacagatga ggttttagta ctctggaaac agaatctact 300

aaaacaaggc aaaatgccgt gtttatctcg tcaacttgtt ggcgagataa caaagcacca 360aaaacaaggc aaaatgccgt gtttatctcg tcaacttgtt ggcgagataa caaagcacca 360

gtggtctagt ggtagaatag taccctgcca cggtacagac ccgggttcga ttcccggctg 420gtggtctagt ggtagaatag taccctgcca cggtacagac ccgggttcga ttcccggctg 420

gtgcaccagc tcatttggct cggcggtttt agtactctgg aaacagaatc tactaaaaca 480gtgcaccagc tcatttggct cggcggtttt agtactctgg aaacagaatc tactaaaaca 480

aggcaaaatg ccgtgtttat ctcgtcaact tgttggcgag at 522aggcaaaatg ccgtgtttat ctcgtcaact tgttggcgag at 522

<210>6<210>6

<211>83<211>83

<212>DNA<212> DNA

<213>Artificial Sequence<213>Artificial Sequence

<400>6<400>6

gttttagtac tctgctggaa acagcagaat ctactaaaac aaggcaaaat gccgtgttta 60gttttagtac tctgctggaa acagcagaat ctactaaaac aaggcaaaat gccgtgttta 60

tctcgtcaac ttgttggcga gat 83tctcgtcaac ttgttggcga gat 83

<210>7<210>7

<211>93<211>93

<212>DNA<212> DNA

<213>Artificial Sequence<213>Artificial Sequence

<400>7<400>7

gttttagtac tctgtaattt tagaaataaa attacagaat ctactaaaac aaggcaaaat 60gttttagtac tctgtaattt tagaaataaa attacagaat ctactaaaac aaggcaaaat 60

gccgtgttta tctcgtcaac ttgttggcga gat 93gccgtgttta tctcgtcaac ttgttggcga gat 93

<210>8<210>8

<211>174<211>174

<212>DNA<212> DNA

<213>Artificial Sequence<213>Artificial Sequence

<400>8<400>8

aacaaagcac cagtggtcta gtggtagaat agtaccctgc cacggtacag acccgggttc 60aacaaagcac cagtggtcta gtggtagaat agtaccctgc cacggtacag acccgggttc 60

gattcccggc tggtgcatcc tcggcgtagt acgggctgtt ttagtactct ggaaacagaa 120gattcccggc tggtgcatcc tcggcgtagt acgggctgtt ttagtactct ggaaacagaa 120

tctactaaaa caaggcaaaa tgccgtgttt atctcgtcaa cttgttggcg agat 174tctactaaaa caaggcaaaa tgccgtgttt atctcgtcaa cttgttggcg agat 174

<210>9<210>9

<211>77<211>77

<212>RNA<212> RNA

<213>Artificial Sequence<213>Artificial Sequence

<400>9<400>9

guuuuaguac ucuggaaaca gaaucuacua aaacaaggca aaaugccgug uuuaucucgu 60guuuuaguac ucuggaaaca gaaucuacua aaacaaggca aaaugccgug uuuaucucgu 60

caacuuguug gcgagau 77caacuuguug gcgagau 77

<210>10<210>10

<211>83<211>83

<212>RNA<212> RNA

<213>Artificial Sequence<213>Artificial Sequence

<400>10<400>10

guuuuaguac ucugcuggaa acagcagaau cuacuaaaac aaggcaaaau gccguguuua 60guuuuaguac ucugcuggaa acagcagaau cuacuaaaac aaggcaaaau gccguguuua 60

ucucgucaac uuguuggcga gau 83ucucgucaac uuguuggcga gau 83

Claims (10)

1. A kit comprising a sgRNA or a biological material associated with the sgRNA, a Cas9 nuclease or a biological material associated with the Cas9 nuclease, a cytosine deaminase or a biological material associated with the cytosine deaminase;
the sgRNA targets a target sequence;
the sgRNA is shown as formula I: an RNA-engineered sgRNA backbone transcribed from the target sequence (formula I);
the modified sgRNA framework is an RNA molecule obtained by inserting an RNA fragment A between 14 th and 15 th positions of the sgRNA framework and inserting an RNA fragment B between 18 th and 19 th positions of the sgRNA framework;
the RNA segment A and the RNA segment B are reversely complementary;
the sizes of the RNA fragment A and the RNA fragment B are both 3 nt;
the sgRNA backbone is m1) or m2) or m 3):
m1) the RNA molecule shown as the sequence 9;
m2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in m1) and having the same function;
m3) and m1) or m2) and has the same function.
2. The kit of claim 1, wherein: the engineered sgRNA backbone is n1) or n2) or n 3):
n1) the RNA molecule shown as the sequence 10;
n2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in n1) and having the same function;
n3) and n1) or n2) and has the same function.
3. The kit of claim 1 or 2, wherein: the Cas9 nuclease is a SaKKHn protein;
the SaKKHn protein is E1) or E2) or E3):
E1) the amino acid sequence is a protein shown in a sequence 2;
E2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 2 in the sequence table and has the same function;
E3) a fusion protein obtained by connecting a label to the N terminal or/and the C terminal of E1) or E2);
the biological material related to the SaKKHn is any one of F1) to F5):
F1) a nucleic acid molecule encoding said SaKKHn protein;
F2) an expression cassette comprising the nucleic acid molecule of F1);
F3) a recombinant vector comprising the nucleic acid molecule of F1) or a recombinant vector comprising the expression cassette of F2);
F4) a recombinant microorganism containing F1) said nucleic acid molecule, or a recombinant microorganism containing F2) said expression cassette, or a recombinant microorganism containing F3) said recombinant vector;
F5) a transgenic cell line comprising the nucleic acid molecule of F1) or a transgenic cell line comprising the expression cassette of F2).
4. The kit of claim 1 or 2, wherein: the cytosine deaminase is PmCDA1 protein;
the PmCDA1 protein is G1) or G2) or G3):
G1) the amino acid sequence is a protein shown in a sequence 3;
G2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 3 in the sequence table and has the same function;
G3) a fusion protein obtained by connecting a tag to the N-terminus or/and the C-terminus of G1) or G2);
the biological material related to the PmCDA1 protein is any one of H1) to H5):
H1) a nucleic acid molecule encoding the PmCDA1 protein;
H2) an expression cassette comprising the nucleic acid molecule of H1);
H3) a recombinant vector containing H1) the nucleic acid molecule or a recombinant vector containing H2) the expression cassette;
H4) a recombinant microorganism containing H1) the nucleic acid molecule, or a recombinant microorganism containing H2) the expression cassette, or a recombinant microorganism containing H3) the recombinant vector;
H5) a transgenic cell line containing H1) the nucleic acid molecule or a transgenic cell line containing H2) the expression cassette.
5. The kit of any one of claims 1 to 4, wherein: the sgRNA is tRNA-sgRNA;
the tRNA-sgRNA is shown as a formula I: tRNA-the RNA transcribed from the target sequence-engineered sgRNA backbone (formula I);
the tRNA is 1) or 2) or 3):
1) an RNA molecule obtained by replacing T in the 474-550 th position of the sequence 1 with U;
2) RNA molecules which are obtained by substituting and/or deleting and/or adding one or more nucleotides in the RNA molecules shown in 1) and have the same functions;
3) RNA molecule with 75% or more than 75% identity with the nucleotide sequence defined in 1) or 2) and with the same function.
6. The kit of any one of claims 1 to 5, wherein: the kit further comprises a UGI protein or a biological material associated with the UGI protein;
the UGI protein is I1) or I2) or I3):
I1) the amino acid sequence is a protein shown in a sequence 4;
I2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 4 in the sequence table and has the same function;
I3) a fusion protein obtained by connecting labels at the N terminal or/and the C terminal of I1) or I2);
the biological material related to the UGI protein is any one of J1) to J5):
J1) a nucleic acid molecule encoding the UGI protein;
J2) an expression cassette comprising the nucleic acid molecule of J1);
J3) a recombinant vector comprising J1) said nucleic acid molecule, or a recombinant vector comprising J2) said expression cassette;
J4) a recombinant microorganism containing J1) the nucleic acid molecule, or a recombinant microorganism containing J2) the expression cassette, or a recombinant microorganism containing J3) the recombinant vector;
J5) a transgenic cell line comprising J1) the nucleic acid molecule or a transgenic cell line comprising J2) the expression cassette.
7. The sgRNA of any one of claims 1-6 or the engineered sgRNA backbone of any one of claims 1-6.
8. The kit of any one of claims 1-6, or the sgRNA of claim 7, or the modified sgRNA backbone of claim 7, for use in any one of X1) -X4):
x1) editing of a target sequence in the genome of an organism or cell of an organism;
x2) preparing an edited product of a target sequence of a genome of an organism or a cell of an organism;
x3) increasing the efficiency of editing a target sequence in the genome of an organism or cell of an organism;
x4) to produce a product that increases the efficiency of editing a target sequence in the genome of an organism or cell of an organism.
9, Y1) or Y2):
y1) or a method of increasing the efficiency of editing a genomic target sequence of an organism or a cell of an organism, comprising expressing the sgRNA of any one of claims 1 to 6, the Cas9 nuclease of any one of claims 1 to 6, the cytosine deaminase of any one of claims 1 to 6 in the organism or cell of the organism to effect editing of the genomic target sequence; the sgRNA targets the target sequence;
y2) biological mutant, comprising the following steps: editing the genome of the organism according to the method described in Y1) to obtain a biological mutant.
10. The kit of any one of claims 1 to 6 or the use of claim 8 or the method of claim 9, wherein:
editing the genome target sequence to mutate C in the target sequence into T;
and/or, the organism is S1) or S2) or S3) or S4):
s1) plants or animals;
s2) a monocot or dicot;
s3) gramineous plants;
s4) rice;
and/or, the biological cell is T1) or T2) or T3) or T4):
t1) plant cells or animal cells;
t2) a monocotyledonous or dicotyledonous plant cell;
t3) graminaceous plant cells;
t4) rice cells.
CN201911200779.0A 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing Active CN110835630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911200779.0A CN110835630B (en) 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911200779.0A CN110835630B (en) 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing

Publications (2)

Publication Number Publication Date
CN110835630A true CN110835630A (en) 2020-02-25
CN110835630B CN110835630B (en) 2023-01-03

Family

ID=69577858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911200779.0A Active CN110835630B (en) 2019-11-29 2019-11-29 Efficient sgRNA and application thereof in gene editing

Country Status (1)

Country Link
CN (1) CN110835630B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023005935A1 (en) * 2021-07-30 2023-02-02 中国科学院天津工业生物技术研究所 Method for reducing editing window of base editor, base editor and use

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109652440A (en) * 2018-12-28 2019-04-19 北京市农林科学院 Application of the VQRn-Cas9&PmCDA1&UGI base editing system in plant gene editor

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109652440A (en) * 2018-12-28 2019-04-19 北京市农林科学院 Application of the VQRn-Cas9&PmCDA1&UGI base editing system in plant gene editor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BENJAMIN P KLEINSTIVER 等: "Broadening the targeting range of Staphylococcus aureus CRISPR-Cas9 by modifying PAM recognition", 《NATURE BIOTECHNOLOGY》 *
F. ANN RAN等: "In vivo genome editing using Staphylococcus aureus Cas9", 《NATURE》 *
YING WU等: "Increasing Cytosine Base Editing Scope and Efficiency With Engineered Cas9-PmCDA1 Fusions and the modified sgRNA in Rice", 《FRONTIERS IN GENETICS》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023005935A1 (en) * 2021-07-30 2023-02-02 中国科学院天津工业生物技术研究所 Method for reducing editing window of base editor, base editor and use

Also Published As

Publication number Publication date
CN110835630B (en) 2023-01-03

Similar Documents

Publication Publication Date Title
CN110564739B (en) Poplar PtMYB158 gene and application thereof in creating new poplar seed material
CN108085287B (en) A kind of recombinant Corynebacterium glutamicum, its preparation method and its application
CN112778405A (en) Protein related to plant flowering phase and coding gene and application thereof
CN109206496B (en) Application of protein GhFLS1 in regulation and control of plant heat resistance
CN103205458B (en) Intermediate expression carrier applicable to monocotyledon transformation and construction method thereof
CN110835630B (en) Efficient sgRNA and application thereof in gene editing
CN110835631B (en) Modified sgRNA and application thereof in improving base editing efficiency
CN108342409B (en) Plant RNAi expression vector and construction method and application thereof
CN110408646B (en) Plant genetic transformation screening vector and application thereof
CN113121662B (en) Application of cotton GhBZR3 protein and coding gene thereof in regulating plant growth and development
CN110923263B (en) Rice beta-amylase BA1 and coding gene and application thereof
CN111187787A (en) Multifunctional plant expression vector and construction method and application thereof
CN101985631B (en) Corynebacterium promoter detection vector and construction method and application thereof
CN111304242A (en) A method for preparing single mutants based on SaKKHn-pBE system
CN110878321B (en) An expression vector for gene editing of Klebsiella pneumoniae
CN112592930B (en) A method and bacterial strain for improving hyaluronic acid production
CN111154797B (en) A gene gun-mediated genetic transformation method of maize backbone inbred lines
CN109321594B (en) Method for improving artemisinin content in artemisia annua by taking artemisia annua suspension cell line as receptor through iaaM gene transfer
CN111269298B (en) Application of protein GhCCoAOMT7 in regulating plant heat tolerance
CN111154796B (en) An Agrobacterium-mediated genetic transformation method of maize backbone inbred lines
CN112575028A (en) RNAi plant expression vector for inhibiting expression of HIS1 gene and application thereof
CN115404193B (en) Recombinant microorganism and method for producing 1,5-pentanediamine
CN109694877B (en) Method for growing transgenic plants with different lignin content
CN114621977A (en) Plant expression vector and application thereof in preparation of herbicide sensitive plants
CN112458113A (en) Plant transgenic dominant suppression vector and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant