[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN113025597B - Improved genome editing system - Google Patents

Improved genome editing system Download PDF

Info

Publication number
CN113025597B
CN113025597B CN201911351725.4A CN201911351725A CN113025597B CN 113025597 B CN113025597 B CN 113025597B CN 201911351725 A CN201911351725 A CN 201911351725A CN 113025597 B CN113025597 B CN 113025597B
Authority
CN
China
Prior art keywords
lys
leu
glu
asp
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911351725.4A
Other languages
Chinese (zh)
Other versions
CN113025597A (en
Inventor
邱金龙
张倩伟
尹康权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Microbiology of CAS
Original Assignee
Institute of Microbiology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Microbiology of CAS filed Critical Institute of Microbiology of CAS
Priority to CN201911351725.4A priority Critical patent/CN113025597B/en
Publication of CN113025597A publication Critical patent/CN113025597A/en
Application granted granted Critical
Publication of CN113025597B publication Critical patent/CN113025597B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Zoology (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Virology (AREA)
  • Cell Biology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)

Abstract

The present invention relates to the field of genome editing. In particular, the present invention relates to an improved genome editing system and applications thereof. More specifically, the present invention provides a genome editing fusion polypeptide comprising a CRISPR nuclease and a 5'→3' exonuclease. The invention also provides polynucleotides or expression constructs encoding the polypeptides, and genome editing systems comprising the polypeptides, polynucleotides and/or constructs. The invention also provides a method for editing the genome of a cell by using the genome editing system.

Description

Improved genome editing system
Technical Field
The present invention relates to the field of genome editing. In particular, the present invention relates to an improved genome editing system and applications thereof. More specifically, the present invention provides a genome editing fusion polypeptide comprising a CRISPR nuclease and a 5'→3' exonuclease. The invention also provides polynucleotides or expression constructs encoding the polypeptides, and genome editing systems comprising the polypeptides, polynucleotides and/or constructs. The invention also provides a method for editing the genome of a cell by using the genome editing system.
Description of the background
The use of CRISPR/Cas systems to provide immunity against viral infection in bacteria and archaea (WIEDENHEFT et al, 2012) technically simplifies genome editing and is revolutionizing biology and genetic engineering. The CRISPR/Cas9 system is most widely used for genome editing of a variety of organisms including plants (Hsu et al, 2014; yin et al, 2017). The CRISPR/Cas9 system consists of two parts, a Cas9 nuclease and a single guide RNA (sgRNA). Cas9 binds to the scaffold of the sgrnas, and target specificity is determined by a spacer sequence of about 20 nucleotides (nt) at the 5' end of the sgrnas (Jinek et al, 2012). Cas9 typically cleaves the target DNA about 3bp upstream of the prosomain sequence adjacent motif (PAM) sequence. The mutation features induced by CRISPR/Cas9 in plants mainly include deletions of less than 10bp (typically 1-3 bp) and insertions of one base pair (bp), in particular A/T (Paul et al 2016; bortesi et al 2016). Likewise, most mutations induced by Cas9 in mammalian cells are small insertions/deletions (indels) (Kim et al, 2015; kosicki et al, 2018). Although a large genomic deletion of up to 250bp was detected after Cas9 editing (Heckl et al, 2014; liang et al, 2015), its frequency was very low. Thus, CRISPR/Cas9 has been widely used for genomic coding regions, as small indels in coding genes often result in frame shift mutations, resulting in loss of function. However, cas9 remains a challenge for editing regulatory and non-coding genomic sequences, since small insertion deletions induced by one sgRNA are unlikely to result in loss-of-function mutations for regulatory and non-coding genomic sequences.
Two guide RNAs on the border of the deleted fragments, the paired guide RNAs (pgrnas), have been used to generate larger non-coding DNA deletions (Han et al, 2014; yin et al, 2015; zhu et al, 2016) and regulatory element deletions (Diao et al, 2017). However, the need for two sgrnas would certainly increase the limitations of this approach. First, PAM sequence is a limiting factor in the broad application of this strategy. Second, pgrnas still tend to produce a single editing event, especially when the two sgRNA target sites are distant from each other (Zhu et al, 2016). In addition, the introduction of two sgrnas is more laborious and the frequency of off-targets may increase.
Cas12a (formerly Cpf 1) has also been used as a genome editing tool (Zetsche et al, 2015; koonin et al, 2017). Like Cas9, cas12a is also a class 2 Cas enzyme of RNA guide. However, the guide RNA used by Cas12a is shorter than the sgRNA of Cas9 (Li et al, 2017; dang et al, 2015), cas12a can recognize T-enriched PAM (Zetsche et al, 2015; jink et al, 2012) compared to G-enriched PAM of Cas 9. Furthermore, cas12a produces a Double Strand Break (DSB) with staggered ends of 4-5nt overhangs at PAM distal positions, unlike Cas9 (Zetsche et al, 2015). Thus, the mutation characteristics of Cas12a in plants are mainly a shortage of up to 44bp (typically 6-13 bp) and rare insertions (Tang et al, 2017).
There is a need in the art to provide further methods that can produce larger genomic deletions at a particular target site with only one guide RNA, without the need for paired guide RNAs.
Brief description of the invention
In one aspect, the invention provides an isolated fusion polypeptide comprising a CRISPR nuclease and a 5'→3' exonuclease.
In another aspect, the present invention also provides a genome editing system comprising at least one of the following i) to v):
i) Fusion polypeptides and guide RNAs of the invention;
ii) an expression construct comprising a nucleotide sequence encoding a fusion polypeptide of the invention, and a guide RNA;
iii) Fusion polypeptides of the invention, and expression constructs comprising a nucleotide sequence encoding a guide RNA;
iv) an expression construct comprising a nucleotide sequence encoding a fusion polypeptide of the invention, and an expression construct comprising a nucleotide sequence encoding a guide RNA;
v) an expression construct comprising a nucleotide sequence encoding a fusion polypeptide of the invention and a nucleotide sequence encoding a guide RNA.
In another aspect, the invention also provides a method of genetically modifying a cell, comprising introducing into the cell, preferably a plant cell, a genome editing system of the invention.
Brief Description of Drawings
Fig. 1: fusion of T5 exonuclease to Cas9 alters the indel characteristics of genome editing. a) Schematic representation of Cas9 and T5exo-Cas9 constructs. b) Ratio of deletion and insertion of rice protoplast OsMKK5 target site induced by Cas9 or T5exo-Cas 9. Cas9 and T5exo-Cas9 induced mutations were first enriched by PCR amplification of protoplast genomic DNA pre-digested with HindIII, and PCR products were cloned for Sanger sequencing. c) Size distribution of deletions produced by Cas9 and T5exo-Cas 9. d) Representative deletions induced by T5exo-Cas9 at the OsMKK5 locus. Black line, deleted genomic region; a rectangular, front inter-region sequence; triangle, cleavage site for Cas 9; two-sided arrows, forward and reverse primers for PCR amplification.
Fig. 2: fusing T5 exonuclease to Cas9 increases the frequency and size of genomic deletions at the guide RNA target locus. a) In rice protoplasts, cas9 and T5exo-Cas9 induce insertion deletion patterns at the OsMPK16, oscc 48, osALS and OsXa target sites. All experiments were repeated three times with similar results. b) Size distribution of deletions created by Cas9 and T5exo-Cas9 at four target sites in rice protoplasts. "D" indicates the deletion length. All experiments were repeated three times with similar results. c) Genome editing efficiency of Cas9 and T5exo-Cas9 at four targets in rice protoplasts. Untreated protoplast samples were used as controls. Data are mean ± s.e.m (n=3). P-values were calculated by two-way ANOVA. * P < 0.01, P < 0.001.
Fig. 3: the T5exo-Cas9 fusion contributes to the genomic deletion of transgenic rice plants. a) Genotyping results for T0 transgenic rice lines obtained by transformation of sgRNA OsXa-T1 and Cas9 or T5exo-Cas9, respectively, are summarized. b) Indel patterns generated by Cas9 and T5exo-Cas9 at OsXa promoters in transgenic rice lines. "D" indicates the deletion length. c) Resistance of the indicated rice mutants to Xanthomonas oryzae (Xanthomonas oryzae pv. Oryzae) (Xoo) strain PXO 99. The leaves were inoculated and lesion length was measured 12 days after inoculation. Data were analyzed by one-way ANOVA (mean ± s.d). The significant difference between the averages was determined by Fisher's protected LSD test (P.ltoreq.0.05), and significantly different groups were indicated by different lower case letters. d) c sequence of insertion or deletion mutation shown in panel. The upper panel shows the structure of OsXa gene. The lower panel shows the sequence of OsXa target sites. The UPT PthXo1 sequence is shown in grey, the sgRNA target sequence is underlined, PAM is framed by rectangles, dashed lines indicate deleted nucleotides, triangles indicate inserted nucleotides. WT, wild type.
Fig. 4: fusing T5 exonuclease to Cas12a increases the frequency and size of genomic deletions at the guide RNA target locus. a) Schematic representation of Cas12a and T5exo-Cas12a constructs. b) Insertion deletion pattern induced by Cas12a and T5exo-Cas12a at OsBADH, osEPSPs and OsPDS target sites in rice protoplasts. All experiments were repeated three times with similar results. c) Size distribution of deletions generated by Cas12a and T5exo-Cas12a at three target sites of rice protoplasts. "D" indicates the deletion length. All experiments were repeated three times with similar results. d) Genome editing efficiency of Cas12a and T5exo-Cas12a at three target sites in rice protoplasts. Untreated protoplast samples were used as controls. Data are mean ± s.e.m (n=3). P-values were calculated by two-way ANOVA. * P < 0.01.
Fig. 5: the T5exo-Cas12a fusion produces a larger genomic deletion in transgenic rice plants. a) Genotyping results for T0 transgenic rice lines obtained by transformation of guide RNA OsPDS-T1 and Cas12a or T5exo-Cas12a, respectively, are summarized. b) Indel patterns generated by Cas12a and T5exo-Cas12a at OsPDS genes in transgenic rice lines. "D" indicates the deletion length.
Detailed Description
1. Definition of the definition
In the present invention, unless otherwise indicated, scientific and technical terms used herein have the meanings commonly understood by one of ordinary skill in the art. Also, protein and nucleic acid chemistry, molecular biology, cell and tissue culture, microbiology, immunology-related terms and laboratory procedures as used herein are terms and conventional procedures that are widely used in the corresponding arts. For example, standard recombinant DNA and molecular cloning techniques for use in the present invention are well known to those skilled in the art and are more fully described in the following documents: sambrook, j., fritsch, e.f., and Maniatis,T.,Molecular Cloning:A Laboratory Manual:Cold Spring Harbor Laboratory Press:Cold Spring Harbor,1989( are abbreviated as "Sambrook"). Meanwhile, in order to better understand the present invention, definitions and explanations of related terms are provided below.
As used herein, the term "and/or" encompasses all combinations of items connected by the term, and should be viewed as having been individually listed herein. For example, "a and/or B" encompasses "a", "a and B", and "B". For example, "A, B and/or C" encompasses "a", "B", "C", "a and B", "a and C", "B and C" and "a and B and C".
The term "comprising" is used herein to describe a sequence of a protein or nucleic acid, which may consist of the sequence, or may have additional amino acids or nucleotides at one or both ends of the protein or nucleic acid, but still have the activity described herein. Furthermore, it will be clear to those skilled in the art that the methionine encoded by the start codon at the N-terminus of a polypeptide may be retained in some practical situations (e.g., when expressed in a particular expression system) without substantially affecting the function of the polypeptide. Thus, in describing a particular polypeptide amino acid sequence in the present specification and claims, although it may not comprise a methionine encoded at the N-terminus by the initiation codon, a sequence comprising such methionine is also contemplated at this time, and accordingly, the encoding nucleotide sequence may also comprise the initiation codon; and vice versa.
As used herein, the term "CRISPR nuclease" generally refers to nucleases found in naturally occurring CRISPR systems, as well as modified forms thereof, variants thereof, catalytically active fragments thereof, and the like. The term encompasses any effector protein based on a CRISPR system that is capable of achieving gene targeting (e.g., gene editing, gene targeting regulation, etc.) within a cell.
Examples of "CRISPR nucleases" include Cas9 nucleases or variants thereof. The Cas9 nuclease may be a Cas9 nuclease from a different species, such as spCas9 from streptococcus pyogenes(s) or SaCas9 derived from staphylococcus aureus (s.aureus). "Cas9 nuclease" and "Cas9" are used interchangeably herein to refer to an RNA-guided nuclease comprising a Cas9 protein or fragment thereof (e.g., a protein comprising the active DNA cleavage domain of Cas9 and/or the gRNA binding domain of Cas 9). Cas9 is a component of a CRISPR/Cas (clustered regularly interspaced short palindromic repeats and related systems) genome editing system that can target and cleave DNA target sequences to form DNA Double Strand Breaks (DSBs) under the direction of guide RNAs.
Examples of "CRISPR nucleases" may also include Cas12a nucleases or variants thereof, e.g., high specificity variants. The Cas12a nuclease may be a Cas12a nuclease from a different species, e.g., cas12a nucleases from FRANCISELLA NOVICIDA U112, acidaminococcus sp.bv3l6 and Lachnospiraceae bacterium ND 2006.
As used herein, the term "5' →3' exonuclease" refers to an exonuclease that degrades DNA from the 5' end, i.e., in the 5' to 3' direction. The 5 '. Fwdarw.3 ' exonuclease of interest can remove nucleotides from the 5' end of the ds DNA strand at the blunt end and, in certain embodiments, at the 3' and/or 5' overhang.
As used herein, "gRNA" and "guide RNA" are used interchangeably to refer to an RNA molecule that is capable of forming a complex with a CRISPR nuclease and of targeting the complex to a target sequence due to some complementarity to the target sequence. For example, in Cas 9-based gene editing systems, the gRNA is typically composed of crRNA and tracrRNA molecules that are partially complementary to form a complex, wherein the crRNA comprises a sequence that has sufficient identity to a target sequence and directs the CRISPR complex (Cas 9+ crRNA + tracrRNA) to specifically bind to the target sequence. However, it is known in the art that one-way guide RNAs (sgrnas) can be designed which contain both the features of crrnas and tracrrnas. Whereas in Cas12 a-based genome editing systems, the gRNA is typically composed of only mature crRNA molecules, where the crRNA contains sequences that have sufficient identity to the target sequence and direct specific binding of the complex (Cas 12 a+crrna) to the target sequence. It is within the ability of the person skilled in the art to design a suitable gRNA sequence based on the CRISPR nuclease used and the target sequence to be edited.
As used herein, "genome" encompasses not only chromosomal DNA present in the nucleus of a cell, but also organelle DNA present in subcellular components of the cell (e.g., mitochondria, plastids).
As used herein, "cell" includes cells of any organism suitable for genome editing. Examples of organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, cats; poultry such as chickens, ducks, geese; plants include monocots and dicots such as rice, maize, wheat, sorghum, barley, soybean, peanut, arabidopsis, and the like.
By "genetically modified organism" or "genetically modified cell" is meant an organism or cell comprising within its genome an exogenous polynucleotide or modified gene or expression control sequence. For example, an exogenous polynucleotide can be stably integrated into the genome of an organism or cell and inherit successive generations. The exogenous polynucleotide may be integrated into the genome alone or as part of a recombinant DNA construct. Modified genes or expression control sequences are those in which the sequence comprises single or multiple deoxynucleotide substitutions, deletions and additions in the genome of the organism or cell.
"Exogenous" with respect to a sequence means a sequence from a foreign species, or if from the same species, a sequence that has undergone significant alteration in composition and/or locus from its native form by deliberate human intervention.
"Polynucleotide", "nucleic acid sequence", "nucleotide sequence" or "nucleic acid fragment" are used interchangeably and are single-or double-stranded RNA or DNA polymers, optionally containing synthetic, unnatural or altered nucleotide bases. Nucleotides are referred to by their single letter designations as follows: "A" is adenosine or deoxyadenosine (corresponding to RNA or DNA, respectively), "C" represents cytidine or deoxycytidine, "G" represents guanosine or deoxyguanosine, "U" represents uridine, "T" represents deoxythymidine, "R" represents purine (A or G), "Y" represents pyrimidine (C or T), "K" represents G or T, "H" represents A or C or T, "I" represents inosine, and "N" represents any nucleotide.
"Polypeptide", "peptide", and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogues of the corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The terms "polypeptide", "peptide", "amino acid sequence" and "protein" may also include modified forms including, but not limited to, glycosylation, lipid attachment, sulfation, gamma carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.
As used herein, an "expression construct" refers to a vector, such as a recombinant vector, suitable for expression of a nucleotide sequence of interest in an organism. "expression" refers to the production of a functional product. For example, expression of a nucleotide sequence may refer to transcription of the nucleotide sequence (e.g., transcription into mRNA or functional RNA) and/or translation of RNA into a precursor or mature protein.
The "expression construct" of the invention may be a linear nucleic acid fragment, a circular plasmid, a viral vector, or in some embodiments, may be an RNA (e.g., mRNA) capable of translation.
The "expression construct" of the invention may comprise regulatory sequences of different origin and nucleotide sequences of interest, or regulatory sequences and nucleotide sequences of interest of the same origin but arranged in a manner different from that normally found in nature.
"Regulatory sequence" and "regulatory element" are used interchangeably and refer to a nucleotide sequence that is located upstream (5 'non-coding sequence), intermediate or downstream (3' non-coding sequence) of a coding sequence and affects transcription, RNA processing or stability, or translation of the relevant coding sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
"Promoter" refers to a nucleic acid fragment capable of controlling transcription of another nucleic acid fragment. In some embodiments of the invention, the promoter is a promoter capable of controlling transcription of a gene in a cell, whether or not it is derived from the cell. The promoter may be a constitutive or tissue specific or developmentally regulated or inducible promoter.
"Constitutive promoter" refers to a promoter that will generally cause a gene to be expressed in most cases in most cell types. "tissue-specific promoter" and "tissue-preferred promoter" are used interchangeably and refer to promoters that are expressed primarily, but not necessarily exclusively, in one tissue or organ, but also in one particular cell or cell type. "developmentally regulated promoter" refers to a promoter whose activity is determined by developmental events. An "inducible promoter" selectively expresses an operably linked DNA sequence in response to an endogenous or exogenous stimulus (environmental, hormonal, chemical signal, etc.).
As used herein, the term "operably linked" refers to a regulatory element (e.g., without limitation, a promoter sequence, a transcription termination sequence, etc.) linked to a nucleic acid sequence (e.g., a coding sequence or an open reading frame) such that transcription of the nucleotide sequence is controlled and regulated by the transcription regulatory element. Techniques for operably linking a regulatory element region to a nucleic acid molecule are known in the art.
"Introducing" a nucleic acid molecule (e.g., plasmid, linear nucleic acid fragment, RNA, etc.) or protein into an organism refers to transforming a cell of the organism with the nucleic acid or protein such that the nucleic acid or protein is capable of functioning in the cell. "transformation" as used herein includes both stable transformation and transient transformation. "Stable transformation" refers to the introduction of an exogenous nucleotide sequence into the genome, resulting in stable inheritance of an exogenous gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generation thereof. "transient transformation" refers to the introduction of a nucleic acid molecule or protein into a cell to perform a function without stable inheritance of an exogenous gene. In transient transformation, the exogenous nucleic acid sequence is not integrated into the genome.
2. Genome editing fusion polypeptides
The present invention provides an isolated fusion polypeptide, wherein the fusion polypeptide comprises a CRISPR nuclease and a 5'→3' exonuclease.
The CRISPR nuclease of the present invention can be any CRISPR nuclease capable of genome editing. In some embodiments, the CRISPR nuclease is Cas9 or an active fragment thereof, such as Cas9 from streptococcus pyogenes (SpCas 9), cas9 from staphylococcus aureus (SaCas 9), cas9 from FRANCISELLA NOVICIDA (FnCas 9), cas9 from vibrio jejuni (Campylobacter jejuni) (CjCas 9), and Cas9 from neisseria griseus (NEISSERIA CINEREA) (NcCas 9). In some embodiments, the CRISPR nuclease is Cas12a or an active fragment thereof, e.g., cas12a (FnCas a) from FRANCISELLA NOVICIDA U112, cas12a of the amino coccus species (Acidaminococcus sp.) BV3L6, and Cas12a (LbCas a) of Mao Luoke bacteria (Lachnospiraceae bacterium) ND 2006. In one embodiment, the amino acid sequence of the CRISPR nuclease is selected from the group consisting of SEQ ID NO:8 or 15. In one embodiment, the nucleotide sequence encoding the CRISPR nuclease is selected from the group consisting of SEQ ID NO:9 or 16.
The 5 '. Fwdarw.3 ' exonuclease according to the present invention may be an exonuclease degrading DNA from the 5' end, i.e., in the 5' to 3' direction. In one embodiment, the exonuclease can digest double-stranded DNA (dsDNA). In some embodiments, the exonuclease can digest single stranded DNA (ssDNA). In some embodiments, the exonuclease can digest double-stranded DNA (dsDNA) and single-stranded DNA (ssDNA). In some embodiments, the exonuclease is not a 3 '. Fwdarw.5' exonuclease. In some embodiments, the 5 '. Fwdarw.3' exonuclease is a T5 exonuclease, e.g., a phage T5 gene D15 product. In some embodiments, the T5 exonuclease comprises a nucleotide sequence that hybridizes to SEQ ID NO:3, an amino acid sequence having at least 80%, at least 90%, at least 95%, at least 99% or 100% sequence identity. In some embodiments, the T5 exonuclease consists of a nucleotide sequence that hybridizes to SEQ ID NO:4, a nucleotide sequence encoding a sequence having at least 80%, at least 90%, at least 95%, at least 99% or 100% sequence identity. In some preferred embodiments, the T5 exonuclease comprises SEQ ID NO:3, and a sequence of amino acids. In a preferred embodiment, the T5 exonuclease consists of SEQ ID NO:4, and a nucleotide sequence encoding the same.
In the polypeptide of the invention, the 5 '. Fwdarw.3' exonuclease and the CRISPR nuclease may be fused directly or indirectly. In some embodiments, the 5 '. Fwdarw.3' exonuclease is directly fused to the CRISPR nuclease. In some embodiments, the 5 '. Fwdarw.3' exonuclease and the CRISPR nuclease can be fused indirectly, e.g., by a linker. The linker may be a nonfunctional amino acid sequence 1-50 (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20 or 20-25, 25-50) or more amino acids long, without secondary or higher structure. For example, the joint may be a flexible joint. In some embodiments, the amino acid sequence of the linker is selected from the group consisting of SEQ ID NOs: 5 or 14.
In the polypeptide of the invention, the 5 '. Fwdarw.3' exonuclease is located at the N-terminus and/or the C-terminus of the CRISPR nuclease. In some embodiments, the 5 '. Fwdarw.3' exonuclease is located at the N-terminus of the CRISPR nuclease. In some embodiments, the 5 '. Fwdarw.3' exonuclease is located at the C-terminus of the CRISPR nuclease.
In some embodiments, the isolated fusion polypeptide further comprises a Nuclear Localization Sequence (NLS). In general, one or more NLS in the polypeptide should be of sufficient strength to drive the polypeptide in the nucleus of the cell to accumulate in an amount that can fulfill its genome editing function. In general, the intensity of the nuclear localization activity is determined by the number, location, one or more specific NLS(s) used, or a combination of these factors in the polypeptide.
In some embodiments of the invention, the NLS of the polypeptide of the invention may be located at the N-terminus and/or the C-terminus. In some embodiments of the invention, the NLS of the polypeptide of the invention may be located between the 5'→3' exonuclease and the CRISPR nuclease. In some embodiments, the polypeptide comprises about 1,2, 3, 4,5, 6, 7, 8, 9, 10 or more NLSs. In some embodiments, the polypeptide comprises about 1,2, 3, 4,5, 6, 7, 8, 9, 10 or more NLS at or near the N-terminus. In some embodiments, the polypeptide comprises about 1,2, 3, 4,5, 6, 7, 8, 9, 10 or more NLS at or near the C-terminus. In some embodiments, the polypeptide comprises a combination of these, such as comprising one or more NLS at the N-terminus and one or more NLS at the C-terminus. When there is more than one NLS, each may be selected to be independent of the other NLS.
Generally, NLS consists of one or more short sequences of positively charged lysines or arginines exposed on the protein surface, but other types of NLS are also known. In some embodiments, the amino acid sequence of the NLS of the invention is selected from the group consisting of SEQ ID NOs: 6. 7, 12 or 13.
In addition, the isolated fusion polypeptides of the invention may also include other targeting sequences, such as cytoplasmic targeting sequences, chloroplast targeting sequences, mitochondrial targeting sequences, etc., depending on the desired editing of the DNA location.
In some embodiments, the isolated fusion polypeptide comprises, from N-terminus to C-terminus: the 5 '. Fwdarw.3' exonuclease, an NLS, the CRISPR nuclease, and another NLS. In some embodiments, the isolated fusion polypeptide comprises, from N-terminus to C-terminus: an NLS, the 5 '. Fwdarw.3' exonuclease, the CRISPR nuclease, and another NLS.
In some preferred embodiments, the isolated fusion polypeptide of the invention comprises the amino acid sequence of SEQ ID NO:1 or 10.
The invention also provides isolated polynucleotides encoding the fusion polypeptides of the invention. In some embodiments, the polynucleotide comprises SEQ ID NO:2 or 11 or a degenerate variant thereof.
In order to obtain efficient expression, in some embodiments, the polynucleotide is codon optimized for the organism being edited, e.g., a plant.
Codon optimization refers to a method of modifying a nucleic acid sequence to enhance expression in a host cell of interest by replacing at least one codon of a native sequence with a more or most frequently used codon in the gene of the host cell (e.g., about or more than about 1, 2,3, 4,5, 10, 15, 20, 25, 50 or more codons while maintaining the native amino acid sequence; different species exhibit specific preferences for certain codons of a particular amino acid; codon preference (difference in codon usage between organisms) is often correlated with translation efficiency of messenger RNA (mRNA) which is believed to depend on the nature of the translated codon and availability of a particular transfer RNA (tRNA) molecule; the dominance of selected tRNA within the cell generally reflects the most frequently used codon for peptide synthesis; thus, genes can be tailored to optimal gene expression in a given organism based on codon optimization; codon usage tables can be readily obtained, e.g., at www.kazusa.orjp/codon/availability ("difference in codon usage" in Codon Usage Database) and can be adapted in the same way as those of Namura, see, e.g., the database, and the like ,"Codon usage tabulated from the international DNA sequence databases:status for the year2000.Nucl.Acids Res.,28:292(2000).
In some embodiments, the isolated fusion polypeptide of the invention, the coding sequence of the 5 '. Fwdarw.3' exonuclease and/or the coding sequence of the CRISPR nuclease are codon optimized for the organism being edited. In some embodiments, the isolated fusion polypeptides of the invention, the coding sequence of the 5 '. Fwdarw.3' exonuclease and/or the coding sequence of the CRISPR nuclease are codon optimized for rice (Oryza sativa).
3. Improved genome editing system
The inventors surprisingly found that fusion of T5 exonuclease with a CRISPR nuclease such as Cas9 or Cas12a can produce larger deletions using only one gRNA at a specific target site and greatly improve editing efficiency. More unexpectedly, when the fusion polypeptide of T5 exonuclease and CRISPR nuclease transformed cells, no cytotoxicity was observed, which makes the fusion polypeptide particularly suitable for genome editing of cells.
Thus, in a further aspect the invention provides the use of an isolated fusion polypeptide of the invention for genome editing of a cell.
In another aspect, the present invention provides a genome editing system comprising at least one of the following i) to v):
i) Isolated fusion polypeptides and guide RNAs of the invention;
ii) an expression construct comprising a nucleotide sequence encoding a fusion polypeptide of the invention, and a guide RNA;
iii) The isolated fusion polypeptides of the invention, and expression constructs comprising a nucleotide sequence encoding a guide RNA;
iv) an expression construct comprising a nucleotide sequence encoding a fusion polypeptide of the invention, and an expression construct comprising a nucleotide sequence encoding a guide RNA;
v) an expression construct comprising a nucleotide sequence encoding a fusion polypeptide of the invention and a nucleotide sequence encoding a guide RNA.
As used herein, a "genome editing system" refers to a combination of components required for genome editing of an intracellular genome. Wherein the individual components of the system, e.g., fusion polypeptides, guide RNAs, etc., may each be independently present or may be present in any combination as a composition.
In some embodiments, wherein the guide RNA is sgRNA. In some embodiments, wherein the guide RNA is a sgRNA and the sgrnas are not paired. Methods for constructing suitable sgrnas according to a given target sequence are known in the art. See, for example, the literature :Wang,Y.et al.Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew.Nat.Biotechnol.32,947-951(2014);Shan,Q.et al.Targeted genome modification of crop plants using a CRISPR-Cas system.Nat.Biotechnol.31,686-688(2013);Liang,Z.et a1.Targeted mutagenesis in Zea mays using TALENs and the CRISPR/Cas system.J Genet Genomics.41,63-68(2014).
The design of target sequences that can be recognized and targeted by CRISPR nucleases and guide RNA complexes is within the skill of one of ordinary skill in the art. In general, for Cas9, the target sequence is a sequence complementary to a guide sequence of about 20 nucleotides contained in the guide RNA, and the 3 'end is immediately adjacent to the proscenium sequence adjacent motif (protospacer adjacent motif) (PAM), e.g., 5' -NGG. Whereas for Cas12a it is typically desirable to include PAM at the 5 'end of the target sequence, which may be, for example, 5' -TTTN.
In some embodiments, the CRISPR system of the present invention comprises at least one of ii) to v) above. In some embodiments, the nucleotide sequence encoding the fusion polypeptide of the invention and/or the nucleotide sequence encoding the guide RNA is operably linked to an expression control sequence, preferably a plant expression control sequence, such as a promoter.
Examples of promoters that can be used in the present invention include, but are not limited to: the cauliflower mosaic virus 35S promoter (Odell et al (1985) Nature 313:810-812), the maize Ubi-1 promoter, the wheat U6 promoter, the rice U3 promoter, the maize U3 promoter, the rice actin promoter, trpPro promoter (U.S. patent application Ser. No.10/377,318; 16. 2005), pEMU promoter (Last et al (1991) Theor. Appl. Genet. 81:581-588), the MAS promoter (Velten et al (1984) EMBO J.3:2723-2730), the maize H3 histone promoter (LEPETIT ET A1. (1992) mol. Gen. Genet.231:276-285 and Atanasva et al (1992) Plant J.2 (3): 300) and the European (Brassanasus ALS) 3 (PCT application WO 97/41228). Promoters useful in the present invention also include Moore et al (2006) Plant j.45 (4): 651-683.
In an exemplary embodiment, the construct of the invention comprises the maize Ubi-1 promoter.
4. Method for modifying target sequence in cell genome
In another aspect, the invention provides a method of modifying a target sequence in the genome of a cell, comprising introducing into the cell a genome editing system of the invention.
In some embodiments, the modification results in the deletion of one or more nucleotides, preferably a plurality of consecutive nucleotides, in the target sequence. In some embodiments, the deletion comprises 1-500 or even more consecutive nucleotides.
In some embodiments, the deletion is within the target sequence. In some embodiments, the modification does not include an insertion and/or substitution mutation.
In another aspect, the invention also provides a method of producing a genetically modified cell comprising introducing into the cell a gene editing system of the invention.
In the present invention, the target sequence to be modified may be located at any position of the genome, for example, within a functional gene such as a protein-encoding gene, or may be located, for example, in a gene expression regulatory region such as a promoter region or an enhancer region, thereby effecting functional modification of the gene or modification of gene expression. Modifications in the cellular target sequence may be detected by T7EI, PCR/RE, or sequencing methods. The genome editing system of the present invention is particularly suitable for modification of regulatory sequences such as promoters or non-coding sequences and the like.
In the methods of the invention, the genome editing system may be introduced into cells by various methods well known to those skilled in the art.
Methods useful for introducing the genome editing system of the invention into cells include, but are not limited to: calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (e.g., baculovirus, vaccinia virus, adenovirus, and other viruses), gene gun method, PEG-mediated protoplast transformation, agrobacterium-mediated transformation.
Cells that can be genome edited by the methods of the invention can be from, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens, ducks, geese; plants, including monocots and dicots, such as rice, maize, wheat, sorghum, barley, soybean, peanut, arabidopsis, and the like. Preferably, the cell is a plant cell, such as a rice cell.
In some embodiments, the methods of the invention are performed in vitro. For example, the cell is an isolated cell. In other embodiments, the methods of the invention may also be performed in vivo. For example, the cell is a cell in an organism, into which the system of the invention can be introduced in vivo, for example by a virus-mediated method. In some embodiments, the cell is a germ cell. In some embodiments, the cell is a somatic cell.
In another aspect, the invention also provides a genetically modified organism comprising a genetically modified cell produced by the method of the invention.
Such organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, cats; poultry such as chickens, ducks, geese; plants, including monocots and dicots, such as rice, maize, wheat, sorghum, barley, soybean, peanut, arabidopsis, and the like. Preferably, the organism is a plant, preferably rice.
5. Kit for detecting a substance in a sample
In yet another aspect, also included within the scope of the invention is a kit for use in the methods of the invention, the kit comprising the genome editing system of the invention, and instructions for use. Kits generally include a label that indicates the intended use and/or method of use of the kit contents. The term label includes any written or recorded material provided on or with or otherwise with the kit.
Examples
In order that the invention may be readily understood, a more particular description of the invention will be rendered by reference to specific embodiments that are illustrated in the appended drawings. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Method of
Plasmid construction
The T5 exonuclease coding sequence has been codon optimized for rice (Oryza sativa) and commercially synthesized (GenScript, nanjing, china). The T5 coding sequence is fused in-frame to the 5' end of Cas9 or Cas12a by a Gibson assembly to produce p163-T5exo-Cas9 or p163-T5exo-Cas12a, respectively. To construct the pH-T5exo-Cas9-sgRNA binary vector, the T5exo-Cas9 expression cassette was cloned into pHUE framework 411 (Xing et al, 2014). To construct the pCambia-T5exo-Cas12a-crRNA binary vector, T5exo-Cas12a and crRNA expression cassettes were cloned into the pCambia2300 backbone.
Protoplast transfection
Seedlings of yellow Japonica rice (Japonica rice) grown on a medium were used to prepare protoplasts. Isolation and transformation of protoplasts was performed as described previously (Shan et al, 2013; wang et al, 2014). Plasmid DNA (10 μg of each construct) was delivered to the protoplasts by PEG-mediated transfection, and the transfected protoplasts were then incubated at 28 ℃. 48 hours after transfection, protoplasts were collected to extract genomic DNA for restriction enzyme assays or amplicon depth sequencing.
Agrobacterium-mediated transformation of rice
The binary vector was introduced into Agrobacterium tumefaciens strain AGL1 by electroporation. As previously reported, agrobacterium-mediated transformation of rice cultivar Nipponbare and regeneration of rice plants was performed (Hiei et al, 1994). Rice calli edited by Cas9/T5exo-Cas9 were selected on medium containing hygromycin (50. Mu.g/ml) and calli edited by Cas12a/T5exo-Cas12a were selected on medium containing G418 (60. Mu.g/ml).
Next generation sequencing of amplicons extracted and targeted by plant genomic DNA
Genomic DNA was extracted from protoplasts and seedlings using the CTAB method (Murray et al, 1980) and then used as a template for PCR amplification. In the first round of PCR, the target region is amplified using site-specific primers. In the second round of PCR, both forward and reverse barcodes were added to the ends of the PCR products for library construction. Equal amounts of PCR products were collected and samples were commercially sequenced by paired end read sequencing using Illumina NextSeq 500 platform (GENEWIZ, su, china). The sequencing reads were checked for indels of the sgRNA target site. Using genomic DNA extracted from three independent protoplast samples, 3 amplicon sequencing was repeated for each target site.
Off-target detection
The inventors examined the potential off-target effects of Cas or T5exo-Cas in OsXa and OsPDS rice mutants, respectively. Potential off-target sites were predicted in the Nipponbare genome by the online tool Cas-OFFinder (Bae et al, 2014). Four potential off-target sites in OsXa rice mutants have 3-4 nucleotide mismatches. Five potential off-target sites with 4-5 nucleotide mismatches in the OsPDS mutants were selected. Locus specific primers flanking these off-target sites were designed. Amplicons of potential off-target sites (about 700 to 1000 bp) were sequenced by Sanger sequencing.
Pathogen inoculation and virulence determination
The Xoo strain PXO99 was inoculated onto two recently fully developed leaves of rice seedlings by six-leaf stage cut (Yang et al, 2013) as previously described (Yang et al, 2013). Disease symptoms were scored by measuring lesion length.
Example 1: fusion of T5 exonuclease to Cas9 alters the indel characteristics of genome editing
To generate larger genomic deletions using the CRISPR/Cas system in higher plants, the inventors selected T5, a well-studied exonuclease that degrades DNA in the 5'- > 3' direction (Kaliman et al, 1986) and fuses it to the N-terminus of Cas9 under the same reading frame (fig. 1 a). To test whether a T5exo-Cas9 fusion protein can alter indel characteristics, the inventors transfected T5exo-Cas9 and Cas9 plasmids, respectively, into rice protoplasts along with the sgRNA OsMKK5-T1 plasmid. Genomic DNA extracted from protoplasts 48h after transfection was first digested with HindIII to reduce unedited DNA, which was then used as template for targeted PCR amplification. To identify indels, the purified PCR products were cloned and sequenced by Sanger sequencing. The resulting indel signature shows that the percentage of deletions in the indels produced by the T5exo-Cas9 fusion is greatly increased, while the portion of insertions is reduced relative to Cas9 (fig. 1 b). The inventors further found that Cas 9-induced deletions were predominantly 1-2bp in size (fig. 1 d), which is consistent with previous reports (Zhang et al, 2014). In contrast, T5exo-Cas9 fusion-induced deletions were variable and larger, up to 446bp (fig. 1 c). These results indicate that fusion of T5 exonuclease with Cas9 contributes to deletions during genome editing.
Example 2: t5exo-Cas9 fusions induced higher frequency and increased the size of genomic deletions
To more thoroughly examine the indels generated by the T5exo-Cas9 fusion, the inventors designed four sgRNAs targeting different genomic loci of rice (OsMPK 16-T1, osCDC48-T1, osALS-T1, osXa-T1) (Table 1). These sgrnas are transformed into rice protoplasts with T5exo-Cas9 fusion or Cas9, respectively. Indels produced at four target sites were analyzed by targeted depth sequencing. Consistently, the T5exo-Cas9 fusion induced significantly more deletions than Cas9. For OsMPK16-T1, the deletion rate increased from 20.1 to 86.5 (4.3 fold); for OsCDC48-T1, the deletion rate increased from 71.8 to 95.6 (1.3 fold); for OsALS-T1, the deletion rate was from 76.4 to 97.4 (1.3 fold); and for OsXa-T1, the deletion rate increased from 22.8 to 90.6 (4.0 fold) (FIG. 2 a). These results demonstrate that during genome editing, T5exo-Cas9 induced deletions more frequently than Cas9.
Table 1: summary of sgRNA target sites and corresponding oligonucleotides for vector construction.
* PAM motifs in each target sequence are shown in bold
Next, the inventors analyzed the size of all deletions made by the T5exo-Cas9 fusion and Cas9 at four target sites. For Cas9, most deletions are smaller than 10bp (96.5% -100%), with only a small portion (0-3.5%) being larger than 10bp. The deletion patterns of OsXa-T1 and OsALS-T1 were predominantly around 1-3bp (FIG. 2 b). The T5exo-Cas9 fusion produced a large deletion, with a deletion of about 16.1% -35.8% greater than 10bp, with an average deletion size of 33-44bp (FIG. 2 b). Interestingly, the genome editing efficiency of Cas9 appears to be enhanced by T5 fusion (fig. 2 c).
Example 3: t5exo-Cas9 fusions produce larger genomic deletions in transgenic rice plants
To demonstrate that rice mutant plants were produced using T5exo-Cas9 fusions, the present inventors performed agrobacterium-mediated rice transformation with a OsXa exo-Cas9 or Cas9, respectively, binary vector expressing OsXa-T1, which was targeted to the UPT PthXo1 box of the OsXa gene promoter (upregulated by the transcription activator-like effector PthXo 1). For Cas9, the inventors obtained 42T 0 transformants, 36 of which were edited with an editing efficiency of 85.7%. Of the lines compiled, 12 were homozygote mutants and 24 were double mutants (FIG. 3 a). 82% of the indel patterns had 1bp insertions (FIG. 3 b). For T5exo-Cas9, 46T 0 transformants were obtained, 42 strains were edited with an editing efficiency of 91.3%. Of the edited lines, 3 lines were homozygote mutants of the single allele and 35 lines were double allele mutants (fig. 3 a). The frequency of deletions was 72%, with 35% of the deletion mutants having deletions of more than 3bp at the target site (FIG. 3 b). Overall, in transgenic rice plants, the T5exo-Cas9 fusion induced higher frequency and greater genome deletion than Cas9, consistent with the results observed in rice protoplasts (fig. 2). In addition, it appears that fusion of T5 with Cas9 enhances genome editing efficiency in transgenic rice plants, similar to that seen in protoplasts.
The UPT PthXo1 box (25 bp) in the OsXa gene promoter is the only Xoo-responsive cis-acting element (Yuan et al, 2011). Naturally occurring deletions in the UPT PthXo1 frame of OsXa result in recessive resistance to Xanthomonas oryzae (Xanthomonas oryzae pv. Oryzae) (Xoo), including the PXO99 strain (Chu et al, 2006). Thus, the present inventors examined PXO99 resistance phenotypes of various homozygous deletion mutants produced by T5exo-Cas9 and Cas9 by a leaf-cutting method. The inventors found that the average lesion length formed on wild-type leaves was about 13 cm. Lesions were about 7-8cm in length on mutants with 1bp insertions or deletions of no more than 2bp on the allele. The 4/-12bp biallelic mutant exhibited the strongest resistance, with lesion lengths of only about 3cm (FIGS. 3c and 3 d). This result suggests that the T5exo-Cas9 fusion may promote loss-of-function mutations of the cis-regulatory element.
The inventors further examined the effect of T5 fusions on Cas9 off-target activity by measuring the frequency of indels of putative off-target sites. For OsXa13-T1 (Table 2), four potential off-target sites with three to four mismatches (Table 2) were identified using the online tool Cas-OFFinder (Table 2) (Bae et al, 2014). The inventors amplified DNA fragments covering these potential off-target sites from T5exo-Cas9 and Cas9 generated mutants. Sequencing of the targeted amplicon indicated that no mutation of these potential off-target sites was detected in the mutants generated by Cas9 and T5exo-Cas9, indicating that fusion of the T5 exonuclease did not alter the off-target activity of Cas 9.
TABLE 2 potential off-target sites in rice
* PAM sequences in each target sequence are shown in bold. Positions that do not match the preselected target are shown underlined
Example 4: fusion of T5 exonuclease with Cas12a increases deletion frequency and enlarges deletion size
To test whether T5 fusions are suitable for other Cas nucleases, the inventors also fused T5 exonucleases to the N-terminus of Cas12a in frame using XTEN linkers (Schellenbergerv et al, 2009). The fusion gene was driven by the maize Ubiquitin-1 promoter (Ubi-1) (FIG. 4 a). Three sgrnas (OsBADH-T1, osEPSPs-T1, osPDS-T1) were designed targeting different genomic loci in rice (table 1). Each sgRNA was transformed into rice protoplasts along with T5exo-Cas12a or Cas12a and editing of each gene was assessed by targeted amplicon depth sequencing. Similar to that observed with T5exo-Cas9, T5exo-Cas12a also induced a higher deletion frequency relative to Cas12a, with OsBADH-T1, the insertion rate was drastically reduced from 6.2 to 0.2 (31.0 fold); for OsEPSPs-T1, the insertion rate was reduced from 11.2 to 1.1 (10.2 times); for OsPDS-T1, from 7.7 to 1.4 (5.5 times) (FIG. 4 b). The inventors then analyzed the size of all deletions made by the T5exo-Cas12a fusion and Cas12a at three target sites. For Cas12a, the deletions are mostly less than 15bp at all three target sites, and concentrated around 6-10 bp. As expected, the T5exo-Cas12a fusions induced a greater deletion per site, and the proportion of > 15bp deletions induced by these T5exo-Cas12a fusions at these target sites was on average 8.6 times higher than Cas12a (fig. 4 c). Taken together, these results support that fusion of T5 exonuclease to Cas12a increases the frequency and size of genomic deletions at the guide RNA target locus. In addition, the genome editing efficiency of the T5exo-Cas12a fusion was higher (1.34-1.47 fold) than Cas12a for all three target sites (fig. 4 d).
Example 5: t5exo-Cas12a fusion produces a larger genomic deletion in transgenic rice plants
The inventors also performed agrobacterium-mediated rice transformation with binary vectors expressing T5exo-Cas12a or Cas12a, or guide RNAs targeting OsPDS genes (table 1). For Cas12a, 128T 0 transformants were obtained, 21 of which were edited with an editing efficiency of 16.4%. For the OsPDS position, all editing features were deletions, most of which were less than 1-15bp, with only 11.5% of the deletions being greater than 15bp (FIG. 5 a). For T5exo-Cas12a, the inventors obtained 150T 0 transformants with a mutation frequency of 28.7% and about 1.8 times that of Cas12 a. For the deletions generated by T5exo-Cas12a, 46.8% was greater than 15bp (fig. 5 a). Wherein, 11.3% of the deletions were greater than 30bp (FIG. 5 b). This result supports that the T5 fusion enhances the genomic deletion and genome editing efficiency of Cas12a in transgenic rice plants.
The inventors also examined the off-target effect of the sgsn rna targeting OsPDS gene on T5exo-Cas12 a. 5 potential off-target sites with 5 to 6 mismatches were identified using the online tool Cas-OFFinder (table 2). DNA fragments covering potential off-target sites were amplified from the mutants produced by T5exo-Cas12a and Cas12 a. Sequencing of the targeted amplicon indicated that no mutations were detected at these potential off-target sites in the mutants produced by T5exo-Cas12a and Cas12a, suggesting that fusion of the T5 exonuclease did not alter the off-target activity of Cas12 a.
Provided herein is a novel method by which Cas9 or Cas12a can be fused to a T5 exonuclease to create larger genomic deletions with one guide RNA at a given target point. As shown by experiments in rice protoplasts and seedlings, the T5exo-Cas fusion caused an increase in both frequency and size of deletions at the target genomic site. Furthermore, the genome editing efficiency of Cas9 and Cas12a is improved by fusion of T5 exonucleases. The T5exo-Cas fusion expands the CRISPR kit and facilitates knockout of regulatory and non-coding DNA. More broadly, the results of the present invention suggest a general strategy for the generation of larger deletions for other Cas nucleases.
Without being bound by any theory, it is speculated that T5 exonucleases can degrade the 5' end of DSBs generated by Cas9 or Cas12a, resulting in increased frequency and increased deletion size when NHEJ repairs DSBs. The different deletion sizes at one genomic site may be due to the different durations of binding of the T5exo-Cas9 or T5exo-Cas12a fusion proteins to DNA, which determines the activity of the T5 exonuclease at the DNA ends. Interestingly, the T5exo-Cas12a fusion produced a larger deletion than T5exo-Cas 9. This may be due to the viscous end produced by Cas12a and the blunt end produced by Cas 9. T5 exonucleases are reported to bind more strongly to DNA duplex with 5' -overhangs than to DNA duplex with blunt ends (Garforth et al, 1997).
The larger genomic deletions generated by the T5exo-Cas fusion will greatly facilitate functional analysis of regulatory sequences and non-coding sequences (such as lncRNA, miRNA and cis elements) because small indels in these regions are highly likely not to generate a loss-of-function phenotype. In this study, the inventors also observed that the small indels (+1/-2) generated in the UPT PthXo1 box of the OsXa promoter failed to knock out its function, but that the larger deletions (-4/-12) induced by the T5exo-Cas9 fusion disrupted the function of the UPT PthXo1 box. Recently, rice mutants of 149bp deleted covering the UPT PthXo1 box were obtained using paired sgRNAs, which showed strong Xoo resistance without affecting fertility (Li et al, 2019). However, the 149bp deletion is much larger than the UPT PthXo1 box (25 bp), which may affect other regulatory sequences in this region. In contrast, most deletions made by the T5exo-Cas fusion are in the UPT PthXo1 box, suggesting that the T5exo-Cas fusion provides a more precise strategy to knock out short regulatory sequences and non-coding sequences. Furthermore, it is not easy to design two sgrnas targeting such short sequences, and the present invention would be useful with a new tool that uses only one sgRNA.
One concern of T5 fusion is the potential toxicity of T5 when expressed in exogenous cells, which was originally proposed in bacteria (Kaliman et al, 1986). However, no visible phenotype or defect in growth was observed in transgenic rice expressing T5exo-Cas9 or T5exo-Cas12a, indicating that T5 fusions did not affect plant growth and development.
In summary, the inventors developed a new and efficient strategy that could create larger deletions with one guide RNA based on using fusion strategies for T5 exonuclease with Cas9 or Cas12 a.
Sequence listing
>SEQ ID NO:1 T5exo-Cas9
>SEQ ID NO:2 T5exo-Cas9
T5exo-Linker-NLS1-Cas9-NLS2
the NLS,Cas9,T5 exonuclease and Linker are highlighted in gray,purple,blue and orange respectively.
> SEQ ID NO:3 T5 exonuclease
> SEQ ID NO:4 T5 exonuclease
>SEQ ID NO:5 Linker
>SEQ ID NO:6 NLS1
>SEQ ID NO:7 NLS2
>SEQ ID NO:8 Cas9
>SEQ ID NO:9 Cas9
>SEO ID NO:10 T5exo-Cas12a
>SEQ ID NO:11 T5exo-Cas12a
NLS3-T5exo-Linker-Cas9-NLS4
the NLS,Cas12a,T5 exonuclease and XTEN linker are highlighted in gray,green,blue and yellow respectively.
>SEQ ID NO:12 NLS3
>SEQ ID NO:13 NLS3
>SEQ ID NO:14 XTEN linker
>SEQ ID NO:15 Cas12a
>SEQ ID NO:16 Cas12a
Sequence listing
<110> Institute of microorganisms at national academy of sciences
<120> Improved genome editing System
<130> I2019TC3889CB
<160> 16
<170> PatentIn version 3.5
<210> 1
<211> 1709
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 1
Met Ser Lys Ser Trp Gly Lys Phe Ile Glu Glu Glu Glu Ala Glu Met
1 5 10 15
Ala Ser Arg Arg Asn Leu Met Ile Val Asp Gly Thr Asn Leu Gly Phe
20 25 30
Arg Phe Lys His Asn Asn Ser Lys Lys Pro Phe Ala Ser Ser Tyr Val
35 40 45
Ser Thr Ile Gln Ser Leu Ala Lys Ser Tyr Ser Ala Arg Thr Thr Ile
50 55 60
Val Leu Gly Asp Lys Gly Lys Ser Val Phe Arg Leu Glu His Leu Pro
65 70 75 80
Glu Tyr Lys Gly Asn Arg Asp Glu Lys Tyr Ala Gln Arg Thr Glu Glu
85 90 95
Glu Lys Ala Leu Asp Glu Gln Phe Phe Glu Tyr Leu Lys Asp Ala Phe
100 105 110
Glu Leu Cys Lys Thr Thr Phe Pro Thr Phe Thr Ile Arg Gly Val Glu
115 120 125
Ala Asp Asp Met Ala Ala Tyr Ile Val Lys Leu Ile Gly His Leu Tyr
130 135 140
Asp His Val Trp Leu Ile Ser Thr Asp Gly Asp Trp Asp Thr Leu Leu
145 150 155 160
Thr Asp Lys Val Ser Arg Phe Ser Phe Thr Thr Arg Arg Glu Tyr His
165 170 175
Leu Arg Asp Met Tyr Glu His His Asn Val Asp Asp Val Glu Gln Phe
180 185 190
Ile Ser Leu Lys Ala Ile Met Gly Asp Leu Gly Asp Asn Ile Arg Gly
195 200 205
Val Glu Gly Ile Gly Ala Lys Arg Gly Tyr Asn Ile Ile Arg Glu Phe
210 215 220
Gly Asn Val Leu Asp Ile Ile Asp Gln Leu Pro Leu Pro Gly Lys Gln
225 230 235 240
Lys Tyr Ile Gln Asn Leu Asn Ala Ser Glu Glu Leu Leu Phe Arg Asn
245 250 255
Leu Ile Leu Val Asp Leu Pro Thr Tyr Cys Val Asp Ala Ile Ala Ala
260 265 270
Val Gly Gln Asp Val Leu Asp Lys Phe Thr Lys Asp Ile Leu Glu Ile
275 280 285
Ala Glu Gln Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly
290 295 300
Gly Ser Gly Ser Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His
305 310 315 320
Gly Val Pro Ala Ala Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile
325 330 335
Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val
340 345 350
Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile
355 360 365
Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala
370 375 380
Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg
385 390 395 400
Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala
405 410 415
Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val
420 425 430
Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val
435 440 445
Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg
450 455 460
Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr
465 470 475 480
Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu
485 490 495
Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln
500 505 510
Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala
515 520 525
Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser
530 535 540
Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn
545 550 555 560
Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn
565 570 575
Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser
580 585 590
Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly
595 600 605
Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala
610 615 620
Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala
625 630 635 640
Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp
645 650 655
Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr
660 665 670
Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile
675 680 685
Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile
690 695 700
Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg
705 710 715 720
Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro
725 730 735
His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu
740 745 750
Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile
755 760 765
Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn
770 775 780
Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro
785 790 795 800
Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe
805 810 815
Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val
820 825 830
Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu
835 840 845
Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe
850 855 860
Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr
865 870 875 880
Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys
885 890 895
Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe
900 905 910
Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp
915 920 925
Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile
930 935 940
Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg
945 950 955 960
Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu
965 970 975
Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile
980 985 990
Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu
995 1000 1005
Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
1010 1015 1020
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val
1025 1030 1035
Ser Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala
1040 1045 1050
Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val
1055 1060 1065
Val Asp Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn
1070 1075 1080
Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly
1085 1090 1095
Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile
1100 1105 1110
Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val Glu Asn
1115 1120 1125
Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn
1130 1135 1140
Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
1145 1150 1155
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
1160 1165 1170
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn
1175 1180 1185
Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys
1190 1195 1200
Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr
1205 1210 1215
Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu
1220 1225 1230
Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu
1235 1240 1245
Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg
1250 1255 1260
Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val
1265 1270 1275
Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys
1280 1285 1290
Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His
1295 1300 1305
Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile
1310 1315 1320
Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr
1325 1330 1335
Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu
1340 1345 1350
Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met
1355 1360 1365
Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg
1370 1375 1380
Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val
1385 1390 1395
Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser
1400 1405 1410
Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly
1415 1420 1425
Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys
1430 1435 1440
Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly
1445 1450 1455
Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys
1460 1465 1470
Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu
1475 1480 1485
Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro
1490 1495 1500
Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp
1505 1510 1515
Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn
1520 1525 1530
Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly
1535 1540 1545
Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu
1550 1555 1560
Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu
1565 1570 1575
Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr Leu Asp Glu
1580 1585 1590
Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile Leu Ala
1595 1600 1605
Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His Arg
1610 1615 1620
Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe
1625 1630 1635
Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp
1640 1645 1650
Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu
1655 1660 1665
Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr
1670 1675 1680
Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Lys Arg Pro Ala Ala
1685 1690 1695
Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
1700 1705
<210> 2
<211> 5130
<212> DNA
<213> ARTIFICIAL SEQUENCE
<220>
<223> T5exo - Linker- NLS1- Cas9-NLS2
<400> 2
atgtcaaagt cttggggcaa gttcatcgag gaggaggagg ccgagatggc gtcaaggcgc 60
aacctcatga ttgtcgacgg caccaatctg ggcttccggt tcaagcacaa caattctaag 120
aagcctttcg cctccagcta cgtgtccaca atccagagcc tcgccaagtc ctacagcgcg 180
cgcaccacaa ttgtgctggg cgacaagggc aagtcagtct tccggctgga gcatctgccg 240
gagtacaagg gcaacaggga tgagaagtac gcacagagga ccgaggagga gaaggcactc 300
gatgagcagt tcttcgagta cctcaaggac gccttcgagc tgtgcaagac cacattccca 360
accttcacaa tcaggggagt ggaggcagac gatatggcag cgtacatcgt caagctcatt 420
ggccacctgt acgatcatgt gtggctcatt tccacagacg gcgattggga caccctcctg 480
acagacaagg tctcacggtt ctctttcacc acacggaggg agtaccacct gagggatatg 540
tacgagcacc ataacgtgga cgatgtcgag cagttcatca gcctcaaggc cattatgggc 600
gatctgggcg acaatatcag gggagtcgag ggaattggag caaagagggg ctacaacatc 660
attcgggagt tcggcaatgt gctcgatatc attgaccagc tcccgctgcc aggcaagcag 720
aagtacatcc agaacctcaa tgcgtccgag gagctcctgt tccgcaatct catcctggtg 780
gatctgccga cctactgcgt cgacgcaatt gcagcagtgg gacaggatgt cctcgacaag 840
ttcacaaagg atatcctgga gattgcggag cagggtggag gcggaagtgg aggtggcggg 900
tcagggggtg gcggatctgg atccatggcc cctaagaaga agagaaaggt cggtattcac 960
ggcgttcctg cggcgatgga caagaagtat agtattggtc tggacattgg gacgaattcc 1020
gttggctggg ccgtgatcac cgatgagtac aaggtccctt ccaagaagtt taaggttctg 1080
gggaacaccg atcggcacag catcaagaag aatctcattg gagccctcct gttcgactca 1140
ggcgagaccg ccgaagcaac aaggctcaag agaaccgcaa ggagacggta tacaagaagg 1200
aagaatagga tctgctacct gcaggagatt ttcagcaacg aaatggcgaa ggtggacgat 1260
tcgttctttc atagattgga ggagagtttc ctcgtcgagg aagataagaa gcacgagagg 1320
catcctatct ttggcaacat tgtcgacgag gttgcctatc acgaaaagta ccccacaatc 1380
tatcatctgc ggaagaagct tgtggactcg actgataagg cggaccttag attgatctac 1440
ctcgctctgg cacacatgat taagttcagg ggccattttc tgatcgaggg ggatcttaac 1500
ccggacaata gcgatgtgga caagttgttc atccagctcg tccaaaccta caatcagctc 1560
tttgaggaaa acccaattaa tgcttcaggc gtcgacgcca aggcgatcct gtctgcacgc 1620
ctttcaaagt ctcgccggct tgagaacttg atcgctcaac tcccgggcga aaagaagaac 1680
ggcttgttcg ggaatctcat tgcactttcg ttggggctca caccaaactt caagagtaat 1740
tttgatctcg ctgaggacgc aaagctgcag ctttccaagg acacttatga cgatgacctg 1800
gataaccttt tggcccaaat cggcgatcag tacgcggact tgttcctcgc cgcgaagaat 1860
ttgtcggacg cgatcctcct gagtgatatt ctccgcgtga acaccgagat tacaaaggcc 1920
ccgctctcgg cgagtatgat caagcgctat gacgagcacc atcaggatct gacccttttg 1980
aaggctttgg tccggcagca actcccagag aagtacaagg aaatcttctt tgatcaatcc 2040
aagaacggct acgctggtta tattgacggc ggggcatcgc aggaggaatt ctacaagttt 2100
atcaagccaa ttctggagaa gatggatggc acagaggaac tcctggtgaa gctcaatagg 2160
gaggaccttt tgcggaagca aagaactttc gataacggca gcatccctca ccagattcat 2220
ctcggggagc tgcacgccat cctgagaagg caggaagact tctacccctt tcttaaggat 2280
aaccgggaga agatcgaaaa gattctgacg ttcagaattc cgtactatgt cggaccactc 2340
gcccggggta attccagatt tgcgtggatg accagaaaga gcgaggaaac catcacacct 2400
tggaacttcg aggaagtggt cgataagggc gcttccgcac agagcttcat tgagcgcatg 2460
acaaattttg acaagaacct gcctaatgag aaggtccttc ccaagcattc cctcctgtac 2520
gagtatttca ctgtttataa cgaactcacg aaggtgaagt atgtgaccga gggaatgcgc 2580
aagcccgcct tcctgagcgg cgagcaaaag aaggcgatcg tggacctttt gtttaagacc 2640
aatcggaagg tcacagttaa gcagctcaag gaggactact tcaagaagat tgaatgcttc 2700
gattccgttg agatcagcgg cgtggaagac aggtttaacg cgtcactggg gacttaccac 2760
gatctcctga agatcattaa ggataaggac ttcttggaca acgaggaaaa tgaggatatc 2820
ctcgaagaca ttgtcctgac tcttacgttg tttgaggata gggaaatgat cgaggaacgc 2880
ttgaagacgt atgcccatct cttcgatgac aaggttatga agcagctcaa gagaagaaga 2940
tacaccggat ggggaaggct gtcccgcaag cttatcaatg gcattagaga caagcaatca 3000
gggaagacaa tccttgactt tttgaagtct gatggcttcg cgaacaggaa ttttatgcag 3060
ctgattcacg atgactcact tactttcaag gaggatatcc agaaggctca agtgtcggga 3120
caaggtgaca gtctgcacga gcatatcgcc aaccttgcgg gatctcctgc aatcaagaag 3180
ggtattctgc agacagtcaa ggttgtggat gagcttgtga aggtcatggg acggcataag 3240
cccgagaaca tcgttattga gatggccaga gaaaatcaga ccacacaaaa gggtcagaag 3300
aactcgaggg agcgcatgaa gcgcatcgag gaaggcatta aggagctggg gagtcagatc 3360
cttaaggagc acccggtgga aaacacgcag ttgcaaaatg agaagctcta tctgtactat 3420
ctgcaaaatg gcagggatat gtatgtggac caggagttgg atattaaccg cctctcggat 3480
tacgacgtcg atcatatcgt tcctcagtcc ttccttaagg atgacagcat tgacaataag 3540
gttctcacca ggtccgacaa gaaccgcggg aagtccgata atgtgcccag cgaggaagtc 3600
gttaagaaga tgaagaacta ctggaggcaa cttttgaatg ccaagttgat cacacagagg 3660
aagtttgata acctcactaa ggccgagcgc ggaggtctca gcgaactgga caaggcgggc 3720
ttcattaagc ggcaactggt tgagactaga cagatcacga agcacgtggc gcagattctc 3780
gattcacgca tgaacacgaa gtacgatgag aatgacaagc tgatccggga agtgaaggtc 3840
atcaccttga agtcaaagct cgtttctgac ttcaggaagg atttccaatt ttataaggtg 3900
cgcgagatca acaattatca ccatgctcat gacgcatacc tcaacgctgt ggtcggaaca 3960
gcattgatta agaagtaccc gaagctcgag tccgaattcg tgtacggtga ctataaggtt 4020
tacgatgtgc gcaagatgat cgccaagtca gagcaggaaa ttggcaaggc cactgcgaag 4080
tatttctttt actctaacat tatgaatttc tttaagactg agatcacgct ggctaatggc 4140
gaaatccgga agagaccact tattgagacc aacggcgaga caggggaaat cgtgtgggac 4200
aaggggaggg atttcgccac agtccgcaag gttctctcta tgcctcaagt gaatattgtc 4260
aagaagactg aagtccagac gggcgggttc tcaaaggaat ctattctgcc caagcggaac 4320
tcggataagc ttatcgccag aaagaaggac tgggacccga agaagtatgg aggtttcgac 4380
tcaccaacgg tggcttactc tgtcctggtt gtggcaaagg tggagaaggg aaagtcaaag 4440
aagctcaagt ctgtcaagga gctcctgggt atcaccatta tggagaggtc cagcttcgaa 4500
aagaatccga tcgattttct cgaggcgaag ggatataagg aagtgaagaa ggacctgatc 4560
attaagcttc caaagtacag tcttttcgag ttggaaaacg gcaggaagcg catgttggct 4620
tccgcaggag agctccagaa gggtaacgag cttgctttgc cgtccaagta tgtgaacttc 4680
ctctatctgg catcccacta cgagaagctc aagggcagcc cagaggataa cgaacagaag 4740
caactgtttg tggagcaaca caagcattat cttgacgaga tcattgaaca gatttcggag 4800
ttcagtaagc gcgtcatcct cgccgacgcg aatttggata aggttctctc agcctacaac 4860
aagcaccggg acaagcctat cagagagcag gcggaaaata tcattcatct cttcaccctg 4920
acaaaccttg gggctcccgc tgcattcaag tattttgaca ctacgattga tcggaagaga 4980
tacacttcta cgaaggaggt gctggatgca acccttatcc accaatcgat tactggcctc 5040
tacgagacgc ggatcgactt gagtcagctc gggggggata agagaccagc ggcaaccaag 5100
aaggcaggac aagcgaagaa gaagaagtag 5130
<210> 3
<211> 290
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 3
Ser Lys Ser Trp Gly Lys Phe Ile Glu Glu Glu Glu Ala Glu Met Ala
1 5 10 15
Ser Arg Arg Asn Leu Met Ile Val Asp Gly Thr Asn Leu Gly Phe Arg
20 25 30
Phe Lys His Asn Asn Ser Lys Lys Pro Phe Ala Ser Ser Tyr Val Ser
35 40 45
Thr Ile Gln Ser Leu Ala Lys Ser Tyr Ser Ala Arg Thr Thr Ile Val
50 55 60
Leu Gly Asp Lys Gly Lys Ser Val Phe Arg Leu Glu His Leu Pro Glu
65 70 75 80
Tyr Lys Gly Asn Arg Asp Glu Lys Tyr Ala Gln Arg Thr Glu Glu Glu
85 90 95
Lys Ala Leu Asp Glu Gln Phe Phe Glu Tyr Leu Lys Asp Ala Phe Glu
100 105 110
Leu Cys Lys Thr Thr Phe Pro Thr Phe Thr Ile Arg Gly Val Glu Ala
115 120 125
Asp Asp Met Ala Ala Tyr Ile Val Lys Leu Ile Gly His Leu Tyr Asp
130 135 140
His Val Trp Leu Ile Ser Thr Asp Gly Asp Trp Asp Thr Leu Leu Thr
145 150 155 160
Asp Lys Val Ser Arg Phe Ser Phe Thr Thr Arg Arg Glu Tyr His Leu
165 170 175
Arg Asp Met Tyr Glu His His Asn Val Asp Asp Val Glu Gln Phe Ile
180 185 190
Ser Leu Lys Ala Ile Met Gly Asp Leu Gly Asp Asn Ile Arg Gly Val
195 200 205
Glu Gly Ile Gly Ala Lys Arg Gly Tyr Asn Ile Ile Arg Glu Phe Gly
210 215 220
Asn Val Leu Asp Ile Ile Asp Gln Leu Pro Leu Pro Gly Lys Gln Lys
225 230 235 240
Tyr Ile Gln Asn Leu Asn Ala Ser Glu Glu Leu Leu Phe Arg Asn Leu
245 250 255
Ile Leu Val Asp Leu Pro Thr Tyr Cys Val Asp Ala Ile Ala Ala Val
260 265 270
Gly Gln Asp Val Leu Asp Lys Phe Thr Lys Asp Ile Leu Glu Ile Ala
275 280 285
Glu Gln
290
<210> 4
<211> 870
<212> DNA
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 4
tcaaagtctt ggggcaagtt catcgaggag gaggaggccg agatggcgtc aaggcgcaac 60
ctcatgattg tcgacggcac caatctgggc ttccggttca agcacaacaa ttctaagaag 120
cctttcgcct ccagctacgt gtccacaatc cagagcctcg ccaagtccta cagcgcgcgc 180
accacaattg tgctgggcga caagggcaag tcagtcttcc ggctggagca tctgccggag 240
tacaagggca acagggatga gaagtacgca cagaggaccg aggaggagaa ggcactcgat 300
gagcagttct tcgagtacct caaggacgcc ttcgagctgt gcaagaccac attcccaacc 360
ttcacaatca ggggagtgga ggcagacgat atggcagcgt acatcgtcaa gctcattggc 420
cacctgtacg atcatgtgtg gctcatttcc acagacggcg attgggacac cctcctgaca 480
gacaaggtct cacggttctc tttcaccaca cggagggagt accacctgag ggatatgtac 540
gagcaccata acgtggacga tgtcgagcag ttcatcagcc tcaaggccat tatgggcgat 600
ctgggcgaca atatcagggg agtcgaggga attggagcaa agaggggcta caacatcatt 660
cgggagttcg gcaatgtgct cgatatcatt gaccagctcc cgctgccagg caagcagaag 720
tacatccaga acctcaatgc gtccgaggag ctcctgttcc gcaatctcat cctggtggat 780
ctgccgacct actgcgtcga cgcaattgca gcagtgggac aggatgtcct cgacaagttc 840
acaaaggata tcctggagat tgcggagcag 870
<210> 5
<211> 15
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 5
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
1 5 10 15
<210> 6
<211> 17
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 6
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala
<210> 7
<211> 16
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 7
Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
1 5 10 15
<210> 8
<211> 1368
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 8
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
<210> 9
<211> 4104
<212> DNA
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 9
atggacaaga agtatagtat tggtctggac attgggacga attccgttgg ctgggccgtg 60
atcaccgatg agtacaaggt cccttccaag aagtttaagg ttctggggaa caccgatcgg 120
cacagcatca agaagaatct cattggagcc ctcctgttcg actcaggcga gaccgccgaa 180
gcaacaaggc tcaagagaac cgcaaggaga cggtatacaa gaaggaagaa taggatctgc 240
tacctgcagg agattttcag caacgaaatg gcgaaggtgg acgattcgtt ctttcataga 300
ttggaggaga gtttcctcgt cgaggaagat aagaagcacg agaggcatcc tatctttggc 360
aacattgtcg acgaggttgc ctatcacgaa aagtacccca caatctatca tctgcggaag 420
aagcttgtgg actcgactga taaggcggac cttagattga tctacctcgc tctggcacac 480
atgattaagt tcaggggcca ttttctgatc gagggggatc ttaacccgga caatagcgat 540
gtggacaagt tgttcatcca gctcgtccaa acctacaatc agctctttga ggaaaaccca 600
attaatgctt caggcgtcga cgccaaggcg atcctgtctg cacgcctttc aaagtctcgc 660
cggcttgaga acttgatcgc tcaactcccg ggcgaaaaga agaacggctt gttcgggaat 720
ctcattgcac tttcgttggg gctcacacca aacttcaaga gtaattttga tctcgctgag 780
gacgcaaagc tgcagctttc caaggacact tatgacgatg acctggataa ccttttggcc 840
caaatcggcg atcagtacgc ggacttgttc ctcgccgcga agaatttgtc ggacgcgatc 900
ctcctgagtg atattctccg cgtgaacacc gagattacaa aggccccgct ctcggcgagt 960
atgatcaagc gctatgacga gcaccatcag gatctgaccc ttttgaaggc tttggtccgg 1020
cagcaactcc cagagaagta caaggaaatc ttctttgatc aatccaagaa cggctacgct 1080
ggttatattg acggcggggc atcgcaggag gaattctaca agtttatcaa gccaattctg 1140
gagaagatgg atggcacaga ggaactcctg gtgaagctca atagggagga ccttttgcgg 1200
aagcaaagaa ctttcgataa cggcagcatc cctcaccaga ttcatctcgg ggagctgcac 1260
gccatcctga gaaggcagga agacttctac ccctttctta aggataaccg ggagaagatc 1320
gaaaagattc tgacgttcag aattccgtac tatgtcggac cactcgcccg gggtaattcc 1380
agatttgcgt ggatgaccag aaagagcgag gaaaccatca caccttggaa cttcgaggaa 1440
gtggtcgata agggcgcttc cgcacagagc ttcattgagc gcatgacaaa ttttgacaag 1500
aacctgccta atgagaaggt ccttcccaag cattccctcc tgtacgagta tttcactgtt 1560
tataacgaac tcacgaaggt gaagtatgtg accgagggaa tgcgcaagcc cgccttcctg 1620
agcggcgagc aaaagaaggc gatcgtggac cttttgttta agaccaatcg gaaggtcaca 1680
gttaagcagc tcaaggagga ctacttcaag aagattgaat gcttcgattc cgttgagatc 1740
agcggcgtgg aagacaggtt taacgcgtca ctggggactt accacgatct cctgaagatc 1800
attaaggata aggacttctt ggacaacgag gaaaatgagg atatcctcga agacattgtc 1860
ctgactctta cgttgtttga ggatagggaa atgatcgagg aacgcttgaa gacgtatgcc 1920
catctcttcg atgacaaggt tatgaagcag ctcaagagaa gaagatacac cggatgggga 1980
aggctgtccc gcaagcttat caatggcatt agagacaagc aatcagggaa gacaatcctt 2040
gactttttga agtctgatgg cttcgcgaac aggaatttta tgcagctgat tcacgatgac 2100
tcacttactt tcaaggagga tatccagaag gctcaagtgt cgggacaagg tgacagtctg 2160
cacgagcata tcgccaacct tgcgggatct cctgcaatca agaagggtat tctgcagaca 2220
gtcaaggttg tggatgagct tgtgaaggtc atgggacggc ataagcccga gaacatcgtt 2280
attgagatgg ccagagaaaa tcagaccaca caaaagggtc agaagaactc gagggagcgc 2340
atgaagcgca tcgaggaagg cattaaggag ctggggagtc agatccttaa ggagcacccg 2400
gtggaaaaca cgcagttgca aaatgagaag ctctatctgt actatctgca aaatggcagg 2460
gatatgtatg tggaccagga gttggatatt aaccgcctct cggattacga cgtcgatcat 2520
atcgttcctc agtccttcct taaggatgac agcattgaca ataaggttct caccaggtcc 2580
gacaagaacc gcgggaagtc cgataatgtg cccagcgagg aagtcgttaa gaagatgaag 2640
aactactgga ggcaactttt gaatgccaag ttgatcacac agaggaagtt tgataacctc 2700
actaaggccg agcgcggagg tctcagcgaa ctggacaagg cgggcttcat taagcggcaa 2760
ctggttgaga ctagacagat cacgaagcac gtggcgcaga ttctcgattc acgcatgaac 2820
acgaagtacg atgagaatga caagctgatc cgggaagtga aggtcatcac cttgaagtca 2880
aagctcgttt ctgacttcag gaaggatttc caattttata aggtgcgcga gatcaacaat 2940
tatcaccatg ctcatgacgc atacctcaac gctgtggtcg gaacagcatt gattaagaag 3000
tacccgaagc tcgagtccga attcgtgtac ggtgactata aggtttacga tgtgcgcaag 3060
atgatcgcca agtcagagca ggaaattggc aaggccactg cgaagtattt cttttactct 3120
aacattatga atttctttaa gactgagatc acgctggcta atggcgaaat ccggaagaga 3180
ccacttattg agaccaacgg cgagacaggg gaaatcgtgt gggacaaggg gagggatttc 3240
gccacagtcc gcaaggttct ctctatgcct caagtgaata ttgtcaagaa gactgaagtc 3300
cagacgggcg ggttctcaaa ggaatctatt ctgcccaagc ggaactcgga taagcttatc 3360
gccagaaaga aggactggga cccgaagaag tatggaggtt tcgactcacc aacggtggct 3420
tactctgtcc tggttgtggc aaaggtggag aagggaaagt caaagaagct caagtctgtc 3480
aaggagctcc tgggtatcac cattatggag aggtccagct tcgaaaagaa tccgatcgat 3540
tttctcgagg cgaagggata taaggaagtg aagaaggacc tgatcattaa gcttccaaag 3600
tacagtcttt tcgagttgga aaacggcagg aagcgcatgt tggcttccgc aggagagctc 3660
cagaagggta acgagcttgc tttgccgtcc aagtatgtga acttcctcta tctggcatcc 3720
cactacgaga agctcaaggg cagcccagag gataacgaac agaagcaact gtttgtggag 3780
caacacaagc attatcttga cgagatcatt gaacagattt cggagttcag taagcgcgtc 3840
atcctcgccg acgcgaattt ggataaggtt ctctcagcct acaacaagca ccgggacaag 3900
cctatcagag agcaggcgga aaatatcatt catctcttca ccctgacaaa ccttggggct 3960
cccgctgcat tcaagtattt tgacactacg attgatcgga agagatacac ttctacgaag 4020
gaggtgctgg atgcaaccct tatccaccaa tcgattactg gcctctacga gacgcggatc 4080
gacttgagtc agctcggggg ggat 4104
<210> 10
<211> 1566
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 10
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ser Lys Ser Trp Gly Lys Phe Ile Glu Glu Glu Glu Ala Glu Met
20 25 30
Ala Ser Arg Arg Asn Leu Met Ile Val Asp Gly Thr Asn Leu Gly Phe
35 40 45
Arg Phe Lys His Asn Asn Ser Lys Lys Pro Phe Ala Ser Ser Tyr Val
50 55 60
Ser Thr Ile Gln Ser Leu Ala Lys Ser Tyr Ser Ala Arg Thr Thr Ile
65 70 75 80
Val Leu Gly Asp Lys Gly Lys Ser Val Phe Arg Leu Glu His Leu Pro
85 90 95
Glu Tyr Lys Gly Asn Arg Asp Glu Lys Tyr Ala Gln Arg Thr Glu Glu
100 105 110
Glu Lys Ala Leu Asp Glu Gln Phe Phe Glu Tyr Leu Lys Asp Ala Phe
115 120 125
Glu Leu Cys Lys Thr Thr Phe Pro Thr Phe Thr Ile Arg Gly Val Glu
130 135 140
Ala Asp Asp Met Ala Ala Tyr Ile Val Lys Leu Ile Gly His Leu Tyr
145 150 155 160
Asp His Val Trp Leu Ile Ser Thr Asp Gly Asp Trp Asp Thr Leu Leu
165 170 175
Thr Asp Lys Val Ser Arg Phe Ser Phe Thr Thr Arg Arg Glu Tyr His
180 185 190
Leu Arg Asp Met Tyr Glu His His Asn Val Asp Asp Val Glu Gln Phe
195 200 205
Ile Ser Leu Lys Ala Ile Met Gly Asp Leu Gly Asp Asn Ile Arg Gly
210 215 220
Val Glu Gly Ile Gly Ala Lys Arg Gly Tyr Asn Ile Ile Arg Glu Phe
225 230 235 240
Gly Asn Val Leu Asp Ile Ile Asp Gln Leu Pro Leu Pro Gly Lys Gln
245 250 255
Lys Tyr Ile Gln Asn Leu Asn Ala Ser Glu Glu Leu Leu Phe Arg Asn
260 265 270
Leu Ile Leu Val Asp Leu Pro Thr Tyr Cys Val Asp Ala Ile Ala Ala
275 280 285
Val Gly Gln Asp Val Leu Asp Lys Phe Thr Lys Asp Ile Leu Glu Ile
290 295 300
Ala Glu Gln Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr
305 310 315 320
Pro Glu Ser Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr Ser Leu Ser
325 330 335
Lys Thr Leu Arg Phe Lys Ala Ile Pro Val Gly Lys Thr Gln Glu Asn
340 345 350
Ile Asp Asn Lys Arg Leu Leu Val Glu Asp Glu Lys Arg Ala Glu Asp
355 360 365
Tyr Lys Gly Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu Ser Phe Ile
370 375 380
Asn Asp Val Leu His Ser Ile Lys Leu Lys Asn Leu Asn Asn Tyr Ile
385 390 395 400
Ser Leu Phe Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn Lys Glu Leu
405 410 415
Glu Asn Leu Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys Ala Phe Lys
420 425 430
Gly Asn Glu Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile Ile Glu Thr
435 440 445
Ile Leu Pro Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala Leu Val Asn
450 455 460
Ser Phe Asn Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe Asp Asn Arg
465 470 475 480
Glu Asn Met Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile Ala Phe Arg
485 490 495
Cys Ile Asn Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met Asp Ile Phe
500 505 510
Glu Lys Val Asp Ala Ile Phe Asp Lys His Glu Val Gln Glu Ile Lys
515 520 525
Glu Lys Ile Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe Phe Glu Gly
530 535 540
Glu Phe Phe Asn Phe Val Leu Thr Gln Glu Gly Ile Asp Val Tyr Asn
545 550 555 560
Ala Ile Ile Gly Gly Phe Val Thr Glu Ser Gly Glu Lys Ile Lys Gly
565 570 575
Leu Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys Gln Lys Leu
580 585 590
Pro Lys Phe Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp Arg Glu Ser
595 600 605
Leu Ser Phe Tyr Gly Glu Gly Tyr Thr Ser Asp Glu Glu Val Leu Glu
610 615 620
Val Phe Arg Asn Thr Leu Asn Lys Asn Ser Glu Ile Phe Ser Ser Ile
625 630 635 640
Lys Lys Leu Glu Lys Leu Phe Lys Asn Phe Asp Glu Tyr Ser Ser Ala
645 650 655
Gly Ile Phe Val Lys Asn Gly Pro Ala Ile Ser Thr Ile Ser Lys Asp
660 665 670
Ile Phe Gly Glu Trp Asn Val Ile Arg Asp Lys Trp Asn Ala Glu Tyr
675 680 685
Asp Asp Ile His Leu Lys Lys Lys Ala Val Val Thr Glu Lys Tyr Glu
690 695 700
Asp Asp Arg Arg Lys Ser Phe Lys Lys Ile Gly Ser Phe Ser Leu Glu
705 710 715 720
Gln Leu Gln Glu Tyr Ala Asp Ala Asp Leu Ser Val Val Glu Lys Leu
725 730 735
Lys Glu Ile Ile Ile Gln Lys Val Asp Glu Ile Tyr Lys Val Tyr Gly
740 745 750
Ser Ser Glu Lys Leu Phe Asp Ala Asp Phe Val Leu Glu Lys Ser Leu
755 760 765
Lys Lys Asn Asp Ala Val Val Ala Ile Met Lys Asp Leu Leu Asp Ser
770 775 780
Val Lys Ser Phe Glu Asn Tyr Ile Lys Ala Phe Phe Gly Glu Gly Lys
785 790 795 800
Glu Thr Asn Arg Asp Glu Ser Phe Tyr Gly Asp Phe Val Leu Ala Tyr
805 810 815
Asp Ile Leu Leu Lys Val Asp His Ile Tyr Asp Ala Ile Arg Asn Tyr
820 825 830
Val Thr Gln Lys Pro Tyr Ser Lys Asp Lys Phe Lys Leu Tyr Phe Gln
835 840 845
Asn Pro Gln Phe Met Gly Gly Trp Asp Lys Asp Lys Glu Thr Asp Tyr
850 855 860
Arg Ala Thr Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala Ile Met
865 870 875 880
Asp Lys Lys Tyr Ala Lys Cys Leu Gln Lys Ile Asp Lys Asp Asp Val
885 890 895
Asn Gly Asn Tyr Glu Lys Ile Asn Tyr Lys Leu Leu Pro Gly Pro Asn
900 905 910
Lys Met Leu Pro Lys Val Phe Phe Ser Lys Lys Trp Met Ala Tyr Tyr
915 920 925
Asn Pro Ser Glu Asp Ile Gln Lys Ile Tyr Lys Asn Gly Thr Phe Lys
930 935 940
Lys Gly Asp Met Phe Asn Leu Asn Asp Cys His Lys Leu Ile Asp Phe
945 950 955 960
Phe Lys Asp Ser Ile Ser Arg Tyr Pro Lys Trp Ser Asn Ala Tyr Asp
965 970 975
Phe Asn Phe Ser Glu Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe Tyr
980 985 990
Arg Glu Val Glu Glu Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser
995 1000 1005
Lys Lys Glu Val Asp Lys Leu Val Glu Glu Gly Lys Leu Tyr Met
1010 1015 1020
Phe Gln Ile Tyr Asn Lys Asp Phe Ser Asp Lys Ser His Gly Thr
1025 1030 1035
Pro Asn Leu His Thr Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn
1040 1045 1050
Asn His Gly Gln Ile Arg Leu Ser Gly Gly Ala Glu Leu Phe Met
1055 1060 1065
Arg Arg Ala Ser Leu Lys Lys Glu Glu Leu Val Val His Pro Ala
1070 1075 1080
Asn Ser Pro Ile Ala Asn Lys Asn Pro Asp Asn Pro Lys Lys Thr
1085 1090 1095
Thr Thr Leu Ser Tyr Asp Val Tyr Lys Asp Lys Arg Phe Ser Glu
1100 1105 1110
Asp Gln Tyr Glu Leu His Ile Pro Ile Ala Ile Asn Lys Cys Pro
1115 1120 1125
Lys Asn Ile Phe Lys Ile Asn Thr Glu Val Arg Val Leu Leu Lys
1130 1135 1140
His Asp Asp Asn Pro Tyr Val Ile Gly Ile Asp Arg Gly Glu Arg
1145 1150 1155
Asn Leu Leu Tyr Ile Val Val Val Asp Gly Lys Gly Asn Ile Val
1160 1165 1170
Glu Gln Tyr Ser Leu Asn Glu Ile Ile Asn Asn Phe Asn Gly Ile
1175 1180 1185
Arg Ile Lys Thr Asp Tyr His Ser Leu Leu Asp Lys Lys Glu Lys
1190 1195 1200
Glu Arg Phe Glu Ala Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile
1205 1210 1215
Lys Glu Leu Lys Ala Gly Tyr Ile Ser Gln Val Val His Lys Ile
1220 1225 1230
Cys Glu Leu Val Glu Lys Tyr Asp Ala Val Ile Ala Leu Glu Asp
1235 1240 1245
Leu Asn Ser Gly Phe Lys Asn Ser Arg Val Lys Val Glu Lys Gln
1250 1255 1260
Val Tyr Gln Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr
1265 1270 1275
Met Val Asp Lys Lys Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu
1280 1285 1290
Lys Gly Tyr Gln Ile Thr Asn Lys Phe Glu Ser Phe Lys Ser Met
1295 1300 1305
Ser Thr Gln Asn Gly Phe Ile Phe Tyr Ile Pro Ala Trp Leu Thr
1310 1315 1320
Ser Lys Ile Asp Pro Ser Thr Gly Phe Val Asn Leu Leu Lys Thr
1325 1330 1335
Lys Tyr Thr Ser Ile Ala Asp Ser Lys Lys Phe Ile Ser Ser Phe
1340 1345 1350
Asp Arg Ile Met Tyr Val Pro Glu Glu Asp Leu Phe Glu Phe Ala
1355 1360 1365
Leu Asp Tyr Lys Asn Phe Ser Arg Thr Asp Ala Asp Tyr Ile Lys
1370 1375 1380
Lys Trp Lys Leu Tyr Ser Tyr Gly Asn Arg Ile Arg Ile Phe Arg
1385 1390 1395
Asn Pro Lys Lys Asn Asn Val Phe Asp Trp Glu Glu Val Cys Leu
1400 1405 1410
Thr Ser Ala Tyr Lys Glu Leu Phe Asn Lys Tyr Gly Ile Asn Tyr
1415 1420 1425
Gln Gln Gly Asp Ile Arg Ala Leu Leu Cys Glu Gln Ser Asp Lys
1430 1435 1440
Ala Phe Tyr Ser Ser Phe Met Ala Leu Met Ser Leu Met Leu Gln
1445 1450 1455
Met Arg Asn Ser Ile Thr Gly Arg Thr Asp Val Asp Phe Leu Ile
1460 1465 1470
Ser Pro Val Lys Asn Ser Asp Gly Ile Phe Tyr Asp Ser Arg Asn
1475 1480 1485
Tyr Glu Ala Gln Glu Asn Ala Ile Leu Pro Lys Asn Ala Asp Ala
1490 1495 1500
Asn Gly Ala Tyr Asn Ile Ala Arg Lys Val Leu Trp Ala Ile Gly
1505 1510 1515
Gln Phe Lys Lys Ala Glu Asp Glu Lys Leu Asp Lys Val Lys Ile
1520 1525 1530
Ala Ile Ser Asn Lys Glu Trp Leu Glu Tyr Ala Gln Thr Ser Val
1535 1540 1545
Lys His Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys
1550 1555 1560
Lys Lys Lys
1565
<210> 11
<211> 4701
<212> DNA
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 11
atggctccta agaagaagcg gaaggttggt attcacgggg tgcctgcggc ttcaaagtct 60
tggggcaagt tcatcgagga ggaggaggcc gagatggcgt caaggcgcaa cctcatgatt 120
gtcgacggca ccaatctggg cttccggttc aagcacaaca attctaagaa gcctttcgcc 180
tccagctacg tgtccacaat ccagagcctc gccaagtcct acagcgcgcg caccacaatt 240
gtgctgggcg acaagggcaa gtcagtcttc cggctggagc atctgccgga gtacaagggc 300
aacagggatg agaagtacgc acagaggacc gaggaggaga aggcactcga tgagcagttc 360
ttcgagtacc tcaaggacgc cttcgagctg tgcaagacca cattcccaac cttcacaatc 420
aggggagtgg aggcagacga tatggcagcg tacatcgtca agctcattgg ccacctgtac 480
gatcatgtgt ggctcatttc cacagacggc gattgggaca ccctcctgac agacaaggtc 540
tcacggttct ctttcaccac acggagggag taccacctga gggatatgta cgagcaccat 600
aacgtggacg atgtcgagca gttcatcagc ctcaaggcca ttatgggcga tctgggcgac 660
aatatcaggg gagtcgaggg aattggagca aagaggggct acaacatcat tcgggagttc 720
ggcaatgtgc tcgatatcat tgaccagctc ccgctgccag gcaagcagaa gtacatccag 780
aacctcaatg cgtccgagga gctcctgttc cgcaatctca tcctggtgga tctgccgacc 840
tactgcgtcg acgcaattgc agcagtggga caggatgtcc tcgacaagtt cacaaaggat 900
atcctggaga ttgcggagca gtccggcagc gagacgccag gcacctccga gagcgctacg 960
cctgaatcgt caaagctcga gaaattcacc aactgttatt cgttgagcaa aacactgcgg 1020
tttaaagcga ttccagtcgg caagactcaa gagaatatag acaataagcg gctgttggtg 1080
gaagatgaaa agcgcgcgga agactacaaa ggggtgaaga agttgttgga cagatactac 1140
ctctctttta tcaatgatgt cttgcactca atcaaattga agaatctgaa caactacatc 1200
tccctcttca gaaagaaaac aaggacagaa aaggagaata aggaacttga aaatttggag 1260
atcaatctga ggaaagagat cgcgaaagcc tttaaaggca acgaaggata caaaagtctg 1320
ttcaagaagg atataattga gacaattttg ccagagttcc tcgatgacaa ggacgagatt 1380
gcgctggtca attcgttcaa cggattcaca acagcattca caggcttctt tgataatcgg 1440
gaaaatatgt tctctgagga ggcaaagtcc acttctattg cgttcaggtg tatcaatgag 1500
aatctcacta ggtacatttc caacatggat atctttgaga aggttgacgc aatttttgac 1560
aagcacgaag ttcaggagat taaggagaag atcctcaatt ccgattatga cgttgaggac 1620
ttcttcgaag gtgagttttt taatttcgtg ctcactcaag agggtatcga cgtgtataat 1680
gcgatcatcg gtgggttcgt gactgagtcc ggtgaaaaga ttaagggatt gaacgagtat 1740
atcaaccttt acaaccaaaa gacgaaacag aagctgccaa agttcaagcc tctttacaaa 1800
caggttcttt cagaccgcga gtcactctcg ttctatgggg agggctacac ttcggatgag 1860
gaagtcctgg aggtgttcag gaatactctc aataagaatt cggagatttt ctcttctata 1920
aaaaaactgg aaaagttgtt taagaatttt gacgaatact ctagcgccgg catatttgtg 1980
aaaaacggcc cggccatatc aacgataagt aaagatatct tcggcgaatg gaacgtgatc 2040
agagacaaat ggaacgcgga gtatgacgat attcacctga agaagaaggc tgtcgtaacg 2100
gagaagtacg aggatgatcg caggaaaagc ttcaaaaaga tcggaagttt cagcctggaa 2160
cagttgcagg agtatgctga cgccgatctt agcgtcgtcg agaagttgaa ggagataatc 2220
atccaaaagg tcgacgagat atataaagtc tatggatcaa gtgaaaaact gttcgacgcc 2280
gacttcgttt tggagaagtc cctgaagaag aacgacgctg ttgttgccat tatgaaggat 2340
ctgctcgaca gcgtgaagag tttcgagaac tatattaagg cttttttcgg ggaggggaag 2400
gagactaaca gagatgagtc cttctacgga gacttcgtcc tcgcgtacga tatactcctt 2460
aaggtagacc acatctacga cgcaatcaga aattacgtga cacaaaagcc gtacagcaag 2520
gacaagttca aactctactt ccagaacccc cagttcatgg gcggctggga caaggacaag 2580
gaaacggatt acagggctac gatcctgagg tatggttcaa aatactactt ggcgattatg 2640
gacaagaagt acgccaagtg tctccagaag attgacaaag acgatgtcaa tggcaattat 2700
gagaagatca actacaagct gcttccgggt ccgaacaaga tgctcccaaa ggttttcttc 2760
agcaagaaat ggatggccta ctataaccca agcgaggaca tccagaagat ttataagaac 2820
ggtacgttca agaagggcga catgttcaat cttaacgact gtcacaagct gatcgacttc 2880
ttcaaagact caattagccg gtacccaaag tggtctaacg cctatgactt caacttttcg 2940
gaaaccgaga agtacaagga tatagccgga ttttatagag aggtggaaga gcagggctac 3000
aaggtgtcat tcgagtccgc cagcaagaag gaagtggaca agctcgtgga agagggtaag 3060
ctctacatgt tccagattta taataaagac tttagcgata agagccacgg gacacctaat 3120
ctccacacaa tgtatttcaa gctgctcttc gacgagaata accacggcca aatcaggttg 3180
tcaggagggg ctgaactctt catgcggcgc gctagcctta agaaggagga gcttgtagtc 3240
caccctgcga atagtccaat tgcgaataag aacccggaca atcctaaaaa gactacaaca 3300
ttgagctacg acgtgtacaa ggataagagg ttttccgagg atcagtacga gctccacatc 3360
ccgattgcga tcaacaagtg cccaaagaat attttcaaga taaacacaga ggtgcgtgta 3420
ctcctgaagc atgacgacaa tccttacgtc attgggattg atcggggcga gaggaacctc 3480
ctctatattg tggtggtgga cgggaagggg aacatagtcg aacagtactc ccttaacgaa 3540
ataattaaca atttcaacgg catccgtatc aagaccgact accattcgtt gctggacaag 3600
aaggagaagg agagatttga ggcgcggcaa aattggacaa gtatcgagaa catcaaggaa 3660
ctcaaagcag gttatatctc tcaagttgtg cataagatat gcgagctggt tgagaagtat 3720
gacgcagtga tcgctcttga ggacctcaac tcgggcttta agaattctag agttaaagtg 3780
gagaagcagg tctatcaaaa gttcgagaag atgcttatag ataagctcaa ctacatggtc 3840
gataagaaat cgaacccatg tgccaccggc ggcgcactca aaggttacca aataacaaac 3900
aaattcgagt ccttcaaatc gatgagtact cagaatgggt tcatatttta tataccggcg 3960
tggcttacgt ctaagatcga cccgtcaact ggttttgtca acctgttgaa gacgaaatac 4020
acgtccattg ccgattcgaa aaagttcata tctagttttg atcgtattat gtacgtccca 4080
gaggaagatc ttttcgagtt tgctctcgac tacaaaaact tttcgcggac cgatgcggat 4140
tacattaaaa aatggaaact ctattcgtac ggcaacagaa tcaggatttt tcgcaaccct 4200
aagaagaata acgtctttga ttgggaggaa gtttgcttga ctagcgcgta caaggagctc 4260
tttaataagt atggcattaa ctaccaacag ggtgatatca gagcactgct ttgcgaacaa 4320
tctgacaagg ctttctactc atccttcatg gctttgatga gcctgatgct ccagatgaga 4380
aattcaatta caggcagaac cgacgtggat ttcttgatct ccccggttaa aaattctgat 4440
ggcatctttt acgatagcag gaactatgaa gcgcaagaga atgcgattct gccaaaaaat 4500
gcagacgcca acggtgccta taacatcgcc aggaaagtcc tgtgggcgat cggccagttc 4560
aaaaaggccg aagacgaaaa attggacaag gtcaaaatcg ctatcagcaa caaagagtgg 4620
ctggagtatg ctcagacatc cgtaaagcat aagcgtcctg ctgccaccaa aaaggccgga 4680
caggctaaga aaaagaagtg a 4701
<210> 12
<211> 7
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 12
Pro Lys Lys Lys Arg Lys Val
1 5
<210> 13
<211> 16
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 13
Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
1 5 10 15
<210> 14
<211> 16
<212> PRT
<213> Artificial Sequence
<220>
<223> ARTIFICIAL SEQUENCE
<400> 14
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 15
<210> 15
<211> 1227
<212> PRT
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 15
Ser Lys Leu Glu Lys Phe Thr Asn Cys Tyr Ser Leu Ser Lys Thr Leu
1 5 10 15
Arg Phe Lys Ala Ile Pro Val Gly Lys Thr Gln Glu Asn Ile Asp Asn
20 25 30
Lys Arg Leu Leu Val Glu Asp Glu Lys Arg Ala Glu Asp Tyr Lys Gly
35 40 45
Val Lys Lys Leu Leu Asp Arg Tyr Tyr Leu Ser Phe Ile Asn Asp Val
50 55 60
Leu His Ser Ile Lys Leu Lys Asn Leu Asn Asn Tyr Ile Ser Leu Phe
65 70 75 80
Arg Lys Lys Thr Arg Thr Glu Lys Glu Asn Lys Glu Leu Glu Asn Leu
85 90 95
Glu Ile Asn Leu Arg Lys Glu Ile Ala Lys Ala Phe Lys Gly Asn Glu
100 105 110
Gly Tyr Lys Ser Leu Phe Lys Lys Asp Ile Ile Glu Thr Ile Leu Pro
115 120 125
Glu Phe Leu Asp Asp Lys Asp Glu Ile Ala Leu Val Asn Ser Phe Asn
130 135 140
Gly Phe Thr Thr Ala Phe Thr Gly Phe Phe Asp Asn Arg Glu Asn Met
145 150 155 160
Phe Ser Glu Glu Ala Lys Ser Thr Ser Ile Ala Phe Arg Cys Ile Asn
165 170 175
Glu Asn Leu Thr Arg Tyr Ile Ser Asn Met Asp Ile Phe Glu Lys Val
180 185 190
Asp Ala Ile Phe Asp Lys His Glu Val Gln Glu Ile Lys Glu Lys Ile
195 200 205
Leu Asn Ser Asp Tyr Asp Val Glu Asp Phe Phe Glu Gly Glu Phe Phe
210 215 220
Asn Phe Val Leu Thr Gln Glu Gly Ile Asp Val Tyr Asn Ala Ile Ile
225 230 235 240
Gly Gly Phe Val Thr Glu Ser Gly Glu Lys Ile Lys Gly Leu Asn Glu
245 250 255
Tyr Ile Asn Leu Tyr Asn Gln Lys Thr Lys Gln Lys Leu Pro Lys Phe
260 265 270
Lys Pro Leu Tyr Lys Gln Val Leu Ser Asp Arg Glu Ser Leu Ser Phe
275 280 285
Tyr Gly Glu Gly Tyr Thr Ser Asp Glu Glu Val Leu Glu Val Phe Arg
290 295 300
Asn Thr Leu Asn Lys Asn Ser Glu Ile Phe Ser Ser Ile Lys Lys Leu
305 310 315 320
Glu Lys Leu Phe Lys Asn Phe Asp Glu Tyr Ser Ser Ala Gly Ile Phe
325 330 335
Val Lys Asn Gly Pro Ala Ile Ser Thr Ile Ser Lys Asp Ile Phe Gly
340 345 350
Glu Trp Asn Val Ile Arg Asp Lys Trp Asn Ala Glu Tyr Asp Asp Ile
355 360 365
His Leu Lys Lys Lys Ala Val Val Thr Glu Lys Tyr Glu Asp Asp Arg
370 375 380
Arg Lys Ser Phe Lys Lys Ile Gly Ser Phe Ser Leu Glu Gln Leu Gln
385 390 395 400
Glu Tyr Ala Asp Ala Asp Leu Ser Val Val Glu Lys Leu Lys Glu Ile
405 410 415
Ile Ile Gln Lys Val Asp Glu Ile Tyr Lys Val Tyr Gly Ser Ser Glu
420 425 430
Lys Leu Phe Asp Ala Asp Phe Val Leu Glu Lys Ser Leu Lys Lys Asn
435 440 445
Asp Ala Val Val Ala Ile Met Lys Asp Leu Leu Asp Ser Val Lys Ser
450 455 460
Phe Glu Asn Tyr Ile Lys Ala Phe Phe Gly Glu Gly Lys Glu Thr Asn
465 470 475 480
Arg Asp Glu Ser Phe Tyr Gly Asp Phe Val Leu Ala Tyr Asp Ile Leu
485 490 495
Leu Lys Val Asp His Ile Tyr Asp Ala Ile Arg Asn Tyr Val Thr Gln
500 505 510
Lys Pro Tyr Ser Lys Asp Lys Phe Lys Leu Tyr Phe Gln Asn Pro Gln
515 520 525
Phe Met Gly Gly Trp Asp Lys Asp Lys Glu Thr Asp Tyr Arg Ala Thr
530 535 540
Ile Leu Arg Tyr Gly Ser Lys Tyr Tyr Leu Ala Ile Met Asp Lys Lys
545 550 555 560
Tyr Ala Lys Cys Leu Gln Lys Ile Asp Lys Asp Asp Val Asn Gly Asn
565 570 575
Tyr Glu Lys Ile Asn Tyr Lys Leu Leu Pro Gly Pro Asn Lys Met Leu
580 585 590
Pro Lys Val Phe Phe Ser Lys Lys Trp Met Ala Tyr Tyr Asn Pro Ser
595 600 605
Glu Asp Ile Gln Lys Ile Tyr Lys Asn Gly Thr Phe Lys Lys Gly Asp
610 615 620
Met Phe Asn Leu Asn Asp Cys His Lys Leu Ile Asp Phe Phe Lys Asp
625 630 635 640
Ser Ile Ser Arg Tyr Pro Lys Trp Ser Asn Ala Tyr Asp Phe Asn Phe
645 650 655
Ser Glu Thr Glu Lys Tyr Lys Asp Ile Ala Gly Phe Tyr Arg Glu Val
660 665 670
Glu Glu Gln Gly Tyr Lys Val Ser Phe Glu Ser Ala Ser Lys Lys Glu
675 680 685
Val Asp Lys Leu Val Glu Glu Gly Lys Leu Tyr Met Phe Gln Ile Tyr
690 695 700
Asn Lys Asp Phe Ser Asp Lys Ser His Gly Thr Pro Asn Leu His Thr
705 710 715 720
Met Tyr Phe Lys Leu Leu Phe Asp Glu Asn Asn His Gly Gln Ile Arg
725 730 735
Leu Ser Gly Gly Ala Glu Leu Phe Met Arg Arg Ala Ser Leu Lys Lys
740 745 750
Glu Glu Leu Val Val His Pro Ala Asn Ser Pro Ile Ala Asn Lys Asn
755 760 765
Pro Asp Asn Pro Lys Lys Thr Thr Thr Leu Ser Tyr Asp Val Tyr Lys
770 775 780
Asp Lys Arg Phe Ser Glu Asp Gln Tyr Glu Leu His Ile Pro Ile Ala
785 790 795 800
Ile Asn Lys Cys Pro Lys Asn Ile Phe Lys Ile Asn Thr Glu Val Arg
805 810 815
Val Leu Leu Lys His Asp Asp Asn Pro Tyr Val Ile Gly Ile Asp Arg
820 825 830
Gly Glu Arg Asn Leu Leu Tyr Ile Val Val Val Asp Gly Lys Gly Asn
835 840 845
Ile Val Glu Gln Tyr Ser Leu Asn Glu Ile Ile Asn Asn Phe Asn Gly
850 855 860
Ile Arg Ile Lys Thr Asp Tyr His Ser Leu Leu Asp Lys Lys Glu Lys
865 870 875 880
Glu Arg Phe Glu Ala Arg Gln Asn Trp Thr Ser Ile Glu Asn Ile Lys
885 890 895
Glu Leu Lys Ala Gly Tyr Ile Ser Gln Val Val His Lys Ile Cys Glu
900 905 910
Leu Val Glu Lys Tyr Asp Ala Val Ile Ala Leu Glu Asp Leu Asn Ser
915 920 925
Gly Phe Lys Asn Ser Arg Val Lys Val Glu Lys Gln Val Tyr Gln Lys
930 935 940
Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Met Val Asp Lys Lys
945 950 955 960
Ser Asn Pro Cys Ala Thr Gly Gly Ala Leu Lys Gly Tyr Gln Ile Thr
965 970 975
Asn Lys Phe Glu Ser Phe Lys Ser Met Ser Thr Gln Asn Gly Phe Ile
980 985 990
Phe Tyr Ile Pro Ala Trp Leu Thr Ser Lys Ile Asp Pro Ser Thr Gly
995 1000 1005
Phe Val Asn Leu Leu Lys Thr Lys Tyr Thr Ser Ile Ala Asp Ser
1010 1015 1020
Lys Lys Phe Ile Ser Ser Phe Asp Arg Ile Met Tyr Val Pro Glu
1025 1030 1035
Glu Asp Leu Phe Glu Phe Ala Leu Asp Tyr Lys Asn Phe Ser Arg
1040 1045 1050
Thr Asp Ala Asp Tyr Ile Lys Lys Trp Lys Leu Tyr Ser Tyr Gly
1055 1060 1065
Asn Arg Ile Arg Ile Phe Arg Asn Pro Lys Lys Asn Asn Val Phe
1070 1075 1080
Asp Trp Glu Glu Val Cys Leu Thr Ser Ala Tyr Lys Glu Leu Phe
1085 1090 1095
Asn Lys Tyr Gly Ile Asn Tyr Gln Gln Gly Asp Ile Arg Ala Leu
1100 1105 1110
Leu Cys Glu Gln Ser Asp Lys Ala Phe Tyr Ser Ser Phe Met Ala
1115 1120 1125
Leu Met Ser Leu Met Leu Gln Met Arg Asn Ser Ile Thr Gly Arg
1130 1135 1140
Thr Asp Val Asp Phe Leu Ile Ser Pro Val Lys Asn Ser Asp Gly
1145 1150 1155
Ile Phe Tyr Asp Ser Arg Asn Tyr Glu Ala Gln Glu Asn Ala Ile
1160 1165 1170
Leu Pro Lys Asn Ala Asp Ala Asn Gly Ala Tyr Asn Ile Ala Arg
1175 1180 1185
Lys Val Leu Trp Ala Ile Gly Gln Phe Lys Lys Ala Glu Asp Glu
1190 1195 1200
Lys Leu Asp Lys Val Lys Ile Ala Ile Ser Asn Lys Glu Trp Leu
1205 1210 1215
Glu Tyr Ala Gln Thr Ser Val Lys His
1220 1225
<210> 16
<211> 3681
<212> DNA
<213> ARTIFICIAL SEQUENCE
<220>
<223> ARTIFICIAL SEQUENCE
<400> 16
tcaaagctcg agaaattcac caactgttat tcgttgagca aaacactgcg gtttaaagcg 60
attccagtcg gcaagactca agagaatata gacaataagc ggctgttggt ggaagatgaa 120
aagcgcgcgg aagactacaa aggggtgaag aagttgttgg acagatacta cctctctttt 180
atcaatgatg tcttgcactc aatcaaattg aagaatctga acaactacat ctccctcttc 240
agaaagaaaa caaggacaga aaaggagaat aaggaacttg aaaatttgga gatcaatctg 300
aggaaagaga tcgcgaaagc ctttaaaggc aacgaaggat acaaaagtct gttcaagaag 360
gatataattg agacaatttt gccagagttc ctcgatgaca aggacgagat tgcgctggtc 420
aattcgttca acggattcac aacagcattc acaggcttct ttgataatcg ggaaaatatg 480
ttctctgagg aggcaaagtc cacttctatt gcgttcaggt gtatcaatga gaatctcact 540
aggtacattt ccaacatgga tatctttgag aaggttgacg caatttttga caagcacgaa 600
gttcaggaga ttaaggagaa gatcctcaat tccgattatg acgttgagga cttcttcgaa 660
ggtgagtttt ttaatttcgt gctcactcaa gagggtatcg acgtgtataa tgcgatcatc 720
ggtgggttcg tgactgagtc cggtgaaaag attaagggat tgaacgagta tatcaacctt 780
tacaaccaaa agacgaaaca gaagctgcca aagttcaagc ctctttacaa acaggttctt 840
tcagaccgcg agtcactctc gttctatggg gagggctaca cttcggatga ggaagtcctg 900
gaggtgttca ggaatactct caataagaat tcggagattt tctcttctat aaaaaaactg 960
gaaaagttgt ttaagaattt tgacgaatac tctagcgccg gcatatttgt gaaaaacggc 1020
ccggccatat caacgataag taaagatatc ttcggcgaat ggaacgtgat cagagacaaa 1080
tggaacgcgg agtatgacga tattcacctg aagaagaagg ctgtcgtaac ggagaagtac 1140
gaggatgatc gcaggaaaag cttcaaaaag atcggaagtt tcagcctgga acagttgcag 1200
gagtatgctg acgccgatct tagcgtcgtc gagaagttga aggagataat catccaaaag 1260
gtcgacgaga tatataaagt ctatggatca agtgaaaaac tgttcgacgc cgacttcgtt 1320
ttggagaagt ccctgaagaa gaacgacgct gttgttgcca ttatgaagga tctgctcgac 1380
agcgtgaaga gtttcgagaa ctatattaag gcttttttcg gggaggggaa ggagactaac 1440
agagatgagt ccttctacgg agacttcgtc ctcgcgtacg atatactcct taaggtagac 1500
cacatctacg acgcaatcag aaattacgtg acacaaaagc cgtacagcaa ggacaagttc 1560
aaactctact tccagaaccc ccagttcatg ggcggctggg acaaggacaa ggaaacggat 1620
tacagggcta cgatcctgag gtatggttca aaatactact tggcgattat ggacaagaag 1680
tacgccaagt gtctccagaa gattgacaaa gacgatgtca atggcaatta tgagaagatc 1740
aactacaagc tgcttccggg tccgaacaag atgctcccaa aggttttctt cagcaagaaa 1800
tggatggcct actataaccc aagcgaggac atccagaaga tttataagaa cggtacgttc 1860
aagaagggcg acatgttcaa tcttaacgac tgtcacaagc tgatcgactt cttcaaagac 1920
tcaattagcc ggtacccaaa gtggtctaac gcctatgact tcaacttttc ggaaaccgag 1980
aagtacaagg atatagccgg attttataga gaggtggaag agcagggcta caaggtgtca 2040
ttcgagtccg ccagcaagaa ggaagtggac aagctcgtgg aagagggtaa gctctacatg 2100
ttccagattt ataataaaga ctttagcgat aagagccacg ggacacctaa tctccacaca 2160
atgtatttca agctgctctt cgacgagaat aaccacggcc aaatcaggtt gtcaggaggg 2220
gctgaactct tcatgcggcg cgctagcctt aagaaggagg agcttgtagt ccaccctgcg 2280
aatagtccaa ttgcgaataa gaacccggac aatcctaaaa agactacaac attgagctac 2340
gacgtgtaca aggataagag gttttccgag gatcagtacg agctccacat cccgattgcg 2400
atcaacaagt gcccaaagaa tattttcaag ataaacacag aggtgcgtgt actcctgaag 2460
catgacgaca atccttacgt cattgggatt gatcggggcg agaggaacct cctctatatt 2520
gtggtggtgg acgggaaggg gaacatagtc gaacagtact cccttaacga aataattaac 2580
aatttcaacg gcatccgtat caagaccgac taccattcgt tgctggacaa gaaggagaag 2640
gagagatttg aggcgcggca aaattggaca agtatcgaga acatcaagga actcaaagca 2700
ggttatatct ctcaagttgt gcataagata tgcgagctgg ttgagaagta tgacgcagtg 2760
atcgctcttg aggacctcaa ctcgggcttt aagaattcta gagttaaagt ggagaagcag 2820
gtctatcaaa agttcgagaa gatgcttata gataagctca actacatggt cgataagaaa 2880
tcgaacccat gtgccaccgg cggcgcactc aaaggttacc aaataacaaa caaattcgag 2940
tccttcaaat cgatgagtac tcagaatggg ttcatatttt atataccggc gtggcttacg 3000
tctaagatcg acccgtcaac tggttttgtc aacctgttga agacgaaata cacgtccatt 3060
gccgattcga aaaagttcat atctagtttt gatcgtatta tgtacgtccc agaggaagat 3120
cttttcgagt ttgctctcga ctacaaaaac ttttcgcgga ccgatgcgga ttacattaaa 3180
aaatggaaac tctattcgta cggcaacaga atcaggattt ttcgcaaccc taagaagaat 3240
aacgtctttg attgggagga agtttgcttg actagcgcgt acaaggagct ctttaataag 3300
tatggcatta actaccaaca gggtgatatc agagcactgc tttgcgaaca atctgacaag 3360
gctttctact catccttcat ggctttgatg agcctgatgc tccagatgag aaattcaatt 3420
acaggcagaa ccgacgtgga tttcttgatc tccccggtta aaaattctga tggcatcttt 3480
tacgatagca ggaactatga agcgcaagag aatgcgattc tgccaaaaaa tgcagacgcc 3540
aacggtgcct ataacatcgc caggaaagtc ctgtgggcga tcggccagtt caaaaaggcc 3600
gaagacgaaa aattggacaa ggtcaaaatc gctatcagca acaaagagtg gctggagtat 3660
gctcagacat ccgtaaagca t 3681

Claims (5)

1. An isolated fusion polypeptide, wherein the fusion polypeptide comprises a CRISPR nuclease and a 5'→3' exonuclease, wherein the fusion polypeptide consists of the amino acid sequence of SEQ ID NO:1 or 10.
2. An isolated polynucleotide encoding the polypeptide of claim 1 and consisting of the nucleotide sequence of SEQ ID No. 2 or 11.
3. A genome editing system comprising at least one of the following i) to v):
i) The fusion polypeptide and guide RNA of claim 1;
ii) an expression construct comprising a nucleotide sequence encoding the fusion polypeptide of claim 1, and a guide RNA;
iii) The fusion polypeptide of claim 1, and an expression construct comprising a nucleotide sequence encoding a guide RNA;
iv) an expression construct comprising a nucleotide sequence encoding the fusion polypeptide of claim 1, and an expression construct comprising a nucleotide sequence encoding a guide RNA;
v) an expression construct comprising a nucleotide sequence encoding the fusion polypeptide of claim 1 and a nucleotide sequence encoding a guide RNA.
4. The genome editing system of claim 3 wherein the guide RNA is sgRNA.
5. A method of genetically modifying a cell, comprising introducing the genome editing system of claim 3 or 4 into a cell, wherein the cell is a plant cell.
CN201911351725.4A 2019-12-24 2019-12-24 Improved genome editing system Active CN113025597B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911351725.4A CN113025597B (en) 2019-12-24 2019-12-24 Improved genome editing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911351725.4A CN113025597B (en) 2019-12-24 2019-12-24 Improved genome editing system

Publications (2)

Publication Number Publication Date
CN113025597A CN113025597A (en) 2021-06-25
CN113025597B true CN113025597B (en) 2024-07-05

Family

ID=76452485

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911351725.4A Active CN113025597B (en) 2019-12-24 2019-12-24 Improved genome editing system

Country Status (1)

Country Link
CN (1) CN113025597B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114317492A (en) * 2021-12-06 2022-04-12 北京大学 Modified artificial nuclease system and application thereof
CN116262927B (en) * 2021-12-13 2024-04-26 中国科学院微生物研究所 Method for regulating gene expression based on CRISPR/Cas system and application thereof
CN116286741A (en) * 2022-03-03 2023-06-23 清华大学 Use of 5 '. Fwdarw.3' exonuclease in gene editing system, gene editing system and editing method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Predicting the mutations generated by repair of Cas9-induced double-strand breaks;Felicity Allen等;《nature biotechnology》;20190131;第第37卷卷(第第1期期);第64-82页 *

Also Published As

Publication number Publication date
CN113025597A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
JP7429057B2 (en) Methods and compositions for sequences that guide CAS9 targeting
US11753651B2 (en) Cas9 proteins and guiding features for DNA targeting and genome editing
US20230227857A1 (en) Class ii, type v crispr systems
CN106164272B (en) Modified plants
CN107267527B (en) Method for maintaining male fertility and application thereof
Raizada et al. Somatic and germinal mobility of the RescueMu transposon in transgenic maize
US20180273961A1 (en) A CRISPR/Cas9 SYSTEM FOR HIGH EFFICIENT SITE-DIRECTED ALTERING OF PLANT GENOMES
JP2020508046A (en) Genome editing system and method
CN111263810A (en) Organelle genome modification using polynucleotide directed endonucleases
CN111742051A (en) Extended single guide RNA and uses thereof
CA3130135A1 (en) Enzymes with ruvc domains
CN113025597B (en) Improved genome editing system
CN110891965A (en) Methods and compositions for anti-CRISPR proteins for use in plants
CN107164401A (en) A kind of method and application that rice Os PIL15 mutant is prepared based on CRISPR/Cas9 technologies
CN110300802A (en) Composition and base edit methods for animal embryo base editor
CN107759676B (en) Plant amylose synthesis related protein Du15, and coding gene and application thereof
JP7361109B2 (en) Systems and methods for C2c1 nuclease-based genome editing
CN112805385B (en) Base editor based on human APOBEC3A deaminase and application thereof
WO2021004456A1 (en) Improved genome editing system and use thereof
KR20220150363A (en) Improved Cytosine Base Editing System
CN116529376A (en) Fertility-related gene and application thereof in cross breeding
CN114340656A (en) Methods and compositions for facilitating targeted genome modification using HUH endonucleases
CN107446031B (en) Plant glutelin transport and storage related protein OsVHA-E1, and coding gene and application thereof
CN110818784A (en) Application of rice gene OsATL15 in regulation of absorption and transportation of pesticides
CN112980839B (en) Method for creating new high-amylose rice germplasm and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant